This is an old revision of the document!
Chapter 7: Network Flow
A bipartite graph can be divided into two sets of nodes where all edges have one end in each. The graph must be undirected. In a matching, no element of a set may appear in more than one pair ( each node may appear in at most one edge). Perfect edge means exactly one edge for every node(no disconnected nodes). These concepts are firmly imbued with Network Flow. We will use concepts shown by Network Flow to attempt to solve the problem: What is the size of the largest matching in a some bipartite graph G? As we will learn, this problem requires advanced techniques to solve which we have not yet covered.
Section 7.1: The Maximum-Flow Problem and the Ford-Fulkerson Algorithm
First, we need to understand the basics of flow networks. Flow is what travels along the edges in a flow network between nodes. Each edge has a capacity Ce representing the maximum amount of flow that it can transport. Further, flow networks must have two special nodes: a source node which the original flow comes from and a sink node which is the final node receiving the flow(takes all that is transported through). There are several required conditions for a graph to represent a flow network. No node may exceed its capacity(capacity condition) and the amount of flow from a given edge cannot exceed the amount of flow into it(conservation condition). The maximum flow problem involves finding a flow of maximum possible value in some flow network. To solve this, we will focus on how some node's capacities put bounds on flow. A “cut” of a flow network is basically just a path that some quantity of flow follows between nodes from source to sink. Unfortunately, Dynamic Programming does not work here. It is important to note that we can divide flow distribution along multiple cuts in our network. For this reason, we cannot appear to use a greedy algorithm either. A graph representation made after the flow has been distributed is called a residual graph. Capacity values are placed by each edge(now represent actual expected flow values) and each node can have both backward and forward edges. A backward edge is made when flow is diverted from the current path back to go on a different cut. Flow can only be diverted when the current node's capacity is exceeded. Clearly, residual graphs will often have more edges than non residual graphs. New capacities of each edge are called residual capacities. Bottleneck capacity is the lowest capacity of any node in some path(cut?). We build a function augment to manage the flows of various edges in residual graphs. For each edge in some cut, if the edge is forward, its flow is increased by the bottleneck and if it is backward its flow is decreased by the bottleneck. Finally, we can use a simple algorithm to compute the residual graph with max flow. All that this algorithm does is apply the augment function to some s-t path in the graph while an s-t path exists. We call this the Ford-Fulkerson Algorithm. This section was a little complicated at times but I found the material interesting and intuitive. This gets an 8/10.
Section 7.2: Maximum Flows and Minimum Cuts in a Network
We now dive into an explanation of why the Ford Algorithm works. Clearly, the conservation principle applies to cuts just as it applies to individual edges: the amount of flow going into a cut can never be less than the amount of flow which leaves. Further, we define the capacity of a cut as the sums of the capacities of the nodes contained within it. We know that if a flow does not allow a path in the resulting residual graph, then it must have the highest capacity of all flows and lowest bottleneck. I am confused how the book derived this or what exactly it means(very unclear). It appears that these proofs show there must be a definite best possible cut that maximizes flow in the residual graph and by trying all possibilities the algorithm inevitably finds it. This section gets a 2/10. It appears to make jumps and concepts that should be intuitively simple into symbolic monstrosities. Seems too dense and abstract to understand without further discussion. I can still grasp most of the concepts but the proofs and wording provided in the book are very hard to make sense of.
Section 7.5: A First Application: The Bipartite Matching Algorithm
We now apply what our knowledge of networks to the bipartite question itself. That is, What is the size of the largest matching in a some bipartite graph G? First, we show the equivalence between maximum flow and maximum matching. We discuss a version of the network problem in which all capacities of flow are 1. Clearly, here the residual graph contains the same number of edges as the original graph. It is also obvious that only the flow from a single edge can enter any given edge in our graph since max capacity is 1 and there can be no non-integer flows. Now, we make the leap that the size of the maximum flow with these conditions applied is equal to the size of the maximum possible matching. This would not be the case if all capacities were not 1, but 1 capacity creates the geometric conditions that we need for this relationship. Since these are equivalent, we can simply use the Ford Fulkerson algorithm on our graph to find maximum matching. This requires O(mn) time. Not all bipartite graphs have perfect matchings. We look for something tangible to help us identify rather some bipartite graph G has a perfect matching or not. Luckily, this is very simple. G has a perfect matching if the size of its two sets of nodes(two sides) are equal. I rate this section at 5/10. The concepts themselves are again somewhat intuitive. However, there is so much mathematical and proof based noise here that understanding the content proved difficult for me. I had to read this section three times before I really knew what was going on and even then I failed to grasp some of the denser material.
Section 7.7: Extensions to the Maximum Flow Problem
The significance of the maximum flow problem is not its relation to Networks. Rather, concepts from our solution can be used to model many other types of problems and solve them in polynomial time. One such problem is circulation with demands. Here, there can be more than one sink and more than one source generating/receiving flow. Sources have “supplies” and sinks have “demands”. Some flows send out more flow than they receive and others take. We note that all flows still must be integers and the capacity principle still applies. We now have a demand condition which replaces our previous conservation condition. Flow must also adhere to the value of the demand:enters +- demand must leave each node. Unlike our original application, we wish to find out if all demands can possibly be satisfied in some graph rather than maximizing flow. This is a feasibility problem. Clearly, if the demand condition (and the capacity condition) is satisfied, meeting all demands is feasible. By creating a master source and master sink, we can again reduce this down to a situation very similar to that of the maximum flow problem, a tactic also used for matching.
