Maximum Matching in Bipartite Graph

A matching in a graph is a sub set of edges such that no two edges share a vertex. The maximum matching of a graph is a matching with the maximum number of edges. This is very difficult problem.

We study only maximum matching in a bipartite graph. In a bipartite graph the vertices can be partition into two disjoint sets V and U, such that all the edges of the graph have one end point vertex in U and the other end in V.

Example of bipartite graph: boys and girls that will dance with each other.

Any other examples?

On the side: A graph is a 2-colorable graph when the vertices can be colored one of two colors such that no edge connects the same color. Naturally, the bipartite graph is 2-colorable.

Guess the Problem: Maximum matching in a bipartite graph or two-color graph, meaning matching one color with the other using edges.

How could we solve? Consider problem transformation to maximum flow. The cost would be O(nm²) using Edmonds-Karp Algorithm.

We will develop an alternative algorithm using an Augmenting Path.

For an iterative method to increase the number of matches there must be unmatched (free) vertices in both V and U.

So, the matching will increase by adding an edge of two free vertices, one vertex from each U and V.

Augmenting Path

In general, we can increase the size of the matching, M, by constructing a simple path from V to U. Starting from a free vertex in V and ending in a free vertex in U.

How many edges in this path, is it odd or even? It must be odd.

The path is augmenting if the edges of the simple path alternate from edges in E-M and edges M, meaning edges not in the matching and edges in the matching.

The first edge is not in M, so the second and every even number edge is in M.

Because the path begins in V and ends in U, the length of the path is odd. So, augmenting the match, M, by adding the odd edges to M and subtracting the even edges from M will increase the size of M by one.

Illustrate an Augmenting Path.

Theorem

A matching M is maximal if and only if there does not exist an augmenting path with respect to M.

Proof

If there is an augmenting path then we have already shown that M is not maximal.

Now we prove that if there are not any augmenting path with respect to M then M is maximal. We will prove by contradiction.

Assume that the matching, M, does not have augmenting path but is not maximal.

Let M* be a maximum matching in G.

So, |M*| > |M| because M* must have at least one more edge than M.

M*-M is M* less the shared edges and M-M* is M less the shared edges

| M*-M| > |M-M*|

because |M*| > |M|.

Consider the edges in the symmetric difference

Definition of symmetric difference:

M*sym diff M = (M-M*) union (M*-M),

which represents the set of edges that is either in M or M* but not in both.

Let G' be sub graph made of the edges of M*sym diff M.

Because the edges of G' are from one of two matches, there can be no more than two edges incident on any vertex in G'.

So, each vertex in G' has degree 2 or less.

Therefore any connected component of G', is either a path or a cycle with an even number of edges.

The cycle or path must have alternating edges from M-M* and M*-M,

because G' = M*sym diff M and M* and M are matchings.

To convince your self of the above suppose the first edge is in M ending in vertex v, if the second edge is in M, the two edges share the same vertex, so M was not a matching.

Since |M*-M| > |M-M*| and any cycle is even, there must exist one path that starts and ends in M*-M but alternates between M*-M and M-M*. This is an augmenting path for M, but that is a contradiction.

Maximum Matching Algorithm

The theorem suggest a technique for iterative improvements of a matching, M. The algorithm iteratively finds an augmenting path and augments the matching, M.

We will find the augmented path by performing several BFS traversals.

Along the way, vertices are label, labels for V vertices represent edges in the matching, M. Labels for U vertices represent edges not in M, meaning in E-M.

The BFS begins on free V vertices, and ends when it finds a free U vertex. It can follow the labels, adding new edges from M and removing edges from M.

The BFS is achieved by filling the queue first with all the free vertices in V. U vertices do not enter the queue unless they have a mate.

The BFS examines the front of the queue, w and adjacent vertices adjacent to w. There are two cases for w:

Case 1: The front of the queue, w is in V.

If u is free then this edge can be the last edge of an augmenting paths. The algorithm augments the matching with the path beginning with (u, w) and following the labeling in w.

If u is not free and (w, u) is not in M then label u with w only if it does not already have a label. Add u to the queue.

This is a potential edge for the augmenting path that is in M, and is the only way for a vertex from U to enter the queue.

Note that u is not free means it has a mate, v*. The edge (v*, u) is in M. The algorithm needs to find it. Case 2 does this

Case 2: The front of the queue, w is in U. In this case w must be matched (see above) so we label the mate (V vertex) with w and enqueue the mate of w.

Note w (a U vertex) could only have entered the queue because an alternative edge in M was found (see Case 1), labeling w's mate is representing the old edge in M that should be removed.

Note that U vertices labels represent edges in E-M and V vertices label represent edges in M.

Algorithm MaximumBigartiteMatching(G)

initialize set M of edges // can be the empty set

initialize queue Q with all the free vertices in V

while not Empty(Q) do

w ← Front(Q)

if w ε V then

for every vertex u adjacent to w do // u must be in U

if u is free then // augment

M ← M union (w, u)

v ← w

while v is labeled do // follow the augmenting path

u ← label of v

M ← M - (v, u) // (v, u) was in previous M

v ← label of u

M ← M union (v, u) // add the edge to the path

// start over

remove all vertex labels

reinitialize Q with all the free vertices in V

break // exit the for loop

else // u is matched

if (w, u) not in M and u is unlabeled then

label u with w // represents an edge in E-M

Enqueue(Q, u)

// only way for a U vertex to enter the queue

else // w ε U and therefore is matched with v

v ← w's mate // (w, v) is in M

label v with w // represents in M

Enqueue(Q, v) // only way for a mated v to enter Q

Illustrate the algorithm

Cost

Each of iteration matches two free vertices, one from V and one from U. So the total number of iterations can not exceed floor(n/2)+1, where n = |U|+|V|.

The time spend on each iteration is O(n+m), where m = |E|, assuming that labeling and determining free or mate are constant. Note that augmenting cost O(m) and is performed only once during an iteration.

So the total cost is O(n(n+m))