2.3 Implementing The Stable Matching Algorithm using Lists and Arrays
To asymptotically analyze the running time of an algorithm, we don't need to implement and run it, but we definitely have to think about how the data will be represented and manipulated in an implementation of the algorithm so we could bound the number of computational steps it takes. This section looks at how to implement the Gale-Shapley algorithm using lists and arrays. In the Stable Matching Problems that the algorithm solves, we need to think about how the rankings will be represented, as each man and each woman has a ranking of all members of the opposite gender. In addition, we will need to know which men and women are free, and who is matched with whom. Finally, we need to decide which data structures to use for all of the mentioned data. The goal of for the designer is to choose data structures that will make the algorithm efficient and easy to implement.
Algorithm Analysis
Let's focus on a single list, such as a list of women in order of preference by a single man.
The simplest way to keep a list of n elements is to use an Array A of length n.
In such an array, let A[i] be the ith element of the list
With such an array we can:
Determine the ithelement on the list in O(1) time,by accessing the value A[i]
Determine if a given element e belongs to the list by checking the elements one by one in O(n) time, assuming we don't know the order in which the elements are arranged in A.
Determine if the given element e belongs to A in O(logn) using binary search if A is sorted
For dynamically maintaining a list of elements that change a lot over time such as a list of free men, an array is less preferable. Indeed, we our collection needs to grow and shrink very frequently throughout the execution of the algorithm. Thus we choose an alternative to an array: the linked list.
In a linked list, each element points to the next in the list. Thus:
For each element v, we maintain a pointer to the next element
The “next” pointer is set to null(None in python) if i is the last element in the list
The pointer“First” points to the first element in the list
By starting from First, the traversal of the entire list can be done in time proportional to its length
To implement such a list:
We would allocate a record e for each element we want to put in the list
For each record e:
If we want to implement a doubly linked list, which is traversable in both directions:
In addition of what the singly linked list contained, we add e.Prev which points to the previous element in the list(e.Prev = null if e is the first element.
The pointer “Last” that points to the last element in the list
Thus, with a doubly linked list:
deletion: To delete e: Have the e.Prev and e.next point directly to each other
insertion: To insert e between d and f: update d.next and f.Prev to point to e, e.Prev point to d and e.next point to f.
Disadvantage of linked lists: To find the ith element we have to follow the “next” pointers starting from “First” until we get the element, which takes O(i), whereas it takes us O(1) time with arrays.
Using arrays and linked lists, it takes us O(n) to convert back and forth.
Implementation of the algorithm
The algorithm takes O(n2) to terminate, so we need to implement each iteration in O(1) time.
Let's assume the set of men and women are both {1,…,n)
To do this, we can order the men and women, and associate number i with the ith man mi or ith women in the same order.
In this way, we have an indexed array of all men or all women
We need preference lists for each man and each woman:
We use 2 arrays: one for women's preference lists and another for men's preference lists
ManPref[m,i] represents the ith woman on man m's preference list
Similarly, WomanPref[w,i] represents the ith man on woman w's preference lis
The space needed to give preference to all 2n persons is O(n2) since each person has a list of length n.
To implement our algorithm in constant time, we need to implement each of the following in constant time:
We need to be able to identify a free man
For a man,m, we need to identify the highest ranked woman to whom he has not yet proposed
For a woman,w, we need to know if she is currently engaged,and if she is, identify the current partner
For a woman,w, and two men,m and m', we need to decide which of m or m' is preferred by w, all in constant time.
Selecting a free man: We need to maintain a list of free men as a linked list. To select a man, we take the first man m on the list. If m becomes engaged, we delete m from the list. If some other man m' becomes free, he's inserted at the front of the list. All of these operations are thus done in constant time using a linked list
Identifying the highest-ranked woman to whom a man m has not yet proposed: To achieve this:
We maintain an extra array,call it Next that indicates for each man mthe position of the next woman he will propose to on his list.
We initialize Next = 1 for all men m.
If a man m needs to propose to a woman, he will propose to w = ManPref[m,Next[m]]
Once m proposes to w, we increment the value of Next[m] by one, whether or not w accepts m's proposal.
Identifying the man m'(if such man exists) w is currently engaged to in case m proposes to w: To do this:
We can maintain an array Current of length n
Current = null if w is not currently engaged
Thus Current = null for all women w at the start of the algorithm.
Deciding which of m or m' is preferred by w:To achieve constant time for this step:
At the start of the algorithm, we create an n X n array Ranking
Ranking[w,m] contains the rank of man m in the sorted order of w's preferences
Creating the array Ranking is O(n2) in time, since it requires us to create such an array for each woman w.
To decide which of m or m' is preferred by w, we just compare the values Ranking[w,m] and Ranking[w,m'],effectively doing it in constant time.
All of the data structures mentioned above help us implement the Gale-Shapley algorithm in O(n2) time.
To be honest, I hadn't understood quite well how we can specify the woman w a man m will propose to, and do it in constant time,before I reread the section and get how we can work it out. The preference lists representation was a little bit trickier because I was thinking of a different way of implementing them, but I'm now convinced the method in the book is really efficient.
This section was really interesting, it does seem like materials in the book get more and more interesting as we advance through the book.I give this section a 9/10.