We have spent some time discussing approaches to solving a number of common recurrences. We shall illustrate the application of divide-and-conquer to problems form a number of different domains. The Problem. We will consider a problem that arises in the analysis of rankings such as couple of websites use a technique known as collaborative filtering, in which they try to match preferences of books, movies, e.t.c with others on the internet. A website can recommend stuff by finding people with similar tastes as you. A problem that websites face is one of comparing two rankings. What is a good way to measure, numerically, how similar two people's rankings are? A natural method would be to label the movies from 1 to n according to your ranking, then order these labels according to the stranger's ranking, and see how many pairs are out of order. A natural way to quantify this notion is by counting the number of inversions. A sequence 2,4,1,3,5 has three inversions. Designing and Analyzing the algorithm. What is the simplest algorithm to count inversions? We could look at every pair and decide but this would take us O(n^2) run time. We shall now aim at O(n log n) time. The basic idea is to split the problem into two parts, find the inversions in the first half, then in the second half, and finally any inversions in both halves. The tricky part is dealing with inversions in both halves(Merge-and-Count). We shall use two pointers at the beginning of both halves and remove the smaller number at which the pointer is currently at, and append it to a new list. If the number at which the pointer is in the second half is smaller than that in the first half, that is when we only increase the number of inversions. Merge-and-count algorithm is on page 224. Since we are only comparing and adding elements to a new array, the two pointers go through all elements in respective halves until both halves are empty, this is what the while loop is doing. Thus, Merge-and-Count will take O(n) time. This is not the overall algorithm, since we need to count the inversions in both halves separately. The final algorithm, Sort-and-Count is on page 225. The Sort-and-Count algorithm correctly sorts the input list and counts the number of inversions; it runs in O(n log n) time for a list with n elements.

Although this section was kinda long, I enjoyed learning how to solve this problem using divide-and-conquer algorithms, thus I give it a 9/10.

courses/cs211/winter2014/journals/fred/5.3_counting_inversions.txt · Last modified: 2014/03/12 02:11 by gisaf
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0