Chapter 5.3: Counting Inversions

Consider comparing two rankings of the same set of n movies. A natural way to quantify this notion is by counting the number of inversions. We say that two indices i < j form an inversion if a(i) > a(j), that is, if the two elements a(i) and a(j) are “out of order.”

Designing/Analyzing the Algorithm

We can count the number of inversions in O(nlogn) time since there can be a quadratic number of inversions and therefore there must be such an algorithm to compute the total number without ever looking at each inversion individually. We split the list into two pieces and first count the number of inversions of each of these two halves. Then we count the number of inversions (a(i), a(j)), where the two numbers belong to different halves.

Now the hard part. Merge-and-Count is the process we use once we have recursively sorted the first and second halves of the list and counted the inversions in each. We now have two sorted lists A and B, and we want to produce a single sorted list C from their union, while also counting the number of pairs (a,b) with a in A, b in B, and a > b. Merging is very similar to mergesort, however we also need to count the inversions. Every time the element a(i) is appended to C, no new inversions are encountered, since a(i) is smaller than everything left in list B, and it comes before all of them . On the other hand, if b(j) is appended to list C, then it is smaller than all the remaining items in A, and it comes after all of them, so we increase our count of the number of inversions by the number of elements remaining in A.

Here is the algorithm in full:

Merge-and-Count(A,B)
  Maintain a //Current// pointer into each list, initialized to point to the front elements
  Maintain a variable //Count// for the number of inversions = 0
  While both lists are nonempty:
    Let a(i) and j(j) be the elements pointed to by the //Current// pointer
    Append the smaller of these two two the output list
    If b(j) is the smaller element then
      Increment //Count// by the number of elements remaining in A
    Advance the //Current// pointer in the list from which the smaller element was selected
  Once one list is empty, append the remainder of the other list to the output
  Return //Count// and the merged list
  

This runtime is O(n).

The whole algorithm is as follows:

Sort-and-Count(L)
  If the list has one element then there are no inversions
  Else
    Divide the list into two halves:
      A contains the first [n/2] elements
      B contains the remaining [n/2] elements
    (r(a), A) = Sort-and-Count(A)
    (r(b), B) = Sort-and-Count(B)
    (r, L) = Merge-and-Count(A,B)
    
  Return r = r(a) + r(b) + r, and the sorted list L

Since merge-and-count takes O(n) the running time of the full Sort-and-Count takes O(nlogn) due to its sorting.

This is a very easily understood chapter and a very useful algorithm. 10/10.

courses/cs211/winter2014/journals/stephen/home/chapter-5.3.txt · Last modified: 2014/03/11 17:43 by rowleys
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0