 [FrontPage] [TitleIndex] [WordIndex]

# FOCS 2006 Problem Set 3 Solutions

## Problem 1

Consider an optimization of merge sort in which the original list of length n is divided into disjoint sublists of length k, these lists are sorted using insertion sort, and then the n/k sublists of length k are merged (pairwise, successively, as in merge-sort) until the final sorted list of length n is achieved.

1. How many steps will it take to sort each of the sublists of length k? Give your answer in asymptotic (Theta, like big-Oh) notation. Indicate whether your analysis is best, worst, or average case.

Insertion sort is (worst and average cases) Theta(n2). SInce our input is of size k, this is Theta(k2). (Best case, i.e., assuming the sublists were magically sorted, would be Theta(k) to verify. But it ain't gonna happen.)

2. How many steps will it take to sort all of the sublists of length k? Again, use Theta and indicate what you're analyzing.

Saying that sorting a sublist is Theta(k2) means that there is some constant c for which c k2 is a good estimate of the number of steps. (Well, technically it means that there are two constants, c1 and c2, and the actual number of steps steps is bounded by

• c1 k2 <= steps <= c2 k2

but I'm going to cheat and just use c so I don't have to keep writing the double bounds.)

Since there are n/k sublists, each taking c k2 steps, this is n/k * c k2 = n c k2 / k = c n k, which is Theta(nk). (For those who want the gorey details, this argument can be made twice with c1, <=, c2, and >= to really prove Theta. I will be satisfied with any reasonable approximation of the derivation, including waving hands madly in the obviously right direction or stating the correct answer by fiat.) This is again a worst or average case. (They're the same here, as the difference is just in the expected constant if the list isn't pessimally organized.)

3. Once the n/k sublists of length k are individually sorted, how long will it take to merge them all back together? Again, Theta and case....

Now that the sublists are sorted, we do a standard merge on them. This will take log2 n/k passes, each of size n, for a bound of n log2 n/k. That is, the (worst, average) case is Theta(n log (n/k)).

To see this, consider the merging problem: We start with n/k little lists and merge them pairwise, generating n/2k somewhat larger lists. This progression is building up the binary tree from the little lists at the leaves to the whole big merged list at the root (which is of course on top because this is an upside-down computational tree...). We know that this tree has depth log2 (n/k), so there are log2 (n/k) passes. Each pass touches each of the n elements exactly once, when it's being put into place in its own merge-list, so each pass is Theta(n). Theta(log (n/k) ) passes of size Theta(n) yields a running time for this phase of Theta(n log (n/k)).

4. Given your analysis and what you know about merge sort, how should we choose k (as a function of n) so that this algorithm is asymptotically (Theta-notation) comparable to merge sort?

This would be a good time to remind ourselves that merge sort is Theta(n log n). Before we answer this question, we need to take a minute to put the algorithm back together. The total algorithm has two basic steps:

1. Sort the little lists.
2. Merge the sorted little lists into one big sorted list.

The time for this algorithm is Theta(nk) + Theta(n log (n/k)). Note that, without knowing what k is, we don't know which term dominates.

For example, if we set k=1, this is Theta(n*1) + Theta(n log (n/1)) = Theta(n) + Theta(n log n) = Theta(n log n), i.e., the n log n term dominates. This is asymptotically comparable to merge sort, which isn't particularly surprising because it is merge sort.

If on the other hand, we choose k=n, we get Theta(n*n) + Theta(n log (n/n)) = Theta(n2) + Theta(n) = Theta(n2), with the nk (now n2) term dominating.

The question, then, is how to pick k so that we get the most improvement -- make the n log (n/k) term small -- without doing worse than merge sort. If we don't worry about constants -- which is the next question (and also this one specifically says "asymptotically comparable") -- we can observe that the Theta(n log (n/k)) term will never do worse than Theta(n log n), which means that we can safely ignore it in picking k. That means that we just need to to ensure that the other term -- Theta(nk) is better than Theta(n log n).

Happily, Theta(nk) <= Theta(n log n) will be true as long as k <= log n.

5. Merge sort typically has a higher constant factor on its (asymptotic) running time than insertion sort. How could this be taken into account in selecting k? Oops, here come those pesky constants.

What the above analysis really means is that there are some constants (again, I'm only going to use one for each Theta rather than constant above and constant below) for which our new algorithm runs in cinsert n k + cmerge n log (n/k) steps. The question points out that cmerge >> cinsert, i.e., merge sort has a much bigger constant.

Any time that we can make k bigger, it helps the cmerge n log (n/k) term, since it reduces the number of merges. And making k bigger isn't so unreasonable in the first term if cinsert << cmerge. We want

• cinsert n k < cmerge n log n

i.e., cinsert k < cmerge log n

or k < (cmerge/cinsert) log n

As long as we adhere to this constraint, we're going to be improving the running time of mergesort by using this algorithm. Since cmerge >> cinsert, this could be a large constant multiplier on the value of k.

## Problem 2

View the solution in this PDF.

## Problem 3

Here's an example of one acceptable answer to parts a and b. Other variant answers are acceptable as well:

```sentence --> noun_phrase, verb_phrase, prepositional_phrase.
sentence --> noun_phrase, verb_phrase.
noun_phrase --> core_noun_phrase.
core_noun_phrase --> determiner, noun.
core_noun_phrase --> proper_noun.
verb_phrase --> verb_intransitive.
verb_phrase --> verb_transitive, noun_phrase.
prepositional_phrase --> preposition, noun_phrase.