2006Assignment3

Contents

Sort analysis
Context Free Grammars
Grammars in Prolog
Programming Project
Miscellany

This assignment is due on Thursday, 9 March. Please email your completed assignment to las and to sean.mcbride. sean, please add any format requests here.

Sort analysis

Consider an optimization of merge sort in which

the original list of length n is divided into disjoint sublists of length k,
these lists are sorted using insertion sort, and
then the n/k sublists of length k are merged (pairwise, successively, as in merge-sort) until the final sorted list of length n is achieved.

In this problem, k is a parameter that you are going to be setting, so do not assume that it is constant in your analysis.

How many steps will it take to sort each of the sublists of length k? Give your answer in asymptotic (theta, like big-Oh) notation. Indicate whether your analysis is best, worst, or average case. See note.
How many steps will it take to sort all of the sublists of length k? Again, use theta and indicate what you're analyzing.
Once the n/k sublists of length k are individually sorted, how long will it take to merge them all back together? Again, theta and case....
Given your analysis and what you know about merge sort, how should we choose k (as a function of n) so that this algorithm is asymptotically (theta-notation) comparable to (or better than) merge sort? Your goal is to come up with an inequality governing the value of k (no more than x or no less than y) that gives a total running time on this sort that is comparable to or better than the (asymptotic) running time of merge sort. (Otherwise, you might as well just use merge sort!)
Merge sort typically has a higher constant factor on its (asymptotic) running time than insertion sort. How could this be taken into account in selecting k? (If you wish, you may give the constant factor on merge sort a name. This is particularly encouraged if it figures into setting the value of k.)

See ReadingRoom/OrdersOfGrowth for additional resources.

Context Free Grammars

Consider the grammar: S --> a S | a S b S | epsilon

This grammar is ambiguous. Show in particular that the string a a b has

two:

parse trees
leftmost derivations (These are the ones that, starting from the top, rewrite the leftmost nonterminal first.)
rightmost derivations (These are the ones that, starting from the top, rewrite the rightmost nonterminal first. Said the other way, starting from the bottom, these are the ones that group together the leftmost input symbols lowest in the tree.)

Extra Credit/Challenge: Prove that this grammar generates all and only the strings of as and bs

such that every prefix has at least as many as as bs.

Extra Credit/Challenge: Find an unambiguous grammar for this language. Hint: Do the readings!

Grammars in Prolog

Expand the grammar that we discussed in class (for English sentences) so that it handles:

adjectives (e.g., [ambidextrous, ophelia, pirouettes] and [rodriguez, chastizes, the, ambidextrous, whangdoodle])
prepositional phrases (e.g., [a, hippogriff, chastizes, the, whangdoodle, with, compassion] or [ophelia, pirouettes, with, the, hippogriff])
Extra Credit: Also handle adverbs (which can modify either adjectives or verbs).
Extra Credit/Challenge: Read about additional arguments to prolog grammars and modify the numeric matching grammar in grammarNM.pl (either as above or to handle person matching, pronoun case matching or -- presumably in a language other than English -- gender matching). in another language

You need not be able to generate all possible sentences with your grammar; it is sufficient for your grammar to accept sentences in this form. Remember that test parses have the form:

sentence( [ ambidextrous, ophelia, pirouettes ], [] ).

while "generate-all-examples" queries have the form:

sentence( CapitalizedName, [] ).

Also remember that you can consult i.e., read in) a file named filename.pl (in the same directory from which you're running gprolog) using

[filename]. Brendan: Watch out! Prolog is old school - this will not work if gprolog.exe is running from a directory with spaces in it.

and you can consult the user (i.e, read what you type from the keyboard as a grammar) using

[user].

In both cases, a successful consult will result in various messages (perhaps including warnings) ending in

yes

Also note that the results of prior consults remain in prolog's context, so if you want to actually overwrite what you've previously consulted in, you should restart gprolog. (There are other ways, but restarting is probably easiest to explain....)

Programming Project

For ENGR3520a students only

Learn to use the parsing library for your favorite programming languge (and/or LEX or YACC) and write a parser for some interesting data. You could choose:

a human language (or subset)
a programming language (or subset)
a web language like html or xml (or the subset used on a particular web page, e.g., a Google or Amazon or .... screen scraper)

Miscellany

footnote on bounds

O, Omega, Theta are upper, lower, and tight bounds on a function. This has nothing to do with best/worst/average case analysis. The best/worst/average case running time of an algorithm is described by a function. A function can have an upper or lower bound (or both).

English grammar

grammar.pl:

sentence --> noun_phrase, verb_phrase.
noun_phrase --> determiner, noun.
noun_phrase --> proper_noun.
verb_phrase --> verb_intransitive.
verb_phrase --> verb_transitive, noun_phrase.


determiner --> [a].
determiner --> [the].
noun --> [whangdoodle].
noun --> [hippogriff].
proper_noun --> [ophelia].
proper_noun --> [rodriguez].
verb_intransitive --> [pirouettes].
verb_transitive --> [chastises].

number matching grammar

grammarNM.pl:

sentence --> noun_phrase(Number), verb_phrase(Number).

noun_phrase(singular) --> determiner, noun(singular).
noun_phrase(plural) --> [the], noun(plural).
noun_phrase(plural) --> noun(plural).
noun_phrase(singular) --> proper_noun.

verb_phrase(Number) --> verb_intransitive(Number).
verb_phrase(Number) --> verb_transitive(Number), noun_phrase(Any).


determiner --> [a].
determiner --> [the].

noun(singular) --> [whangdoodle].
noun(singular) --> [hippogriff].
noun(plural) --> noun(singular), [s].

proper_noun --> [ophelia].
proper_noun --> [rodriguez].

verb_intransitive(plural) --> [pirouette].
verb_intransitive(singular) --> verb_intransitive(plural), [s].

verb_transitive(plural) --> [chastises].
verb_transitive(singular) --> verb_transitive(plural), [s].