Maybe add topological sorts

Generating Combinatorial Objects

Important combinatorial objects are: permutations, combinations, and subsets of a set.

 

Generating Permutations

Without lost of generality we can assume that the objects to permute are consecutive integer. We can do this because the integers can represent the indices into an array.  The number of permutations is n!.

 

We can use a bottom up approach starting with 1 and then add 2 to 1 on the right and move it left. Then we add 3 to 12 on the right and move it left. Also add 3 to 21 and move left, etc See figure for n = 3.

 

1

12, 21

123, 132, 312,

213, 231, 213

 

Make a tree

 

1

            12,                                           21

123, 132, 312,                         213, 231, 213

 

 

The sequence

123, 132, 312, 213, 231, 213

has the property of minimal change between elements. This property may be advantages for algorithms using the sequence because the solution of the prior element may be used in the next iteration, for example calculating path length in traveling sales man problem.

 

The algorithm does more work then it needs because it generates all the permutations for lesser numbers.

How much more work? (Hint: Each level, i, has i! elements.)

i=1n-1 i! << n! because factorial grows very quickly.

 

The Johnson-Trotter algorithm generates permutations without generating the permutations of lower numbers. Integers of the permutations have arrows associated with them, for example

 

3241

 

An integer is considered mobile if the arrow points to a smaller value integers. In this example 3 and 4 are mobile but not 1 and 2.

 

 

 

Algorithm JohnsonTrotter(n)

initialize the first permutation to 1 2 ... n

while the last permutation has a mobile integer do

find the largest mobile integer, k

swap k with the adjacent element that the arrow points to

reverse the direction of all the elements that are larger than k

add the permutation to the list

 

 

 

 

Example

write

1,2,3, it is the first permutation

123, Note 2 and 3 are mobile, 3 is the largest, swap it with 2, write it down. Any integers larger than 3?

1 32, Only 3 is mobile, swap it with 1, and write

312,  Now 2 is mobile and largest, swap with 1, don't for get to change the direction of 3 because it is larger

321,  Now 3 is mobile again, swap with 2

231,  Again 3 is mobile, swap with one

213, None are mobile, all done!

 

Note that this algorithm runs in Θ(n!) which is the best that can be done considering that we have to generating that number of elements.

Generating Subsets

The number of subsets of a set with n elements is 2n.

 

We can do a bottom up approach by adding one element at a time to all current sets.

 

Another approach is to represent membership in the subset by using a bit vector the length of the number of elements in the original set. Zero means the element is not in the subset and 1 means the element is in set.

Then generate all the binary number from 1 to 2n, padding the left with zeros if necessary.

 

If the goal is to construct all the subsets of a particular size then the two prior approaches are not efficient because they generate all the smaller subsets.

 

The binary reflective gray code is an example of minimal change sequence.

 

Squash ordering is any set involving element aj the subsets involving a1, ... , aj-1 have already been listed.

 

Variable Size Decrease Algorithms

These algorithms attempt to accelerate reduction of the sub problem using intuition.

 

Computing a Median and the Selection Problem

The selection problem is finding the kth smallest element in an array of n elements. The number is called the kth order statistics.

 

For k = 1 or n the algorithm is easy. Just scan the array for the largest or smallest value. More interesting is finding the median, the element value that the splits the array into two equal parts, half less than the median and half greater than the median. We could sort the array and then choose the kth element. This would cost Θ(n lg n), but this would be overkill. A quicker algorithm is based on the ideas of quick sort, called quick select.

 

 

Outline of Quick select for kth statistics

1. Select a pivot value from the array

2. Recursively partition the input array by the pivot into less-than and greater-than

3. If the size of the less-than set is greater than k then the kth statistics is in the less-than set, so recursively call on the less-than set

If the size of the less-than set equals k than the pivot is the kth statistics so return the pivot

If the size of the less-than is greater than k then the kth statistics is in the greater-than set, so recursively call on the greater-than set. Note the value of k needs to be adjusted to k minus less-than set size and one less (for the pivot).

 

 

 

Quick select suffers a worst case cost like quick sort, Θ(n2)

 

The best case cost (short of getting lucky) would split the array exactly in half at each partitioning. In this case

C(n) = C(n/2) + n +1

The homogenous term, n+1, is the cost of the partitioning at each recursion. How does this differ from the recursive relation for quick sort?

Using Master's Theorem, the cost is Θ(lg n)

 

There are algorithms that work in linear time, but are too complicated for practical use.

 

Interpolation Search

 A variation of binary search for a key, interpolation search, uses information about the difference of key value, v, and bounds of the partition.   Note that each iteration of binary search guess that key is halfway is midway in the range (specified by two indexes, l and r) and decreases the range by half and so achieves Θ(lg n) worst case cost. Interpolation search uses the interpolation formula for the next guesses at the index of the key value

 

x = l + floor((v-A[l])(r-l)/(A[r]-A[l]))

 

Show graph

 

An analysis can show that the average case cost is O(lg lg n +1). The double lg is not a typing error. The double log function is very slow (HW problem), and is constant for practical problem size.

 

But the worst case cost is linear (HW problem shows). This is a very bad worst case, so the interpolation search is not generally used unless the cost of access the partition is expensive. This would occur for very large arrays when caching fails.