Divide and Conquer Closest Pair and Convex-Hull Algorithms

 

Closest Pair Problem

Recall the closest pair problem.

 

The brute force algorithm checks the distance between every pair of points and keep track of the min. The cost is O(n(n-1)/2), quadratic.

 

 

The general approach of a merge-sort like algorithm is to sort the points along the x-dimensions then recursively divide the array of points and find the minimum. The only trick is that we must check distance between points from the two sets. This could have quadratic cost if we checked each point with the other. But, is there is only a finite number of points then cost could be less.

 

Alogrithm Closest Pair

0. Initially sort the n points, Pi = (xi, yi) by their x dimensions.

 

1. Then recursively divide the n points, S1 = {P1,...,Pn/2} and S2 = {Pn/2+1,...,Pn}

so that S1 points are two the left of x = xn/2  and S2 are to the right of x = xn/2.

 

2. Recursively find the closest pair in each set, d1 of S1 and d2 for S2, d = min(d1, d2).

 

3. We must check all the S1 points lying in this strip to every S2 points in the strip, and get closest distance dbetween

 

4. To efficiently do the above, need to sort the points along the y dimensions, using a merge sort approach.

 

5. Then the minimum distance is minimum distance is min(d, dbetween)

 

 

Analyzing and Cost:

Alogrithm Closest Pair

0. Initially sort the n points, Pi = (xi, yi) by their x dimensions.

 

1. Then recursively divide the n points, S1 = {P1,...,Pn/2} and S2 = {Pn/2+1,...,Pn}

so that S1 points are two the left of x = xn/2  and S2 are to the right of x = xn/2.  Cost is O(1) for each recursive call

 

2. Recursively find the closest pair in each set, d1 of S1 and d2 for S2, d = min(d1, d2). Cost is O(1) for each recursive call.

Note that d is not the solution because the closest pair could be a pair between the sets, meaning on from each set.

These points must lie in the vertical stripe described by x = xn/2-d and x = xn/2+d. Draw the diagram.

 

3. We must check all the S1 points lying in this strip to every S2 points in the strip, and get closest distance dbetween

Note that there can be only 6 S2 points. Note the points must lie also [yi - d, yi + d]. Illustrate the worst case. So the time for this step is Θ(6n/2) = Θ(3n). Draw diagram showing the six points in S2 with respect to the point in S1.

 

4. To accomplish this we also need to sort the points along the y dimensions. We do not want to a sort from scratch for each recursive division. So we use a merge sort approach and the cost is of maintaining the sort along y is O(n).

 

5. Then the minimum distance is minimum distance is min(d, dbetween)

 

 

The recursive relation is

T(n) = 2T(n/2) + M(n), where M(n) is linear in n.

 

Using Master's Theorem (a =2, b = 2, d = 1)

T(n) ε O(n lg n)

 

Note that it has been shown that the best that can be done is Ω(n lg n). So we have found one of the best solutions.

 

Convex-Hull Problem

Recall the convex hull is the smallest polygon containing all the points in a set, S, of n points Pi = (xi, yi). The set of vertices defines the polygon and the points of the vertices are found in the original set of points.

 

Recall the brute force algorithm. Make all possible lines from pairs of points and then check if the rest of the points are all on the same side of the line. How much? There are n(n-1)/2 such lines and then we check with n-2 remaining points. So the cost is cubic.

 

Algorthim quickhull

1. Sort the set of points, S, by the x-dimension with ties resolved by the y-dimension.

 

2. Identify the first and last points of the sort P1 and Pn

Note P1 and Pn are vertices of the hull

The ray P1Pn divides S into sets of points, by points left (S1) or right (S2) of the line, defined later.

We need to find the upper and lower hulls. We'll do this recursively.

Note also that S1 or S2 could be empty sets.

 

3. For S1 find the Pmax which is the maximum distance from line P1Pn, tires can be resolved by the point that maximizes the angle PmaxP1Pn.

Note that the ray P1Pmax divides points of S1 into left and right sets. The left points are S11.

Also PmaxPn identifies the left points S12 of S1

 

Pmax is vertex of the hull

The points inside the triangle P1PmaxPn cannot be vertices of the hull

 There are no points to the left of both P1Pmax and PmaxPn

 

4. Recursively find the upper hull of the union of P1, S11 and Pmax, and the union of Pmax, S12, and Pn

5. Do the like to find the lower hull

 

We need to identify if point (x3, y3) is left or right of the ray defined by points (x1, y1) and (x2, y2). We use the sign of the determinate

 

x1  y1  1│

x2  y2  1│

x3  y3  1│

 

Which has value of the area of the triangle with sign determine by order of the three points. The sign has the properties we need.

 

Sorting along the x-dimensions cost Θ(n lg n). Finding Pmax cost  Θ(n). Cost of determining the sets S1, S2, S11, and S12 are each Θ(n).

How many recursive call in the worst case? O(n).

The worst case cost is Θ(n2) which beats the brute force O(n3)

 

We expect the average case to do much better because of the divide and conquer approach, much like quick sort does. In addition for any reasonable and random distribution of points many points in the triangle are eliminated.  In fact for randomly chosen points in a circle the average case cost is linear.