Document Sample
katie Powered By Docstoc
					      The Pancake Problem: Improving the Bounds for Sorting by Prefix
                                      Date: April 28, 2005

                                    Kathryn J. Sullivan
                                  Mathematics Discipline
                              University of Minnesota, Morris
                        Morris, MN 56267, sull0294@morris.umn.edu

                                  Faculty Advisor: Peh Ng
                                   Mathematics Discipline
                              University of Minnesota, Morris
                         Morris, MN 56267, pehng@morris.umn.edu


The pancake number Pn is the smallest number of prefix reversals that will transform any
permutation of the integers from 1 to n into the identity permutation. Currently, exact values of
Pn are known for n≤13. For n≥14, it is known that n+1 ≤ Pn ≤ 5(n+1)/3. This paper will discuss
two approaches to improving the bounds for P14. It will also present the methods used by Gates,
Papadimitriou, Heydari, and Sudborough when they improved the bounds on Pn in general.
Results include numerical data for an attempt to improve the current upper bounds by modifying
the underlying algorithm, two options for P14 and improved bounds on Pn for n<38.
1. Introduction to Problem
The Pancake Problem is a problem of a waiter navigating a busy restaurant while carrying a
plate of pancakes. To avoid disaster, the waiter wants to sort the pancakes in increasing order by
diameter (each pancake is assumed to have a distinct size). Since the waiter has only one
available hand, the only possible action is to lift the top portion of the stack, flip it over, and
place it back on top. [Dweighter]

For instance, in Figure 1, a process of "1 flip" is shown. Given a stack of pancakes, the waiter
wants to order them using the least number of flips. Finding the minimum number of flips
required to sort any possible arrangement of n pancakes is the Pancake Problem.

 Figure 1:

A stack of pancakes can be represented by an ordered list of the integers 1 to n. For the rest of
this paper the term list is used with the assumption that it is an ordered list. Each pancake is
represented by an integer according to size (the smallest is 1, the largest is n) and written so that
the top “pancake” is in the left most position and the bottom “pancake” is in the last position.
The initial stack in Figure 1 is represented by {1, 3, 2, 5, 4}. The flips are then prefix reversals
where the first k numbers in the list have their order reversed, and all later elements remain in the
same position. For example in Figure 1, the initial list {1, 3, 2, 5, 4} becomes {2, 3, 1, 5, 4} by
reversing the order of the first 3 elements (k=3).

The follow is an example a way to sort the stack {1, 3, 2, 5, 4} (the initial arrangement in Figure
        Step 1: flip the whole list to obtain {4, 5, 2, 3, 1}
        Step 2: reverse the first two elements, resulting in {5, 4, 2, 3, 1}
        Step 3: reverse the order of the whole list again giving {1, 3, 2, 4, 5}
        Step 4: flip the first three (k=3) to get {2, 3, 1, 4, 5}
        Step 5: flip with k=2, the result is the list {3, 2, 1, 4, 5}
        Step 6: flip the first three elements giving the desired list: {1, 2, 3, 4, 5}.
So, 6 flips are sufficient for this list. However, this may not be the best way to sort the as there
may be other flip sequences that sort the same arrangement in fewer steps.

As stated above, the goal of the pancake problem is to find the maximum number of flips for a
stack of n pancakes. Mathematically, suppose we want to sort, using prefix reversals, a list of
length n starting at a particular ordered list s, we define
                f(s) = minimum number of prefix reversals needed to sort the list s.

                                          Sullivan, Page 2
Then the pancake number Pn is the smallest number of prefix reversals needed to sort any list of
length n into the identity permutation.
                Pn = max {f (s)} = max {f (s)}
                    all s of length n   s (n)
Where  (n) is the set of all permutations of the list {1, 2, 3,…, n}.

The main goal of this project is to improve the bounds for Pn and more specifically to find P14.
The current bounds are Pn ≥ n+1 for n≥6 and Pn ≤ (5n+5)/3 [Weisstein] [Carey] [Gates].

In Section 2, we will present the motivation behind this research and the applications of prefix
permutations. We will then discuss the concept of pancake numbers, and the most recently
known results for bounds of Pn, in Section 3. We will describe our approaches to "tighten" the
bounds and present numerical results and analysis in Section 4.

2. Motivation and Applications
In this section we will briefly describe why mathematicians, computer scientists and molecular
biologists have been interested in this problem.

2.1. Mathematics

The Pancake Problem is directly related to discrete optimization, which is a part of discrete
mathematics. The ordered lists can be used to create “pancake networks” in which a graph is
created by representing each distinct list as a vertex and connecting two vertices only if you can
get from one to the other in one prefix reversal. The pancake number is then the longest path
between any two vertices in the graph. So the pancake problem can become a graph theory
problem. In general, finding the longest path is a NP-hard problem, meaning that there is no time
efficient way to solve it. This may be an interesting case to look at within the larger problem,
especially if more information about the pancake numbers can be found.
2.2. Computer Networking:
                                                                 2413           1243
                                                                                          Pancake Network: *P4
                                                                                          24 nodes, 36 edges,
Computer structure changed fairly                                                         Diameter= P4 =4
recently from a single elaborate processor              1423                    2143

to a parallel architecture structure using                      4123                                    4312
many relatively less powerful chips. The
                                                 2134                                         3412             1342
advantage is that problems that can be
broken down into parts and worked on 3124               1234
                                                                                              1432             3142
separately can be solved quicker using
a parallel architecture computer than      1324         3214                                            4132
the serial kind. This is true despite the        2314
fact that parallel processors are nowhere                                              2431

near as powerful as would be needed to                                         4231              3421
solve the same problem in the same amount of time on a
serial computer. The processors must be connected to each                      3241              4321
other so that they can share information when necessary. There
are a variety of ways these processors are connected together; the

                                                  Sullivan, Page 3
pancake network is one of them. [Malkevitch]

A n-dimensional pancake network, *Pn, has processors labeled with each of the n! distinct
permutations of length n. Two processors are connected when the label of one is obtained from
the other by some prefix reversal. The pancake problem concerns Pn, the maximum number of
prefix reversals required to sort any stack of n pancakes, is the diameter of the pancake network,
*Pn. The diameter is the maximum distance between any two vertices, which is the shortest path
between these vertices. The diameter of *Pn corresponds to the worst communication delay for
transmitting information in a system. [Heydari]

2.3. Molecular Biology:

Recent advances in genome identification have brought to light questions in molecular biology
very similar to the pancake problem. Differences in genomes are usually explained by
accumulated differences built up in the genetic material due to random mutation and random
                                                     mating. In the 1980‟s another mechanism of
  “Transformation” of cabbage into turnip:           evolution was discovered. The gene
  B. oleracea (cabbage): 1 -5 4 -3 2                 sequences in some plants are the same but
                          1 -5 4 -3 -2               differ in order, similar to how “tar” and “rat”
                          1 -5 -4 -3 -2              have the same letters, just in a different
  B. campestris (turnip): 1 2 3 4 5                  order. This has inspired some molecular
                                        [Berman] biologists to look at the mechanisms which
might shuffle the order of the genetic material. One way of doing this is the pancake
permutations with more available flips (not limited to the first section only- middle and end
section flip are acceptable). [Malkevitch]

3. Pancake Theory Foundation
The pancake Problem has been worked on for 30 years. In this time many people have made
significant progress. This section is broken up into three different parts: „Upper Bound Theory‟,
„Lower Bound Theory‟, and „Known Pancake Numbers‟. „Upper Bound Theory‟ and „Lower
Bound Theory‟ will introduce some of the ideas that have been used to determine upper and
lower bounds to the Pancake Numbers. “Known Pancake Numbers” will present a chart of the
values found for the pancake numbers up to P13.

3.1. Upper Bound Theory

To prove an upper bound M for Pn, one must show that any permutation of length n can be sorted
in at most M steps. The two upper bounds we will present use a sorting algorithm as the basis for
their proof. The upper bound is obtained by defining the sorting algorithm so that it sorts any
ordered list of length n in at most M steps.

3.1.1. Obvious Upper Bound

Given any list s, we can sort it by following these steps:
       Step 1: flip the largest element to the front.

                                          Sullivan, Page 4
      Step 2: flip the whole list
      Step 3: flip the largest unsorted element to         3        4       1      3      2      1
           the front.                                      2        2       3      1      1      2
      Step 4: flip the whole unsorted portion of the       4        3       2      2      3      3
           list.                                           1        1       4      4      4      4
  Repeat Steps 3-4 until the list is sorted.                  s = {3,2,4,1} sorted by the method
                                                             used to find the upper bound: Pn ≤ 2n
This algorithm takes 2 steps to sort each element so
Pn is at most 2n. This upper bound can be improved to Pn ≤ 2n - c (where c is a constant) by
sorting the last few elements by a more clever method [Gates].

3.1.2. Gates and Papadimitriou Upper Bound

Gates and Papadimitriou developed a looping algorithm which had 9 cases depending on the
structure of the list at the beginning of each loop through. The loop is performed until n
adjacencies are present in the list and each time through it determines which of the nine cases is
appropriate, then gives the flip sequence to follow (1-4 flips in length). An adjacency is defined
to be two elements next to each other in the list which are either consecutive integers or the pair
1 and n (the smallest and biggest elements). They then add on a series of at most 4 steps to sort
the resulting list which could have
                                           4      2      1       4         4      5       3      1
2 blocks of sorted elements
                                           2      4      5       5         5      4       2      2
(Example: {3, 4, 5, 1, 2})
                                           5      5      4       1         1      1       1      3
                                           1      1      2       2         2      2       4      4
They prove the maximum number
                                           3      3      3       3         3      3       5      5
of steps this algorithm will take by
combining information about the         Sorting by Gates initial        There is an at most 4 step
number of adjacencies and the           Algorithm can result in an      follow up sorting
number of blocks (groups of             incompletely sorted list.       procedure
adjacencies) created for each of the
9 cases. Then they manipulate it algebraically and using the Duality Theorem of Von Neuman,
Kuhn and Tucker, Gale, and Dantzig prove that this process will take at most (5n+5)/3 flips for a
list of length n. Therefore, Pn≤(5n+5)/3. [Gates]

3.2. Lower Bound Theory

Lower bounds are tough to prove for general n. The following two sections describe the baseline
lower bound and a method for obtaining better (higher) lower bound. If bounds on Pn are desired
for a specific n, the lower bound can be improved by finding a list which requires more steps to
sort than the current lower bound. The bound must then be increased accordingly.

3.2.1. Obvious Lower Bound

In any ordered list an adjacency is two consecutive integers who are placed next to each other.
The sorted list contains n adjacencies if the definition is extended to include the placement of the
largest element in the last place (or the largest pancake next to the plate). For n≥4, there exists at
least one stack in which there are no adjacencies. Also, each prefix reversal can create at most

                                          Sullivan, Page 5
one adjacency per reversal. So if we start with a stack that has no adjacencies and each flip
creates at most one adjacency it will take at least n flips to obtain n adjacencies.

M. R. Carey, D.S. Johnson and S. Lin claim that they can show that Pn ≥ n+1 for n ≥ 7 [Carey].
There is no proof given in their comments in American Mathematical Monthly. It is likely that
they derived this from knowledge that P7 =8 and then created a list which needed one step to sort
the “8” then required the maximum number of steps to sort the remaining seven elements. If this
same process could be shown iteratively for all n≥7, the result is a proof that to get from Pn to
Pn+1 requires the number of reversals to increase by at least one flip. This results in the lower
bound Pn ≥ P7 + (n-7) which is the same as Pn ≥ n+1.

3.2.2. Gates, Papidimitriou, Heydari, and Sudborough “Infinitely Often” Lower Bounds

Gates and Papidimitriou proved that Pn ≥ 17n/16 infinitely often (for n which are multiples of
16) by constructing a list n with length 16 k (where k is an integer with k ≥ 1) that requires 17k
prefix reversals to sort. [Gates] They also conjectured that 19n/16 flips are required to sort the
list n. Heydari and Sudborough disprove Gates and Papadimitriou‟s conjecture and improve the
bounds to (9/8)n + 2 ≥ f(Xn ) ≥ 15n/14. [Heydari]

3.3. Known Pancake Numbers

    N     2        3       4      5       6      7       8      9      10      11      12       13
   Pn     1        3       4      5       7      8       9     10      11      13      14       15

One pattern that we observed in the known pancake numbers is that they increase by 1 for each
increase in n, except for after n=k2+1 for k=1,2,3, at which points Pn increases by 2. When we
continue the pattern we find potential pancake numbers, let them be represented by ~Pn. More
specifically, we find that
                                             P144 =154.

f(Xn ) ≥ 15n/14 (refer back to 3.2.2) for a chosen list Xn where n is a multiple of 16. So,

                            P144 ≥ f(X144)= f(X9*16)≥ 15*144/14=154.3
                                          P144 ≥ 155     (since all Pn are integers)

Because P144 ≥ 155 and ~P144 =154, ~Pn cannot be the pancake number Pn. Therefore, this pattern
of increases does not work for all n.

4. Approach to finding P14
Given the current bounds on Pn, we know that 14+1≤ P14 ≤ (5*14+5)/3 or 15 ≤ P14 ≤ 25. In the
following sections we are going to inspect ways to improve the bounds of P14.

4.1. Enumeration of all Possibilities

                                          Sullivan, Page 6
The first and most obvious approach is to enumerate all possible lists of length 14, find the best
way to sort each, and then find the largest required number of prefix reversals. Before beginning
to write out all possible lists of length 14, let‟s first look at how many possible lists of length 14
exist and how many possible ways there are to sort each list.

There are 14! = 87,178,291,200 possible ordered lists of length 14. Some could be easily
deleated as not possible worst case stacks, but even if half of the stacks could be deleted that
would still leave us with roughly 40 trillion stacks. For each of these you could flip 1 to 14
pancakes first (14 options) then at each subsequent step you could flip any number of pancakes
other than the number you flipped in the previous step (13 options). Therefore, each list has at
most 14*1324 = 7,599,210,785,241,187,178,802,335,054 ways to obtain flip sequences of length
25 (our current upper bound). Some of these would not be sorted after 25 steps or would be
duplicates and could be thrown out, others would obtain a sorted stack in fewer steps and the
process could be stopped then. Even if we found an approach to weed out half of the possible
sort sequences as “bad” flip processes, we still would have roughly four thousand gazillion
gazillion possible ways to sort any given stack. This results in
potential sorting sequence/stack combinations that must be inspected. Even with filtering that
cuts out half of the possibilities as unfeasible as worst case stacks or best case flip sequences,
there are clearly far too many possible flip sequences and ordered sets to enumerate. Therefore
this is not a feasible method.

4.2. Improvement of Current Bounding Methods

The next promising direction for improving our knowledge of P14 was to look more closely at the
Gates and Papadimitriou Algorithm for determining the upper bound and investigate if a few
changes to their algorithm that give promising results in smaller lists (lengths 4 to 6) could
improve the bounds for P14.

As discussed in 3.1.2, Gates and Papidimitriou based their algorithm off the following basic
       - Review: The algorithm is made up of flip sequences which are determined by the
       arrangement at the beginning of each cycle through the process. When the determined
       series of steps are completed, it goes back and looks at the new arrangement to determine
       the next set of flips.
       -1: The flip sequences take the initial arrangement and use the information to create as
       many adjacencies as possible in each cycle through the loop.
       -2: An adjacency is between two consecutive integers and the largest and smallest
       integers in the list. After sorting is complete they correct for this by adding a 1 to 4 step
       process at the end.

We modified their Algorithm in the following ways:
     -1: The flip sequence aims to create an adjacency between the first element and some
     other element in the list in the least number of flips. This cut out extra steps required to
     bring parts forward for the sole purpose of sorting and instead dealt primarily with the
     inital elements.

                                          Sullivan, Page 7
         -2: An adjacency is defined to be between two consecutive integers or the largest integer
         and the end of the list. This cut out the additional 4 step sorting process Gates and
         Papidimitriou added at the end of their algorithm to deal with the potential adjacency
         between n and 1.
                                                        N Pn Gates               Our     Time it took
We then programmed this new algorithm into                         Upper         upper Mathematica
Mathematica, and applied it using total                            bound:        bound
enumeration to find an improved upper bound                        5(n+1)/3
for 3 ≤ n ≤ 9. We could not obtain values using         3     3    6.6           3            0. sec
this method for n=14 because the large number           4     4    7.3           4          0.016 sec
of lists involved (explained in section 4.1)            5     5    10            6          0.094 sec
causes the computation to take to much time             6     7    11.6          8          0.719 sec
and memory to compute the bound.
                                                        7     8    13.3          11         6.828 sec
                                                        8     9    15            13        66.203 sec
Given that our method obtained P9 ≤ 15 and
                                                        9     10 16.6            15       706.109 sec
Gates algorithm gives P9 ≤ 16.6 and the fact that
the bound determined by our method appeared             10 11 18.3
to be increasing faster than the bound given by         11 13 20
the Gates Algorithm, we predict that they will          12 14 21.6
approach the same values. Therefore this                13 15 23.3
“improved” method would probably not be any better in the long run.

4.3. Iterative Bounding

Lastly we investigated the possibility that we could obtain more specific information about P14
from what we know about P13. We will first explain in general, then apply this to P14, and we will
show how this method can improve bounds for n<38.

Upper Bound: Given an ordered set of n+1 elements, it takes at most 2 prefix reversals to move
“n+1” to the end of the set. (The first is to move “n+1” to the front, the second is to flip the
whole stack.) Therefore,

                                             Pn+1 ≤ Pn +2                             [*1]

Lower Bound: Let m=an ordered set of n elements that requires Pn reversals to sort. Let mR be
the set m with all elements in reverse order. Create a set with length n+1 by inserting “n+1” into
the first place of the set mR and shifting all elements of mR one place back. Then flipping the
entire stack results in the set m with “n+1” added in the last place. It is going to require at least
one extra move to place “n+1” into its adjacency with the end so

                                             Pn +1 ≤ Pn+1                             [*2]

Combining equations [*1] and [*2] for P13 and P14 gives:

                                        Pn +1 ≤Pn+1 ≤ Pn +2
                                        P13 +1 ≤P14 ≤ P13 +2

                                          Sullivan, Page 8
                                           16 ≤P14 ≤ 17                             [*3]

All Pn can be bounded in relation to a known Pk. An increase in n by 1 increases Pn by at least
one flip and by at most two flips, so

                                   Pk + (n-k) ≤ Pn ≤ Pk +2(n-k).

Then we can use the information we know about P13 to obtain new bounds on Pn for n≥ 13

                                 P13 + (n-13) ≤ Pn ≤ P13 +2(n-13)
                                 15 + (n-13) ≤ Pn ≤ 15 +2(n-13)
                                        n+2 ≤ Pn ≤ 2n-11 for n ≥ 13                 [*4]

The lower bound in [*4] is an improvement on the bound found by Carey, Johnson and Lin. It is
likely that they used a similar process to prove that Pn ≥ n+1 for n ≥ 7 (refer back to 3.2.1),
except that they used P7=8 (the highest known pancake number at the time) to obtain their
bound. The upper bound in [*4] is better than the upper bound Pn ≤ 5(n+1)/3 which was obtained
by Gates and Papidimitriou (refer back to 3.1.2) for 13≤n<38.

5. Results and Analysis
Both of our methods obtained improved bounds for a limited size of ordered list (or stack of
pancakes). Modifying Gates and Papidimitriou‟s Algorithm resulted in improved upper bounds
for n less than or equal to 9. However, for these n, exact pancake numbers are already known so
this does not present us with any new information. It also appears that this method, if continued
higher, would converge quickly to the same values as found by Gates and Papadimitriou. In the
long run, we would expect that the changes we made will have little effect on the upper bound.

The iterative method obtained much better results, giving improved bounds for n less than 38.
These bounds are tighter close to the known pancake numbers. If more pancake numbers were
determined, equation [*4] could be modified to include the new information. This would either
improve the upper or lower bound determined by the iterative process.

6. Conclusions
This project did not obtain improved bounds for all n. It did, however, improve bounds on Pn for
smaller n and did result in more precise knowledge of P14. We reduced the range of possible
values for P14 from 15 ≤ P14 ≤ 25 to 16 ≤ P14 ≤ 17. There is still work to be done on this problem,
but the work we have done has both combined past research on this problem and made some
small progress toward determining P14.

7. Future Research Work
Future work in this area could include looking for a list of size 14 which cannot be sorted in 16
steps and thus requires 17 (proving that P14=17). Or research could investigate ways to prove that
all list with n=14 can be sorted in at most 16 steps, proving that P14=16. I experimented a bit

                                         Sullivan, Page 9
with both of these but more extensive research could be done on these topics. Other possible
research areas related to this problem include the longest path problem in graph theory for the
family of graphs that represent pancake network and using information found about this family
of graphs to relate back to bounds on Pn. These are just a small number of the many possibilities
for future research in this area.

8. Acknowledgements
I would like to thank the Undergraduate Research Opportunities Program for partially supporting
this project. I would also like to thank Professor Peh Ng for her help advising me through this
long process. I would like to thank Professor Mark Logan for being a second reader for this

                                        Sullivan, Page 10
9. References
[1]   Carey, M.R.; Johnson, D.S.; and S. Lin: American Mathematics Monthly, vol. 84 p. 296,
[2]   Dweighter, Harry. American Mathematical Monthly 82 (1975), 110.
[3]   Gates W.H.; Papadimitriou, C.H. Bounds for sorting by prefix reversal. Discrete
      Mathematics 27 (1979), 47--57.
[4]   Heydari M.H.; Sudborough, I.H. On the diameter of the pancake network. Journal of
      Algorithms 25 (1997), no. 1, 67--94.
[5]   Malkevitch, Joseph. Pancakes, Graphs, and the Genome of Plants. The UMAP Journal
      23 (4), pp 373-382, 2002.
[6]   Weisstein. “Pancake Sorting.” From MathWorld—A Wolfram Web Resource.
      http://mathworld.wolfram.com/Pancake Sorting.html

                                      Sullivan, Page 11

Shared By: