A Comparative Study on Kakkot Sort and Other Sorting Methods

Document Sample
A Comparative Study on Kakkot Sort and Other Sorting Methods Powered By Docstoc
					                                        (IJCSIS) International Journal of Computer Science and Information Security,
                                        Vol. 8, No. 8, November 2010




           A Comparative Study on Kakkot Sort and Other
                       Sorting Methods
       Rajesh Ramachandran                              Dr.E.Kirubakaran

       HOD, Department of Computer Science             Sr.DGM(Outsourcing),BHEL,Trichy
       Naipunnya Institute of Management &             Email: e_kiru@yahoo.com
Information Technology, Pongam, Kerala
       Email: ryanrajesh@hotmail.com


Abstract: Several efficient algorithms were            Introduction
developed to cope with the popular task of
sorting. Kakkot sort is a new variant of               Sorting is any process of arranging items in
Quick and Insertion sort. The Kakkot sort              some sequence and/or in different sets, and
algorithm requires O( n log n )                        accordingly, it has two common, yet distinct
comparisons for worst case and average                 meanings:
case. Typically, Kakkot Sort is significantly
faster in practice than other O ( n log n )                 1. ordering: arranging items of the
algorithms , because its inner loop can be                     same kind, class, nature, etc. in some
efficiently    implemented       on     most                   ordered sequence,
architectures . This sorting method requires                2. categorizing: grouping and labeling
data movement, but less than that of                           items with similar properties together
insertion sort. This data movement can be                      (by sorts).
reduced by implementing the algorithm
using linked list. In this comparative study           In computer science and mathematics, a
the mathematical results of Kakkot sort                Sorting Algorithm is an algorithm that puts
were verified experimentally on ten                    elements of a list in a certain order. The
randomly generated unsorted numbers. To                most-used orders are numerical order and
have some experimental data to sustain this            lexicographical order. Efficient sorting is
comparison four different sorting methods              important to optimizing the use of other
were chosen and code was executed and                  algorithms (such as search and merge
execution time was noted to verify and                 algorithms) that require sorted lists to work
analyze the performance. The Kakkot Sort               correctly.
algorithm performance was found better as
compared to other sorting methods.                     To analyze an algorithm is to determine the
                                                       amount of resources (such as time and
Key words: Complexity, performance of                  storage) necessary to execute it. Most
algorithms, sorting                                    algorithms are designed to work with inputs
                                                       of arbitrary length. Usually the efficiency or
                                                       complexity of an algorithm is stated as a
                                                       function relating the input length to the
                                                       number of steps (time complexity) or
                                                       storage locations (space complexity).
                                                       Algorithm analysis is an important part of a
                                                       broader computational complexity theory,


                                                 150                               http://sites.google.com/site/ijcsis/
                                                                                   ISSN 1947-5500
                                         (IJCSIS) International Journal of Computer Science and Information Security,
                                         Vol. 8, No. 8, November 2010




which provides theoretical estimates for the            time complexity, its space complexity is also
resources needed by any algorithm which                 important: This is essentially the number of
solves a given computational problem.                   memory cells which an algorithm needs. A
These estimates provide an insight into                 good algorithm keeps this number as small
reasonable directions of search for efficient           as possible, too. The space complexity of a
algorithms. In theoretical analysis of                  program (for a given input) is the number of
algorithms it is common to estimate their               elementary objects that this program needs
complexity in the asymptotic sense, i.e., to            to store during its execution. This number is
estimate the complexity function for                    computed with respect to the size n of the
arbitrarily large input. Big O notation,                input data.
omega notation and theta notation are used
to this end                                             There is often a time-space-tradeoff
                                                        involved in a problem, that is, it cannot be
Time complexity                                         solved with few computing time and low
                                                        memory consumption. One then has to make
Time efficiency estimates depend on what                a compromise and to exchange computing
we define to be a step. For the analysis to             time for memory consumption or vice versa,
correspond usefully to the actual execution             depending on which algorithm one chooses
time, the time required to perform a step               and how one parameterizes it.
must be guaranteed to be bounded above by
a constant. In mathematics, computer                    In addition to varying complexity, sorting
science, and related fields, Big Oh notation            algorithms also fall into two basic categories
describes the limiting behavior of a function           — comparison based and non-comparison
when the argument tends towards a                       based. A comparison based algorithm orders
particular value or infinity, usually in terms          a sorting array by weighing the value of one
of simpler functions. Big O notation allows             element against the value of other elements.
its users to simplify functions in order to             Algorithms such as Quicksort, Mergesort,
concentrate on their growth rates: different            Heapsort, Bubble sort, and Insertion sort are
functions with the same growth rate may be              comparison based. Alternatively, a non-
represented using the same O notation.                  comparison based algorithm sorts an array
                                                        without consideration of pairwise data
Although developed as a part of pure                    elements. Radix sort is a non-comparison
mathematics, this notation is now frequently            based algorithm that treats the sorting
also used in computational complexity                   elements as numbers represented in a base-
theory to describe an algorithm's usage of              M number system, and then works with
computational resources: the worst case or              individual digits of M.
average case running time or memory usage
of an algorithm is often expressed as a                 Another factor which influences the
function of the length of its input using big           performance of sorting method is the
O notation.                                             behavior pattern of the input. In computer
                                                        science, best, worst and average cases of a
Space complexity                                        given algorithm express what the resource
                                                        usage is at least, at most and on average,
The better the time complexity of an                    respectively. Usually the resource being
algorithm is, the faster the algorithm will             considered is running time, but it could also
carry out his work in practice. Apart from              be memory or other resources.



                                                  151                               http://sites.google.com/site/ijcsis/
                                                                                    ISSN 1947-5500
                                         (IJCSIS) International Journal of Computer Science and Information Security,
                                         Vol. 8, No. 8, November 2010




Kakkot Sort                                              Step1. Read the first two numbers from N,
                                                                Let K1 & K2
Kakkot Sort is a sorting algorithm that ,                Step2. Sort K1 and K2
makes O ( n log n ) (Big Oh notation)                    Step3. Read the next number, Let A
comparisons to sort n items. Typically,                  Step4. Compare A with K2
Kakkot Sort is significantly faster in
practice than other O ( n log n ) algorithms ,          Step5. If A is greater than or equal to K2
because its inner loop can be efficiently                       then place A right of K2
implemented on most architectures . This                        else
sorting method requires data movement but                       compare A with K1.
less than that of insertion sort. This data                     If A is less than K1
movement can be reduced by implementing                         then place A left of K1
the algorithm using linked list. Major                          else
advantage of this sorting method is its                         Place A immediate right of K1
behavior pattern is same for all cases, ie              Step6 . If the list contains any more
time complexity of this method is same for                      elements go to step 3
best, average and worst case                            Step 7. Now we have 3 Sub list.
How it sorts
                                                                            First list with all values less
                                                                            than or equal to K1.
From the given set of unsorted numbers,
                                                                            Second with values between
take the first two numbers and name it as
                                                                            K1 and K2
key one and key two , ie, K1 and K2. Read
                                                                            Final with values greater than
all the remaining numbers one by one.
                                                                            or equal to K2.
Compare each number first with K2. If the
number is greater than or equal to K2 then
                                                        Step8. If each list contains more than 1
place the number right of K2 else compare
                                                               element go to step1
the same number with K1. If the number is
greater than K1 then place the number
                                                        Step 9 End.
immediate right of K1 else left of
K1.Conitnue the same process for all the
                                                        Time complexity
remaining numbers in the list. Finally we
will get three sub lists. One with numbers
                                                        If there are ‘n’ numbers, then each iteration
less than or equal to K1, one with numbers
                                                        needs maximum 2 * (n-2) comparison and
greater than or equal to K2 and the other
                                                        minimum of n-2 comparison and plus one.
with numbers between K1 and K2. Repeat
                                                        So if we take the average it will be
the same process for each sub list. Continue
this process till the sub list contains zero
                                                        =(2n-4+n-2)/2 + 1
elements or one element.
                                                        =(3n-6)/2+1
                                                        = 3n/2 – 2

                                                        In the average case each list would have 3
                                                        sub lists and number of iteration will be
Algorithm
                                                        3x=n
Kakkot Sort(N:Array of Numbers, K1 ,K2 ,
                                                        taking logarithm on both side we get
A:integers,)



                                                  152                               http://sites.google.com/site/ijcsis/
                                                                                    ISSN 1947-5500
                                           (IJCSIS) International Journal of Computer Science and Information Security,
                                           Vol. 8, No. 8, November 2010




x log 3= log n
x= log n / log 3                                          Consider the following randomly generated
x= log n/ 0.4771                                          ten unsorted numbers
Ignoring the constant we can write x = log n                     1,60,33,3,35,21,53,19,70,94
                                                                         List 1
That is there will be log n iterations and
each require 3n/2 – 2 comparisons. So the                 First two numbers are 1 and 60
time complexity of Kakkot Sort in average                 and sort it . Here K1 is 1 and K2 is 60
case is 3n/2 – 2 * log n. When we represent               Now the total comparison is one.
in Big Oh notation constants can be ignored,              Read the remaining numbers one by one
so we get O(n log n).                                     Read 33, since 33 is less than K2 and greater
If the list is already in sorted order, then two          than K1 it need two comparison . Now the
comparison will be required for each                      total comparison is increased to 3.
number ,so total no of comparison required                Read 3, total comparison is now 5
for each iteration will be (n-2)+1, i.e. n-1              Read 35, total comparison is now 7
and number of iteration will be n-1+n-3+n-                Read 21, total comparison is now 9
6+…..+1                                                   Read 53, total comparison now is 11
This can be written as                                    Read 19, total comparison now is 13
1+3+5+…..n-3+n-1.                                         Read 70, total comparison now is 14
                                                          Read 94 total comparison now is 15
Sum of this series is                                     Now the list will be
S= N/2*(2a +(N-1)*d)                                      1, 3,35,21,53,19 ,60,70,94
Where N is the number of terms in the series              Here we have 3 sublist
‘a’ is first term                                         The first one with zero elements
‘d’ is the difference                                     Second list is , 3,35, 35,21,53,19
To get Nth term, the equation is a+(N-1) d                Third list is 70,94
And here Nth term is n- 1, so                             Now do the same process second and third
1+(N-1)*2=n-1                                             list
2N=n                                                      Second list
N= n/2                                                    Read first two numbers, and sort
S=N/2(2*1+(N-1)*2)                                        We have K1 =3 and K2= 35
S=N/2(2+2N-2)                                             Now total comparison is 16
S=N/2(2N)                                                 Read 21, total comparison now is 18
Sum = N2                                                  Read 53,total comparison now is 19
Substitute value for N we get                             Read 19,total comparison now is 21
(n/2)**2                                                  Now the list will be
                                                          3,19,21,35,53
This is equal to one forth of n2. So Kakkot               Now only one list with more than one
Sort requires only one forth of Quick sort                element, ie 19 and 21
comparison in worst case. This is almost                  Read the first two numbers and sort
equal to average case time complexity. So                 Here K1=19 and K2 =21
we can say that time complexity of Kakkot                 Now the total comparison is 22
sort is similar in all the cases.                         Now regarding the sublist 3 we have two
                                                          numbers 70 and 94
Now let me manually calculate the number                  Read the numbers and sort
of comparison that Kakkot sort take.                      Now the total number of comparison is 23



                                                    153                               http://sites.google.com/site/ijcsis/
                                                                                      ISSN 1947-5500
                                           (IJCSIS) International Journal of Computer Science and Information Security,
                                           Vol. 8, No. 8, November 2010




                                                          Kakkot Sort and Bubble Sort
So Using Kakkot sort , to sort the given ten
randomly generated numbers require only                   Bubble sort is a straightforward and
23 comparisons.                                           simplistic method of sorting data that is used
                                                          in computer science education. The
Kakkot Sort and Qucick Sort                               algorithm starts at the beginning of the data
                                                          set. It compares the first two elements, and if
Time complexity of Quick sort is O(n log n)               the first is greater than the second, then it
in the case of average case and O(n2) in the              swaps them. It continues doing this for each
worst case behavior. From this it is clear that           pair of adjacent elements to the end of the
Kakkot sort is better than quick sort. While              data set. It then starts again with the first two
sorting Quick sort does not require any data              elements, repeating until no swaps have
movement where as Kakkot sort needs data                  occurred on the last pass. This algorithm is
movement when the item is less than first                 highly inefficient, and is rarely used[citation
key element and greater than second key                   needed][dubious – discuss], except as a
element. But this data movement can be                    simplistic example. For example, if we have
avoided by implementing the algorithm                     100 elements then the total number of
using linked list.                                        comparisons will be 10000. A slightly better
To sort the above ten numbers in the List 1 ,             variant, cocktail sort, works by inverting the
Quick sort requires 29 comparisons                        ordering criteria and the pass direction on
                                                          alternating passes. The modified Bubble sort
Kakkot Sort and Heap Sort                                 will stop 1 shorter each time through the
                                                          loop, so the total number of comparisons for
Heapsort is a much more efficient version of              100 elements will be 4950.
selection sort. It also works by determining
the largest (or smallest) element of the list,            Bubble sort average case and worst case are
placing that at the end (or beginning) of the             both O(n²)
list, then continuing with the rest of the list,
but accomplishes this task efficiently by                 For the above unsorted numbers in the List 1
using a data structure called a heap, a special           Bubble sort requires 45 comparisons.
type of binary tree. Once the data list has
been made into a heap, the root node is
guaranteed to be the largest(or smallest)                 Kakkot Sort and Insertion Sort
element. When it is removed and placed at
the end of the list, the heap is rearranged so            Insertion sort is a simple sorting algorithm
the largest element remaining moves to the                that is relatively efficient for small lists and
root. Using the heap, finding the next largest            mostly-sorted lists, and often is used as part
element takes O(log n) time, instead of O(n)              of more sophisticated algorithms. It works
for a linear scan as in simple selection sort.            by taking elements from the list one by one
This allows Heapsort to run in O(n log n)                 and inserting them in their correct position
time, and this is also the worst case                     into a new sorted list. In arrays, the new list
complexity.                                               and the remaining elements can share the
With the same set of unsorted numbers in                  array's space, but insertion is expensive,
the List 1, Heap sort requires 30                         requiring shifting all following elements
comparisons                                               over by one. Shell sort (see below) is a




                                                    154                               http://sites.google.com/site/ijcsis/
                                                                                      ISSN 1947-5500
                                           (IJCSIS) International Journal of Computer Science and Information Security,
                                           Vol. 8, No. 8, November 2010




variant of insertion sort that is more efficient          [6] Sartaj Sahni, “Data Structures
for larger lists.                                         Algorithms and Applications in C++”,
Insertion sort requires 38 comparisons to                 University Press, 2nd Ed.,2005
sort the above ten randomly generated
numbers in the List 1.                                    [7] Yedidyah Langsam,Moshe J Augenstein,
                                                          Aaron M Tanenbaum “Data Structures
                                                          using C and C++”, Prentice Hall India, 2nd
                                                          Ed. 2005
Conclusion
                                                          [8] Alfred V Aho, John E Hopcroft, Jeffrey
 From the above examples it is clear that                 D      Ullman,”Data     Structures    and
Kakkot Sort time complexity is better than                Algorithms”,    Pearson      Education,2nd
other sorting methods. Even though Kakkot                 Ed.,2006
sort requires data movement of items when
the item is less than the key K2 and greater              [9] Sara Baase, Allen Van Gelder,
than the key K1, this data movement can be                “Computer Algorithms Introduction to
reduced by implementing the algorithm                     Design and Analysis, Pearson Education, 3rd
using linked list.                                        Ed. ,2006

References:                                               [10] Mark Allen Weiss “Data Structures
                                                          and Algorithm analysis in C++ “, Pearson
                                                          Education, 3rd Ed., 2007
[1] Aaron M Tanenbaum, Moshe J
Augenstein, “Data      Structures using                   [11] Michael T Goodrich, Roberto
C”,Prentice       Hall        International               Tamassia, “Algorithm Design Foundations,
Inc.,Emglewood Cliffs,NJ,1986                             Analysis and Internet Examples”, John
                                                          Wiley and Sons Inc.,2007
[2] Robert L Cruse, “ Data Structure and
Program Design”, Prentice Hall India 3rd                  [12] Seymour Lipschutz, GAV Pai , “ Data
ed.,1999                                                  Structures”, Tata McGraw Hill,2007

[3] Robert Kruse, C L Tondo, Bruse Leung                  [13] Robert Lafore,” Data Structures and
“Data Structures and Program design in                    Algorithms in Java”, Waite Group Inc.,
C”, Pearson Education,2nd Ed.,2002                        2007

[4] Alfred V Aho, John E Hopcroft, Jeffrey                [14]     Rajesh    Ramachandran,     Dr.E.
D Ullman, “ The Design and Analysis of                    Kirubakaran, “Kakkot Sort – A New Sorting
Computer Alogorithms”, Pearson Education                  Method”, International Journal of Computer
, 2003                                                    Science,    Systems     Engineering   and
                                                          Information Technology, ISSN 0974-5807
[5] Thomas H Cormen, Charles E Leiserson,                 Vol. 2 No. 2 pp209-213,2010
Ronald     L     Rivest,      Clifford Stein,
“Introduction to Algorithms” Prentice Hall
of India Pvt.Ltd., 2nd Ed. , 2004




                                                    155                               http://sites.google.com/site/ijcsis/
                                                                                      ISSN 1947-5500