KEMBAR78
Implementation of Sorting Algorithms | PDF | Array Data Structure | Time Complexity
0% found this document useful (0 votes)
849 views97 pages

Implementation of Sorting Algorithms

This document discusses the implementation of sorting algorithms. It was submitted by Erinfolami Daniel Oluwatosin in partial fulfillment of the requirements for a Bachelor of Science degree in Computer Science from Ekiti State University, Nigeria. The document includes an introduction to sorting concepts, classifications of sorting algorithms, descriptions of popular sorting algorithms, and comparisons of different algorithms. Source code implementations of some algorithms in Visual Basic are provided in the appendices.

Uploaded by

DanielErinfolami
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
849 views97 pages

Implementation of Sorting Algorithms

This document discusses the implementation of sorting algorithms. It was submitted by Erinfolami Daniel Oluwatosin in partial fulfillment of the requirements for a Bachelor of Science degree in Computer Science from Ekiti State University, Nigeria. The document includes an introduction to sorting concepts, classifications of sorting algorithms, descriptions of popular sorting algorithms, and comparisons of different algorithms. Source code implementations of some algorithms in Visual Basic are provided in the appendices.

Uploaded by

DanielErinfolami
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 97

IMPLEMENTATION OF

SORTING ALGORITHMS

BY

ERINFOLAMI DANIEL OLUWATOSIN


(074617)

BEING A PROJECT SUBMITTED TO THE


DEPARTMENT OF MATHEMATICAL SCIENCE
(COMPUTER SCIENCE OPTION)
FACULTY OF SCIENCE,
EKITI STATE UNIVERSITY, ADO EKITI, NIGERIA

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE AWARD OF

BACHELOR OF SCIENCE (B.SC) DEGREE IN COMPUTER SCIENCE

MARCH, 2012

i
CERTIFICATION

This is to certify that this work was carried out by ERINFOLAMI, Daniel

Oluwatosin with matriculation number 074617 of the Department of

Mathematical Science (Computer Science), Faculty of Science, Ekiti State

University, Ado-Ekiti,

____________________ _____________________
Mr. A.A. Obayomi DATE
PROJECT SUPERVISOR

____________________ _____________________
Dr. R.A. Adeleke DATE

HEAD OF DEPARTMENT

____________________ _____________________
External Examiner DATE

ii
DEDICATION

This project is dedicated to Almighty God for his guidance, protection and

provision throughout my course of study in Ekiti State University and also

to my family and friends for the encouragement, moral and financial support

given to me.

iii
ACKNOWLEDGEMENTS

A Big thanks to Almighty God the maker of Heaven and Earth who

has always been my Guide, my mentor, my Provider who has been

protecting me up till now, I owe him everything.

To my able Supervisor Mr. A.A Obayomi who is always being there

for me like a father to his son, who took time in going through this project

manuscript and making constructive criticisms, useful advice and

corrections for the success of this project, I say a big thank you.

I also appreciate the H.O.D of the department Dr R.A Adeleke, my

level coordinator, Mr. Ekong, and other lecturers in the department.

To my parents, Pastor & Mrs. Erinfolami who has always being there

for me, spiritually, financially and morally, you gave me the best gift of life

by sponsoring my education; I love you so much.

There are many to whom I owe a debt of gratitude for their valuable

suggestions, love and wonderful support all the way, More importantly Mr.

Abimbola Akande and a special crystal, Olutolami Falaiye. Also to my

siblings, I say thank you for your love and support, May God reward you

bountifully. I love you so much, thanks for being there. I cannot forget my
iv
friends; Akinwale Emmanuel Michael Falade, Samuel, Cecilia, and the

members of CAC Victory Assembly, EKSU, who all have been so loving and

have been such good people to me, it’s been a privilege meeting you all.

v
TABLE OF CONTENTS

CONTENTS PAGES

TITLE PAGE i

CERTIFICATION ii

DEDICATION iii

ACKNOWLEDGEMENTS iv

TABLE OF CONTENTS vi

ABSTRACT xi

CHAPTER ONE: INTRODUCTION

1.1 Introduction 1

1.2 Aims and Objectives of The Project 5

1.3 Methodology 6

1.4 Scope and Limitations of the Project 6

1.5 Structure of The Project 7

vi
CHAPTER TWO: LITERATURE REVIEW

2.1 History of Sorting 9

2.2 Terminologies 10

2.3 Concepts 13

2.4 Classification 18

2.5 Stability 19

2.6 Popular Sorting Algorithms 20

2.6.1 Bubble Sort 20

2.6.2 Cocktail Sort 22

2.6.3 Comb Sort 22

2.6.4 Gnome Sort 23

2.6.5 Heap Sort 24

2.6.6 Insertion Sort 25

2.6.7 JSort 27

2.6.8 Jump Sort 27

2.6.9 Merge Sort 28

2.6.10 Quick Sort 29

2.6.11 Quick Sort 3 33

vii
2.6.12 Selection Sort 34

2.6.13 Shaker Sort 35

2.6.14 Shell Sort 35

2.6.15 Smooth Sort 38

2.6.16 Snake Sort 38

2.7 Comparison 39

2.7.1 Comparison Sorts 40

2.7.2 Non-Comparison Sorts 43

2.7.3 Impractical Sorts 45

CHAPTER THREE: SYSTEM ANALYSIS AND DESIGN

3.1 Introduction 47

3.2 The Ideal Sorting Algorithm 47

3.3 Characteristics of a Good System Design 48

3.4 Development Tools 49

3.5 Overview of the Proposed Program 49

3.6 Program Architecture 50

3.7 The User Interface 50

viii
CHAPTER FOUR: SYSTEM IMPLEMENTATION

4.1 System Requirement 52

4.2 Installation Description 53

4.3 Program Implementation 54

4.3.1 Tab 1: Basic Sorting 54

4.3.2 Tab 2: Sort File 56

4.3.3 Tab 3: Sort by Key Field 58

CHAPTER FIVE: CONCLUSION AND RECOMMENDATION

5.1 Conclusion 60

5.2 Recommendation 60

REFERENCES 61

APPENDICES 63

Appendix A: List of Sorting Algorithms 63

Appendix B: Pseudocode Implementation of some Sorting Algorithms 66

ix
Appendix C: Sorting Algorithms in Visual Basic 69

Appendix D: Program Source Code 77

x
ABSTRACT

This project presents and implements a collection of algorithms for sorting.

Descriptions are brief with just enough theory. The first section introduces

basic sorting concepts and classifications of sorting algorithms. The next

section presents several sorting algorithms. Source code for the most

popular algorithms, in Visual Basic, is included.

The implementation, in Visual Basic, is a Windows Forms application.

At first the sorting algorithms will be applied to an array of integers.

Although this is a good way to introduce the algorithms, it is not very

common in practical situations to sort an array of simple numbers. Usually,

the items being sorted are more complex, typically objects or structs, so that

each item is itself made up of a number of elements.

Hence, the application will be able (include procedures) implement a

predefined set of more complex tasks.

The bibliography appearing at the end of this project lists 38 sorting

algorithms and various books and articles published in recent years.

xi
xii
CHAPTER ONE

1.1 INTRODUCTION

According to Wikipedia,

Sorting is any process of arranging items in some sequence and/or in

different sets, and accordingly, it has two common, yet distinct meanings:

1. Ordering: arranging items of the same kind, class, nature, etc. in some

ordered sequence,

2. Categorizing: grouping and labeling items with similar properties

together (by sorts).

Although applying sorting algorithms to arrays of integers this is a good

way to introduce the algorithms, it is not very common in practical

situations to sort an array of simple numbers. Usually, the items being

sorted are more complex, typically objects or structs, so that each item is

itself made up of a number of elements. We will refer to the items that are

being sorted as records and to the elements that make up the items as fields

of the records. Sorting n-tuples (depending on context also called e.g.

records consisting of fields) can be done based on one or more of its

1
components. For example, if each record is a name and address, then the

fields might be first name, last name, street address, city, state, and zip code.

When an array of records is to be sorted, the sorted order typically depends

on only one of the fields of the record. This field is said to be the key field for

the sort (or sort key), and the values of this field are called keys. In other

words, when we sort an array of records, we compare keys, and we arrange

the records so that their keys are in ascending (or descending) order.

A new sort key can be created from two or more sort keys by

lexicographical order. The first is then called the primary sort key, the

second the secondary sort key, etc.

For example, addresses could be sorted using the city as primary sort key,

and the street as secondary sort key.

If the sort key values are totally ordered, the sort key defines a weak order of

the items: items with the same sort key are equivalent with respect to

sorting. See also stable sorting. If different items have different sort key

values then this defines a unique order of the items.

2
A standard order is often called ascending (corresponding to the fact that

the standard order of numbers is ascending, i.e. A to Z, 0 to 9), the reverse

order descending (Z to A, 9 to 0). For dates/times ascending means that

earlier values precede later ones e.g. 1/1/2000 will sort ahead of 1/1/2001.

A SORTING ALGORITHM is an algorithm that puts elements of a list in a

certain order. The most used orders are numerical and lexicographical order.

Sorting is a commonly encountered programming task in computing, hence,

efficient sorting is important for optimizing the use of other algorithms

(such as search and merge algorithms) that require sorted lists to work

correctly; it is often useful for canonicalizing data and for producing human-

readable output.

More formally, the output must satisfy two conditions:

1. The output is in non-decreasing order (each element is no smaller than

the previous element according to the desired total order)

2. The output is a permutation, or reordering, of the output.

3
SORTING EXAMPLES:

 List containing exam scores sorted from Lowest to Highest or from

Highest to Lowest.

 List containing words that were misspelled and be listed in

alphabetical order.

 List of student records and sorted by student number or

alphabetically by first or last name.

WHY DO WE SORT?

 Searching for an element in an array will be more efficient. (Example:

looking up for information like phone number).

 It’s always nice to see data in a sorted display. (Example: spreadsheet

or database application).

4
1.2 AIMS AND OBJECTIVES OF THE PROJECT

This project attempts to provide an overview of Lists, Arrays and searching

of such data structures.

It also provides an understanding of the basics and application of Sorting.

This project also aims to;

 Show how each algorithm operates

 Show that there is no best sorting algorithm

 Show the advantages and disadvantages of each algorithm

 Show that the worse-case asymptotic behavior is not always the best

deciding factor in choosing an algorithm

 Show that the initial condition (input order and key distribution)

affects performance as much as the algorithm choice.

This project also aimed at developing a simple application that

demonstrates simple and practical applications of sorting. The application

will be able to:

 Sort the contents of a text file,

 Sort a list and remove duplicates,

5
 Randomize an array, and

 Sort objects by different key fields by building a comparer class

(in Visual Basic .NET)

1.3 METHODOLOGY

This project work has benefitted majorly from information gathered from

various articles, publications and presentations. These materials are results

of high-quality research from both the Academics background with lasting

significance in the theory, design, implementation, analysis and applications

of sorting.

To develop the application the following will be used

Programming Language: Microsoft Visual Basic .NET

.NET Framework: Version 4

1.4 SCOPE AND LIMITATION OF THE STUDY

Sorting is a vast topic; the application will demonstrate basic

implementation of sorting algorithms.

6
External sorting, radix sorting, string sorting, and linked list sorting – all

wonderful and interesting topics – are deliberately omitted to limit the

scope of the project.

1.5 STRUCTURE OF THE PROJECT

i. Chapter One (Introduction): the background of the study, aims of the

study, scopes of the study and the methodology used.

ii. Chapter Two (Literature Review): analyses core algorithm concepts,

popular sorting algorithms and compares them. The chapter also

explores available written literatures.

iii. Chapter Three (System Design and Analysis): discusses how the

proposed application will be implemented by giving an overview of the

proposed application, describing the working operations, analysis of it

and the design.

iv. Chapter Four (System Implementation): takes a broad look at the

working operations of the application, the chapter exposes it to real life

scenarios to confirm performance. It starts by taking a look at the

7
hardware and software requirement for installation and the process of

installation itself.

v. Chapter Five (Conclusion and Recommendation): discusses the future

of sorting and sorting algorithms and based on that, recommendations

will be suggested.

8
CHAPTER TWO

LITERATURE REVIEW

2.1 History of Sorting

Sorting is one of the most important operations performed by computers. In

the days of magnetic tape storage before modern databases, database

updating was done by sorting transactions and merging them with a master

file. It's still important for presentation of data extracted from databases:

most people prefer to get reports sorted into some relevant order before

flipping through pages of data.

In computer science, sorting is one of the most extensively researched

subjects because of the need to speed up the operation on thousands or

millions of records during a search operation.

The main purpose of sorting information is to optimize its usefulness for

specific tasks. In general, there are two ways of grouping information: by

category e.g. a shopping catalogue where items are compiled together under

headings such as 'home', 'sport & leisure', 'women's clothes' etc. (nominal

9
scale) and by the intensity of some property, such as price, e.g. from the

cheapest to most expensive (ordinal scale.)

2.2 Terminologies

a) Algorithm:

An algorithm is an effective method expressed as a finite list of well-defined

instructions for calculating a function. Starting from an initial state and

initial input (perhaps empty), the instructions describe a computation that,

when executed, will proceed through a finite number of well-defined

successive states, eventually producing "output" and terminating at a final

ending state.

b) Lists:

A list or sequence is an abstract data type that implements an ordered

collection of values, where the same value may occur more than once. An

instance of a list is a computer representation of the mathematical concept

of a finite sequence. Each instance of a value in the list is usually called an

10
item, entry, or element of the list; if the same value occurs multiple times,

each occurrence is considered a distinct item.

c) Array:

An array data structure or simply array is a data structure consisting of a

collection of elements (values or variables), each identified by at least one

array index or key. An array is stored so that the position of each element

can be computed from its index tuple by a mathematical formula.

Arrays are among the oldest and most important data structures, and are

used by almost every program. They are also used to implement many other

data structures, such as lists and strings.

When data objects are stored in an array, individual objects are selected by

an index that is usually a non-negative scalar integer. Indices are also called

subscripts. An index maps the array value to a stored object.

There are three ways in which the elements of an array can be indexed:

 0 (zero-based indexing): The first element of the array is indexed by

subscript of 0.

11
 1 (one-based indexing): The first element of the array is indexed by

subscript of 1.

 n (n-based indexing): The base index of an array can be freely chosen.

Arrays can have multiple dimensions, thus it is not uncommon to access an

array using multiple indices. The number of indices needed to specify an

element is called the dimension, dimensionality, or rank of the array.

d) Search Algorithm:

A search algorithm, broadly speaking, is an algorithm for finding an item

with specified properties among a collection of items. The items may be

stored individually as records in a database; or may be elements of a search

space defined by a mathematical formula or procedure, such as the roots of

an equation with integer variables.

e) N-Tuple:

An n-tuple, sometimes simply called a “tuple” when the number n is known

implicitly, is another word for a list, i.e., an ordered set of n elements. It can

be interpreted as a vector, or more specifically, an n-vector.

12
2.3 Concepts

a) Big O Notation:

In Mathematics, Big O Notation is used to describe the limiting behavior of

a function when the argument tends towards a particular value or infinity,

usually in terms of simpler functions. Big O notation characterizes functions

according to their growth rate.

b) Timing Estimates

We can place an upper-bound on the execution time of algorithms using O

(big-oh) notation. An algorithm that runs in O(n2) time indicates that

execution time increases with the square of the dataset size. For example, if

we increase dataset size by a factor of ten, execution time will increase by a

factor of 100. A more precise explanation of big-oh follows.

Assume that execution time is some function t(n), where n is the dataset

size. The statement

t(n) = O(g(n))

implies that there exists positive constants c and n0 such that

13
t(n) <= c·g(n)

for all n greater than or equal to n0. This is illustrated graphically in the

following figure.

Figure 2-1: Big-Oh

Big-oh notation does not describe the exact time that an algorithm takes,

but only indicates an asymptotic upper bound on execution time within a

constant factor. If an algorithm takes O(n2) time, then execution time grows

no worse than the square of the size of the list.

14
7/6 2
n lg n n n lg n n
1 0 1 0 1
16 4 25 64 256
256 8 645 2,048 65,536
4,096 12 16,384 49,152 16,777,216
65,536 16 416,128 1,048,565 4,294,967,296
1,048,576 20 10,568,983 20,971,520 1,099,511,627,776
16,777,216 24 268,435,456 402,653,183 281,474,976,710,656

Table 2-1: Growth Rates

Table 1-1 illustrates growth rates for various functions. A growth rate of

O(lg n) occurs for algorithms similar to the binary search. The lg (logarithm,

base 2) function increases by one when n is doubled. Recall that we can

search twice as many items with one more comparison in the binary search.

Thus the binary search is a O(lg n) algorithm.

If the values in Table 1-1 represented microseconds, then a O(n1.25)

algorithm may take 10 microseconds to process 1,048,476 items, a O(lg n)

algorithm 20 seconds, and a O(n2) algorithm up to 12 days! In the following

chapters a timing estimate for each algorithm, using big-O notation, will be

included. For a more formal derivation of these formulas you may wish to

consult the references.

15
c) Divide and Conquer Algorithms:

Divide and Conquer (D&C) is an important algorithm design paradigm

based on multi-branched recursion. A divide and conquer algorithm works

by recursively breaking down a problem into two or more sub-problems of

the same (or related) type, until these problems become simple enough to be

solved directly. The solutions to the sub-problems are then combined to

give a solution to the original problem.

This technique is the basis of efficient algorithms for all kinds of problems

such as sorting (e.g., Quicksort, merge sort.)

d) Data Structures:

A data structure is a particular way of storing and organizing data in a

computer so that it can be used efficiently. Different kinds of data structures

are suited to different kinds of applications. For example, compiler

implementations usually use hash tables to look up identifiers. Data

structures are used in almost every program or software system. Data

structures provide a means to manage huge amounts of data efficiently such

as large databases and internet indexing services.

16
e) Best, Worst and Average Case:

Best, Worst and Average Cases of a given algorithm express what the

resource usage is at least, at most and on average, respectively. Usually the

resource being considered is running time, but it could also be memory or

other resources. In real-time computing, the worst-case execution time is

often of particular concern since it is important to know how much time

might be needed in the worst case to guarantee that the algorithm will

always finish on time.

f) Space-Time Tradeoffs (or time-memory tradeoff):

This is a situation where the memory use can be reduced at the cost of

slower program execution (and, conversely, the computation time can be

reduced at the cost of increased memory use). As the relative costs of CPU

cycles, RAM space, and hard drive space change – hard drive space has for

some time been getting cheaper at a much faster rate than other components

of computers – the appropriate choices of space-time tradeoffs have changed

radically. Often, by exploiting a space-time tradeoff, a program or algorithm

can be made to run much faster.

17
2.4 CLASSIFICATION

Sorting algorithms used in computer science are often classified by:

a) Recursion. Some algorithms are either recursive or non-recursive, while

others may be both (e.g., merge sort).

b) Whether or not they are a comparison sort. A comparison sort examines

the data only by comparing two elements with a comparison operator.

c) Memory usage. Strictly, a sorting algorithm is said to be an in-place

sorting algorithm if the amount of extra space required by the algorithm

is O(1); sometimes O(log(n)) additional memory is considered "in place".

That is, the amount of extra space is bounded by a constant, independent

of the size of the array.

d) Computational complexity of swaps (for "in place" algorithms)

e) Computational complexity (worst, average and best behavior) of

element comparisons in terms of the size of the list (n). For typical

sorting algorithms good behavior is O(n log n) and bad behavior is

O(n2). Ideal behavior for a sort is O(n), but this is not possible in the

average case. Comparison-based sorting algorithms, which evaluate the

18
elements of the list via an abstract key comparison operation, need at

least O(n log n) comparisons for most inputs.

f) Stability: stable sorting algorithms maintain the relative order of records

with equal keys

g) General method: insertion, exchange, selection, merging, etc.. Exchange

sorts include bubble sort and quicksort. Selection sorts include shaker

sort and heapsort.

h) Adaptability: Whether or not the presortedness of the input affects the

running time. Algorithms that take this into account are known to be

adaptive.

2.5 STABILITY

A sorting algorithm is said to be stable if items that have the same key

remain in the same relative order after the sorting algorithm is applied as

they were in before the sort.

Insertion Sort is a stable sorting algorithm because it never changes the

ordering of items with the same key. In Selection Sort, on the other hand, an

19
item can jump from one part of the array to another, moving past other

items with the same key. Selection Sort is not stable. Similarly, Heap Sort is

not a stable sorting algorithm.

Stability is not always needed in a sorting algorithm, but it is desirable in

some cases. Suppose, for example, we are sorting records that contain first

name and last name fields. Suppose that we sort the array using the first

name as the key, and then we sort using the last name as key. If we use a

stable sorting algorithm, then records with the same last name will still be

in alphabetical order by first name. If the algorithm is not stable, then

people with the same last name can end up in an arbitrary order. [3]

2.6 POPULAR SORTING ALGORITHMS

2.6.1. Bubble Sort

Bubble sort is a simple sorting algorithm. It works by repeatedly stepping

through the list to be sorted, comparing two items at a time and swapping

them if they are in the wrong order. The pass through the list is repeated

until no swaps are needed, which means the list is sorted.

20
Figure 2-2: A Visual Representation of How Bubble Sort Works

The algorithm gets its name from the way smaller elements "bubble" to the

top (i.e. the beginning) of the list via the swaps. One way to optimize

bubble sort (implemented here) is to note that, after each pass, the largest

element will always move down to the bottom. Thus it suffices to sort the

remaining n - 1 elements each subsequent pass.

Although simple, this algorithm is highly inefficient and is rarely used

except in education.

21
2.6.2. Cocktail Sort

Cocktail sort is a variation of bubble sort that sorts in both directions each

pass through the list.

One optimization (implemented here) is to add an if-statement that checks

whether there has been a swap after the first pass each iteration. If there

hasn't been a swap the list is sorted and the algorithm can stop.

2.6.3. Comb Sort

Comb sort was invented by Stephen Lacey and Richard Box, who first

described it to Byte Magazine in 1991. It improves on bubble sort and rivals

in speed more complex algorithms like quick sort. The idea is to eliminate

turtles, or small values near the end of the list, since in a bubble sort these

slow the sorting down tremendously. (Rabbits, large values around the

beginning of the list, do not pose a problem in bubble sort.)

In bubble sort, when any two elements are compared, they always have a

gap (distance from each other) of one. The basic idea of comb sort is that the

gap can be much more than one.

22
The gap starts out as the length of the list being sorted divided by the shrink

factor, and the list is sorted with that value (rounded down to an integer if

needed) for the gap. Then the gap is divided by the shrink factor again, the

list is sorted with this new gap, and the process repeats until the gap is one.

At this point, comb sort reverts to a true bubble sort, using a gap of one

until the list is fully sorted. In this final stage of the sort most turtles have

already been dealt with, so a bubble sort will be efficient.

The shrink factor has a great effect on the efficiency of comb sort. In the

original article, the authors suggested 1.3 after trying some random lists and

finding it to be generally the most effective. A value too small slows the

algorithm down because more comparisons must be made, whereas a value

too large may not kill enough turtles to be practical.

2.6.4. Gnome Sort

Gnome sort is a sorting algorithm which is similar to insertion sort except

that moving an element to its proper place is accomplished by a series of

swaps, as in bubble sort. It is conceptually simple, requiring no nested

loops. In practice the algorithm has been reported to generally run as fast as

23
Insertion sort, although this depends on the details of the architecture and

the implementation

2.6.5. Heap Sort

Heap sort is a much more efficient version of selection sort. Invented by

John William Joseph Williams in 1964, it works efficiently by using a data

structure called a heap.

A heap is a specialized tree-based data structure that satisfies the heap

property: if B is a child node of A, then key(A) >= key(B). This implies that

the element with the greatest key is always in the root node. All elements to

be sorted are inserted into a heap, and the heap organizes the elements

added to it in such a way that the largest value can be quickly extracted.

Once the data list has been made into a heap, the root node is guaranteed to

be the largest element. It is removed and placed at the end of the list, then

the heap is rearranged so the largest element remaining moves to the root.

Using the heap, finding the next largest element takes much less time than

scanning every remaining element, which gives heap sort much better

24
performance than selection sort. Similar to selection sort, the initial

conditions have little or no effect on the amount of time required.

Heap sort is one of the best general-purpose sorting algorithms. Although

somewhat slower in practice on most machines than a good implementation

of quick sort, it has a better worst-case runtime.

2.6.6. Insertion Sort

One of the simplest methods to sort an array is an insertion sort. Insertion

sort is a simple comparison sort in which the sorted array (or list) is built

one entry at a time. It is much less efficient on large lists than more

advanced algorithms such as quicksort, heap sort, or merge sort, but it's very

efficient on small (5-50 key) lists, as well as lists that are mostly sorted to

begin with.

25
Figure 2-3: Insertion Sort

In abstract terms, every iteration of an insertion sort removes an element

from the input data, inserting it at the correct position in the already sorted

list, until no elements are left in the input. The choice of which element to

remove from the input is arbitrary and can be made using almost any choice

algorithm.

26
2.6.7. JSort

JSort is a hybrid of heap sort and insertion sort developed by Jason

Morrison. It works by running two heap passes to roughly order the array,

and then finishes with an insertion sort.

The first heap pass converts the array to a heap, moving the smallest item to

the top. The second heap pass works in reverse, moving the largest element

to the bottom. These two passes combine to roughly order the array, though

much work is still left to the final insertion sort.

For small lists, JSort is extremely efficient, but due to its design it does not

scale well.

2.6.8. Jump Sort

Similar to shell sort and comb sort, jump sort employs a gap value that

decreases in each successive pass that allows out-of-place elements to be

moved very far initially. The underlying framework is based on bubble sort

instead of the more efficient insertion sort, but due to the initial ordering in

the early passes it ends up being very efficient in practice, though still

somewhat slower than either shell sort or comb sort.


27
2.6.9. Merge Sort

Similar to quick sort, merge sort is a recursive algorithm based on a divide-

and-conquer strategy. First, the sequence to be sorted is split into two

halves. Each half is then sorted independently, and the two sorted halves are

merged to a sorted sequence. Merge sort takes advantage of the ease of

merging together already sorted lists.

In many implementations, merge sort calls out to an external algorithm --

usually insertion sort -- when it reaches a level of around 10-20 elements.

This is not necessary in Visual Basic; in fact such an implementation appears

to slow merge sort's overall performance in practice.

Instead, the implementation here uses the purest form of merge sort, where

the list is recursively divided into halves until it reaches a list size of two, at

which point those two elements are sorted.

Unfortunately, extra memory is required to combine two sorted lists

together. The extra memory in this implementation is in the form of a full

copy of the initial array. There are various optimizations that can be made to

improve on this, but that is left as an exercise for the reader.

28
Invented in 1945 by John von Neumann, merge sort is far and away the

fastest stable algorithm.

2.6.10. Quick Sort

Quick sort was originally invented in 1960 by Charles Antony Richard

Hoare. It is a divide and conquer algorithm which relies on a partition

operation.

To partition an array, a pivot element is first randomly selected, and then

compared against every other element. All smaller elements are moved

before the pivot, and all larger elements are moved after. The lesser and

greater sublists are then recursively processed until the entire list is sorted.

This can be done efficiently in linear time and in-place.

Quick sort turns out to be the fastest sorting algorithm in practice.

However, in the (very rare) worst case quick sort is as slow as bubble sort.

There are good sorting algorithms with a better worst case, e.g. heap sort

and merge sort, but on the average they are slower than quick sort by a

consistent margin.

29
int function Partition (Array A, int Lb, int Ub);
begin
select a pivot from A[Lb]...A[Ub];
reorder A[Lb]...A[Ub] such that:
all values to the left of the pivot are <= pivot
all values to the right of the pivot are >= pivot
return pivot position;
end;

procedure QuickSort (Array A, int Lb, int Ub);


begin
if Lb < Ub then
M = Partition (A, Lb, Ub);
QuickSort (A, Lb, M - 1);
QuickSort (A, M, Ub);
end;

Figure 2-4: Quicksort Algorithm

The implementation here uses Niklaus Wirth's variant for selecting the

pivot value, which is simply using the middle value. This works particularly

well for already sorted lists.

In Figure 2-5(a), the pivot selected is 3. Indices are run starting at both ends

of the array. One index starts on the left and selects an element that is larger

than the pivot, while another index starts on the right and selects an

element that is smaller than the pivot. In this case, numbers 4 and 1 are

selected. These elements are then exchanged, as is shown in Figure 2-5(b).

This process repeats until all elements to the left of the pivot <= the pivot,

30
and all elements to the right of the pivot are >= the pivot. Quick Sort

recursively sorts the two sub arrays, resulting in the array shown in Figure

2-5(c).

Figure 2-5: Quicksort Example 1

Its speed and modest space usage makes quick sort one of the most popular

sorting algorithms, available in many standard libraries.

Full example of quicksort on a random set of numbers (Figure 2-6)

The shaded element is the pivot. It is always chosen as the last element of

the partition. However, always choosing the last element in the partition as

the pivot in this way results in poor performance ( ) on already sorted

31
Figure 2-5: Quicksort Example 2

lists, or lists of identical elements. Since sub-lists of sorted / identical

elements crop up a lot towards the end of a sorting procedure on a large set,

versions of the quicksort algorithm which choose the pivot as the middle

element run much more quickly than the algorithm described in this

diagram on large sets of numbers.

32
2.6.11. Quick Sort 3

The critical operation in the standard quick sort is choosing a pivot: the

element around which the list is partitioned. The simplest pivot selection

algorithm is to take the first or the last element of the list as the pivot,

causing poor behavior for the case of sorted or nearly-sorted input. Niklaus

Wirth's variant uses the middle element to prevent these occurrences,

degenerating to O(n²) for contrived sequences.

The median-of-3 pivot selection algorithm takes the median of the first,

middle, and last elements of the list; however, even though this performs

well on many real-world inputs, it is still possible to contrive a median-of-3

killer list that will cause dramatic slowdown of a quicksort based on this

pivot selection technique. Such inputs could potentially be exploited by an

aggressor, for example by sending such a list to an Internet server for sorting

as a denial of service attack.

The quicksort3 implementation here uses a median-of-3 technique, but

instead of using the first, last and middle elements, three elements are

chosen at random. This has the advantage of being immune to intentional

33
attacks, though there is still a possibility (however remote) of realizing the

worst case scenario.

2.6.12. Selection Sort

Selection sort is a simple sorting algorithm that mimics the way humans

instinctively sort. It works by first scanning the entire list to find the

smallest element, swapping it into the first position. It then finds the next

smallest element, swapping that into the second position, and so on until

the list is sorted.

Selection sort is unique compared to almost any other algorithm in that its

running time is not affected by the prior ordering of the list; it always

performs the same number of operations because of its simple structure.

Selection sort also requires only n swaps, which can be very attractive if

writes are the most expensive operation.

Unless writes are very expensive, selection sort will usually be

outperformed by the more complicated algorithms, though it will always

outperform a basic bubble sort. Heap sort is an efficient variation of

selection sort that is both very fast and also scales well.

34
2.6.13. Shaker Sort

Shaker sort is a gap-based bubble sort with a twist. Most gap sorts -- shell

sort, comb sort, et al. -- begin with a large gap and gradually shrink it down

to one. By the time the gap reaches one, the list should be mostly ordered so

the final pass should be efficient.

Like other gap sorts, shaker sort begins with a large gap which gradually

shrinks. However, once the gap reaches one, the gap gets expanded again

before shrinking back toward one. The expanding and contracting gap sizes

constitute the "shaking" part of shaker sort. Each additional expansion is

smaller and smaller until it eventually resolves to one, when no further

expansion is done. At this point the list is almost certain to be nearly sorted,

so the final bubble sort pass is very efficient.

2.6.14. Shell Sort

Shell sort is a variation of insertion sort that was invented by (and takes its

name from) Donald Shell, who published the algorithm in 1959.

Shell sort improves on insertion sort by comparing elements separated by a

gap of several positions. This lets an element take "bigger steps" toward its
35
expected position. Multiple passes over the data are taken using insertion

sort with smaller and smaller gap sizes. The last step of Shell sort is with a

gap size of one -- meaning it is a standard insertion sort -- guaranteeing that

the final list is sorted. By then, the list will be almost sorted already, so the

final pass is efficient.

The gap sequence is an integral part of the shellsort algorithm. Any

increment sequence will work, so long as the last element is 1. Donald Shell

originally suggested a gap sequence starting at half the size of the list,

dividing by half, every iteration until it reached one. While offering

significant improvement over a standard insertion sort, it was later found

that steps of three improve performance even further.

In the implementation here, the initial gap size is calculated by an iterative

formula x=3x+1, where x starts at 0 and grows until the gap is larger than the

list size. Each insertion sort loop begins by dividing the gap by three, thus

ensuring a very large starting gap.

36
Figure 2-7: Shell Sort

A key feature of shell sort is that the elements remain k-sorted even as the

gap diminishes. For instance, if a list was 5-sorted and then 3-sorted, the list

is now not only 3-sorted, but both 5- and 3-sorted. If this were not true, the

algorithm would undo work that it had done in previous iterations, and

would not achieve such a low running time.

Although shell sort is inefficient for large data sets, it is one of the fastest

algorithms for sorting small numbers of elements (sets with less than 1000

or so elements). Another advantage of this algorithm is that it requires

relatively small amounts of memory. It is worth noting that shell sort

37
enjoyed a brief period when it was the fastest sorting algorithm known, only

to be eclipsed by quick sort one year later.

2.6.15. Smooth Sort

Smooth sort is a variation of heap sort developed by Edsger Dijkstra in 1981.

Smooth sort has a similar average case to heap sort but a much better best

case, with a smooth transition between the two. This is where the name

comes from. The advantage of smooth sort is that it's faster if the input is

already sorted to some degree.

Due to its complexity, smooth sort is rarely used.

2.6.16. Snake Sort

An original algorithm, snake sort was invented in 2007. It is as fast as or

faster than any other algorithm, including quick sort, and it scales up as well

as quick sort excluding memory constraints. It is in the merge sort family,

and is unstable, out-of-place, offline, and non-recursive. I call it snake sort

due to its similarities to fantasy football snake drafts.

38
The idea is simple: A random ordering will result in very small contiguous

ordered blocks in either direction. Snake sort begins by identifying all those

blocks, and then merges them together. Each merge pass will halve the

remaining number of blocks, so it very quickly resolves to a sorted state.

It uses quite a bit of memory: a full copy of the original array, plus an index

array (to remember the block cutoffs) half the size of the original array.

The most interesting feature of snake sort is that the more ordered the array

is initially, the faster it runs. Because it is in the unique position of knowing

when the initial order is descending, it is optimally efficient at transposing

such a list to an ordered state.

2.7 COMPARISON

In this table, n is the number of records to be sorted. The columns "Average"

and "Worst" give the time complexity in each case, under the assumption

that the length of each key is constant, and that therefore all comparisons,

swaps, and other needed operations can proceed in constant time. "Memory"

denotes the amount of auxiliary storage needed beyond that used by the list

39
itself, under the same assumption. These are all comparison sorts. The run

time and the memory of algorithms could be measured using various

notations like theta, omega, Big-O, small-o, etc. The memory and the run

times below are applicable for all the 5 notations.

2.7.1 COMPARISON SORTS

Name Best Average Worst Memory Stable Method Other notes

When using a self-


Binary tree
Yes Insertion balancing binary
sort
search tree
Randomly permute
Bogosort No Luck the array and check
if sorted.
Bubble sort Yes Exchanging Tiny code size
Cocktail sort Yes Exchanging

Comb sort — — No Exchanging Small code size


In-place with
theoretically
Cycle sort — No Insertion
optimal number of
writes
Gnome sort Yes Exchanging Tiny code size
Heapsort No Selection

Implemented in
Standard Template
Library (STL); can
In-place
Yes Merging be implemented as
Merge sort
a stable sort based
on stable in-place
merging

40
Name Best Average Worst Memory Stable Method Other notes

O(n + d), where d is


Insertion sort Yes Insertion the number
of inversions

Partition-
Used
ing &
Introsort No in SGI STL implem
Select-
entations
ion

Library sort — Yes Insertion

Depends;
worst
Merge sort Yes Merging
case
is
Finds all
Insertion the longest
Patience
— — No & increasing
sorting
Selection subsequences withi
n O(n log n)
Quicksort is usually
done in place with
O(log(n)) stack
space. Most
implementations
Partition- are unstable, as
Quicksort Depends
ing stable in-place
partitioning is more
complex. Naïve vari
ants use an O(n)
space array to
store the partition.

Stable with O(n) extra


Selection sort No Selection space, for example
using lists.

41
Name Best Average Worst Memory Stable Method Other notes

Depends
on gap
sequence;
Shell sort or No Insertion
best known
is

Remarkably
Slow sort — — — No Selection inefficient sorting
algorithm
An adaptive sort -
comparisons
Smoothsort No Selection when the data is
already sorted, and
0 swaps.
Strand sort Yes Selection

comparisons
Insertion
when the data is
Timsort Yes &
already sorted or
Merging
reverse sorted.
Tournament
— Selection
sort

Table 2-2: Comparison Sorts

The following table describes integer sorting algorithms and other sorting

algorithms that are not comparison sorts. As such, they are not limited by a

lower bound. Complexities below are in terms of n, the number of items to


42
be sorted, k, the size of each key, and d, the digit size used by the

implementation. Many of them are based on the assumption that the key

size is large enough that all entries have unique key values, and hence that n

<< 2k, where << means "much less than."

2.7.2 Non-Comparison Sorts

k
Name Best Average Worst Memory Stable n << 2 Notes

Pigeonhole sort — Yes Yes

Bucket Assumes uniform


sort (uniform — Yes No distribution of elements from
keys) the domain in the array.

Bucket r is the range of numbers to


sort (integer — Yes Yes be sorted. If r = then
keys) Avg RT =

r is the range of numbers to


Counting sort — Yes Yes be sorted. If r = then
Avg RT =

43
2.7.2 Non-Comparison Sorts

k
Name Best Average Worst Memory Stable n << 2 Notes

LSD Radix Sort — Yes No

Stable version uses an


MSD Radix
— Yes No external array of size n to
Sort
hold all of the bins

MSD Radix In-Place. k / d recursion


— No No d
Sort levels, 2 for count array

Asymptotics are based on


k
the assumption that n << 2 ,
Spreadsort — No No
but the algorithm does not
require this.

Table 2-3: Non-Comparison Sorts

The following table describes some sorting algorithms that are impractical

for real-life use due to extremely poor performance or a requirement for

specialized hardware.

44
2.7.3 Impractical Sorts

Name Best Average Worst Memory Stable Comparison Other notes

Bead
— N/A N/A — N/A No Requires specialized hardware
sort

Simple
pancake — No Yes Count is number of flips.
sort

This A linear-time, analog


algorithm for sorting a sequence
Spaghetti
of items, requiring O(n) stack
(Poll) Yes Polling
space, and the sort is stable. This
sort
requires parallel
processors. Spaghetti sort

Sorting Requires a custom circuit of


— Yes No
networks size

Table 2-3: Impractical Sorts

45
Algorithms not yet compared above include:

 Odd-even sort

 Flash sort

 Burst sort

 Postman sort

 Stooge sort

 Sample sort

 Bitonic sorter

46
CHAPTER THREE

SYSTEM ANALYSIS AND DESIGN

This chapter discusses how the proposed application will be implemented by giving an

overview of the proposed application, describing the working operations, analysis of it and

the design.

3.0. INTRODUCTION

A complete functional understanding of the sorting program is presented. In

addition, the various algorithms are graphically visualized the running

process of each algorithm.

3.1. THE IDEAL SORTING ALGORITHM

Would have the following properties

 Stable: Equal keys aren’t reordered.

 Operates in-place, requiring O(1) extra space.

47
 Worst-case O(n∙lg(n)) key comparisons.

 Worst-case O(n) swaps.

 Adaptive: Speeds up to O(n) when data is nearly sorted or when there

are few unique keys.

There is no algorithm that has all these properties, and so the choice of

sorting algorithm depends on the application.

3.2. CHARACTERISTICS OF A GOOD SYSTEM DESIGN

I. A good system must meet user’s requirements and also be cost

effective.

II. A good system must be user friendly and the user interface must be

easy to understand and very simple.

III. A good system design must be conscious of the growth in the

institution for which such system is designed and must be flexible

enough to accommodate such growth.

IV. A good system design must be maintainable with a very simple and

self-explanatory documentation.

48
V. A good system design must be reliable, secured and put into

consideration features that would protect it from fraudulent attempt

or abuse.

3.3. DEVELOPMENT TOOLS

The Program is written in Visual Basic .NET (Visual Studio 2010, .NET

Framework 4.)

Visual Basic .NET (VB.NET) is an object-oriented computer programming

language that can be viewed as an evolution of the classic Visual Basic (VB),

which is implemented on the .NET Framework.

3.4. OVERVIEW OF THE PROPOSED PROGRAM

The proposed program is meant to show practical situations and

implementation of sorting algorithms. Each unit of the program will be able

to solve one task or the other a using sorting algorithm.

The program does not need installation or any security features.

49
3.5. PROGRAM ARCHITECTURE

The program has one module and two classes.

The first class contains the various subroutines used in the program while

the module contains the actual sorting algorithm (insert sort) used in the

program.

The second class is nested in the first class because it’s a Private class.

Input for each text box is verified to be of integer type.

3.6. THE USER INTERFACE

There are 3 tabs in the application. Each tab solves a particular problem.

Tab 1: Basic Sorting: It allows the user to input a set of numbers and sorts it

and also removes duplicates. This process can also be automated in which

the computer randomly generates the set of numbers.

Tab 2: Sort Text File: This example, takes a text file, splits the contents,

sorts the resulting string alphabetically and rewrites it back to the text file.

50
Tab 3: Sort by Key Fields: This example sorts different objects by different

key fields.

51
CHAPTER FOUR

SYSTEM IMPLEMENTATION

This chapter takes a broad look at the working operations of the application, the chapter

exposes it to real life scenarios to confirm the performance, It starts by taking a look at the

hardware and software requirement for running.

4.1 SYSTEM REQUIREMENT

The program is a very light program that requires very little processing

power and memory space to run. For a successful running of the program,

the following minimum system specification is required among other things.

(a) Minimum Hardware requirements

Pentium III or higher with memory not lesser than 128MB

(b) Software requirements

I. Operating System:

o Microsoft Windows 2000, Xp, Vista, 7 or any other higher

version.

52
o Linux or other variants like Ubuntu

II. .NET Framework 4.0

4.2 INSTALLATION DESCRIPTION

The program is very light (about 100kb) and requires no installation. The

user simply clicks and runs it from the storage media on which it is saved.

4.3 PROGRAM IMPLEMENTATION

4.3.1 Tab 1 – Basic Sorting

It allows the user to input a set of numbers and sorts it and also removes

duplicates. This process can also be automated in which the computer

randomly generates the set of numbers.

1. To input the set of numbers manually,

a) The user clicks on the text box and inputs a number

b) The user clicks the add button

53
Figure 4-1: Tab 1: Basic Sorting

c) The program checks if the input is valid. A valid entry is a non-empty

text box containing numeric characters. If the entry contains

alphanumeric characters, then we use the val() function to

convert it to the appropriate type. The val() function Returns the

numbers contained in a string as a numeric value of appropriate type.

54
d) The number is then added to Listbox1

e) The text button is automatically focused again

The user will go over these steps till he has input all the numbers he

wishes to sort.

2. After the inputs, the user clicks the Sort button and the add button is

disabled to prevent input during the running of the program. The

program then copies all the items in listbox1 to an array and sorts the

array using the Insert sort algorithm. The resulting sorted array is then

copies to ListBox2

3. The user can remove duplicate items from the new sorted array by

clicking the “Remove Duplicates” button. A simple algorithm scans

ListBox2 and removes every duplicate therein.

4. Alternatively, the program can automatically generate a set of numbers

into ListBox1 and sort them into ListBox2. To do this, the user

clicks the Button labeled “Automate Add And Sort”

5. An input box is displayed prompting the user to input the range of the

dataset. If the range is higher than 6,400, the user is alerted that datasets

55
higher than that figure results in exponentially longer run time and

hence more computer memory and CPU resources.

4.3.2 Tab 2: Sort File

This example, takes a text file, splits the contents, sorts the resulting string

alphabetically and rewrites it back to the text file.

Figure 4-2: Tab 2: Sort Text File

56
1. The file to be analyzed must be a text (.txt) file. Hence if the user

wants to sort the contents of a Word (.docx) or a Pfd (.pdf) file, the

user must copy the contents of the file onto a blank Notepad.

2. The user copies the file address into the text box supplied.

3. The student then selects the delimiter that separates each item in the

text file.

4. Once the “Open + Split” Button is clicked, the program first checks if

the file address specified in the text box exists. If it does, it then

checks if the text file is notEmpty.

5. If the text file path does not exist or is empty, the user is notified and

the control is transferred back to the text box.

6. Once the input is valid (i.e. the file exists and is not empty), the

program copies the content of the text file and splits it using the

delimiter specified earlier by the user.

7. The output of the split, a string array, is then copied into the

ListBox.

8. The user clicks the “Sort” button to sort the contents of the

ListBox.

57
9. Once the ListBox is sorted, the user clicks on “Rewrite” button to

copy the contents back into the text file.

10. The Len box displays the number of characters in the text file while

the Num box displays the number of words in the text file.

4.3.3 Tab 3: Sort by Key Field

This simple example sorts a list of 10 names either by Name or by ID.

The example simply shows that objects with several fields can be sorted by

different sort keys.

The user selects the radio button by Name or by ID and the program sorts

the list box according to the selected choice.

58
Figure 4-3: Tab 3: Sort by Key Field

59
CHAPTER FIVE

CONCLUSION AND RECOMMENDATION

5.1 CONCLUSION

This project has been successfully implemented and concluded.

The choice of sorting algorithm ultimately depends on the program.

Suggestions and constructive criticism is highly welcomed for the

improvement of the program.

5.2 RECOMMENDATION

This application and procedures therein can be further expanded and used

for developing a password storage and encryption system.

60
REFERENCES

Black, Paul E. (13 November 2008). "array". Dictionary of Algorithms and


Data Structures. National Institute of Standards and Technology. Retrieved
2010-08-22

Bjoern Andres; Ullrich Koethe; Thorben Kroeger; Hamprecht (2010).


"Runtime-Flexible Multi-dimensional Arrays and Views for C++98 and
C++0x". arXiv:1008.2909 [cs.DS].

Charles E., Rivest, Ronald L., Stein, Clifford (2009) [1990]. Introduction to
Algorithms (3rd ed.). MIT Press and McGraw-Hill. ISBN 0-262-03384-4

David R. Richardson (2002), The Book on Data Structures. iUniverse, 112


pages. ISBN 0-595-24039-9, 9780595240395

Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and


Searching, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89685-0

Garcia, Ronald; Lumsdaine, Andrew (2005). "MultiArray: a C++ library for


generic programming with arrays". Software: Practice and Experience 35
(2): 159–188. doi:10.1002/spe.630. ISSN 0038-0644.Cormen, Thomas H.;
Leiserson

Insertion Sort & Shellsort, Andy Le, CS146 – Dr. Sin Min Lee, Spring 2004,

ISSort-AndyLe.ppt, Ret. March 01, 2012

61
LaMarca, A.; Ladner, R. E. (1997). "The influence of caches on the
performance of sorting". Proc. 8th Ann. ACM-SIAM Symp. on Discrete
Algorithms (SODA97): 370–379

Sedgewick, Robert (2003), "6.10 Key-Indexed Counting", Algorithms in Java,


Parts 1-4: Fundamentals, Data Structures, Sorting, and Searching (3rd ed.),
Addison-Wesley, pp. 312–314

Sorting and Searching Algorithms, Thomas Niemann, epaperpress.com

T. Veldhuizen. Arrays in Blitz++. In Proc. of the 2nd Int. Conf. on Scientific


Computing in Object-Oriented Parallel Environments (ISCOPE), LNCS
1505, pages 223-220. Springer, 1998

http://www.sorting-algorithms.com

WIKIPEDIA - http://en.wikipedia.org/wiki/Sorting_algorithm

http://en.wikipedia.org/wiki/Sorting

62
APPENDIX A

List of Sorting Algorithms


Exchange Sorts
1. Bubble sort: for each pair of indices, swap the items if out of order
2. Cocktail sort
3. Comb sort
4. Gnome sort
5. Odd-even sort
6. Quicksort: divide list into two, with all items on the first list coming
before all items on the second list; then sort the two lists. Often the
method of choice

Humorous or ineffective
7. Bogosort
8. Stooge sort

Hybrid
9. Introsort: begin with quicksort and switch to heapsort when the
recursion depth exceeds a certain level
10. JSort
11. Spread sort
12. Timsort: adaptive algorithm derived from merge sort and insertion
sort. Used in Python 2.3 and up, and Java SE 7.
13. UnShuffle sort

Insertion sorts
14. Cycle sort: in-place with theoretically optimal number of writes
15. Insertion sort: determine where the current item belongs in the list of
sorted ones, and insert it there
16. Library sort

63
17. Patience sorting
18. Shell sort: an attempt to improve insertion sort
19. Tree sort (binary tree sort): build binary tree, then traverse it to
create sorted list

Merge sorts
20. Merge sort: sort the first and second half of the list separately, then
merge the sorted lists
21. Polyphase merge sort
22. Strand sort

Distribution (Non-comparison) sorts


23. American flag sort
24. Bead sort
25. Bucket sort
26. Burstsort: build a compact, cache efficient burst trie and then traverse
it to create sorted output
27. Counting sort
28. Flashsort
29. Pigeonhole sort
30. Proxmap sort: variant of Bucket sort which takes advantage of
hierarchical structure
31. Radix sort: sorts strings letter by letter

Selection sorts
32. Cartesian Tree sort
33. Cycle sort
34. Heap sort: convert the list into a heap, keep removing the largest
element from the heap and adding it to the end of the list
35. Selection sort: pick the smallest of the remaining elements, add it to
the end of the sorted list

64
36. Smooth sort
37. Tournament sort

Concurrent sorts
38. Batch odd-even mergesort
39. Bitonic sorter
40. Pairwise sorting network

Other
41. Pancake sorting
42. Topological sort

Unknown class
43. Distribution sort
44. Jump sort
45. Samplesort
46. Shaker sort
47. Snake sort

65
APPENDIX B
Pseudocode Implementation of some Sorting Algorithms

(a) Bubble Sort

procedure bubbleSort( A : list of sortable items )


n = length(A)
repeat
swapped = false
for i = 1 to n-1 inclusive do
if A[i-1] > A[i] then
swap(A[i-1], A[i])
swapped = true
end if
end for
n = n - 1
until not swapped
end procedure

(b) Bucket Sort

function bucketSort(array, n) is
buckets ← new array of n empty lists
for i = 0 to (length(array)-1) do
insert array[i] into buckets[msbits(array[i], k)]
for i = 0 to n - 1 do
nextSort(buckets[i])
return the concatenation of buckets[0], ..., buckets[n-1]

66
(c) Counting Sort

''' allocate an array Count[0..k] ; initialize each array cell to zero


; THEN '''
for each input item x:
Count[key(x)] = Count[key(x)] + 1
total = 0
for i = 0, 1, ... k:
c = Count[i]
Count[i] = total
total = total + c

''' allocate an output array Output[0..n-1] ; THEN '''


for each input item x:
store x in Output[Count[key(x)]]
Count[key(x)] = Count[key(x)] + 1
return Output

67
(d) Merge Sort

function merge_sort(list m)
// if list size is 1, consider it sorted and return it
if length(m) <= 1
return m
// else list size is > 1, so split the list into two sublists
var list left, right
var integer middle = length(m) / 2
for each x in m up to middle
add x to left
for each x in m after or equal middle
add x to right
// recursively call merge_sort() to further split each sublist
// until sublist size is 1
left = merge_sort(left)
right = merge_sort(right)
// merge the sublists returned from prior calls to merge_sort()
// and return the resulting merged sublist
return merge(left, right)

68
APPENDIX C
Some Sorting Algorithms in Visual Basic

(a) Insert Sort


(b) QSort
(c) Quick Sort
(d) Shell Sort

= = = = = = = = = = = = = = = = = = = = = = = =
(A) insert sort
= = = = = = = = = = = = = = = = = = = = = = = =

Public Sub InsertSort(ByRef A() As Variant, _


ByVal Lb As Long, ByVal Ub As
Long)
Dim t As Variant
Dim i As Long
Dim j As Long

'sort A[Lb..Ub]
For i = Lb + 1 To Ub
t = A(i)

'shift elements down until insertion point


'found
For j = i - 1 To Lb Step -1
If A(j) <= t Then Exit For
A(j + 1) = A(j)
Next j
' insert
A(j + 1) = t
Next i
End Sub

69
= = = = = = = = = = = = = = = = = = = = = = = =
(B) QSort
= = = = = = = = = = = = = = = = = = = = = = = =

Public Sub QSort(ByRef A() As Variant, _


ByVal Lb As Long, ByVal Ub As
Long)

Dim lbStack(32) As Long


Dim ubStack(32) As Long
Dim sp As Long ' stack pointer
Dim lbx As Long ' current lower-
bound
Dim ubx As Long ' current upper-
bound
Dim m As Long
Dim p As Long ' index to pivot
Dim i As Long
Dim j As Long
Dim t As Variant ' temp used for exchanges

lbStack(0) = Lb
ubStack(0) = Ub
sp = 0
Do While sp >= 0
lbx = lbStack(sp)
ubx = ubStack(sp)

Do While (lbx < ubx)

' select pivot and exchange with 1st


' element
p = lbx + (ubx - lbx) \ 2

70
' exchange lbx, p
t = A(lbx)
A(lbx) = A(p)
A(p) = t

' partition into two segments


i = lbx + 1
j = ubx
Do
Do While i < j
If A(lbx) <= A(i) Then Exit Do
i = i + 1
Loop

Do While j >= i
If A(j) <= A(lbx) Then Exit Do
j = j - 1
Loop

If i >= j Then Exit Do

' exchange i, j
t = A(i)
A(i) = A(j)
A(j) = t

j = j - 1
i = i + 1
Loop

' pivot belongs in A[j]


' exchange lbx, j
t = A(lbx)
A(lbx) = A(j)
A(j) = t

71
m = j

' keep processing smallest segment, and


' stack largest
If m - lbx <= ubx - m Then
If m + 1 < ubx Then
lbStack(sp) = m + 1
ubStack(sp) = ubx
sp = sp + 1
End If
ubx = m - 1
Else
If m - 1 > lbx Then
lbStack(sp) = lbx
ubStack(sp) = m - 1
sp = sp + 1
End If
lbx = m + 1
End If
Loop
sp = sp - 1
Loop
End Sub

= = = = = = = = = = = = = = = = = = = = = = = =
(C) Quick Sort
= = = = = = = = = = = = = = = = = = = = = = = =

Public Sub QuickSort(ByRef A() As Variant, _


ByVal Lb As Long, ByVal Ub As
Long)
Dim m As Long

' sort array A(lb..ub)

Do While Lb < Ub

72
' quickly sort short lists
If (Ub - Lb <= 12) Then
Call InsertSort(A, Lb, Ub)
Exit Sub
End If

' partition into two segments


m = Partition(A, Lb, Ub)

' sort the smallest partition to minimize


' stack requirements
If m - Lb <= Ub - m Then
Call QuickSort(A, Lb, m - 1)
Lb = m + 1
Else
Call QuickSort(A, m + 1, Ub)
Ub = m - 1
End If
Loop
End Sub

---------

Private Function Partition(ByRef A() As Variant,_


ByVal Lb As Long, ByVal Ub As Long) _
As Long
Dim t As Variant
Dim pivot As Variant
Dim i As Long
Dim j As Long
Dim p As Long

' partition array[lb..ub]

' select pivot and exchange with 1st element


p = Lb + (Ub - Lb) \ 2

73
pivot = A(p)
A(p) = A(Lb)

' sort Lb+1 .. Ub based on pivot


i = Lb
j = Ub + 1
Do
Do
j = j - 1
While j > i And A(j) > pivot
Do
i = i + 1
While i < j And A(i) < pivot

If i >= j Then Exit Do

' swap A(i), A(j)


t = A(i)
A(i) = A(j)
A(j) = t
Loop

' pivot belongs in A(j)


A(Lb) = A(j)
A(j) = pivot
Partition = j
End Function

= = = = = = = = = = = = = = = = = = = = = = = =
(D) Shell Sort
= = = = = = = = = = = = = = = = = = = = = = = =

Public Sub ShellSort(ByRef A() As Variant, ByVal Lb


As Long, ByVal Ub As Long)

74
Dim n As Long
Dim h As Long
Dim i As Long
Dim j As Long
Dim s As Long
Dim t As Variant
Dim seq(28) As Long

' sort array[lb..ub]

s = ShellSeq(seq, Ub - Lb + 1)
Do While s >= 0

' sort by insertion in increments of h


h = seq(s)
For i = Lb + h To Ub
t = A(i)
For j = i - h To Lb Step -h
If A(j) <= t Then Exit For
A(j + h) = A(j)
Next j
A(j + h) = t
Next i
s = s - 1
Loop
End Sub

Private Function ShellSeq(ByRef h() As Long, ByVal


n As Long)

' establish increment sequence (see Knuth, Vol


3)

Dim p1 As Long
Dim p2 As Long

75
Dim p3 As Long
Dim s As Long

p1 = 1
p2 = 1
p3 = 1
s = -1

Do
s = s + 1
If s Mod 2 Then
h(s) = 8 * p1 - 6 * p2 + 1
Else
h(s) = 9 * p1 - 9 * p3 + 1
p2 = 2 * p2
p3 = 2 * p3
End If
p1 = 2 * p1
Loop While 3 * h(s) < n

If s > 0 Then s = s - 1
ShellSeq = s

End Function

76
APPENDIX D
PROGRAM SOURCE CODE
MICROSOFT VISUAL STUDIO 2010 ULTIMATE
.NET FRAMEWORK 4 BETA 2

Imports System.IO
Imports Microsoft.VisualBasic.FileSystem
Imports System.Windows.Forms.KeyEventArgs

Public Class insertsortform


Inherits System.Windows.Forms.Form

Public Sub sort(ByVal aa As Windows.Forms.ListBox, _


ByVal bb As Windows.Forms.ListBox)

Dim elem As Integer = aa.Items.Count - 1


Dim array1(elem) As Object
Dim nu As Integer

For nu = 0 To elem
array1(nu) = aa.Items.Item(nu)
Next

array1 = all.InsertSort(array1, LBound(array1), UBound(array1))

If bb.Items.Count > 0 Then


bb.Items.Clear()
End If

For nu = 0 To elem
bb.Items.Add(array1(nu))
Next

''

End Sub

#Region "Form Load"

Private m_Items(9) As CustomerInfo

Private Sub insertsortform_Load(ByVal sender As System.Object, _


ByVal e As System.EventArgs) Handles
MyBase.Load

77
m_Items(0) = New CustomerInfo("Archer", 4)
m_Items(1) = New CustomerInfo("Beck", 7)
m_Items(2) = New CustomerInfo("Cantu", 2)
m_Items(3) = New CustomerInfo("Deevers", 1)
m_Items(4) = New CustomerInfo("Edwards", 9)
m_Items(5) = New CustomerInfo("Finnagan", 3)
m_Items(6) = New CustomerInfo("Guy", 5)
m_Items(7) = New CustomerInfo("Hennesey", 8)
m_Items(8) = New CustomerInfo("Jacquinth", 6)
m_Items(9) = New CustomerInfo("Irving", 10)
'' Uncomment to test duplicate objects.
'm_Items(6) = New CustomerInfo("Finnagan", 3)
'm_Items(7) = New CustomerInfo("Edwards", 9)
'm_Items(8) = New CustomerInfo("Finnagan", 3)
'm_Items(9) = New CustomerInfo("Deevers", 1)

' Start sorting by name.


radByName.Checked = True

txt.Focus()

End Sub

' Holds data.


Private Class CustomerInfo
Public Name As String
Public ID As Integer ' We sort on this value.

Public Sub New(ByVal new_name As String, ByVal new_id As Integer)


Name = new_name
ID = new_id
End Sub

Public Overrides Function ToString() As String


Return ID.ToString() & ". " & Name
End Function
End Class

#End Region

#Region "Tab1 Basics"

Private Sub txt_GotFocus(ByVal sender As Object, _


ByVal e As System.EventArgs) Handles txt.GotFocus

If addBtn.Enabled = False Then


ListBox2.Items.Clear()
addBtn.Enabled = True
End If

78
End Sub

Private Sub txt_KeyDown(ByVal sender As Object, _


ByVal e As System.Windows.Forms.KeyEventArgs)
If e.KeyCode = Windows.Forms.Keys.Return Or e.KeyValue() = 13 Then
add()
End If
End Sub

Private Sub Label5_Click(ByVal sender As System.Object, _


ByVal e As System.EventArgs) Handles
Label5.Click
ListBox1.Items.Clear()
ListBox2.Items.Clear()
End Sub

Private Sub addBtn_Click(ByVal sender As System.Object, _


ByVal e As System.EventArgs) Handles
addBtn.Click
add()
End Sub

Sub add()
If addBtn.Enabled = True Then
Dim item As Single = Val(txt.Text)
If item <= 0 Then
txt.Clear()
Exit Sub
End If
ListBox1.Items.Add(item)
txt.Clear()
End If
txt.Focus()
End Sub

Private Sub Button5_Click(ByVal sender As System.Object, _


ByVal e As System.EventArgs) Handles Button5.Click

With ListBox2.Items

For i = 1 To 7
Dim oOK As Integer = .Count - 1
Dim c1 As Integer = 0

While oOK > c1

If .Item(c1) = .Item(c1 + 1) Then

79
.RemoveAt(c1)
oOK -= 1
End If
c1 += 1
End While
Next

End With

End Sub

Private Sub sortBtn_Click(ByVal sender As System.Object, _


ByVal e As System.EventArgs) Handles sortBtn.Click

sort(ListBox1, ListBox2)
addBtn.Enabled = False

End Sub

' to Remove individual items from listbox1


Private Sub ListBox1_KeyDown(ByVal sender As Object, ByVal e As
System.Windows.Forms.KeyEventArgs)
If e.KeyCode = Windows.Forms.Keys.Delete Then
ListBox1.Items.RemoveAt(ListBox1.SelectedIndex)
End If
End Sub

Private Sub clearBtn_Click(ByVal sender As System.Object, ByVal e As


System.EventArgs) Handles clearBtn.Click
ListBox1.Items.Clear()
End Sub

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As


System.EventArgs) Handles Button1.Click

Dim auto As Integer = Val(InputBox("Number of items to iterate


through. (1 - 262144)" & vbNewLine & "Dataset greater than 131072 may slow
down your PC (depending on your processor)", "Automate", 0))

If auto < 1 Or auto > 262144 Then


MsgBox("Input should be a positive Integer between 1 and 262144",
MsgBoxStyle.Critical, "Error!")
Exit Sub
End If

ListBox1.Items.Clear()
ListBox2.Items.Clear()
Dim tim = Timer

80
For autoGen = 1 To auto
ListBox1.Items.Add(Int((1000 * Rnd()) + 1))
Next

sort(ListBox1, ListBox2)

MsgBox("Dataset of " & auto & " items was sorted in " & (Timer - tim)
& " seconds!", MsgBoxStyle.Information, "TIME")
End Sub

#End Region

#Region "Tab3 Sort Text File"

Private Sub openSplitText_Click(ByVal sender As System.Object, _


ByVal e As System.EventArgs) Handles openSplitText.Click

Dim content As String

Try
content = FileIO.FileSystem.ReadAllText(dir.Text)
Catch
MsgBox("check your file dir")
dir.Focus()
Exit Sub
End Try

' while the text file is not empty,


Dim lent As Integer = content.Length
len.Text = lent

If lent < 1 Then


MsgBox("Empty File")
dir.Focus()
Exit Sub
End If

Dim delimit As String = delim.Text


Dim split As String()

Select Case delimit


Case "<any>"
split = content.Split(New [Char]() {" "c, ","c, ":"c, "/"c,
"*"c, CChar(vbTab), CChar(vbNewLine)})
Case "New Line"
split = content.Split(New [Char]() {CChar(vbNewLine)})
Case "Tab"
split = content.Split(New [Char]() {CChar(vbTab)})
Case "Space"

81
split = content.Split(New [Char]() {" "c})
Case "Comma"
split = content.Split(New [Char]() {","c})
Case "Underscore"
split = content.Split(New [Char]() {"_"c})
Case "@"
split = content.Split(New [Char]() {"@"c})
Case "/"
split = content.Split(New [Char]() {"/"c})
Case "\"
split = content.Split(New [Char]() {"\"c})
Case "#"
split = content.Split(New [Char]() {"#"c})
Case "$"
split = content.Split(New [Char]() {"$"c})
Case ":"
split = content.Split(New [Char]() {":"c})
Case "&"
split = content.Split(New [Char]() {"&"c})
Case "*"
split = content.Split(New [Char]() {"*"c})
Case "-"
split = content.Split(New [Char]() {"-"c})
Case "+"
split = content.Split(New [Char]() {"+"c})
Case "="
split = content.Split(New [Char]() {"="c})
Case Else
split = content.Split(New [Char]() {" "c, ","c, ":"c, "/"c,
CChar(vbTab), CChar(vbNewLine)})
End Select

Dim numb As Integer = split.GetLength(0)


num.Text = numb

ListBox3.Items.Clear()

For Each s As String In split


If s.Trim() <> "" Then
ListBox3.Items.Add(s)
End If
Next

End Sub

Private Sub sortText_Click(ByVal sender As System.Object, _


ByVal e As System.EventArgs) Handles sortText.Click
sort(ListBox3, ListBox3)

82
End Sub

Private Sub reWriteText_Click(ByVal sender As System.Object, _


ByVal e As System.EventArgs) Handles
reWriteText.Click
Dim a As String
Dim content As String

content = FileIO.FileSystem.ReadAllText(dir.Text)
FileIO.FileSystem.WriteAllText(dir.Text, "", False)

Try
For i = 0 To ListBox3.Items.Count - 1
a = ListBox3.Items.Item(i) & vbNewLine
FileIO.FileSystem.WriteAllText(dir.Text, a, True)
Next
Catch ex As Exception
FileIO.FileSystem.WriteAllText(dir.Text, content, False)
MsgBox("an error occured")
End Try

End Sub

#End Region

#Region "Tab5 Key Fields"

' Compares CustomerInfo objects for sorting.


Private Class CustomerInfoComparer
Implements System.Collections.IComparer

' Determines whether we compare by name or ID.


Public Enum CompareTypes
ByName
ById
End Enum

Public CompareType As CompareTypes = CompareTypes.ById

Public Sub New(ByVal compare_type As CompareTypes)


CompareType = compare_type
End Sub

' Compare two CustomerInfo objects.


Public Function Compare(ByVal x As Object, ByVal y As Object) _
As Integer Implements System.Collections.IComparer.Compare
Dim x_cust As CustomerInfo = DirectCast(x, CustomerInfo)
Dim y_cust As CustomerInfo = DirectCast(y, CustomerInfo)

83
Select Case CompareType
Case CompareTypes.ById
Return x_cust.ID.CompareTo(y_cust.ID)
Case CompareTypes.ByName
Return x_cust.Name.CompareTo(y_cust.Name)
End Select
End Function

End Class

' Sort by name.


Private Sub radByName_CheckedChanged(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles
radByName.CheckedChanged
If radByName.Checked Then
SortBy(CustomerInfoComparer.CompareTypes.ByName)
End Sub

' Sort by ID.


Private Sub radById_CheckedChanged(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles
radById.CheckedChanged
If radById.Checked Then SortBy(CustomerInfoComparer.CompareTypes.ById)
End Sub

' Sort by name or ID.


Private Sub SortBy(ByVal compare_type As
CustomerInfoComparer.CompareTypes)
' Sort.
Dim cust_comparer As New CustomerInfoComparer(compare_type)
Array.Sort(m_Items, cust_comparer)

' Display the items.


lstItems.Items.Clear()
For i As Integer = 0 To m_Items.Length - 1
lstItems.Items.Add(m_Items(i).ToString())
Next i
End Sub

#End Region

End Class

Module all

' insert sort


Public Function InsertSort(ByRef A() As Object, ByVal Lb As Long, _
ByVal Ub As Long)

84
Dim t As Object
Dim i As Long
Dim j As Long

' sort A[Lb..Ub]


For i = Lb + 1 To Ub
t = A(i)

' shift elements down until insertion point found


For j = i - 1 To Lb Step -1
If A(j) <= t Then Exit For
A(j + 1) = A(j)
Next j
' insert
A(j + 1) = t
Next i
InsertSort = A
End Function

End Module

85

You might also like