Data Structure and Algorithm
CS-102
CHETAN AGARWAL
1
Data Structure
vs
Storage Structure
• Data Structure : The logical or mathematical
model of a particular organization of data
• Storage Structure : Representation of a
particular data structure in the memory of a
computer
• There are many possible storage structure to a
particular data structure
• Ex: there are a number of storage structure for a
data structure such as array
– It is possible for two DS to represented by the
same storage structure
2
Classification
• Data Structure
– Linear
– Non-Linear
• A Data structure is said to be
linear if its elements form a
sequence or in other words form a
linear list
3
Representation in Memory
• Two basic representation of Linear
structure in memory
– Have a linear relationship between the
elements represented by
means of sequential memory
locations [ Arrays]
– Have the linear relationship between the
elements represented by
means of pointer or links [ Linked List]
4
Operation on Linear Structure
• Traversal : Processing each element in the
list
• Search : Finding the location of the element
with a given value or the record with a given
key
• Insertion: Adding a new element to the list
• Deletion: Removing an elements from the list
• Sorting : Arranging the elements in some
type of order
• Merging : Combining two list into a single list
5
Array
6
Linear Arrays
• A linear array is a list of a finite number
of n homogeneous data elements
( that is data elements of the same
type) such that
– The elements of the arrays are
referenced respectively by an index set
consisting of n consecutive numbers
– The elements of the arrays are stored
respectively in successive memory
locations
7
Linear Arrays
• The number n of elements is called the length
or size of the array.
• Usually, the index set consists of the integer
1, 2, … n
but in fact it may consist of any sequence of
integers.
• Length or the number of data elements of the
array can be obtained from the index set by
Length = UB – LB + 1 where UB is the largest
index called the upper bound and LB is the
smallest index called the lower bound of the
arrays
8
Linear Arrays
• Element of an array A may be denoted
by
– Subscript notation A1, A2, , …. , An
– Parenthesis notation A(1), A(2), …. ,
A(n)
– Bracket notation A[1], A[2], ….. , A[n]
• The number K in A[K] is called
subscript or an index and A[K] is called
a subscripted variable
9
Declaration of Array
• Each declaration of an array must
give, implicitly or explicitly
three items of information
(i) Name of Array
(ii) Data type of Array (iii) Index
set
Example :- in Fortran77 INTEGER
A()
10
Representation of Linear Array in
Memory
100
0
100
1
100
2
100
3
100 :
4
100 Computer Memory
5
11
Representation of Linear Array in
Memory
• Let LA be a linear array in the memory of the
computer
• LOC(LA[K]) = address of the element
LA[K] of the array LA
• The element of LA are stored in the
successive memory cells
• Computer does not need to keep track of the
address of every element of LA, but need
to track only the address of the first element
of the array denoted by Base(LA) called the
base address of LA
12
Representation of Linear Array in
Memory
• LOC(LA[K]) = Base(LA) + w(K – lower
bound) where w is the number of
words per memory cell of the array
LA [w is the size of the data type]
13
Example 1
200 LA[1]
Find the address for LA[6] 201 LA[2]
Each element of the array
202 LA[3]
occupy 1 byte
203 LA[4]
204 LA[5]
205 LA[6]
206 LA[7]
207 LA[8]
LOC(LA[K]) = Base(LA) + w(K – lower :
bound)
LOC(LA[6]) = 200 + 1(6 – 1) = 205
14
Example 2
200
Find the address for LA[16] LA[1]
201
Each element of the array
202
occupy 2 byte LA[2]
203
204
LA[3]
205
206
LA[4]
207
LOC(LA[K]) = Base(LA) + w(K – lower :
bound)
LOC(LA[16]) = 200 + 2(16 – 1) = 230
15
Representation of Linear Array in
Memory
• Given any value of K, time to calculate
LOC(LA[K]) is same
• Given any subscript K one can access
and locate the content of LA[K] without
scanning any other element of LA
• A collection A of data element is said to
be index if any element of A called Ak
can be located and processed in time
that is independent of K
16
Traversing Linear Arrays
• Let A be a collection of data elements
stored in memory of the computer.
• Suppose we want to print the contents
of each element of A or suppose we
want to count the number of elements
of A with a given property.
• This can be accomplished by
accessing and processing each element
of A exactly once
• Linear Array
•••
17
Traversing Linear Arrays
• Traversing is accessing and
processing (aka visiting ) each
element of the data structure exactly
ones
Linear Array
•••
18
Traversing Linear Arrays
• Traversing is accessing and
processing (aka visiting ) each
element of the data structure exactly
ones
Linear Array
•••
19
Traversing Linear Arrays
• Traversing is accessing and
processing (aka visiting ) each
element of the data structure exactly
ones
Linear Array
•••
20
Traversing Linear Arrays
• Traversing is accessing and
processing (aka visiting ) each
element of the data structure exactly
ones
Linear Array
•••
21
Traversing Linear Arrays
• Traversing is accessing and
processing (aka visiting ) each
element of the data structure exactly
ones
Linear Array
•••
22
Traversing Linear Arrays
• Traversing is accessing and
processing (aka visiting ) each
element of the data structure exactly
ones
Linear Array
•••
1. Repeat for K = LB to UB
Apply PROCESS to LA[K]
[End of Loop]
2. Exit 23
Inserting and Deleting
• Insertion: Adding an element
– Beginning
– Middle
– End
• Deletion: Removing an element
– Beginning
– Middle
– End
24
Insertion
1 Brown 1 Brown
2 Davis 2 Davis
3 Johnson 3 Johnson
4 Smith 4 Smith
5 Wagner 5 Wagner
6 6 Ford
7 7
8 8
Insert Ford at the End of Array
25
Insertion
1 Brown 1 Brown 1 Brown 1 Brown
2 Davis 2 Davis 2 Davis 2 Davis
3 Johnson 3 Johnson 3 Johnson 3 Ford
4 Smith 4 Smith 4 4 Johnson
5 Wagner 5 5 Smith 5 Smith
6 6 Wagner 6 Wagner 6 Wagner
7 7 7 7
8 8 8 8
Insert Ford as the 3rd Element of Array
Insertion is not Possible without
loss of data if the array is FULL
26
Deletion
1 Brown 1 Brown
2 Davis 2 Davis
3 Ford 3 Ford
4 Johnson 4 Johnson
5 Smith 5 Smith
6 Taylor 6 Taylor
7 Wagner 7
8 8
Deletion of Wagner at the End of Array
27
Deletion
1 Brown 1 Brown 1 Brown
1 Brown
2 Davis 2 2 Ford
2 Ford
3 Ford 3 Ford 3
3 Johnson
4 Johnson 4 Johnson 4 Johnson
4
5 Smith 5 Smith 5 Smith
5 Smith
6 Taylor 6 Taylor 6 Taylor
6 Taylor
7 Wagner 7 Wagner 7 Wagner
7 Wagner
8 8 8
8
Deletion of Davis from the Array
28
Deletion
1 Brown
2 Ford
3 Johnson
4 Smith
5 Taylor
6 Wagner
7
8
No data item can be deleted from an
empty array
29
Insertion Algorithm
• INSERT (LA, N , K , ITEM) [LA is a
linear array with N elements and K is a positive
integers such that K ≤ N. This algorithm insert an
element ITEM into the Kth position in LA ]
1. [Initialize Counter] Set J := N
2. Repeat Steps 3 and 4 while J ≥ K
3. [Move the Jth element downward ]
Set LA[J + 1] := LA[J]
4. [Decrease Counter] Set J := J -1
5 [Insert Element] Set LA[K] := ITEM
6. [Reset N] Set N := N +1;
7. Exit
30
Deletion Algorithm
• DELETE (LA, N , K , ITEM) [LA is a
linear array with N elements and K is a positive
integers such that K ≤ N. This algorithm deletes
Kth element from LA ]
1. Set ITEM := LA[K]
2. Repeat for J = K to N -1:
[Move the J + 1st element upward] Set LA[J]
:= LA[J + 1]
3. [Reset the number N of elements] Set N :=
N - 1;
4. Exit
31
Searching of an element in Linear
Array
•Let DATA be a collection of N data elements in
memory i.e. (DATA is a Linear Array stored in
memory)
• And suppose a specific ITEM of information is given.
• Searching refers to the operation of
-- Finding the LOC of ITEM in DATA,
-- or printing some message that ITEM does not
appear there.
•The Search is said to be
--successful if ITEM does appear in DATA
--and unsuccessful otherwise.
32
Searching Algorithm
• SearchLinear(DATA,N,K,ITEM,Loc) [Here
DATA is a linear array with N elements and ITEM is a given
item of information. This algorithm finds the Loc of ITEM in
DATA, or sets LOC = 0 if the search is unsuccessful.]
1. [Insert ITEM at the end of DATA ]
DATA[N+1] = ITEM
2. [Initialize counter] set LOC = 1
3. [Search for ITEM]
Repeat while DATA[LOC] =! ITEM
Set LOC = LOC + 1
[End of Loop]
4. [Successful?]
If LOC = N +1, then , Set LOC = 0
5 Exit
Complexity of Linear search in worst case is f(n) = n + 1
33
Implementation of Linear Search
with Example
Consider the array NAME with n = 6
elements
Witnes Gidieo Shilin
Samwell Allen Kasim
s n g
1 2 3 4 5 6 7 8
(Though there are 8 places in the array,
right now only 6 elements are there,
and 2 places are vacate.)
34
Searching an non existing
element
Suppose, we want to know whether ”Japhet”
appears in the array, and, if so, then where.
Algorithm temporarily places ”Japhet” at the end of
the array, as pictured in below fig. , by setting
NAME[7] = ”Japhet”
Witnes Gidieo Shilin
Samwell Allen Kasim Japhet
s n g
1 2 3 4 5 6 7
8
Then, algorithm searches the array from top to
bottom. Since, ”Japhet” first appears in
NAME[N+1], ”Japhet” is not in the original Array.
35
Searching an existing element
Suppose, we want to know whether ”Shilong”
appears in the array, and, if so, then where.
Algorithm temporarily places ”Shilong” at the end
of the array, as pictured in below fig. , by setting
NAME[8] = ”Shilong”
Witnes Gidieo Shilon
Samwell Allen Kasim Japhet Shilong
s n g
1 2 3 4 5 6 7
8
Then, algorithm searches the array from top to bottom.
Since, ”Shilong” first appears in NAME[6], (where 6 =<
N), ”Shilong” is in the original Array.
36
Witnes Gidieo Shilin
Samwell Allen Kasim Japhet Stephen
s n g
1 2 3 4 5 6 7
8
Witnes Gidieo Shilin
Samwell Allen Kasim Japhet Stephen
s n g
1 2 3 4 5 6 7
8
Witnes Gidieo Shilin
Samwell Allen Kasim Japhet Stephen
s n g
1 2 3 4 5 6 7
8
37
Multidimensional Array
• One-Dimensional Array
• Two-Dimensional Array
• Three-Dimensional Array
• Some programming Language allows
as many as 7 dimension
38
Two-Dimensional Array
• A Two-Dimensional m x n array A is a
collection of m . n data elements
such that each element is specified by
a pair of integer (such as J, K) called
subscript with property that
1 ≤ J ≤ m and 1 ≤ K ≤ n
The element of A with first subscript J
and second subscript K will be
denoted by AJ,K or A[J][K]
39
2D Arrays Representation
The elements of a 2-dimensional array a
is shown as below
a[0][0] a[0][1] a[0][2] a[0][3]
a[1][0] a[1][1] a[1][2] a[1][3]
a[2][0] a[2][1] a[2][2] a[2][3]
Rows Of A 2D Array
a[0][0] a[0][1] a[0][2] a[0][3] row 0
a[1][0] a[1][1] a[1][2] a[1][3] row 1
a[2][0] a[2][1] a[2][2] a[2][3] row 2
Columns Of A 2D Array
a[0][0] a[0][1] a[0][2] a[0][3]
a[1][0] a[1][1] a[1][2] a[1][3]
a[2][0] a[2][1] a[2][2] a[2][3]
column 0 column column column
1 2 3
2D Array
• Let A be a two-dimensional array m x n
• The array A will be represented in the
memory by a block of m x n sequential
memory location
• Usually, the first dimension of A contains the
index set 1, 2, 3,……m,
with lower bound 1 and upper bound M;
and the second dimension contains
the index set 1,2,3,……n, with lower
bound 1 and upper bound n.
43
• If the index set of a dimension has different Lower
bound and upper bound (i.e. index set does not
start with 1 and last index is not m or n(as the
dimension shown))
• The length of a dimension is the number of
integers in its index set, and can be computed as
follow :-
• Lengthi = Upper Boundi - Lower Bound i + 1
• i.e 1 <= i <= 2
44
Storing 2 D array in Programming
Language
Programming language will store array
A either
Column by Column (Called Column-
Major Order) Ex: Fortran, MATLAB
Row by Row (Called Row-Major Order)
Ex: C, C++ , Java
45
2D Array in Memory
A Subscript A Subscript
(1,1) (1,1)
(2,1) (1,2)
Column Row 1
(3,1)1 (1,3)
(1,2) (1,4)
(2,2) Column (2,1)
(3,2) 2 (2,2)
(1,3) Row 2
(2,3)
(2,3) Column (2,4)
3
(3,3) (3,1)
(1,4) (3,2) Row 3
(2,4) Column (3,3)
4
(3,4) (3,4)
Column-Major Order Row-Major Order 46
Address calculation of an element
in 2D Array
• LOC(LA[K]) = Base(LA) + w(K -1)
• LOC(A[J,K]) of A[J,K]
Column-Major Order
LOC(A[J,K]) = Base(A) + w[m(K-1) + (J-
1)]
Row-Major Order
LOC(A[J,K]) = Base(A) + w[n(J-1) + (K-
1)]
47
2D Array Example
• Consider a 25 x 4 array A. Suppose the
Base(A) = 200 and w =4. Suppose the
programming store 2D array using row-
major. Compute LOC(A[12,3])
• LOC(A[J,K]) = Base(A) + w[n(J-1) + (K-1)]
• LOC(A[12,3]) = 200 + 4[4(12-1) + (3 -1)]
= 384
48
Multidimensional Array
• An n-dimensional m1 x m2 x …. X mn array
B is a collection of m1.m2…mn data
elements in which each element is specified
by a list of n integers – such as K1, K2, …., Kn
– called subscript with the property that
1≤K1≤m1, 1≤K2≤m2, …. 1≤Kn≤mn
The Element B with subscript K1, K2, …,Kn will
be denoted by
BK1,K2, …,Kn or B[K1,K2,….,Kn]
49
Multidimensional Array
• Let C be a n-dimensional array
• Length Li of dimension i of C is the
number of elements in the index set
Li = UB – LB + 1
• For a given subscript Ki, the
effective index Ei of Li is the number
of indices preceding Ki in the index
set
Ei = Ki – LB 50
Multidimensional Array
• Address LOC(C[K1,K2, …., Kn]) of an
arbitrary element of C can be obtained
as
Column-Major Order
Base( C) + w[((( … (ENLN-1 + EN-
1)LN-2) + ….. +E3)L2+E2)L1+E1]
Row-Major Order
Base( C) + w[(… ((E1L2 + E2)L3 +
E3)L4 + ….. +EN-1)LN +EN]
51
Example
• MAZE(2:8, -4:1, 6:10)
• Calculate the address of MAZE[5,-
1,8]
• Given: Base(MAZE) = 200, w = 4,
MAZE is stored in Row-Major order
• L1 = 8-2+1 = 7, L2 = 6, L3 = 5
• E1 = 5 -2 = 3, E2 = 3, E3 = 2
52
Example Contd ..
• Base( C) + w[(… ((E1L2 + E2)L3 + E3)L4
+ ….. +EN-1)LN +EN]
• E1L2 = 3 . 6 = 18
• E1L2 + E2 = 18 + 3 = 21
• (E1L2 + E2)L3 = 21 . 5 = 105
• (E1L2+E2)L3 + E3 = 105 + 2 = 107
• MAZE[5,-1,8] = 200 + 4(107) = 200
+ 248 = 628
53
Pointer, Pointer Array
• Let DATA be any array
• A variable P is called a pointer if P
points to an element in DATA i.e if P
contains the address of an element
in DATA
• An array PTR is called a pointer
array if each element of PTR is a
pointer
54
Group Group Group Group 4
1 2 3
Evan Conrad Davis Baker A Company
Harris Felt Segal Cooper wants to keep
Lewis Glass Ford
the records of
Shaw Hill Gray
Penn Jones
all employees
Silver Reed according to
Troy their
Wagner departments as
King shown in table.
55
One way to store these records is to declare
a Two Dimensional 4x9 or 9x4 array
• Advantage- Easy access to each group
• Disadvantage
- Waste of space
-Insertion and deletion of the new
employees
-- What if employees increases in a
department than 10
56
Group Group Group Group 4
1 2 3
Evan Conrad Davis Baker
Harris Felt Segal Cooper
Lewis Glass Ford
Shaw Hill Gray
Penn Jones
Silver Reed
Troy
Wagner
King
57
1 Evan
2 Harris
3 Lewis Group 1
4 Shaw
5 Conrad
: : Group 2
13 Wagner
14 Davis
Group 3
15 Segal
16 Baker
: : Group 4
21 Reed
58
1 Evan
: : Group 1
4 Shaw
5 $$$
6 Conrad Group are not
: : Group 2index in this
representation
14 Wagner
15 $$$
16 Davis Group 3
17 Segal
18 $$$
19 Baker
: : Group 4
24 Reed 59
1 Evan
2 Harris Group 1
Grou 3 Lewis
p
1 1 4 Shaw
2 5
5 Conrad
3 14
: : Group 2
4 16
5 22 1 Wagner
3
1 Davis Group 3
4
1 Segal
5 Group 4
1 Baker
6
: : 60