KEMBAR78
Data Structure | PDF | Data Type | Data Structure
0% found this document useful (0 votes)
7 views25 pages

Data Structure

Basic of data structure and array

Uploaded by

jintugial
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views25 pages

Data Structure

Basic of data structure and array

Uploaded by

jintugial
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Unit I

A data structure is a special way of organizing and storing data in a computer so


that it can be used efficiently. Array, Linked List, Stack, Queue, Tree, Graph etc. are
all data structures that stores the data in a special way so that we can access and
use the data efficiently. Each of these mentioned data structures has a different
special way of organizing data so we choose the data structure based on the
requirement. So, data structure is a collection of data elements and its manipulation
functions.

Classification of data structures

Data structure is basically classified into two categories-


 Primitive Data type
 Non- primitive data type
But often Data structures in C++ are classified by their characteristics of storing data.
Possible characteristics are:
1-Linear or non-linear: This characteristic describes whether the data items are
arranged in ordered sequence, such as with an array, or in an unordered
sequence, such as with a graph.
2-Homogeneous or non-homogeneous: This characteristic describes whether all
data items in a given repository are of the same data type or of various types. for
example- An array has similar data type stored contiguously but a Structure
has collection of different data types.
3-Static or dynamic: This characteristic describes how the data structures are
compiled. Static data structures have fixed sizes, structures and memory locations at
compile time. Dynamic data structures have sizes, structures and memory locations
that can shrink or expand depending on the use.
Common linear data structures:

 ‍ rray: An array is a data structure that stores a sequence of elements. Arrays


A
typically store data in memory, but they can also represent relationships between
pieces of data.

 ‍ tacks: A stack is a data structure that stores a sequence of elements, with the most
S
recently added element at the top of the stack.

A stack is a data structure that organizes several pieces of information in the order in
which operations are applied to them. The order is Last in, First out – LIFO, which is
the same as First in, Last out – FILO.

 ‍ ueue: A queue is a data structure that stores a sequence of elements, with the most
Q
recently added element at the back of the queue.

A queue is a data structure that organizes several pieces of information in the order in
which operations are applied to them. The order is First in, First out–FIFO, which is
the same as Last in, Last out – LILO.

 ‍ inked list: A linked list (also known as a list) is a linear collection of data elements
L
of any kind, called nodes, with each node having a value and linking to the next in the
chain.
There are different linked lists, such as a singly-linked list, circular linked list, and
doubly-linked list.

The nonlinear data structures:

 ‍ raph: A graph is a data structure representing relationships between data pieces.


G
Graphs often represent social networks and roadmaps. A graph G(V,E) is a set of
vertices V and a set of edges E. An edge connects a pair of vertices and may have
weight such as length, cost. Vertices on the graph are shown as point or circles and
edges are drawn as arcs or line segment.

 ‍ ree: A tree data structure stores a hierarchy of information. Trees are commonly
T
used to represent the structure of XML documents and file systems.
Many different tree types exist, including binary, B-trees, and AVL trees.
‍Comparison of Linear and Non-Linear Data Structures

S.NO Linear Data Structure Non-linear Data Structure

In a linear data structure, data elements


In a non-linear data structure, data
are arranged in a linear order where each
1. elements are attached in hierarchically
and every element is attached to its
manner.
previous and next adjacent.

In linear data structure, single level is Whereas in non-linear data structure,


2.
involved. multiple levels are involved.

Its implementation is easy in While its implementation is complex in


3.
comparison to non-linear data structure. comparison to linear data structure.

While in non-linear data structure, data


In linear data structure, data elements
4. elements can’t be traversed in a single
can be traversed in a single run only.
run only.

While in a non-linear data structure,


In a linear data structure, memory is not
5. memory is utilized in an efficient way.
utilized in an efficient way.

Its examples are: array, stack, queue, While its examples are: trees and
6.
linked list, etc. graphs.

Applications of linear data structures are Applications of non-linear data


7. mainly in application software structures are in Artificial Intelligence
development. and image processing.

8. Linear data structures are useful for Non-linear data structures are useful for
simple data storage and manipulation. representing complex relationships and
S.NO Linear Data Structure Non-linear Data Structure

data hierarchies, such as in social


networks, file systems, or computer
networks.

Performance is usually good for simple


operations like adding or removing at Performance can vary depending on the
9. the ends, but slower for operations like structure and the operation, but can be
searching or removing elements in the optimized for specific operations.
middle.

Basic Operations
The data in the data structures are processed by certain operations. The particular
data structure chosen largely depends on the frequency of the operation that needs
to be performed on the data structure.
 CREATE : reserving memory for the programming elements
 DESTROY: Destroys the memory space allocated for the specified data structure.
Malloc() and free() function of C language are used for this.
 SELECTION: Accessing a particular data within a data structure
 UPADATE : Updates or modifies the data in the data structure
 SEARCHING: Finds the presence of the desired data item in the list of data item
 SORTING: Arranging all data item in a data structure in a particular way; either in
ascending or descending
 MERGING: Combining the data items of two different sorted list into a single sorted
list.

Advantages of DS
We need data structures because there are several advantages of using them, few
of them are as follows:
1. Data Organization: We need a proper way of organizing the data so that it
can access efficiently when we need that particular data. DS provides
different ways of data organization so we have options to store the data in
different data structures based on the requirement.
2. Efficiency: The main reason we organize the data is to improve the
efficiency. We can store the data in arrays then why do we need linked lists
and other data structures? Because when we need to perform several
operations such as add, delete update and search on arrays, it takes more
time in arrays than some of the other data structures. So the fact that we are
interested in other data structures is because of the efficiency.

Arrays

C provides a data structure, the array, which stores a fixed-size sequential


collection of elements of the same type. An array is used to store a collection of
data, but it is often more useful to think of an array as a collection of variables of the
same type.
Properties:
 All elements of an array must be of the same data type.
 All elements of an array are stored in contiguous memory locations, meaning they
occupy a block of memory next to each other. This allows for efficient random access
and memory management.
 The size of an array is fixed at the time of declaration and cannot be changed during
the program's execution. This means you need to know the maximum number of
elements you need beforehand.
 Each element in an array is identified by an index starting with 0. The lower bound of
and array is the index of its first element, which is always 0. The last element in the
array size -1 as its index.
 The name of an array is equivalent to a constant pointer to its first element.
 There is nobound checking concept for arrays in C.
 Since an array can store all the elements of same type, the total memory occupied by
it depends on the data type

.
Declaring Arrays
To declare an array in C++, the programmer specifies the type of the elements and
the number of elements required by an array as follows –

type arrayName [ arraySize ];

This is called a single-dimension array. The arraySize must be an integer constant


greater than zero and type can be any valid C data type. For example, to declare
a 5-element array called Arr of type int, use this statement −
int Arr[5];

Array Initialization
Initialization in C is the process to assign some initial value to the variable. When the array is
declared or allocated memory, the elements of the array contain some garbage value. So, we
need to initialize the array to some meaningful value.
1. Array Initialization with Declaration
In this method, we initialize the array along with its declaration. We use an initializer list to
initialize multiple elements of the array. An initializer list is the list of values enclosed within
braces { } separated b a comma.

data_type array_name [size] = {value1, value2, ... valueN};

2. Array Initialization with Declaration without Size


If we initialize an array using an initializer list, we can skip declaring the size of the array as
the compiler can automatically deduce the size of the array in these cases. The size of the
array in these cases is equal to the number of elements present in the initializer list as the
compiler can automatically deduce the size of the array.
data_type array_name[] = {1,2,3,4,5};
The size of the above arrays is 5 which is automatically deduced by the compiler.
3. Array Initialization after Declaration (Using Loops)
We initialize the array after the declaration by assigning the initial value to each element
individually. We can use for loop, while loop, or do-while loop to assign the value to each
element of the array.
for (int i = 0; i < N; i++) {
array_name[i] = value;
}
Accessing 1D Array

Individual elements of the array can be accessed using the following syntax:
array_name[index or subscript];

eg: arr[1]=15; // assign the value 15 to the second location of the array.

scanf(“%d”,&arr[1]);
//reads a value from the keyboard and assigns it to second location of the array

C program to read and write elements into the array


#include <stdio.h>
// Main function
int main()
{
int arr[10]; // Declare an array of size 10 to store integer values
int i;

// Prompt the user to input 10 elements into the array


printf("Input 10 elements in the array :\n");
for(i=0; i<10; i++)
{
scanf("%d", &arr[i]); // Read the input and store it in the array
}
// Display the elements in the array
printf("\nElements in array are: ");
for(i=0; i<10; i++)
{
printf("%d ", arr[i]); // Print each element in the array
}
printf("\n");
return 0;
}

Implementation Of 1D Array In Memory

The one dimensional array is the simplest form of a linear array. The array is represented
with its name and the elements those are referred by the subscripts or indices of the array. A
one dimensional array is used to store a large number of items in memory. It references all
item in uniform manner. Now we consider a linear array arr in memory of the computer. As
we know that the memory of the computer is simply a sequence of addressed location as:

The address of a particular element in one-dimensional array is given by the relation

Address of element a[k] = B + W * k

Where B is the base address of the array, W is the size of each element of the array, and k is
the number of required element in the array which should be integer quantity.

Eg:
Let the base address of the first element of the array is 2000 and each element occupies four
bytes in the memory, then address of fifth element of a 1D array a[10] will be gives as:
Address of element a[5] = 2000 + 4 *5
= 200 + 20
= 2020

Insertion in 1D Array

Insertion of new element in an array can be done in two ways:


1) Insertion at the end of the array
Inserting an element at the end of an array involves adding the new element to
the last available index of the array.
2) Insertion at required position
The elements must be moved downwards to new locations

Deletion Of An Element From 1D Array


Deleting an element at the end of an array present no difficulties, but deleting in the middle of
the array would require to shift all the elements to fill the space emptied by the deletion of the
element.

Traversing Array

The process of accessing each element in an array.

Merging Two Arrays

Merging means combining elements of two arrays to form a new array. Simplest way of
merging two arrays is that first copy all elements of one array into a third empty array, and
then copy all the elements of other array into third array. The elements of second arrav are
copied into third array after the position at which the last elements of the first array are
copied. If you want the array to be sorted, you can sort the resultant array. Another way is to
sort the array during merging. This technique i.e., sorting while merging is more efficient as
other techniques require sorting of resultant array even if the two source arrays are sorted
beforehand.

Algorithm

Searching

Array searching can be defined as the operation of finding a particular element or a group of
elements in the array.

1. Linear Search
Linear search is a simple searching algorithm used to find a particular value in a list or
array. It works by checking each element in the list one by one, starting from the first
element, until the desired element is found or the entire list has been traversed.
Element found at location 4

2. Binary Search
Binary search looks for a particular key value by comparing the middle most item of
the collection. If a match occurs, then the index of item is returned. But if the middle
item has a value greater than the key value, the right sub-array of the middle item is
searched. Otherwise, the left sub-array is searched.

The algorithm follows the procedure below −


Step 1 − Select the middle item in the array and compare it with the key value to be
searched. If it is matched, return the position of the median.
Step 2 − If it does not match the key value, check if the key value is either greater
than or less than the median value.
Step 3 − If the key is greater, perform the search in the right sub-array; but if the key
is lower than the median value, perform the search in the left sub-array.
Step 4 − Repeat Steps 1, 2 and 3 iteratively, until the size of sub-array becomes 1.
Step 5 − If the key value does not exist in the array, then the algorithm returns an
unsuccessful search.

Algorithm
Consider the following array and the search element to understand the Binary Search
techniques.
Array considered: 9 17 25 34 49
Element to be searched: 34
Step 1: Start the Binary Search algorithm by using the formula middle = (left +
right )/2 Here, left = 0 and right = 4. So the middle will be 2. This means 25 is the
middle element of the array.
Step 2: Now, you have to compare 25 with our search element i.e. 34. Since 25 < 34,
left = middle + 1 and right = 4.
Step 3: So, the new middle = (3 + 4)/ 2, which is 3.5 considered as 3.
Step 4: Now, If you observe, the element to be searched = middle found in the
previous step. This implies that the element is found at a[3].

Sorting
Sorting is a fundamental operation in data structures and algorithms, which involves
arranging a collection of elements in a particular order. Sorting can be done in
ascending or descending order, depending on the context.
1. Bubble Sort
It works by repeatedly stepping through the list, comparing adjacent elements, and
swapping them if they are in the wrong order. This process is repeated until the
list is sorted.

2. Selection Sort
3. Insertion Sort
Insertion sort is a simple sorting algorithm that works by iteratively inserting each
element of an unsorted list into its correct position in a sorted portion of the list.
Two-Dimensional Arrays
A 2D array is also known as a matrix (a table of rows and columns).
Syntax
data_type array_name[rows][columns];
Eg:
int twodimen[4][3];

Initialization of 2D Array in C
In the 1D array, we don't need to specify the size of the array if the declaration and
initialization are being done simultaneously. However, this will not work with 2D
arrays. We will have to define at least the second dimension of the array. The two-
dimensional array can be declared and defined in the following way.
int arr[4][3]={{1,2,3},{2,3,4},{3,4,5},{4,5,6}};

Access the Elements of a 2D Array


To access an element of a two-dimensional array, you must specify the index number
of both the row and column.
This statement accesses the value of the element in the first row (0) and third
column (2) of the matrix array.

Eg:
int matrix[2][3] = { {1, 4, 2}, {3, 6, 8} };
printf("%d", matrix[0][2]); // Outputs 2

Loop Through a 2D Array

To loop through a multi-dimensional array, you need one loop for each of the array's
dimensions.
The following example outputs all elements in the matrix array:
int matrix[2][3] = { {1, 4, 2}, {3, 6, 8} };

int i, j;
for (i = 0; i < 2; i++) {
for (j = 0; j < 3; j++) {
printf("%d\n", matrix[i][j]);
}
}
Memory Implementation of 2D Array
When it comes to organizing and accessing elements in a multi-dimensional array,
two prevalent methods are Row Major Order and Column Major Order. These
approaches define how elements are stored in memory and impact the efficiency of
data access in computing.

Row Major Order


Row major ordering assigns successive elements, moving across the rows and then down the
next row, to successive memory locations. In simple language, the elements of an array are
stored in a Row-Wise fashion.
To find the address of the element using row-major order uses the following formula:
Address of A[I][J] = B + W * ((I – LR) * N + (J – LC))
I = Row Subset of an element whose address to be found,
J = Column Subset of an element whose address to be found,
B = Base address,
W = Storage size of one element store in an array(in byte),
LR = Lower Limit of row/start row index of the matrix(If not given assume it as zero),
LC = Lower Limit of column/start column index of the matrix(If not given assume it as zero),
N = Number of column given in the matrix.
How to find address using Row Major Order?
Given an array, arr[1………10][1………15] with base value 100 and the size of each element
is 1 Byte in memory. Find the address of arr[8][6] with the help of row-major order.
Solution:
Given:
Base address B = 100
Storage size of one element store in any array W = 1 Bytes
Row Subset of an element whose address to be found I = 8
Column Subset of an element whose address to be found J = 6
Lower Limit of row/start row index of matrix LR = 1
Lower Limit of column/start column index of matrix = 1
Number of column given in the matrix N = Upper Bound – Lower Bound + 1
= 15 – 1 + 1
= 15
Formula:
Address of A[I][J] = B + W * ((I – LR) * N + (J – LC))
Solution:
Address of A[8][6] = 100 + 1 * ((8 – 1) * 15 + (6 – 1))
= 100 + 1 * ((7) * 15 + (5))
= 100 + 1 * (110)
Address of A[I][J] = 210
Column Major Order
If elements of an array are stored in a column-major fashion means moving across the column
and then to the next column then it’s in column-major order. To find the address of the element
using column-major order use the following formula:
Address of A[I][J] = B + W * ((J – LC) * M + (I – LR))
I = Row Subset of an element whose address to be found,
J = Column Subset of an element whose address to be found,
B = Base address,
W = Storage size of one element store in any array(in byte),
LR = Lower Limit of row/start row index of matrix(If not given assume it as zero),
LC = Lower Limit of column/start column index of matrix(If not given assume it as zero),
M = Number of rows given in the matrix.
How to find address using Column Major Order?
Given an array arr[1………10][1………15] with a base value of 100 and the size of each
element is 1 Byte in memory find the address of arr[8][6] with the help of column-major order.
Solution:
Given:
Base address B = 100
Storage size of one element store in any array W = 1 Bytes
Row Subset of an element whose address to be found I = 8
Column Subset of an element whose address to be found J = 6
Lower Limit of row/start row index of matrix LR = 1
Lower Limit of column/start column index of matrix = 1
Number of Rows given in the matrix M = Upper Bound – Lower Bound + 1
= 10 – 1 + 1
= 10
Formula: used
Address of A[I][J] = B + W * ((J – LC) * M + (I – LR))
Address of A[8][6] = 100 + 1 * ((6 – 1) * 10 + (8 – 1))
= 100 + 1 * ((5) * 10 + (7))
= 100 + 1 * (57)
Address of A[I][J] = 157
From the above examples, it can be observed that for the same position two different address
locations are obtained that’s because in row-major order movement is done across the rows
and then down to the next row, and in column-major order, first move down to the first column
and then next column. So both the answers are right.
Row Major Order vs Column Major Order
Aspect Row Major Order Column Major Order

Elements are stored Elements are stored


row by row in column by column in
Memory Organization contiguous locations. contiguous locations.

For a 2D array A[m] For the same array:


[n]: [A[0][0], A[0] [A[0][0], A[1][0], ...,
Memory Layout Example [1], ..., A[m-1][n-1]] A[m-1][n-1]]

Moves through the Moves through the


entire row before entire column before
progressing to the progressing to the
Traversal Direction next row. next column.

Efficient for row-wise Efficient for column-


access, less efficient wise access, less
for column-wise efficient for row-wise
Access Efficiency access. access.

Commonly used in Commonly used in


languages like C and languages like
Common Use Cases C++. Fortran.

Suitable for row-wise Suitable for column-


operations, e.g., wise operations, e.g.,
Applications image processing. matrix multiplication.

So it’s all based on the position of the element whose address is to be found for some cases
the same answers is also obtained with row-major order and column-major order and for some
cases, different answers are obtained.
Algorithm
An algorithm is a finite set of instructions that, is followed, accomplishes a particular
task.
Five characteristics of an Algorithm are:
 Input: Accept Zero or more inputs externally.
 Output: Must produce at least one output as result of procedure.
 Definiteness: Each instruction must be clear and unambiguous.
 Finiteness: If all the instructions are traced in algorithm, then for all cases the
algorithm must terminate after a finite number of steps.
 Effectiveness: Every instruction must be very basic so that it can be carried
out and must be feasible to perform on any machine.

Time – Space Trade off

Performance analysis of an algorithm depends upon two factors i.e.


amount of memory used and amount of compute time consumed on
any CPU. Formally they are notified as complexities in terms of:
 Space Complexity.
 Time Complexity.

Space Complexity of an algorithm is the amount of memory it needs to


run to completion i.e. from start of execution to its termination. Space
need by any algorithm is the sum of following components:
1. Fixed Component: This is independent of the characteristics of the
inputs and outputs. This part includes: Instruction Space, Space of
simple variables, fixed size component variables, and constants
variables.
2. Variable Component: This consist of the space needed by
component variables whose size is dependent on the particular
problems instances(Inputs/Outputs) being solved, the space needed
by referenced variables and the recursion stack space is one of the
most prominent components. Also this included the data structure
components like Linked list, heap, trees, graphs etc.
Therefore the total space requirement of any algorithm 'A' can be
provided as

S(P)=c+SP

S(P) – Space requirement

Sp- instance characteristics

c- constant

Among both fixed and variable component the variable part is important
to be determined accurately, so that the actual space requirement can be
identified for an algorithm 'A'. To identify the space complexity of any
algorithm following steps can be followed:
1. Determine the variables which are instantiated by some default
values.
2. Determine which instance characteristics should be used to
measure the space requirement and this is will be problem specific.
3. Generally the choices are limited to quantities related to the number
and magnitudes of the inputs to and outputs from the algorithms.
4. Sometimes more complex measures of the interrelationships among
the data items can used.

Example: Space Complexity


Algorithm Sum(number,size)\\ procedure will produce sum of all numbers
provided in 'number' list
{
result=0.0;
for count = 1 to size do \\will repeat from 1,2,3,4,....size
times
result= result + number[count];
return result;
}

In above example, when calculating the space complexity we will be


looking for both fixed and variable components. here we have

Fixed components as 'result','count' and 'size' variable there for total


space required is three(3) words.

Variable componentsis characterized as the value stored in 'size' variable


(suppose value store in variable 'size 'is 'n'). because this will decide the
size of 'number' list and will also drive the for loop. therefore if the space
used by size is one word then the total space required by 'number'
variable will be 'n'(value stored in variable 'size').

therefore the space complexity can be written as Space(Sum) = 3 + n;


Time Complexity of an algorithm(basically when converted to program)
is the amount of computer time it needs to run to completion. The time
taken by a program is the sum of the compile time and the run/execution
time .The compile time is independent of the instance(problem specific)
characteristics. following factors effect the time complexity:
1. Characteristics of compiler used to compile the program.
2. Computer Machine on which the program is executed and physically
clocked.
3. Multiuser execution system.
4. Number of program steps.

Therefore the again the time complexity consist of two components


fixed(factor 1 only) and variable/instance(factor 2,3 & 4), so for any
algorithm 'A' it is provided as:

Time(A) = Fixed Time(A) + Instance Time(A)

Here the number of steps is the most prominent instance characteristics


and The number of steps any program statement is assigned depends on
the kind of statement like
 comments count as zero steps,
 an assignment statement which does not involve any calls to other
algorithm is counted as one step,
 for iterative statements we consider the steps count only for the
control part of the statement etc.

Therefore to calculate total number program of program steps we use


following procedure. For this we build a table in which we list the total
number of steps contributed by each statement. This is often arrived at by
first determining the number of steps per execution of the statement and
the frequency of each statement executed. This procedure is explained
using an example.

Example: Time Complexity

In above example if you analyze carefully frequency of " for count = 1 to size
do" it is 'size +1' this is because the statement will be executed one time
more die to condition check for false situation of condition provided in for
statement. Now once the total steps are calculated they will resemble the
instance characteristics in time complexity of algorithm. Therefore the
time complexity can be expressed as: Time(Sum) = (2size +3)

So in this way both the Space complexity and Time complexity can be
calculated. Combination of both complexity comprises the Performance
analysis of any algorithm and can not be used independently. Both these
complexities also helps in defining parameters on basis of which we
optimize algorithms.

Asymptotic Notation

In algorithm analysis, the best case, worst case, and average case are used to describe the
performance of an algorithm under different conditions. They provide a way to understand
the time or space complexity of an algorithm in various scenarios. Here's an overview of
each:

1. Best Case

 Definition: The best-case scenario refers to the situation where the input data is
structured in such a way that the algorithm performs the fewest possible operations. It
represents the minimum time or space required for the algorithm to complete.

2. Worst Case

 Definition: The worst-case scenario is when the algorithm encounters the most
difficult or unideal input, resulting in the maximum number of operations. It gives an
upper bound on the algorithm's running time or space usage.

3. Average Case

 Definition: The average-case scenario reflects the expected performance of an


algorithm over a distribution of all possible inputs. It is calculated by averaging the
running time or space for all inputs, assuming a certain distribution of inputs.

Big Oh, O: Asymptotic Upper Bound


The notation Ο(n) is the formal way to express the upper bound of an algorithm's running time. is the
most commonly used notation. It measures the worst case time complexity or the longest amount of
time an algorithm can possibly take to complete.
Big Omega, Ω: Asymptotic Lower Bound
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running
time. It measures the best case time complexity or the best amount of time an algorithm can
possibly take to complete.

Theta, θ: Asymptotic Tight Bound


The notation θ(n) is the formal way to express both the lower bound and the upper bound of
an algorithm's running time. Some may confuse the theta notation as the average case time
complexity; while big theta notation could be almost accurately used to describe the average
case, other notations could be used as well.

You might also like