KEMBAR78
Data Struct | PDF | Data Type | Time Complexity
0% found this document useful (0 votes)
17 views182 pages

Data Struct

Data structure Unit-1

Uploaded by

sakshi.swami25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views182 pages

Data Struct

Data structure Unit-1

Uploaded by

sakshi.swami25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 182

Data Structure

Mr. Dinesh Satre


MMIT, Lohgaon, Pune-47
dinesh.satre@mmit.edu.in
Prerequisite Courses
• Fundamentals of Programming Languages
• C Programming, Operators and Expressions, Control Flow, Arrays, User
Defined Functions

• Programming and Problem Solving


• Python Programming, Decision Control, Functions and Strings, File
Handling and Dictionaries, Object Oriented Programming
Companion Course
• Data Structures Laboratory
• Group A: Arrays and Searching Sorting Algorithms
• Group B: Stacks Queues and Linked List
• Group C: Hashing
• Group D: Graphs and Trees
• Group E: Mini project
Course Objectives
• To understand importance of data structures in context of writing
efficient programs
• To be able to implement linear and non-linear data structures
Course Outcomes (COs)
● On completion of the course, learner will be able to–
● CO1:Understand and Analyze various types of data structures and
algorithms
● CO2:Apply various sorting and searching algorithms for given problem
● CO3:Make Use of Stacks and Queues to solve the given problem
● CO4:Analyze different hashing techniques and collision resolution
strategies.
● CO5:Demonstrate basic operations on trees and graphs
Syllabus
• Unit I: Introduction to Data Structures and Algorithms
• Introduction to Data Structures: Abstract Data Types (ADT), Linear and
Non-linear, Static and Dynamic, Persistent and Ephemeral data structures
• Algorithms: Space complexity, Time complexity, Asymptotic notation-
Big-O, Theta and Omega, finding complexity using step count method,
Analysis of programming constructs-Linear, Quadratic, Cubic, Logarithmic.
• Algorithmic Strategies: Introduction to algorithm design strategies- Divide
and Conquer, and Greedy strategy
• Case Study:E-commerce Product Sorting using Divide and Conquer
strategy Google Calendar application using Greedy strategy
Syllabus
• Unit II: Linear Data Structures, searching and sorting
• Overview of Array, Array as an Abstract Data Type, Operations on Array,
Storage Representation, Multidimensional Arrays[2D, nD], Sparse matrix
representation using 2D
• Searching: Sequential Search/Linear Search, Binary Search, Fibonacci
Search, and Indexed Sequential Search.
• Sorting: Concepts- Stability, Efficiency, and Number of Passes, Internal and
External Sorting, Bubble sort, Insertion Sort, Selection Sort , Quick Sort,
Merge sort
• Case Study : Social Network Adjacency Matrix Representing friendship
connections among millions of users.
Syllabus
• Unit III: Stacks, Queues and Linked Lists
• Stacks: Stack operations, Multiple Stacks, Applications of Stack for
Expression Conversion [infix, prefix and postfix], Postfix expression
evaluation
• Queues: Queue Operations, Circular Queue, Priority Queue and its
advantages and applications
• Linked list: Introduction of Linked Lists, Primitive Operations on Linked
List- Create, Traverse, Search, Insert, Delete, Sort, and Concatenate. Types
of Linked List: Singly linked, linear and Circular Linked Lists, Doubly
Linked List,
• Case study: Implementation of Stack and Queue operations using Linked
lists
Syllabus
• Unit IV: Hashing
• Concepts-hash table, hash function, basic operations, bucket, collision,
probe, synonym, overflow, open hashing, closed hashing, perfect hash
function, load density, full table, load factor, rehashing, properties of good
hash function, Collision resolution strategies- open addressing and chaining,
Hash table overflow- open addressing and chaining, extendible hashing,
closed addressing and separate chaining
• Case study : Dictionary Application using Hash Tables, Description:
Implement a dictionary where words and meanings are stored and retrieved
using hashing with collision resolution
Syllabus
• Unit V: Graphs and Trees
• Graphs: Basic Concepts, Storage representation, Adjacency matrix,
adjacency list, Traversals-depth first and breadth first, Minimum spanning
Tree, Greedy algorithms for computing minimum spanning tree- Prims and
Kruskal Algorithms
• Trees: General tree and its representation: sequential and linked
organization, Binary tree- properties, converting tree to binary tree, binary
tree traversals (recursive and non-recursive) - inorder, preorder, post order,
Operations on binary tree. Binary Search Tree (BST) and its operation
• Case study: GPS/Navigation system that models a city map as a weighted
graph and applies core graph algorithms ZIP/GZIP file compression using
frequency-based encoding. using Huffman tree
Examination Scheme
• Theory
• Comprehensive Continuous Evaluation (CCE): 30 Marks
• End-Semester Examination (ESE): 70 Marks
Unit No. CCE Exam End Sem Exam (ESE)
Institute Level University Level
Unit-I 06 14
Unit-II 06 14
Unit-III 06 14
Unit-IV 06 14
Unit-V 06 14
Marks Weightage Per Unit For Examination
Examination Scheme
• Comprehensive Continuous Evaluation (CCE): 30 Marks
• Based on all the Units of course syllabus to be scheduled and conducted at
institute level.
• Case studies included under each unit are intended to support applied
learning and are part of Comprehensive Continuous Evaluation
• These case studies will be assessed through internal assessment components
such as presentations, assignments, or group discussions. They shall not be
included in the End-Semester Theory Examination.
Examination Scheme
• Comprehensive Continuous Evaluation (CCE): 30 Marks

Sr. No. Parameter Marks Coverage of Unit


1 Unit Test 12 Units 1 & Unit 2 (6 Marks/Unit)

2 Assignments / Case Study 12 Units 3 & Unit 4 (6 Marks/Unit)

3 Seminar Presentation / Open Book Test/ Quiz 06 Unit 5


Text Book
• Data structures and algorithms in python by Michael T. Goodrich,
ISBN-13: 978- 1118290279, ISBN-10: 1118290275, Publisher:
Wiley; 1st edition (March 18, 2013).

• Problem Solving with Algorithms and Data Structures Using Python


by Bradley N Miller and David L. Ranum. ISBN-13:
978-1590282571, ISBN-10: 1590282574, Publisher: Franklin,
Beedle & Associates; 2nd edition (August 22, 2011).
MOOC/ Video Lectures
• https://nptel.ac.in/courses/106/102/106102064/
• https://nptel.ac.in/courses/106/105/106105085
• https://nptel.ac.in/courses/106/106/106106127
Pre-requisite Test
• 20 Marks Pre-requisite test is based on Pre-requisite courses
• Fundamentals of Programming Languages
• 2 Marks each unit

• Programming and Problem Solving


• 2 Marks each unit
Introduction To Data Structure
• Computer science includes the study of data, its representation, and
its processing by computers

• The success of a software project often depends upon the choices


made in the representation of the data and the choice of algorithms

• The term data structure refers to the organization of data elements


and the interrelationships among them
Introduction To Data Structure
• There is a clear distinction between the data structure specification
and its realization
• The specification comes before the programming language
• Its realization comes with a specific programming language

• Specification of data structures requires explaining the functioning


and overall behaviour of the data structure, whereas the
implementation of the data structure requires simulating the data
structure in some programming language
Data
• Data is nothing but a piece of information

• Data input, data manipulation (or data processing), and data output
are the functions of computers

• It can be a number, a string, or a set of many numbers and strings


Atomic and Composite Data
• Atomic data is the data that we choose to consider as a single,
non-decomposable entity
• For example, the integer 1234 may be considered as a single integer value
• In some languages, atomic data is known as scalar data because of its
numeric properties.

• Composite data can be broken down into subfields that have


meaning.
• For example, a student’s record consists of Roll_Number, Name, Branch,
Year, and so on.
• Composite data is also referred to as structured data and can be
implemented using a structure or a class in C++.
Data Type
• Data type refers to the kind of data a variable may store

• Data type is a term that specifies the type of data that a variable may
hold in the programming language.
Built-in Data Types
• In general, languages have their built-in data types

• For example, in the C/C++ languages, int, float, and char are built-in
data types
User-defined Data Types
• Using built-in data types, we can
design (define) our own data
types by means of structures,
unions, and classes

• Suppose we want to maintain a


record of 100 students. Then we
use the C++ class

• Class, structure, and union are


the user-defined data types in
C++.
Data Object
• A data object represents a container for data values — a place where
data values may be stored and later retrieved.

• A data object is characterized by a set of attributes.

• The attributes determine the number and type of values that the data
object may contain
Data Object
• A data object is nothing but a set of elements, say D.
• The data object ‘alphabets’ can be defined as
D = {A, B, …, Z, a, b, …, z}
• The data object ‘integers’ as
D = {…, -3, -2, -1, 0, 1, 2, 3, …}.

The data object set may be finite or infinite.


Data Object
• A data object is a run-time instance of data structures.

• The programmer explicitly creates and manipulates data objects


through declarations and statements in the program.

• System-defined data objects are ordinarily generated automatically


as needed during program execution without explicit specification by
the programmer.
Data Structure
• Data structures refer to data and representation of data objects within
a program

• A data structure is a collection of atomic and composite data types


into a set
Data Structure
• A data structure is a set of domains D, a set of functions F, and a set
of axioms A.

• The triple structure (D, F, A) denotes the data structure with the
following elements:
• Domain (D) This is the range of values that the data may have.
• Functions (F) This is the set of operations for the data.
• Axioms (A) This is a set of rules with which the different operations
belonging to F can actually be implemented.
Data Structure
• The data structure d = Integer
Abstract Data Type
• Software engineering is the establishment and the use of good
engineering methodologies and a principle for writing reliable
software.

• Abstraction allows us to organize the complexity of a task by


focussing on logical properties of data and actions rather than on the
implementation details.

• Logical properties refer to the ‘what’ and implementation details


refer to the ‘how’
Abstract Data Type
• An ADT is a mathematical model that includes data with various
operations defined

• Implementation details of an ADT are hidden, which is why it is


called abstract

• To represent the mathematical model underlying an ADT, we use the


data structure, which is a collection of the variables and the data
types inter-related in different ways.
Abstract Data Type
• Data abstraction is the separation of logical properties of the data
from details of how the data is represented.

• Procedural abstraction means separation of the logical properties of


action from implementation.

• An ADT encompasses both procedural as well as data abstraction;


the set of operations are defined for any data type that might make up
the set of values
Abstract Data Type
• In brief, an ADT includes declaration of data, implementation of
operations, and encapsulation of data and operations.

• Consider the concept of a queue.


• At least three data structures will support a queue. We can use an array, a
linked list, or a file.
• If we place our queue in an ADT, users should not be aware of the structure
we use.
• As long as they can enqueue (insert) and dequeue (retrieve) data, how we
store the data should make no difference
Abstract Data Type
• Let us redefine ADT for the
Integer
Abstract Data Type
• Five basic functions are defined on a set of integer data object
1. zero() — It is a function which takes no input but generates the
integer zero as result. That is, its output is 0.
2. ifzero(int) — This function takes one integer input and checks
whether that number is 0 or not.
3. increment(int) — This function reads one integer and produces its
incremented value, that is, (integer + 1), which is again an
integer.
4. add(int, int) — This function reads two integers and adds them
producing another integer.
5. equal(int, int) — This function takes two integer values and
checks whether they are equal or not.
Q&A
1. Which of the following best describes a data structure?
A) A way to store data in a database
B) A way to organize and store data for efficient access and modification
C) A set of mathematical operations
D) A data type predefined by a programming language
Q&A
2. What is atomic data?
A) Data divided into smaller parts
B) Data that can be divided into subfields
C) Indivisible data that cannot be broken down further
D) A data type that contains multiple fields
Q&A
3. Which of the following is an example of composite data?
A) Integer
B) Character
C) Array
D) Float
Q&A
4. What is meant by data type?
A) The method used to sort data
B) The kind of value a variable can hold and the operations that can be performed
on it
C) The memory location of a variable
D) The order of data elements
Q&A
5. Which of the following is NOT a built-in data type in C?
A) int
B) float
C) struct
D) char
Q&A
6. Which one of the following is an example of a user-defined data type?
A) int
B) float
C) enum
D) double
Q&A
7. What is a data object?
A) The implementation of a program
B) A set of data elements and the operations defined on them
C) A type of variable
D) A memory address
Q&A
8. Which of the following best defines an Abstract Data Type (ADT)?
A) Data type defined without specifying concrete implementation
B) Built-in data type in the programming language
C) Data stored as primitive values only
D) Data type with only numeric data
Q&A
9. Which of the following statements about Abstract Data Types (ADT) is true?
A) ADTs only work with primitive data types
B) ADTs define what operations are to be performed but not how they are
implemented
C) ADTs must be directly supported by hardware
D) ADTs cannot be used in object-oriented programming
TYPES OF DATA STRUCTURES
• We defined a data structure as a way of organizing data that specifies
1. a set of data elements, that is, a data object
2. a set of operations that are applied to this data object
TYPES OF DATA STRUCTURES
• The data structure is independent of their implementation.

• The various types of data structures are as follows


1. primitive and non-primitive
2. linear and non-linear
3. static and dynamic
4. persistent and ephemeral
5. sequential and direct access
Primitive and Non-primitive Data Structures
• Primitive data structures define a set of primitive elements that do
not involve any other elements as its subparts
• For example, data structures defined for integers and characters.

• Non-primitive data structures are those that define a set of derived


elements such as arrays.
• Arrays consist of a set of similar type of elements.
• Class and structure are other examples of non-primitive data structures
Linear and Non-linear Data Structures
• A data structure is said to be linear if its elements form a sequence or
a linear list.

• In a linear data structure, every data element has a unique successor


and predecessor.

• There are two basic ways of representing linear structures in


memory.
• One way is to have the relationship between the elements by means of
pointers (links), called linked lists.
• The other way is using sequential organization, that is, arrays.
Linear and Non-linear Data Structures
• Non-linear data structures are used to represent the data containing
hierarchical or network relationship among the elements.
• Trees and graphs are examples of non-linear data structures.

• In non-linear data structures, every data element may have more than
one predecessor as well as successor.

• Elements do not form any particular linear sequence.


Linear and Non-linear Data Structures
• Figure depicts both linear and non-linear data structures.
Static and Dynamic Data Structures
• A data structure is referred to as a static data structure if it is created
before program execution begins

• The variables of static data structure have user-specified names.

• An array is a static data structure


Static and Dynamic Data Structures
• In many applications, it is desirable to be able to start a program with the
smallest amount of memory necessary and then allocate extra memory as
the need arises.

• This facility is provided by many programming languages and in C++,


through the operator “new”.

• These functions allow programmers to allocate memory during execution.

• Hence, the programmer can realize the data structure which dynamically
grows and shrinks.
Static and Dynamic Data Structures
• A data structure that is created at run-time is called dynamic data structure.

• The variables of this type are not always referenced by a user-defined


name.

• These are accessed indirectly using their addresses through pointers.

• A linked list is a dynamic data structure

• Non-linear data structures are generally implemented implemented as


dynamic data structures.
Persistent and Ephemeral Data Structures
• Data structures comprise a set of operations and a set of data to
operate on.

• The operations that process the data may modify the data.

• This may create two versions of a data structure namely the recently
modified data structure and the previous version

• The list data type has the associated operations—append and reverse.
These operations preserve two copies of the data structure as the
recent version and the previous version
Persistent and Ephemeral Data Structures
• A data structure that supports operations on the most recent version
as well as the previous version is termed as a persistent data
structure.

• A persistent data structure is partially persistent if any version can be


accessed but only the most recent one can be updated; it is fully
persistent if any version can be both accessed and updated.
Persistent and Ephemeral Data Structures
• An ephemeral data structure is one that supports operations only on
the most recent version.

• The distinction between ephemeral and persistent data structure is


essentially the distinction between functional (also called effect free)
and imperative (also called effect full) programming paradigms.

• The functional data structures are persistent and the imperative data
structures are ephemeral
Sequential Access and Direct Access Data
Structures
• Sequential access means that to access the nth element, we must
access the preceding (n - 1) data elements.
• A linked list is a sequential access data structure.

• Direct access means that any element can be accessed without


accessing its predecessor or successor; we can directly access the nth
element.
• An array is an example of a direct access data structure.
Q&A
1. Which of the following is a primitive data structure?
A) Stack
B) Queue
C) Integer
D) Linked list
Q&A
2. Which of the following data structures is linear?
A) Tree
B) Graph
C) Queue
D) Binary search tree
Q&A
3. What is a static data structure?
A) A data structure whose size can change during runtime
B) A data structure whose size is fixed at compile time
C) A data structure that cannot store numbers
D) A data structure that only uses pointers
Q&A
4. Which of the following is true about persistent data structures?
A) They lose their previous version when modified
B) They always occupy less memory
C) All previous versions remain accessible after updates
D) They cannot be implemented using trees
Q&A
5. Which of the following best describes an ephemeral data structure?
A) Maintains all past versions after updates
B) Only the latest version is kept, past versions are lost
C) Cannot be used in programs
D) Stores data permanently in secondary storage
Q&A
6. Arrays are usually categorized as:
A) Dynamic and linear data structures
B) Static and linear data structures
C) Dynamic and non-linear data structures
D) Static and non-linear data structures
Algorithms
• Algorithm: An algorithm is an finite set of instructions that, if
followed, accomplished a particular task

• Analysis: Separation into its constituent parts for individual


study

• Analysis of Algorithm: Investigation of algorithm efficiency


Algorithms
• There are two kind of efficiency
• Space Efficiency: the amount of memory units required by the
algorithm
• Time Efficiency: indicates how fast an algorithm
Space Complexity
• All algorithms run longer on larger inputs
• For example, sort larger arrays, multiply larger matrices, and so on

• The appropriate size metric can be influenced by


operations of the algorithm
• If the algorithm examines individual characters of its input, then size
will be the number of characters
• If the algorithm examines a positive integers n in an algorithm, then
size will be count of n.
Space Complexity
• The space needed by the algorithm is based on the
following criteria
• A fixed part that is independent of input and output. This includes
instruction space, space of simple variable, fixed sized component,
etc
• A variable part that consists of variables whose size is depends on
the instance of problem solve, space needed by reference variables
and recursive stack.
Space Complexity
• The space requirement S(P) of any algorithm P may
be written as S(P) = c + Sp, where c is constant and
Sp is the instance characteristics.
Space Complexity


• The problem instance is characterized by specific values of a, b and c.


Making assumption, one word is adequate to store a, b and c. It is
independent of instance characteristic, therefore, S(P) = c + Sp = c
Space Complexity
• The problem instance are
characterized by n, the
number of elements to be
summed. The space needed
by n is one word, since it is
of type integer. The variable
a requires at least n words,
since a must be large
enough to hold n elements.
So, S(P) = c + Sp ≥ 3 + n (3
for each i, s and n, n for a[])
Space Complexity

• The recursion stack includes the space formal parameter, local


variable and return address. Assume that return address require
one word of memory. Each call of algorithm requires at least 3
words. Since the depth of recursion is n+1, the stack space needed
is S(P) ≥ 3(n+1)
Time Complexity
• It is an units for measuring an algorithm’s running time
• Standard unit of time can be used for measurement - a second, or
millisecond, and so on

• Drawbacks of using Standard unit of time


• dependence on the speed of a particular computer
• dependence on the quality of a program implementing the algorithm
• the compiler used in generating the machine code
Time Complexity
• One possible approach is to count the number of times
each of the algorithm’s operations is executed
• The thing to do is to identify the most important operation of the
algorithm, called the basic operation

• The basic operation of an algorithm: It is usually the


most time-consuming operation in the algorithm
Time Complexity
• The time T(P) taken by a program P is sum of compile time and run time

• Compile Time:
• The compile time does not depends on instance characteristics
• Compiled program can be run several time without recompiling

• Run Time:
• The run time is denoted by t (instance characteristic)
p
• The run time t (n) is calculated as t (n) = c ADD(n) + c SUB(n) + c MUL(n) + c Div(n) +
p p a s m d
Time Complexity
• A program statement is a meaningful statement of a program and
has a execution time.

• The number of steps assigned in program is depends on the kind of


statement

• For example,
• Comment count as zero
• Assignment statement is counted as one step
• In iterative statement number of steps depends on control parameters
Time Complexity
• Number of steps needed by the program is
determined by two ways
st
• In 1 method, variable count is introduced
• 2nd method is tabular method
st
1 Method: Example
• Find the running time of the following algorithm
st
1 Method: Example
• Use count variable to compute running time
st
1 Method: Example
• Simplification of algorithm showing only count variable

• Each execution of the algorithm runs 2n+3 steps


st
1 Method: Example-2
• Find the running time of the following algorithm
st
1 Method: Example-2
st
1 Method: Example-2

• Let t (n) be the increase in the value of count


RSum

• When analyzing the algorithm, we obtain the following


recursion formula
st
1 Method: Example-2
• The recursive formula is referred as recurrence relation
• One way of solving the recurrence relation is repeated
substitution of each occurrence of function
st
1 Method: Example-3
st
1 Method: Example-3
● Line 7 executes n times
and line 5 executes m
times and line 9
executes once.
Therefore, the step
count will be 2mn + 2n +
1
nd
2 Method
nd
• The 2 method to determine the step count of
algorithm is to build the table

• Table includes steps per execution (s/e) and total


number of time each step executed
nd
2 Method

nd
2 Method

nd
2 Method

Order of Growth
Worst, Best & Average Case Complexity
Worst, Best & Average Case Complexity

• The running time of this algorithm can be quite


different for the same list size n
• In the worst case, when there are no matching elements
• The algorithm makes the largest number of key comparisons
Worst, Best & Average Case Complexity
• Worst-case Efficiency:
• The longest among all possible inputs of that size n
• Determine the kind of inputs for which the count C(n) largest among all
possible inputs of size n
• Provides very important information about an algorithm’s efficiency by
bounding its running time from above
• Guarantees that for any instance of size n, the running time will not exceed

C (n)
worst
• C (n) = n for possible inputs of size n
worst
Worst, Best & Average Case Complexity

• Best-case Efficiency:
• The algorithm runs the fastest among all possible inputs of that size n
• Determine the kind of inputs for which the count C(n) will be the
smallest among all possible inputs of size n

• C (n) = 1 for possible inputs of size n


best
Worst, Best & Average Case Complexity

• Average-case Efficiency
• To analyze the algorithm’s average case, make some assumptions
about possible inputs of size n

• The probability of a successful search is p (0 ≤ p ≤ 1)


th
• The probability of the first match occurring in the i position of the list is
the same for every i
Worst, Best & Average Case Complexity

• Find the average number of key comparisons C (n)


avg
as follows
• In the case of a successful search, the probability of the first match
th
occurring in the i position of the list is p/n for every i, and the
number of comparisons made by the algorithm in such a situation is
obviously i.
• In the case of an unsuccessful search, the number of comparisons
will be n with the probability of such a search being (1− p).
Worst, Best & Average Case Complexity


Worst, Best & Average Case Complexity

• If p = 1 (the search must be successful), the average


number of key comparisons made by sequential
search is (n + 1)/2; that is, the algorithm will inspect,
on average

• If p = 0 (the search must be unsuccessful), the


average number of key comparisons will be n
Asymptotic Notation
• The efficiency analysis framework concentrates on the
order of growth of an algorithm’s basic operation

• To compare and rank such orders of growth, computer


scientists use three notations
• O (big oh)
• Ω (big omega)
• Ө (big theta).
Big O Notation
• The function f(n) = O(g(n) iff there exist a positive constant c and n such that f(n) ≤ c * g(n)
0
for all n, n ≥ n .
0

• For example,
• 3n + 2 = O(n) as 3n+2≤4n for all n ≥ 2
• 3n + 3 = O(n) as 3n+3≤4n for all n ≥ 3
• 100n + 6 = O(n) as 100n+6≤101n for all n ≥ 6
2 2 2 2
• 10n + 4n + 2 = O(n ) as 10n + 4n + 2 ≤11n for all n ≥ 5
2 2 2 2
• 1000n + 100n - 6 = O(n ) as 1000n + 100n - 6 ≤1001n for all n ≥ 100
n 2 n n 2 n
• 6*2 +n = O(2 ) as 6*2 +n ≤7*2 for all n ≥ 4
2 2
• 3n + 3 = O(n ) as 3n+3≤3n for all n ≥ 2
Big O Notation
• O(1) mean the computing time is constant
• O(n) mean the computing time is linear
2
• O(n ) mean the computing time is quadratic
3
• O(n ) mean the computing time is cubic
n
• O(2 ) mean the computing time is exponential
• O(log n) mean the computing time is logarithmic
Ω Notation
• The function f(n) = Ω(g(n) iff there exist a positive constant c
and n such that f(n) ≥ c * g(n) for all n, n ≥ n .
0 0

• For example,
• 3n + 2 = Ω(n) as 3n+2≥3n for all n ≥ 1
• 3n + 3 = Ω (n) as 3n+3≥3n for all n ≥ 1
• 100n + 6 = Ω(n) as 100n+6≥100n for all n ≥ 1
2 2 2 2
• 10n + 4n + 2 = Ω(n ) as 10n + 4n + 2 ≥n for all n≥1
n 2 n n 2 n
• 6*2 +n = Ω(2 ) as 6*2 +n ≥2 for all n ≥ 1
Ө Notation

• The function f(n) = Ө(g(n) iff there exist a positive constant c , c and
1 2
n such that c * g(n) ≤ f(n) ≤ c * g(n) for all n, n ≥ n .
0 1 2 0

• For example,
• 3n + 2 = Ө(n) as 3n+2≥3n for all n ≥ 2 3n+2≤4n for all n ≥ 2, so c = 3 c = 4 and n =2
• 3n + 3 = Ө(n)
1 2 0
2 2
• 10n + 4n + 2 = Ө(n )
n 2 n
• 6*2 +n = Ө(2 )
Analysis of Programming Constructs-Linear,
Quadratic, Cubic, Logarithmic
1. Linear Time Complexity (O(n)):

● An algorithm has linear time complexity if its execution time grows proportionally to the input size 'n'.
● For example, searching for an element in a list by iterating through each element once has O(n) complexity.
● If you double the input size, the execution time roughly doubles as well.

2. Quadratic Time Complexity (O(n^2)):

● An algorithm has quadratic time complexity if its execution time grows proportionally to the square of the input
size 'n'.
● Nested loops where each loop iterates 'n' times are a common example.
● If you double the input size, the execution time increases by a factor of four.
Analysis of Programming Constructs-Linear,
Quadratic, Cubic, Logarithmic
3. Cubic Time Complexity (O(n^3)):

● An algorithm has cubic time complexity if its execution time grows proportionally to the cube of the input size
'n'.
● An example is a triple nested loop where each loop iterates 'n' times.
● If you double the input size, the execution time increases by a factor of eight.

4. Logarithmic Time Complexity (O(log n)):

● An algorithm has logarithmic time complexity if its execution time grows proportionally to the logarithm of the
input size 'n'.
● Binary search is a classic example. In each step, the search space is halved.
● As the input size increases, the execution time increases much slower compared to linear, quadratic, or cubic
time complexities.
Q&A
1. What does space complexity of an algorithm refer to?
A) Time taken by the algorithm
B) Number of arithmetic operations
C) Amount of memory required
D) Number of recursive calls
Q&A
2. Time complexity of an algorithm quantifies:
A) Amount of memory used
B) Number of input elements
C) Time taken relative to input size
D) Time taken on specific hardware
Q&A
3. What is the worst-case time complexity of linear search on an
unsorted array?
A) O(1)
B) O(n log n)
C) O(log n)
D) O(n)
Q&A
4. What does average case time complexity measure?
A) The complexity for the smallest input
B) The complexity for the largest input
C) The expected complexity over all possible inputs
D) The memory usage of an algorithm
Q&A
5. Which of the following represents best-case time complexity?
A) Minimum number of steps for any input
B) Maximum memory usage
C) Time for the largest input
D) All of the above
Q&A
6. Which of the following asymptotic notations gives the upper bound
of an algorithm?
A) Θ (Theta)
B) Ω (Omega)
C) O (Big O)
D) None of the above
Q&A
7. What does Ω (Omega) notation describe?
A) Worst-case scenario
B) Best-case scenario
C) Average-case scenario
D) Memory usage
Q&A
8. Θ (Theta) notation provides:
A) Only upper bound
B) Only lower bound
C) Both upper and lower bound
D) Neither bound
Q&A
9. Which of the following is the correct hierarchy of asymptotic
notations in terms of bounds?
A) O < Θ < Ω
B) Θ < O < Ω
C) Ω < Θ < O
D) O (Upper Bound), Θ (Tight Bound), Ω (Lower Bound)
Q&A
10. If an algorithm has time complexity O(n²), which of the following
statements is true?
A) The algorithm will always run in exactly n² steps
B) The algorithm will never take more than n² steps
C) The algorithm will take less than n² steps
D) The algorithm will run in constant time
Algorithmic Strategies
• Divide and Conquer
• Greedy strategy
Divide & Conquer
• Divide-and-conquer algorithms work according to the
following general plan
• A problem is divided into several subproblems of the same type,
ideally of about equal size
• The subproblems are solved - typically recursively
• If necessary, the solutions to the subproblems are combined to get a
solution to the original problem
Divide & Conquer

Divide & Conquer
• The strategy suggest the splitting the n inputs into k
distinct subset, 1 < k ≤ n.

Example

• Let us consider the problem of computing the sum of n numbers a , . .


0
.,a .
n−1

• If n > 1, divide the problem into two instances of the same problem:
• compute the sum of the first n/2 numbers
• compute the sum of the remaining n/2 numbers

• Once each of these two sums is computed by applying the same


method recursively, add their values to get the sum in question
Efficiency Analysis
• An instance of size n can be divided into b instances of
size n/b, with a of them needing to be solved
• Assuming that size n is a power of b, the recurrence
relation is

• where f (n) is a function that accounts for the time spent


on dividing an instance of size n into instances of size n/b
and combining their solutions
Efficiency Analysis
• The efficiency analysis of many divide-and-conquer
algorithms is greatly simplified by the following
theorem
• Master Theorem:
d
• If f (n) ∈ Ө(n ) where d ≥ 0
Efficiency Analysis
• You cannot use the Master Theorem if
• T(n) is not monotone, ex: T(n) = sin n
n
• f(n) is not a polynomial, ex: T(n) = 2T(n/2 ) + 2
• b cannot be expressed as a constant, ex: T(n) = T(√n)
Example 1
•Consider the recurrence relation
•Recurrence relation for D&C is

•Therefore,
–a = 2
–b = 2
–f(n) = 1
Example 1
d
•If f (n) ∈ Ө(n ) where d ≥ 0
•Then, d = 0, since f(n) = 1
•According to master theorem,





d
•Since, a > b

Example 2
•Consider the recurrence relation

•Recurrence relation for D&C is

•Therefore,
–a = 2
–b = 2
–f(n) = n
Example 2
d
•If f (n) ∈ Ө(n ) where d ≥ 0
•Then, d = 1, since f(n) = n
•According to master theorem,



•Therefore,

d
•Since, a = b
Example 3
•Consider the recurrence relation

•Recurrence relation for D&C is

•Therefore,
–a = 3
–b = 2
2
–f(n) = n
Example 3
d
•If f (n) ∈ Ө(n ) where d ≥ 0
2
•Then, d = 2, since f(n) = n
•According to master theorem,



•Therefore,

d
•Since, a < b

Merge Sort
•Merge sort is a perfect example of a successful
application of the divide-and-conquer technique
•It sorts a given array by dividing it into two halves,
sorting each of them recursively, and then merging the
two smaller sorted arrays into a single sorted one
Merge Sort
Merge Sort
Merge Sort
•The operation of
the algorithm on the
list 8, 3, 2, 9, 7, 1,
5, 4 is illustrated as
Merge Sort

•Assuming for simplicity that n is a power of 2, the recurrence


relation for the number of key comparisons C(n) is

•Let us analyze C (n), the number of key comparisons


merge
performed during the merging stage
Merge Sort

•The worst case, C (n) = n − 1


merge
•Therefore the recurrence relation is

•Hence, according to master theorem
–a = 3, b = 2, f(n) = n
d
–If f (n) ∈ Ө(n ) where d ≥ 0
•Then, d = 1, since f(n) = n
Binary Search

• Let a , 1≤i≤n, be a list of element that are sorted in ascending


i
order
• Determine the given element x is present in the list.
• If x is present, determine value of j such that a = x
j
• If x is not present, j is set to zero
• Let P=(n, a , …, a , x) is an instance of search problem, where
i l
n is number of elements, a , …, a is the list of elements and x
i l
is the element to be search.
Binary Search
• Let us consider the list

Binary Search

Binary Search

Binary Search
• The recurrence relation for Binary Search

• Recurrence relation for D&C is

• Therefore,
• –a = 1
• –b = 2
• –f(n) = 1

Binary Search
d
• If f (n) ∈ Ө(n ) where d ≥ 0
• Then, d = 0, since f(n) = 1
• According to master theorem,



• Therefore,

d
• Since, a = b
Q&A
Which of the following best describes the divide and conquer approach?
A) Solve the problem by iteratively improving the solution until it becomes
optimal
B) Break the problem into smaller subproblems, solve them independently,
and combine their solutions
C) Use randomness to simplify the problem
D) Use a greedy choice to construct the solution incrementally
Q&A
Which of the following algorithms is based on the divide and conquer
strategy?
A) Prim’s algorithm
B) Dijkstra’s algorithm
C) Merge Sort
D) Linear Search
Q&A
In the divide and conquer strategy, after dividing the problem into
subproblems, what is typically done next?
A) Solve only the first subproblem and ignore others
B) Solve each subproblem recursively
C) Use dynamic programming to solve overlapping subproblems
D) Combine all subproblems into a single large one
Q&A
Which of the following correctly lists the three main steps of a divide and
conquer algorithm?
A) Divide, Repeat, Finish
B) Split, Execute, Merge
C) Divide, Conquer, Combine
D) Divide, Loop, Merge
Q&A
Which of the following is NOT typically solved using divide and conquer?
A) Quick Sort
B) Binary Search
C) Strassen’s matrix multiplication
D) Breadth-First Search
Reference
• Varsha H. Patil, “Data Structures Using C++”, Oxford University
Press 2012

Greedy Technique
• Algorithm that works in stages
• At each stage decision is made regarding whether a
particular input is an optimal solution
• Order of input is determined by some selection procedure
• On each step the choice made must be
• feasible, i.e., it has to satisfy the problem’s constraints
• locally optimal, i.e., it has to be the best local choice among all
feasible choices available on that step
• irrevocable, i.e., once made, it cannot be changed on
subsequent steps of the algorithm
Greedy Technique

Greedy Technique
• Problems based on greedy techniques
• Prim’s Algorithm
• Krushkal’s Algorithm
• Huffman Coding
• Dijkstra’s Algorithm
Prim’s Algorithm
• It has direct applications to the design of all kinds of
networks
• including communication, computer, transportation, and
electrical
• It identifies clusters of points in data sets
• It has been used for classification purposes in archeology,
biology, sociology, and other sciences
• It is also helpful for constructing approximate solutions to
more difficult problems such the traveling salesman
problem
Prim’s Algorithm
• Definition: A spanning tree of an undirected connected
graph is its connected acyclic subgraph that contains all
the vertices of the graph
• If such a graph has weights assigned to its edges, a
minimum spanning tree is its spanning tree of the
smallest weight
• The minimum spanning tree problem is the problem of
finding a minimum spanning tree for a given weighted
connected graph
Prim’s Algorithm

Prim’s Algorithm
• If we were to try constructing a minimum spanning tree
by exhaustive search we would face two serious
obstacles
• First, the number of spanning trees grows exponentially with the
graph size
• Second, generating all spanning trees for a given graph is not easy
Prim’s Algorithm
• Prim’s algorithm constructs a minimum spanning tree
through a sequence of expanding subtrees
• The initial subtree in such a sequence consists of a single vertex
selected arbitrarily from the set V of the graph’s vertices
• On each iteration, the algorithm expands the current tree in the
greedy manner by simply attaching to it the nearest vertex
• The algorithm stops after all the graph’s vertices have been
included in the tree being constructed
Prim’s Algorithm

Example
• Find MST using Prim’s Algo

Example

Example

Example
• How efficient is Prim’s algorithm?
• The answer depends on the data structures chosen for the graph itself and
for the priority queue of the set V − V whose vertex priorities are the
T
distances to the nearest tree vertices
• In particular, if a graph is represented by its weight matrix and the
priority queue is implemented as an unordered array the algorithm’s
2
running time will be in Ө(|V | ).
• If a graph is represented by its adjacency lists and the priority queue
is implemented as a min-heap, the running time of the algorithm is in
O(|E| log |V |)
Kruskal’s Algorithm
• The algorithm begins by sorting the graph’s edges in
nondecreasing order of their weights
• Then, starting with the empty subgraph, it scans this
sorted list adding the next edge on the list to the
current subgraph if such an inclusion does not create a
cycle and simply skipping the edge otherwise
Kruskal’s Algorithm

Kruskal’s Algorithm
• Solve using Kruskal’s Algorithm

Kruskal’s Algorithm

Kruskal’s Algorithm

Kruskal’s Algorithm
• On each iteration, the algorithm takes the next edge
(u, v) from the sorted list of the graph’s edges, finds
the trees containing the vertices u and v, and, if these
trees are not the same, unites them in a larger tree by
adding the edge (u, v).
• with an efficient sorting algorithm, the time efficiency of
Kruskal’s algorithm will be in O(|E| log |E|).
Knapsack Problem
• Problem Statement
• We are given n objects and a knapsack or bag.

• Object i has weight w and has capacity m.


i
• If an fraction x , 0≤ x ≤1, of object i is placed into the knapsack, then
i i
profit of p x is earned.
ii
• The objective is to obtain a filling of knapsack that maximize the total
profit earned
Knapsack Problem
• The problem can be stated as





• The profits and weights are positive numbers
Example 1
• Consider the following instance of knapsack problem
• n = 3, m = 20, (p , p , p ) = (25, 24, 15) and (w , w , w ) = (18,
1 2 3 1 2 3
15, 10)
• The feasible solutions are
Example 2
• Consider the following instance of knapsack problem
• n = 3, m = 6, (p , p , p ) = (1, 2, 5) and (w , w , w ) = (2, 3, 4)
1 2 3 1 2 3
Job Scheduling
• Problem Statement:
• Set of n jobs are given.
• Integer deadline d ≥ 0 and profit p ≥ 0, is associated with job i.
i i
• For any job i the profit p is earned iff job is completed by its deadline.
i
• To complete a job, one has to process a job on the machine for one unit of
time.
• Solution:
• The feasible solution for this problem is a subset J of jobs such that each
job in this subset can be completed by its deadline.
• The value of a feasible solution J is the sum of profit of its jobs.
• An optimal solution is a feasible solution with max value
Example-1

• Let n = 4, (p , p , p , p )
1 2 3 4
= (100, 10, 15, 27), (d ,
1
d , d , d ) = (2, 1, 2, 1)
2 3 4
• The feasible solution and
there values are
Example-1

•Greedy Approach
–Sort the jobs according to their profit
–Find the max deadline value
–Assign jobs Jobs
p p p p
1 4 3 2
Profit 100 27 15 10
Deadline 2 1 2 1

p p
4 1
Example 2

• Let n = 4, (p , p , p , p , p ) = (60, 100, 20, 40, 20),


1 2 3 4 5
(d , d , d , d ) = (2, 1, 3, 2, 1)
1 2 3 4
• Sort the jobs according to profit
• Jobs
P p P p p
2 1 4 3 5
Profit 100 60 40 20 20 p p p
2 1 3
Deadlin 1 2 2 3 1
e
Example 3

• Let n = 4, (p , p , p , p ) = (9, 7, 7, 2), (d , d , d ,


1 2 3 4 1 2 3
d ) = (3, 2, 3, 1)
4
• Sort the jobs according to profit
Jobs
p p P p
1 2 3 4
Profit 9 7 7 2
Deadline 3 2 3 1 p p p
3 2 1
Huffman Tree and Code
• Encode a text that comprises from some n-symbol alphabet by
assigning to each of the text’s symbols some sequence of bits
called the codeword
• we can use a fixed-length encoding that assigns to each
symbol a bit string of the same length m
• This idea was used, in particular, in the telegraph code invented
th
in the mid-19 century by Samuel Morse
• In that code, frequent letters such as e (.) and a (.−) are assigned
short sequences of dots and dashes while infrequent letters such
as q (−−.−) and z (−−..) have longer ones.
Huffman Tree and Code
• Variable-length encoding, which assigns codewords
of different lengths to different symbols
• how can we tell how many bits of an encoded text
represent the first symbol?
• To avoid this complication, we can limit ourselves to the
so-called prefix-free (or simply prefix) codes
• In a prefix code, no codeword is a prefix of a codeword of
another symbol
Huffman Tree and Code
• If we want to create a binary prefix code for some
alphabet, it is natural to associate the alphabet’s
symbols with leaves of a binary tree in which all the left
edges are labeled by 0 and all the right edges are
labeled by 1
Huffman Tree and Code
• Huffman’s algorithm
• Step 1 Initialize n one-node trees and label them with the
symbols of the alphabet given. Record the frequency of each
symbol in its tree’s root to indicate the tree’s weight
• Step 2 Repeat the following operation until a single tree is
obtained. Find two trees with the smallest weight Make them the left
and right subtree of a new tree and record the sum of their weights in
the root of the new tree as its weight
Huffman Tree and Code
• Consider the five-symbol alphabet {A, B, C, D, _} with
the following occurrence frequencies in a text made up
of these symbols
Huffman Tree and Code
• The Huffman tree
construction for
this input is
shown in Figure
Huffman Tree and Code

You might also like