Chapter 1: Introduction to data structure
Concept of data
In general, data is a distinct piece of information that is gathered and translated for some
purpose. If data is not formatted in a specific way, it does not valuable to computers or humans.
Data can be available in terms of different forms, such as bits and bytes stored in electronic
memory, numbers or text on pieces of paper, or facts stored in a person's mind. Since the
invention of computers, people have used the word data to mean computer information, and this
information is transmitted or stored.
There are different kinds of data; such are as follows:
o Sound
o Video
o Single character
o Number (integer or floating-point)
o Picture
o Boolean (true or false)
o Text (string)
In a computer's storage, data is stored in the form of a series of binary digits (bits) that contain
the value 1 or 0. The information can be in terms of pictures, text documents, software programs,
audio or video clips, or other kinds of data. The computer data may be stored in files and folders
on the computer's storage, and processed by the computer's CPU, which utilizes logical
operations to generate output (new data) form input data.
As the data is stored on the computer in binary form (zero or one), which can be processed,
created, saved, and stored digitally. This allows data to be sent from one computer to another
with the help of various media devices or a network connection. Furthermore, if you use data
multiple times, it does not deteriorate over time or lose quality.
Data objects
A data object is a structure for describing a data entity by grouping a set of related fields. For
example, a supermarket Online orders application might contain an Customer data object.
The Customer data object includes fields that describe the supermarket's customer, such as First
name, Last name, Full name, Email, and Phone.
Data objects hold data for your application and simplify the organization of fields, user interface
views, and integration settings that your application needs to access the right data at the right
time to successfully resolve a Case.
Each data object is reusable across an application's Case Types, saving development effort as
well as ensuring data consistency.
Data Structure
A data structure is a particular way of organizing data in a computer so that it can be used
effectively. The idea is to reduce the space and time complexities of different tasks.
The choice of a good data structure makes it possible to perform a variety of critical operations
effectively. An efficient data structure also uses minimum memory space and execution time to
process the structure. A data structure is not only used for organizing the data. It is also used
for processing, retrieving, and storing data. There are different basic and advanced types of
data structures that are used in almost every program or software system that has been
developed. So we must have good knowledge of data structures.
Need of Data Structure:
The structure of the data and the synthesis of the algorithm are relative to each other. Data
presentation must be easy to understand so the developer, as well as the user, can make an
efficient implementation of the operation.
Data structures provide an easy way of organizing, retrieving, managing, and storing data.
Here is a list of the needs for data.
Data structure modification is easy.
It requires less time.
Save storage memory space.
Data representation is easy.
Easy access to the large database
Classification/Types of Data Structures:
1. Linear Data Structure
2. Non-Linear Data Structure.
Linear Data Structure:
Elements are arranged in one pattern, also known as linear dimension.
Example: lists, stack, queue, etc.
Non-Linear Data Structure
Elements are arranged in one-many, many-one and many-many dimensions.
Example: tree, graph, table, etc.
In Static data structure the size of the structure is fixed. The content of the data structure can
be modified but without changing the memory space allocated to it.
Example of Static Data Structures: Array
In Dynamic data structure the size of the structure is not fixed and can be modified during
the operations performed on it. Dynamic data structures are designed to facilitate change of
data structures in the run time.
Example of Dynamic Data Structures: Linked List
Advantage of Static data structure:
Fast access time: Static data structures offer fast access time because memory is allocated
at compile-time and the size is fixed, which makes accessing elements a simple indexing
operation.
Predictable memory usage: Since the memory allocation is fixed at compile-time, the
programmer can precisely predict how much memory will be used by the program, which
is an important factor in memory-constrained environments.
Ease of implementation and optimization: Static data structures may be easier to
implement and optimize since the structure and size are fixed, and algorithms can be
optimized for the specific data structure, which reduces cache misses and can increase the
overall performance of the program.
Efficient memory management: Static data structures allow for efficient memory
allocation and management. Since the size of the data structure is fixed at compile-time,
memory can be allocated and released efficiently, without the need for frequent
reallocations or memory copies.
Simplified code: Since static data structures have a fixed size, they can simplify code by
removing the need for dynamic memory allocation and associated error checking.
Reduced overhead: Static data structures typically have lower overhead than dynamic data
structures, since they do not require extra bookkeeping to manage memory allocation and
de-allocation.
Advantage of Dynamic Data Structure:
Flexibility: Dynamic data structures can grow or shrink at runtime as needed, allowing
them to adapt to changing data requirements. This flexibility makes them well-suited for
situations where the size of the data is not known in advance or is likely to change over
time.
Reduced memory waste: Since dynamic data structures can resize themselves, they can
help reduce memory waste. For example, if a dynamic array needs to grow, it can allocate
additional memory on the heap rather than reserving a large fixed amount of memory that
might not be used.
Improved performance for some operations: Dynamic data structures can be more
efficient than static data structures for certain operations. For example, inserting or deleting
elements in the middle of a dynamic list can be faster than with a static array, since the
remaining elements can be shifted over more efficiently.
Simplified code: Dynamic data structures can simplify code by removing the need for
manual memory management. Dynamic data structures can also reduce the complexity of
code for data structures that need to be resized frequently.
Scalability: Dynamic data structures can be more scalable than static data structures, as
they can adapt to changing data requirements as the data grows.
Difference between static and dynamic
Most Popular Data Structures:
1. Array:
An array is a collection of data items stored at contiguous memory locations. The idea is to
store multiple items of the same type together. This makes it easier to calculate the position of
each element by simply adding an offset to a base value, i.e., the memory location of the first
element of the array (generally denoted by the name of the array).
2. Linked Lists:
Like arrays, Linked List is a linear data structure. Unlike arrays, linked list elements are not
stored at a contiguous location; the elements are linked using pointers.
3. Stack:
Stack is a linear data structure which follows a particular order in which the operations are
performed. The order may be LIFO(Last In First Out) or FILO(First In Last Out). In stack, all
insertion and deletion are permitted at only one end of the list.
Stack Operations:
push(): When this operation is performed, an element is inserted into the stack.
pop(): When this operation is performed, an element is removed from the top of the stack
and is returned.
top(): This operation will return the last inserted element that is at the top without
removing it.
size(): This operation will return the size of the stack i.e. the total number of elements
present in the stack.
isEmpty(): This operation indicates whether the stack is empty or not.
4. Queue:
Like Stack, Queue is a linear structure which follows a particular order in which the operations
are performed. The order is First In First Out (FIFO). In the queue, items are inserted at one
end and deleted from the other end. A good example of the queue is any queue of consumers
for a resource where the consumer that came first is served first. The difference between stacks
and queues is in removing. In a stack we remove the item the most recently added; in a queue,
we remove the item the least recently added.
Queue Operations:
Enqueue(): Adds (or stores) an element to the end of the queue..
Dequeue(): Removal of elements from the queue.
Peek() or front(): Acquires the data element available at the front node of the queue
without deleting it.
rear(): This operation returns the element at the rear end without removing it.
isFull(): Validates if the queue is full.
isNull(): Checks if the queue is empty.
5. Trie:
Trie is an efficient information retrieval data structure. Using Trie, search complexities can be
brought to an optimal limit (key length). If we store keys in the binary search tree, a well-
balanced BST will need time proportional to M * log N, where M is maximum string length
and N is the number of keys in the tree. Using Trie, we can search the key in O(M) time.
However, the penalty is on Trie storage requirements.
6. Graph:
Graph is a data structure that consists of a collection of nodes (vertices) connected by edges.
Graphs are used to represent relationships between objects and are widely used in computer
science, mathematics, and other fields. Graphs can be used to model a wide variety of real-
world systems, such as social networks, transportation networks, and computer networks.
Classification of Data Structure
Data structure is normally divided into two broad categories:
◦ Primitive Data Structure
⚫ There are basic structures and directly operated upon by the machine instructions.
⚫ Data structures that are directly operated upon the machine-level instructions are known
as primitive data structures
⚫ Integer, Floating-point number, Character constants, string constants, pointers etc, fall in
this category.
◦ Non-Primitive Data Structure
⚫ There are more sophisticated data structures.
⚫ The Data structures that are derived from the primitive data structures are called Non-
primitive data structure.
⚫ The non-primitive data structures emphasize on structuring of a group of homogeneous
(same type) or heterogeneous (different type) data items.
Persistence
Persistent Data Structures preserve previous versions of themselves allowing us to revisit and
audit any historical version. Depending on the operations allowed on the previous versions,
persistence is classified into three categories
Partially Persistent
Partially Persistent Data Structures allows access to all the historical versions but allows
modification to only the newest one. This typically makes historical versions of the data structure
immutable (read-only).
Fully Persistent
Fully Persistent Data Structures allows access and modification to all the historical versions. It
does not restrict any modifications whatsoever. This means we can typically revisit any historical
version and modify it and thus fork out a new branch.
ephemeral data structure
An ephemeral data structure is one for which only one version is available at a time: after an
update operation, the structure as it existed before the update is lost. A persistent structure is one
where multiple versions are simultaneously accessible: after an update, both old and new
versions can be used.
ADT (Abstract Data type)
In this article, we will learn about ADT but before understanding what ADT is let us consider
different in-built data types that are provided to us. Data types such as int, float, double, long,
etc. are considered to be in-built data types and we can perform basic operations with them such
as addition, subtraction, division, multiplication, etc. Now there might be a situation when we
need operations for our user-defined data type which have to be defined. These operations can be
defined only as and when we require them. So, in order to simplify the process of solving
problems, we can create data structures along with their operations, and such data structures that
are not in-built are known as Abstract Data Type (ADT).
Abstract Data type (ADT) is a type (or class) for objects whose behavior is defined by a set of
values and a set of operations. The definition of ADT only mentions what operations are to be
performed but not how these operations will be implemented. It does not specify how data will
be organized in memory and what algorithms will be used for implementing the operations. It is
called “abstract” because it gives an implementation-independent view.
The process of providing only the essentials and hiding the details is known as abstraction.
The user of data type does not need to know how that data type is implemented, for example, we
have been using Primitive values like int, float, char data types only with the knowledge that
these data type can operate and be performed on without any idea of how they are implemented.
Features of ADT:
Abstract data types (ADTs) are a way of encapsulating data and operations on that data into a
single unit. Some of the key features of ADTs include:
Abstraction: The user does not need to know the implementation of the data structure only
essentials are provided.
Better Conceptualization: ADT gives us a better conceptualization of the real world.
Robust: The program is robust and has the ability to catch errors.
Encapsulation: ADTs hide the internal details of the data and provide a public interface for
users to interact with the data. This allows for easier maintenance and modification of the
data structure.
Data Abstraction: ADTs provide a level of abstraction from the implementation details of
the data. Users only need to know the operations that can be performed on the data, not how
those operations are implemented.
Data Structure Independence: ADTs can be implemented using different data structures,
such as arrays or linked lists, without affecting the functionality of the ADT.
Information Hiding: ADTs can protect the integrity of the data by allowing access only to
authorized users and operations. This helps prevent errors and misuse of the data.
Modularity: ADTs can be combined with other ADTs to form larger, more complex data
structures. This allows for greater flexibility and modularity in programming.
Advantages:
Encapsulation: ADTs provide a way to encapsulate data and operations into a single unit,
making it easier to manage and modify the data structure.
Abstraction: ADTs allow users to work with data structures without having to know the
implementation details, which can simplify programming and reduce errors.
Data Structure Independence: ADTs can be implemented using different data structures,
which can make it easier to adapt to changing needs and requirements.
Information Hiding: ADTs can protect the integrity of data by controlling access and
preventing unauthorized modifications.
Modularity: ADTs can be combined with other ADTs to form more complex data structures,
which can increase flexibility and modularity in programming.
Disadvantages:
Overhead: Implementing ADTs can add overhead in terms of memory and processing,
which can affect performance.
Complexity: ADTs can be complex to implement, especially for large and complex data
structures.
Learning Curve: Using ADTs requires knowledge of their implementation and usage, which
can take time and effort to learn.
Limited Flexibility: Some ADTs may be limited in their functionality or may not be suitable
for all types of data structures.
Cost: Implementing ADTs may require additional resources and investment, which can
increase the cost of development.