KEMBAR78
CH - 1 Dbms CLG | PDF | Databases | Computer Data Storage
0% found this document useful (0 votes)
20 views18 pages

CH - 1 Dbms CLG

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views18 pages

CH - 1 Dbms CLG

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

S.Y.B.Sc.

(Computer Science) - Sem – III


Course Type: Major Core
Course Code: CS-202-MJ-T
Course Title: Database Management System I

Chapter 1 : Introduction to DBMS (3 Hours)


1.1 Introduction to Data, Database and DBMS.
1.2 File system vs DBMS
1.3 Levels of abstraction and data independence
1.4 Architectures of DBMS
1.5. Users of DBMS
1.6 Advantages and Disadvantages of DBMS
1.7 Applications of DBMS

1.1 Introduction to Data, Database and DBMS


Data :
 Data is the raw facts that can be recorded and have meaning.
 It refers to the information stored within the database. This information can be of
various types like text, numbers, dates, images, videos, files and more. It is
organized into structures like tables for efficient storage, retrieval and
manipulation.
 Data is stored in a database.

Database :
A database is a structured collection of data designed to meet the data management
needs of an organization. These are used to store, manage, and retrieve the
information.

Database Management System (DBMS) :


 A database-management system (DBMS) is a collection of interrelated data and a
set of programs to access the data.
 The collection of data referred to as the database which contains information
relevant to a domain.
 The primary goal of a DBMS is to provide a way to store and retrieve database
information that is both convenient and efficient.
 Database systems are designed to manage large bodies of information.

ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 1
 Management of data involves :
 to define structures for storage of information
 to provide mechanisms for the manipulation of information
 to ensure the safety of the information stored, despite system crashes or
attempts at unauthorized access. If data are to be shared among several
users, the system must avoid possible anomalous results
 Thus DBMS allows users to create, modify and query databases while ensuring
data integrity, security and efficient data access.
Key Functions of a DBMS :
 Data Storage and Retrieval : DBMS provides the mechanisms for storing and
retrieving data efficiently.
 Data Definition : DBMS allows users to define the structure of the database
(schema) using languages like SQL.
 Data Manipulation : DBMS provides tools for manipulating data (adding,
updating, deleting).
 Data Integrity : DBMS enforces rules and constraints to ensure the accuracy
and consistency of data.
 Security : DBMS manages access control and user authentication to protect
data from unauthorized access.
 Backup and Recovery : DBMS provides mechanisms for backing up and
restoring data in case of failures.
 Concurrency Control : DBMS manages simultaneous access to data by
multiple users.
In essence, a DBMS acts as a bridge between users and data, providing a
structured and controlled environment for data management.
 Unlike traditional file systems, It minimizes data redundancy, prevents
inconsistencies and simplifies data management with features like concurrent
access and backup mechanisms.
 DBMS plays a vital role in supporting data-driven decision-making and operational
efficiency.
 It includes relational databases (like Oracle, MySQL, and PostgreSQL) and
NoSQL databases (like MongoDB, Cassandra).

1.2 File System vs DBMS


A file system and a Database Management System (DBMS) are the methods to store
and manage data, but they differ significantly in their structure, capabilities, and
ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 2
intended use.
File System :
 File systems are used to manage files and directories stored on disk and provide
basic operations for creating, deleting, renaming, and accessing files.
 They typically store data in a hierarchical structure, where files are organized in
directories and subdirectories.
 File systems are simple and efficient, but they lack the ability to manage complex
data relationships and ensure data consistency.
DBMS :
 DBMS is a software system designed to manage huge structured data with
relationships
 It provides basic and advanced operations for storing, retrieving, and manipulating
data.
 DBMS provides a centralized and organized way of storing data in relations/tables
which can be accessed and modified by multiple users or applications.
 It offers advanced features like data validation, indexing, transactions, concurrency
control, backup and recovery mechanisms.
 DBMS ensures data consistency, accuracy, and integrity by enforcing data
constraints, such as primary keys, foreign keys, and data types.
 Comparison between File System and DBMS :

Particulars File System DBMS

Storage The Data is stored in Files DBMS is an application


Structure and folders are organized in a software which provides an
hierarchical structure on a efficient and effective
storage medium of a Computer. environment to create,
store, retrieve and manage
the data stored in tables /
relations.

Relationships limited or no inherent explicitly well defined and


relationships between data enforced relationship, ensuring
items data consistency.

ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 3
Particulars File System DBMS

Scalability Limited scalability. Performance High. Designed to handle large


degrades with increasing data volumes of data and can be
volume. optimized for performance.

Data Prone to data redundancy as No redundant data. Data


Redundancy the same data might be stored redundancy is minimized
in multiple files. through normalization and other
techniques

Data There is less data consistency More data consistency due to


Consistency . process of normalization.

Data Integrity Lacks built-in mechanisms for Offers features for enforcing
enforcing data consistency and data integrity using constraints
accuracy. and data types.

Data There is no data independence. In DBMS data independence


Independence exists, mainly of two types:
 Logical Data Independence.
 Physical Data Independence.

Access Operating system commands. Data is accessed and


manipulated using a database
language like (SQL)

Backup and It doesn't provide Inbuilt It provides in house tools for


Recovery mechanism for backup and backup and recovery of data
recovery of data if it is lost. seven if it is lost.

Query There is no efficient query Efficient query processing


processing processing in the file system.

ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 4
Particulars File System DBMS

Complexity less complex as compared to more complex to handle the


DBMS. data

Security less data security is provided more data security


Constraints mechanisms

Cost less expensive than DBMS. It has a comparatively higher


cost than a file system.

User Access Only one user can access data Multiple users can access data
at a time. at a time.

Meaning The users are not required to The user has to write
write procedures. procedures for managing
databases

Sharing Data is distributed in many files. Due to centralized nature data


So, it is not easy to share data. sharing is easy

Data It give details of storage and It hides the internal details


Abstraction representation of data of Database

Integrity Integrity Constraints are difficult Integrity constraints are


Constraints to implement easy to implement

Attributes To access data in a file, user No such attributes are required.


requires attributes such as file
name, file location.

Example Cobol , C++ Oracle , MySQL, PgSQL

ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 5
1.3 Levels of Data Abstraction and Data Independence :
A Database System (DBS) is a collection of interrelated data and a set of programs that
allow users to access and modify the data. A major purpose of a database system is to
provide users with an abstract view of the data. Thus the system hides certain details of
how the data are stored and maintained.

1.3.1 Data Abstraction :


Data Abstraction is the process of hiding unwanted and irrelevant details from the end
user. It helps to store information in such a way that the end user can access data which
is necessary, the user will not be able to see what data is stored or how it is stored in a
database. Data abstraction helps to keep data secure from unauthorized access and
also hides all the implementation details.
Thus Data Abstraction :
 shows only relevant data to users by hiding unwanted and irrelevant details.
 keeps data safely by hiding where and how data is stored.
 simplifies access while maintaining data security and efficiency.
There are three levels of Data Abstraction :
 Physical ( Internal ) level
 Logical ( Conceptual ) level
 View ( External ) level

Figure : The three levels of Data Abstraction

Physical ( Internal ) level :


 It is the lowest level of abstraction which describes how the data is actually stored in
memory.
ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 6
 Physical level describes the physical storage structure of data in database.
 It also describes how to retrieve data in database and also describes compression
and encryption techniques applied on data.
 It deals with the physical storage such as files, indexing, and compression.
 It describes complex low-level data structures in detail to summarize how the
relations described in the conceptual schema are actually stored on secondary
storage devices such as disks and tapes.
 It also decides what file organizations to use to store the relations and create auxiliary
data structures, called indexes, to speed up data retrieval operations.

Logical ( Conceptual ) level :


 It is the next-higher level of abstraction which describes what data are stored in the
database and what relationships exist among those data.
 The logical level thus describes the entire database in terms of a small number of
relatively simple structures.

View level :
 The highest level of abstraction describes only part of the entire database.
 Even though the logical level uses simpler structures, complexity remains because of
the variety of information stored in a large database.
 Many users of the database system need to access only a part of the database.
 The view level of abstraction exists to simplify their interaction with the system.
 The system may provide many views for the same database.

A Database Schema : The overall design of a database is called as the Database


Schema. It describes
 how the data will be organized,
 how the relationships between different entities will be maintained,
 how constraints will be applied.
Database Schemas are important to define the structure of the database and once
defined, they remain relatively stable over time.
Example:
dept (dno, dname, no_emp)
emp (eid, ename, desig, salary)
dept and emp are related with each other by one-to-many (binary) relationship.

ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 7
An Instance : The collection of information stored in the database at a particular
moment is called an instance. It represents the current state of the database, reflecting
the values of attributes and the content of tables. The data items in a record can be
inserted, modified, or deleted at any time. Thus the data can change from one state to
another.
Example:
dept :
dno dname no_emp
1 HR 3
2 Production 3
emp:
eid ename desig salary dno
1 Mr. Patil supervisor 25000 2
2 Mrs. Sinha asst. mgr 35000 1
3 Mr. Verma accountant 30000 1
4 Mr. Patil mgr 45000 2
5. Mrs. Dixit mgr 42000 1
6 Mrs. Borse store incharge 30000 2

Data Independence
Data independence in a Database Management System (DBMS) is the ability to modify
the database schema (structure) at one level without affecting the schema at another
level. It is achieved through use of the three levels of data abstraction.

Figure : Logical and Physical Data Independence

ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 8
Logical Data Independence : It is the ability to change the logical structure (tables,
columns, relationships) without affecting external views or application programs. Purpose
of this to allow the database structure to evolve without impacting user access or
changes in an application code. This is the independence to change the conceptual
schema without having to change the external schemas and their application programs.
Example:
 Adding new columns in an existing table/relation.
 Creating a new relationship between two tables.
 Merging two tables into a view for simplification.

Physical Data Independence : It is the ability to change the physical storage of data
without affecting the logical schema. The purpose is to improve performance, storage
efficiency, or hardware configurations without changing how the data is structured
logically. This is the independence to change the internal schema without having to
change the conceptual schema.
Example:
 Moving data files from one drive to another i.e. from drive C: to drive D:.
 Creating an index to speed up queries.
 Switching from HDD to SSD for better performance.
 Compressing data files to save space.

1.4 Architectures of DBMS


Databases :
A database is a system for storing, managing, and retrieving the data. It organizes the
information in a way to make it easy to access and use for various applications.
There are two types of databases :
 Centralized Databases
 De-centralized Databases
Centralized Databases :
In a centralized database, the complete data is stored on a single location, such as a
server. Users can access this database through a network. Centralized databases are
easier to manage and secure. However, if the server fails, the entire system may stop
working.
Decentralized (Distributed) Databases :
A decentralized database stores data across multiple locations or servers. Each server
holds part of the data and works independently. This system is more reliable because
ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 9
even if one server fails, others continue to function. However, managing decentralized
databases can be more complex.
A Database System is partitioned into modules that deal with each of the responsibilities
of the overall system. The functional components of a database system can be broadly
divided into the storage manager and the query processor.

Figure : Database System Structure


Storage Manager :
 The storage manager provides the interface between the low-level data stored in the
database and the application programs and queries submitted to the system.
 It is responsible for the interaction with the file manager.
 It is also responsible for storing, retrieving, and updating data in the database.
 The storage manager includes the following components :

ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 10
 Authorization and Integrity manager : It tests for the satisfaction of integrity
constraints and checks the authority of users to access data.
 Transaction manager : It ensures that the database remains in a consistent
(correct) state despite system failures, and that concurrent transaction executions
proceed without conflicting.
 File manager : It manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.
 Buffer manager : It is responsible for fetching data from disk storage into main
memory, and deciding what data to cache in main memory. The buffer manager
enables the database to handle data of size that are much larger than the size of
main memory.
The storage manager implements several data structures as part of the physical
system implementation:
 Data files : which store the database itself.
 Data dictionary : which stores metadata (Data about the data) i.e. the structure of
the database and the schema of the database.
 Indices : which can provide fast access to data items. A database index provides
pointers to those data items that hold a particular value. Hashing is an alternative
to indexing that is faster in some but not all cases.
 Statistical Data : It stores statistical information about the data in the database.
This information used by the query processor to select efficient ways to execute a
query.

The Query Processor :


The query processor includes the following components :
 DDL Interpreter : It interprets DDL (Data Definition Language) statements and
records the definitions in the data dictionary.
 DML Compiler : It translates DML (Data Manipulation Language) statements in a
query language into an evaluation plan consisting of low-level instructions that the
query evaluation engine understands. A query can usually be translated into any of a
number of alternative evaluation plans that all give the same result. The DML
compiler also performs query optimization. It picks the lowest cost evaluation plan
from among the alternatives.
 Embedded DML Pre-compiler : It converts DML statements embedded in an
application program to normal procedure calls in the host language. The pre-compiler
must interact with the DML compiler to generate the appropriate code.
ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 11
 Query Evaluation Engine : It executes low-level instructions generated by the DML
compiler.
 Query Optimizer : The query optimizer is responsible for improving query
performance. It evaluates multiple ways to execute a query and selects the most
efficient one. This reduces processing time, so the query execution is done quickly.

DBMS Architecture :
 The DBMS architecture refers to the structural design and its interconnected
components that manage and maintain databases efficiently and effectively.
 The widely used approach is the client-server architecture. Client and Server
components are separated to streamline data handling, application logic, and user
interactions.
 DBMS architecture will help us to understand the components of database system
and the relation among them.
 The architecture of DBMS depends on the computer system on which it runs.
 For example, in a client-server DBMS architecture, the database systems at server
machine can run several requests made by client machine.
Types of DBMS Architecture
 Two Tier Architecture
 Three Tier Architecture
Two tier Architecture :

Figure : Two –Tier Architecture


 In it, database application is partitioned into two parts.
 The two-tier is based on Client Server architecture.
ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 12
 The Database system is present at the server machine and the DBMS application is
present at the client machine, these two machines are connected with each other
through a reliable network as shown in the figure.
 Whenever client machine makes a request to access the database present at server
using a query language like SQL, the server performs the request on the database
and returns the result back to the client.
 The direct communication takes place between client and server. The application
connection interface such as JDBC, ODBC are used for the interaction between
server and client.
 There is no intermediate between client and server.

Three tier Architecture :

Figure : Three –Tier Architecture


 Three tier architecture adds an intermediate layer between the client and the
database server.
 This intermediate layer is called the application server or the Web server, depending
on the application.
 Here, the client application doesn’t communicate directly with the database systems
present at the server machine, rather the client application communicates with server
application and the server application internally communicates with the database
system present at the server to access data.
 Three tier architecture has Three layers as below :
1. Client (Presentation) layer,
2. Application (Business logic) layer
3. Database (Data) Layer

ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 13
These layers are designed to separate concerns and improve the overall functionality,
security, and maintainability of the database system.
Presentation Layer (Client Layer) :
This layer is the user interface (UI) that interacts directly with the user. It handles user
input and displays output.
Application Layer (Business Logic Layer) :
This layer contains the business rules and logic of the application. It processes user
requests, interacts with the data layer to retrieve or update data, and sends results back
to the presentation layer.
Data Layer (Database Layer) :
This layer stores and manages the data. It handles data storage, retrieval, update, and
deletion operations. It is responsible for the physical storage and access of the database.
This separation allows for independent development, maintenance, and scalability of
each layer, making it easier to manage and adapt the system to changing needs.

1.5. Users of DBMS :


There are four different types of database-system users, differentiated by the way
they expect to interact with the system. Different types of user interfaces have
been designed for the different types of users :
 Naive Users
 Application programmers
 Sophisticated users
 Specialized users

Naive users : These are the unsophisticated users who interact with the system by
invoking one of the application programs that have been written previously. The typical
user interface for naive users is a forms interface, where the user can fill in appropriate
fields of the form. Naive users may also simply read reports generated from the DB.

Application programmers : These are the computer professionals who write application
programs. Application programmers can choose from many tools to develop user
interfaces. Rapid Application Development (RAD) tools are tools that enable an
application programmer to construct forms and reports with minimal programming effort.

Sophisticated users : These interact with the system without writing programs. Instead,
they form their requests either using a database query language or by using tools such
ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 14
as data analysis software. Analysts who submit queries to explore data in the database
fall in this category.

Specialized users : These are sophisticated users who write specialized database
applications that do not fit into the traditional data-processing framework. Among these
applications are computer-aided design systems, knowledgebase and expert systems,
systems that store data with complex data types (for example, graphics data and audio
data), and environment-modeling systems.

Database Administrator (DBA) :


A person who has such central control of both the data and the programs that access
those data over the system is called a database administrator (DBA).
The functions of a DBA include :
 Schema definition. The DBA creates the original database schema by executing
a set of data definition statements in the DDL.
 Storage structure and access-method definition.
 Schema and physical-organization modification : The DBA carries out changes
to the schema and physical organization to reflect the changing needs of the
organization, or to alter the physical organization to improve performance.
 Granting of authorization for data access : By granting different types of
authorization, the database administrator can regulate which parts of the database
various users can access. The authorization information is kept in a special system
structure that the database system consults whenever someone attempts to access
the data in the system.
 Routine maintenance : The DBA’s routine maintenance activities are:
 Periodically backing up the database, either onto tapes or onto remote servers, to
prevent loss of data in case of disasters such as flooding.
 Ensuring that enough free disk space is available for normal operations, and
upgrading disk space as required.
 Monitoring jobs running on the database and ensuring that performance is not
degraded by very expensive tasks submitted by some users.

1.6 Advantages and Disadvantages of DBMS :


Advantages of DBMS :
 Data independence: Application programs should be as independent as possible
from details of data representation and storage. The DBMS can provide an abstract
ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 15
view of the data to insulate application code from such details.
 Efficient data access: A DBMS utilizes a variety of sophisticated techniques to store
and retrieve data efficiently. This feature is especially important if the data is stored
on external storage devices.
 Data integrity and security: The DBMS can enforce integrity constraints on the
data. The DBMS can enforce access controls that govern what data is visible to
different classes of users.
 Data administration: When several users share the data, centralizing the
administration of data can offer significant improvements. It can be used for
organizing the data representation to minimize redundancy and for fine-tuning the
storage of the data to make retrieval efficient.
 Concurrent access and crash recovery: A DBMS schedules concurrent accesses
to the data in such a manner that users can think of the data as being accessed by
only one user at a time. Further, the DBMS protects users from the effects of system
failures.
 Reduced application development time: Clearly, the DBMS supports many
important functions that are common to many applications accessing data stored in
the DBMS.

Disadvantages of DBMS
 A DBMS is a complex piece of software, optimized for certain kinds of workloads and
its performance may not be adequate for certain specialized applications.
Example : Applications with tight real-time constraints or just a few well-defined
critical operations for which efficient custom code must be written.
 When an application may need to manipulate the data in ways not supported by the
query language. In such a situation, DBMS is not useful because the abstract view of
the data presented by the DBMS does not match the application's needs and actually
gets in the way.
Example : Relational database do not support flexible analysis of text data
 If specialized performance or data manipulation requirements are central to an
application. The application may not use DBMS, especially when the added benefits
of a DBMS (e.g., flexible querying, security, concurrent access, and crash recovery)
are not required. In most situations calling for large-scale data management,
however, DBMSs have become an indispensable tool.
 The disadvantage of the DBMS system is overhead cost. The processing overhead
introduced by the DBMS to implement security, integrity, and sharing of the data
ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 16
causes a degradation of the response and throughput times. An additional cost is that
of migration from a traditionally separate application environment to an integrated
one.
 Even though centralization reduces duplication, the lack of duplication requires that
the database be adequately backup so that in the case of failure the data can be
recovered.
 Backup and recovery operations are complex in a DBMS environment, and this is an
increment in a concurrent multi-user database system. A database system requires a
certain amount of controlled redundancies and duplication to enable access to related
data items.
 Complexity : The provision of the functionality that is expected of a good DBMS
makes the DBMS an extremely complex piece of software. Failure to understand the
system can lead to bad design decisions, which can have serious consequences for
an organization.
 Size : The complexity and breadth of functionality makes the DBMS an extremely
large piece of software, occupying many megabytes of disk space and requiring
substantial amounts of memory to run efficiently.
 Performance : The DBMS file based system is written to be more general, to cater
for many applications rather than just one. The effect is that some applications may
not run as fast as they used to.
 Higher impact of a failure : The centralization of resources increases the
vulnerability of the system. Since all users and applications rely on the availability of
the DBMS, the failure of any component can bring operations to a halt.
 Cost of DBMS : The cost of DBMS varies significantly, depending on the
environment and functionality provided. There is also the recurrent annual
maintenance cost.
 Additional Hardware costs : To achieve the required performance it may be
necessary to purchase a larger machine, perhaps even a machine dedicated to
running the DBMS. The procurement of additional hardware results in further
expenditure.
 Cost of Conversion : In some situations, the cost of DBMS and extra hardware may
be insignificant compared with the cost of converting existing applications to run on
the new DBMS and hardware. This cost is one of the main reasons why some
organizations feel tied to their current systems and cannot switch to modern database
technology.

ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 17
1.7 Applications of DBMS :
Databases are widely used in many applications such as :
 Enterprise Information :
 Sales : For customer, product, and purchase information.
 Accounting : For payments, receipts, account balances, assets and other
accounting information.
 Human resources : For information about employees, salaries, payroll taxes,
and benefits, and for generation of paychecks.
 Manufacturing : For management of the supply chain and for tracking
production of items in factories, inventories of items in warehouses and stores,
and orders for items.
 Online retailers : For sales data noted above plus online order tracking,
generation of recommendation lists, and maintenance of online product
evaluations.
 Banking and Finance :
 Banking: For customer information, accounts, loans, and banking
transactions.
 Credit card transactions: For purchases on credit cards and generation of
monthly statements.
 Finance : For storing information about holdings, sales, and purchases of
financial stocks and bonds also for storing real-time market data to enable
online trading by customers and automated trading by the firm.
 Universities : For student information, course registrations, and grades
 Airlines : For reservations and schedule information. Airlines were among the
first to use databases in a geographically distributed manner.
 Telecommunication : For keeping records of calls made, generating monthly
bills, maintaining balances on prepaid calling cards, and storing information
about the communication networks.

ACA, Department of Computer Science, GES’s HPT Arts and RYK Science College, Nashik-05 Page No. 18

You might also like