KEMBAR78
Dbms Unit I | PDF | Databases | Relational Model
0% found this document useful (0 votes)
22 views12 pages

Dbms Unit I

Uploaded by

didlavikas764
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views12 pages

Dbms Unit I

Uploaded by

didlavikas764
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

DATABASE MANAGEMENT SYSTEM

UNIT – I

What is a Database?

A database is a collection of related data which represents some aspect of the real world. A database system is
designed to be built and populated with data for a certain task.

What is DBMS?

Database Management System (DBMS) is a software for storing and retrieving users' data while considering
appropriate security measures. It consists of a group of programs which manipulate the database. The DBMS
accepts the request for data from an application and instructs the operating system to provide the specific data.
In large systems, a DBMS helps users and other third-party software to store and retrieve data.

DBMS allows users to create their own databases as per their requirement. The term “DBMS” includes the
user of the database and other application programs. It provides an interface between the data and the software
application.

Database-System Applications:
Applications where we use Database Management Systems are:

● Telecom: There is a database to keeps track of the information regarding calls made, network usage,
customer details etc. Without the database systems it is hard to maintain that huge amount of data that
keeps updating every millisecond.
● Industry: Where it is a manufacturing unit, warehouse or distribution centre, each one needs a database
to keep the records of ins and outs. For example, a distribution centre should keep a track of the product
units that are supplied into the centre as well as the products that got delivered out from the distribution
centre on each day; this is where DBMS comes into picture.
● Banking System: For storing customer info, tracking day to day credit and debit transactions, generating
bank statements etc. All this work has been done with the help of Database management systems.
● Sales: To store customer information, production information and invoice details.
● Airlines: To travel though airlines, we make early reservations, this reservation information along with
flight schedule is stored in the database.
● Education sector: Database systems are frequently used in schools and colleges to store and retrieve the
data regarding student details, staff details, course details, exam details, payroll data, attendance details,
fees details etc. There is a hell lot amount of interrelated data that needs to be stored and retrieved in an
efficient manner.
● Online shopping: You must be aware of the online shopping websites such as Amazon, Flipkart etc.
These sites store the product information, your addresses and preferences, credit details and provide you
the relevant list of products based on your query. All this involves a Database management system.
User interfaces hide details of access to a database, and most people are not even aware they are dealing
with a database, accessing databases forms an essential part of almost everyone’s life today.
There are two modes in which databases are used:
●The first mode is to support online transaction processing, where a large number of users use the
database, with each user retrieving relatively small amounts of data, and performing small updates. This is
the primary mode of use for the vast majority of users of database applications such as those that we
outlined earlier.
• The second mode is to support data analytics, that is, the processing of data to draw conclusions, and
infer rules or decision procedures, which are then used to drive business decisions.

Purpose of Database Systems


Development of the system proceeds as follows:

o New application programs must be written as the need arises.


o New permanent files are created as required.
o but over a long period of time files may be in different formats, and
o Application programs may be in different languages.

There are problems with the straight file-processing approach:


o Data redundancy and inconsistency
▪ Same information may be duplicated in several places.
▪ All copies may not be updated properly.
o Difficulty in accessing data
▪ May have to write a new application program to satisfy an unusual request.
▪ E.g. find all customers with the same postal code.
▪ Could generate this data manually, but a long job...
o Data isolation
▪ Data in different files.
▪ Data in different formats.
▪ Difficult to write new application programs.
o Multiple users
▪ Want concurrency for faster response time.
▪ Need protection for concurrent updates.
▪ E.g. two customers withdrawing funds from the same account at the same time -
account has $500 in it, and they withdraw $100 and $50. The result could be $350,
$400 or $450 if no protection.
o Security problems
▪ Every user of the system should be able to access only the data they are permitted to see.
▪ E.g. payroll people only handle employee records, and cannot see customer accounts;
tellers only access account data and cannot see payroll data.
▪ Difficult to enforce this with application programs.
o Integrity problems
▪ Data may be required to satisfy constraints.
▪ E.g. no account balance below $25.00.
▪ Again, difficult to enforce or to change constraints with the file-processing approach.
These problems and others led to the development of database management systems.

View of Data
A major purpose of a database system is to provide users with an abstract view of the data. That is, the
system hides certain details of how the data are stored and maintained.

1 Data Models
A collection of conceptual tools for describing data, data relationships, data semantics, and consistency
constraints. The data models can be classified into four different categories:

a. Relational Data Model: This type of model designs the data in the form of rows and columns within a
table. Thus, a relational model uses tables for representing data and in-between relationships. Tables are also
called relations. This model was initially described by Edgar F. Codd, in 1969. The relational data model is the
widely used model which is primarily used by commercial data processing applications.

b. Entity-Relationship Data Model: An ER model is the logical representation of data as objects and
relationships among them. These objects are known as entities, and relationship is an association among these
entities. This model was designed by Peter Chen and published in 1976 papers. It was widely used in database
designing. A set of attributes describe the entities. For example, student_name, student_id describes the
'student' entity. A set of the same type of entities is known as an 'Entity set', and the set of the same type of
relationships is known as 'relationship set'.

c. Object-based Data Model: An extension of the ER model with notions of functions, encapsulation, and
object identity, as well. This model supports a rich type system that includes structured and collection types.
Thus, in the 1980s, various database systems following the object-oriented approach were developed. Here,
the objects are nothing but the data carrying its properties.

d. Semistructured Data Model: This type of data model is different from the other three data models
(explained above). The semistructured data model allows the data specifications at places where the individual
data items of the same type may have different attribute sets. The Extensible Markup Language, also known as
XML, is widely used for representing semistructured data. Although XML was initially designed for including
the markup information to the text document, it gains importance because of its application in the exchange of
data.

2 Data Abstraction
There are mainly three levels of data abstraction:

1. Internal Level: Actual PHYSICAL storage structure and access paths.


2. Conceptual or Logical Level: Structure and constraints for the entire database
3. External or View level: Describes various user views

a. Internal Level/Schema

The internal schema defines the physical storage structure of the database. The internal schema is a very
low-level representation of the entire database. It contains multiple occurrences of multiple types of internal
record. In the ANSI term, it is also called "stored record”.

● The internal schema is the lowest level of data abstraction


● It helps you to keep information about the actual representation of the entire database. Like the actual
storage of the data on the disk in the form of records
● The internal view tells us what data is stored in the database and how
● It never deals with the physical devices. Instead, the internal schema views a physical device as a
collection of physical pages.

b. Conceptual Schema/Level

The conceptual schema describes the Database structure of the whole database for the community of users.
This schema hides information about the physical storage structures and focuses on describing data types,
entities, relationships, etc. This logical level comes between the user level and physical storage view.
However, there is only a single conceptual view of a single database.

● Defines all database entities, their attributes, and their relationships


● Security and integrity information
● In the conceptual level, the data available to a user must be contained in or derivable from the physical
level

c. External Schema/Level

An external schema describes the part of the database which a specific user is interested in. It hides the
unrelated details of the database from the user. There may be "n" number of external views for each database.
Each external view is defined using an external schema, which consists of definitions of various types of
external record of that specific view. An external view is just the content of the database as it is seen by some
specific user. For example, a user from the sales department will see only sales related data.

● An external level is only related to the data which is viewed by specific end users.
● This level includes some external schemas.
● External schema level is nearest to the user
● The external schema describes the segment of the database which is needed for a certain user group
and hides the remaining details from the database from the specific user group.
3 Instances and Schemas
● Definition of instance: The data stored in database at a particular moment of time is called instance of
database. Database schema defines the variable declarations in tables that belong to a particular
database; the value of these variables at a moment of time is called the instance of that database.
For example, lets say we have a single table student in the database, today the table has 100 records, so
today the instance of the database has 100 records. Lets say we are going to add another 100 records in
this table by tomorrow so the instance of database tomorrow will have 200 records in table. In short, at a
particular moment the data stored in database is called the instance, that changes over time when we add or
delete data from the database.
● Definition of schema: Design of a database is called the schema. Schema is of three types: Physical
schema, logical schema and view schema.
For example: In the following diagram, we have a schema that shows the relationship between three
tables: Course, Student and Section. The diagram only shows the design of the database, it doesn’t show
the data present in those tables. Schema is only a structural view(design) of a database as shown in the
diagram below.

The design of a database at physical level is called physical schema, how the data stored in blocks of storage
is described at this level.
Design of database at logical level is called logical schema, programmers and database administrators work at
this level, at this level data can be described as certain types of data records gets stored in data structures,
however the internal details such as implementation of data structure is hidden at this level (available at
physical level).
Design of database at view level is called view schema. This generally describes end user interaction with
database systems.

DATABASE DESIGN
Database Design is a collection of processes that facilitate the designing, development, implementation and
maintenance of enterprise data management systems. Properly designed database are easy to maintain,
improves data consistency and are cost effective in terms of disk storage space. The database designer decides
how the data elements correlate and what data must be stored.
The main objectives of database designing are to produce logical and physical designs models of the proposed
database system.
The logical model concentrates on the data requirements and the data to be stored independent of physical
considerations. It does not concern itself with how the data will be stored or where it will be stored physically.
The physical data design model involves translating the logical design of the database onto physical media
using hardware resources and software systems such as database management systems (DBMS).
DATABASE ENGINE
A database system is partitioned into modules that deal with each of the responsibilities of the overall system.
The functional components of a database system can be broadly divided into the storage manager, the query
processor components, and the transaction management component.

1. Storage Manager: A storage manager is a program module which is responsible for storing,
retrieving and updating data in the database.
Following are the components of the storage manager;

1. Authorization and Integrity Manager: It tests the integrity constraints and checks the
authorization of users to access data.
2. Transaction Manager: It ensures that no kind of change will be brought to the database until a
transaction has been completed totally.
3. File Manager: It manages the allocation of space on disk storage and the data structures used to
represent information stored on disk.
4. Buffer Manager: It decides which data is in need to be cached in main memory and then fetch it
up in main memory. This is very important as it defines the speed in which the database can be used.

The storage manager implements several data structures as part of the physical system implementation:
• Data files, which store the database itself.
• Data dictionary, which stores metadata about the structure of the database, in particular the schema of the
database.
• Indices, which can provide fast access to data items. Like the index in this textbook, a database index
provides pointers to those data items that hold a particular value. For example, we could use an index to find
the instructor record with a particular ID, or all instructor records with a particular name.

2. Query Processor:
It interprets the requests (queries) received from end user via an application program into instructions.
It also executes the user request which is received from the DML compiler.
Query Processor contains the following components –
● DML Compiler –
It processes the DML statements into low level instruction (machine language), so that they can be
executed.
● DDL Interpreter –
It processes the DDL statements into a set of table containing meta data (data about data).
● Embedded DML Pre-compiler –
It processes DML statements embedded in an application program into procedural calls.
● Query Optimizer –
It executes the instruction generated by DML Compiler.

3. Transaction Management
Transactions are a set of operations used to perform a logical set of work. A transaction usually means that the
data in the database has changed.

ACID Properties are used for maintaining the integrity of database during transaction processing. ACID in
DBMS stands for Atomicity, Consistency, Isolation, and Durability.

● Atomicity: A transaction is a single unit of operation. You either execute it entirely or do not execute
it at all. There cannot be partial execution.
● Consistency: Once the transaction is executed, it should move from one consistent state to another.
● Isolation: Transaction should be executed in isolation from other transactions (no Locks). During
concurrent transaction execution, intermediate transaction results from simultaneously executed
transactions should not be made available to each other. (Level 0,1,2,3)
● Durability: · After successful completion of a transaction, the changes in the database should persist.
Even in the case of system failures.
Database and Application Architecture
A Database Architecture is a representation of DBMS design. It helps to design, develop, implement,
and maintain the database management system. A DBMS architecture allows dividing the database
system into individual components that can be independently modified, changed, replaced, and
altered. It also helps to understand the components of a database.

A Database stores critical information and helps access data quickly and securely. Therefore, selecting
the correct Architecture of DBMS helps in easy and efficient data management.

Types of DBMS Architecture


There are mainly three types of DBMS architecture:

● One Tier Architecture (Single Tier Architecture)


● Two Tier Architecture
● Three Tier Architecture

1-Tier Architecture

1 Tier Architecture in DBMS is the simplest architecture of Database in which the client, server, and
Database all reside on the same machine. A simple one tier architecture example would be anytime you install
a Database in your system and access it to practice SQL queries. But such architecture is rarely used in
production.

2-Tier Architecture

A 2 Tier Architecture in DBMS is a Database architecture where the presentation layer runs on a client (PC,
Mobile, Tablet, etc.), and data is stored on a server called the second tier. Two tier architecture provides added
security to the DBMS as it is not exposed to the end-user directly. It also provides direct and faster
communication.

3-Tier Architecture

A 3 Tier Architecture in DBMS is the most popular client server architecture in DBMS in which the
development and maintenance of functional processes, logic, data access, data storage, and user interface is
done independently as separate modules. Three Tier architecture contains a presentation layer, an application
layer, and a database server.

3-Tier database Architecture design is an extension of the 2-tier client-server architecture. A 3-tier architecture
has the following layers:

1. Presentation layer (your PC, Tablet, Mobile, etc.)


2. Application layer (server)
3. Database Server
Database Users and Administrators
A primary goal of a database system is to retrieve information from and store new information into the
database. People who work with a database can be categorized as database users or database
administrators.

1. Database users are the persons who interact with the database and take the benefits of database.
They are differentiated into different types based on the way they expect to interact with the
system.

● Naive users: They are the unsophisticated users who interact with the system by using permanent
applications that already exist. Example: Online Library Management System, ATMs (Automated
Teller Machine), etc.
● Application programmers: They are the computer professionals who interact with system through
DML. They write application programs.
● Sophisticated users: They interact with the system by writing SQL queries directly through the query
processor without writing application programs.
● Specialized users: They are also sophisticated users who write specialized database applications that
do not fit into the traditional data processing framework. Example: Expert System, Knowledge Based
System, etc.

2. Database Administrator
DBMSs is to have central control of both the data and the programs that access those data. A person
who has such central control over the system is called a database administrator (DBA). The functions
of a DBA include:
● Schema definition. The DBA creates the original database schema by executing a set of data
definition statements in the DDL.
• Storage structure and access-method definition. The DBA may specify some parameters pertaining
to the physical organization of the data and the indices to be created.
● Schema and physical-organization modification. The DBA carries out changes to the schema and
physical organization to reflect the changing needs of the organization, or to alter the physical
organization to improve performance.
● Granting of authorization for data access. By granting different types of authorization, the database
administrator can regulate which parts of the database various users can access. The authorization
information is kept in a special system structure that the database system consults whenever a user
tries to access the data in the system.
● Routine maintenance. Examples of the database administrator’s routine maintenance activities are:
° Periodically backing up the database onto remote servers, to prevent loss of data in case of
disasters such as flooding.
° Ensuring that enough free disk space is available for normal operations, and upgrading disk space
as required.
° Monitoring jobs running on the database and ensuring that performance is not degraded by very
expensive tasks submitted by some users.
INTRODUCTION TO THE RELATIONAL MODEL
STRUCTURE OF RELATIONAL DATABASES
Relational Model (RM) represents the database as a collection of relations. A relation is nothing but a table of
values. Every row in the table represents a collection of related data values. These rows in the table denote a
real-world entity or relationship.

The table name and column names are helpful to interpret the meaning of values in each row. The data are
represented as a set of relations. In the relational model, data are stored as tables. However, the physical
storage of the data is independent of the way the data are logically organized.

Relational Model Concepts


1. Attribute: Each column in a Table. Attributes are the properties which define a relation. e.g.,
Student_Rollno, NAME, etc.
2. Tables – In the Relational model the, relations are saved in the table format. It is stored along with its
entities. A table has two properties rows and columns. Rows represent records and columns represent
attributes.
3. Tuple – It is nothing but a single row of a table, which contains a single record.
4. Relation Schema: A relation schema represents the name of the relation with its attributes.
5. Degree: The total number of attributes which in the relation is called the degree of the relation.
6. Cardinality: Total number of rows present in the Table.
7. Column: The column represents the set of values for a specific attribute.
8. A domain is atomic if elements of the domain are considered to be indivisible units.
9. Relation instance – Relation instance is a finite set of tuples in the RDBMS system. Relation
instances never have duplicate tuples.
10. Relation key - Every row has one, two or multiple attributes, which is called relation key.
11. Attribute domain – Every attribute has some predefined value and scope which is known as attribute
domain.
12. The null value is a special value that signifies that the value is unknown or does not exist.

DATABASE SCHEMA
The database schema, which is the logical design of the database, and the database instance, which is a
snapshot of the data in the database at a given instant in time.
A database schema is the skeleton structure that represents the logical view of the entire database. It defines
how the data is organized and how the relations among them are associated. It formulates all the constraints
that are to be applied on the data.
A database schema defines its entities and the relationship among them. It contains a descriptive detail of the
database, which can be depicted by means of schema diagrams. It’s the database designers who design the
schema to help programmers understand the database and make it useful.
A database schema can be divided broadly into two categories −
● Physical Database Schema − This schema pertains to the actual storage of data and its form of
storage like files, indices, etc. It defines how the data will be stored in a secondary storage.
● Logical Database Schema − This schema defines all the logical constraints that need to be applied
on the data stored. It defines tables, views, and integrity constraints.

KEYS
o Keys play an important role in the relational database.
o It is used to uniquely identify any record or row of data from the table. It is also used to establish and
identify relationships between tables.

1. Primary key
o It is the first key which is used to identify one and only one instance of an entity uniquely. An entity
can contain multiple keys as we saw in PERSON table. The key which is most suitable from those lists
become a primary key.
o In the EMPLOYEE table, ID can be primary key since it is unique for each employee. In the
EMPLOYEE table, we can even select License_Number and Passport_Number as primary key since
they are also unique.
o For each entity, selection of the primary key is based on requirement and developers.

2. Super Key
Super key is a set of attributes which can uniquely identify a tuple. Super key is a superset of a candidate key.
For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME) the name of two
employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this combination can also be a
key. The super key would be EMPLOYEE-ID, (EMPLOYEE_ID, EMPLOYEE-NAME), etc.
3. Foreign key
o Foreign keys are the column of the table which is used to point to the primary key of another table.
o In a company, every employee works in a specific department, and employee and department are two
different entities. So we can't store the information of the department in the employee table. That's
why we link these two tables through the primary key of one table.
o We add the primary key of the DEPARTMENT table, Department_Id as a new attribute in the
EMPLOYEE table.
o Now in the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related.

4. Candidate Key: The minimal set of attribute which can uniquely identify a tuple is known as
candidate key. For Example, STUD_NO in STUDENT relation.

● The value of Candidate Key is unique and non-null for every tuple.
● There can be more than one candidate key in a relation. For Example, STUD_NO is candidate key
for relation STUDENT.
● The candidate key can be simple (having only one attribute) or composite as well. For Example,
{STUD_NO, COURSE_NO} is a composite candidate key for relation STUDENT_COURSE.
5. Alternate Key: The candidate key other than the primary key is called an alternate key. For Example,
STUD_NO, as well as STUD_PHONE both, are candidate keys for relation STUDENT but
STUD_PHONE will be an alternate key (only one out of many candidate keys).
6. Composite Key - If any single attribute of a table is not capable of being the key i.e it cannot identify
a row uniquely, then we combine two or more attributes to form a key. This is known as a composite
key.

RELATIONAL QUERY LANGUAGES


A query language is a language in which a user requests information from the database.
Query languages can be categorized as imperative, functional, or declarative.
1. In an imperative query language, the user instructs the system to perform a specific sequence of operations
on the database to compute the desired result; such languages usually have a notion of state variables, which
are updated in the course of the computation.
2. In a functional query language, the computation is expressed as the evaluation of functions that may
operate on data in the database or on the results of other functions; functions are side-effect free, and they do
not update the program state.
3. In a declarative query language, the user describes the desired information without giving a specific
sequence of steps or function calls for obtaining that information; the desired information is typically
described using some form of mathematical logic.
Relational algebra
Relational algebra is a procedural query language. It gives a step by step process to obtain the result of the
query. It uses operators to perform queries.
Select Operation:
o The select operation selects tuples that satisfy a given predicate.
o It is denoted by sigma (σ).

Notation: σ p(r)
Where:
σ is used for selection prediction
r is used for relation
p is used as a propositional logic formula which may use connectors like: AND OR
and NOT. These relations can be used as relational operators like =, ≠, ≥, <, >, ≤.

Example:
σ dept_name =“Physics” (instructor)
σ salary>90000 (instructor)
σ dept name =building(department)
σ dept name =“Physics” ∧ salary>90000 (instructor)

Project Operation:
o This operation shows the list of those attributes that we wish to appear in the result. Rest of the
attributes are eliminated from the table.
o It is denoted by ∏.

Notation: ∏ A1, A2, An (r)


Where A1, A2, A3 is used as an attribute name of relation r.

Example:
∏ NAME, CITY (CUSTOMER)
Π CustomerName, Status (Customers)

Union Operation:
o Suppose there are two tuples R and S. The union operation contains all the tuples that are either in R
or S or both in R & S.
o It eliminates the duplicate tuples. It is denoted by ∪.

Notation: R ∪ S

A union operation must hold the following condition:


o R and S must have the attribute of the same number.
o Duplicate tuples are eliminated automatically.

Example:
∏ CUSTOMER_NAME (BORROW) ∪ ∏ CUSTOMER_NAME (DEPOSITOR)

Set Intersection:
o Suppose there are two tuples R and S. The set intersection operation contains all tuples that are in both
R & S.
o It is denoted by intersection ∩.

Notation: R ∩ S
Example:
∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR)
Set Difference:
o Suppose there are two tuples R and S. The set intersection operation contains all tuples that are in R
but not in S.
o It is denoted by intersection minus (-).

Notation: R – S
∏ CUSTOMER_NAME (BORROW) - ∏ CUSTOMER_NAME (DEPOSITOR)

Cartesian product
o The Cartesian product is used to combine each row in one table with each row in the other table. It is
also known as a cross product.
o It is denoted by X.

Notation: E X D
Example: EMPLOYEE X DEPARTMENT

Rename Operation:
The rename operation is used to rename the output relation. It is denoted by rho (ρ).

Example: We can use the rename operator to rename STUDENT relation to STUDENT1.
ρ (STUDENT1, STUDENT)

You might also like