Data Base Management System
(DBMS)
Unit -2
Data Base Management System
Data: Data is the basic raw facts and figures
Ex: a name, a digit, a picture etc.
Data Base: Collection of related data
Ex. the names, telephone numbers and addresses of all the
people you know
Data Base Management System:
A DBMS is a set of programs that controls creation, storage,
management, and retrieval of data in a database.
Ex: MS-Access, Oracle, MY SQL, Sybase, IBM DB2, Ingres etc
Use of DBMS
❖Corporate
❖Airlines
❖Hotels
❖Banks
❖Colleges /University
❖Railway reservation
❖Telecommunication Industry
❖Data mining
❖Libraries
Advantages of Using DBMS
❖ No Data Redundancy
❖ Data Consistency
❖ Mass Data Storage
❖ Centralized Access
❖ Automatic Backup Possible
❖ Data Recovery Possible
❖ Integrity Constraints
❖ Easily updating & fetching of data
❖ Only authorized Access
Disadvantages of Flat File Systems
❖No centralized control
❖Data Redundancy
❖Data Inconsistency
❖Data can not be shared
❖Standards can not be enforced
❖Security issues
❖Integrity can not be maintained
❖Data Dependence
Data Base Characteristics
❖ Controls data redundancy.
❖ Enforces user defined rules.
❖ Ensures data sharing.
❖ It has automatic and intelligent backup and recovery
procedures.
❖ It has central dictionary to store information.
❖ Pertaining to data and its manipulation.
❖ It has different interfaces via which user can manipulate the
data.
❖ Enforces data access authorization.
❖ Represents complex relationship between data.
Classification of DBMS
• Relational DBMS:
– Modeling concept: tables and constraints on tables
– Query Language: SQL
– Applications: suited for traditional business processing
• Object-Oriented DBMS
– Modeling concepts: objects, classes, inheritance
– Query Language: object oriented OQL
– Applications: suited for CAD databases, CASE databases, office
automation
• XML DBMS
Data Base Users
•DBMS designers and implementers
•Database administrator (DBA)
“superuser” of a database, similar to a system
administrator.
Define schemas, views, authorization, indexes,
tuning parameters, etc.
•Application programmers
•End users
Roles of Data Base Administrator
A database administrator (DBA) is a person responsible for the design,
implementation, maintenance and repair of an organization's database. The key
roles of a DBA are :
❖To Provide space to each user.
❖To create the external and logical Schema.
❖To Provide security from unauthorized access.
❖To grant permissions to the user
❖Installation, configuration and upgrading of Oracle server software and
related products.
❖To take Back up and Recovery of data.
❖Performance monitoring of the machine and database.
DBMS Overview
user
Applications/queries
Query processor
Storage manager
metadata data
• Data: collection of interrelated information about world being modeled
• DBMS: general-purpose software to define, create, modify, retrieve, delete and
manipulate a database
Data Abstraction
• Hiding system complexity and physical storage details
from users and applications
Customized view View1 View 2 View n
(External level)
Conceptual representation Logical Level
Physical data description Physical level
(Internal level)
3-tier Architecture
Physical Level:
At the physical level, the information about the
location of database objects in the data store is
kept. Various users of DBMS are unaware of the
locations of these objects. In simple terms,
physical level of a database describes how the
data is being stored in secondary storage devices
like disks and tapes and also gives insights on
additional storage details.
Conceptual Level:
• This level is designed by data base administrator.
• Under this level a schema of data base is created
by DBA.
• It represents the entire database and there can be
only one conceptual view per database.
• It represents entities, their attributes and
relationships between them.
• It is independent on the hardware and software.
• This is also known as Logical Level.
External Level:
An external level specifies a view of the data in
terms of conceptual level tables. Each external
level view is used to cater to the needs of a
particular category of users. For Example, FACULTY
of a university is interested in looking course details
of students, STUDENTS are interested in looking at
all details related to academics, accounts, courses
and hostel details as well. So, different views can be
generated for different users. The main focus of
external level is data abstraction.
Data Independence
• When a schema at a lower level is changed, only
the mappings between this schema and
higher-lever schemas need to be changed in a
DBMS that fully supports data independence.
• The higher-level schemas themselves are
unchanged. Hence, the application programs
need not be changed since they refer to the
external schemas.
• Disadvantages of two levels of mappings:
Overhead during compilation or execution of a
query or program
Mapping between views
Two mappings are required in a database system with three
different views:
• Mapping between conceptual and external view
• Mapping between internal and conceptual view
Mapping between views specifies the methods of deriving the
record at one level from the record at lower level.
• Conceptual/Internal Mapping
The conceptual/internal mapping defines the
correspondence between the conceptual view
and the stored database. It specifies how
conceptual records and fields are represented at
internal level. If the structure of stored database
is changed, then the conceptual/internal
mapping must be changed accordingly so that
the conceptual schema remains consistent, it is
the responsibility of the DBA to manage such
change.
• External/Conceptual view
An external/conceptual mapping defines the
correspondence between a particular external
view and a conceptual view. For example, fields
can have different datatype and record name
can be changed , several conceptual fields can
be combined into single field .
Data Independence
Logical Data Independence: The capacity to change the conceptual
schema without having to change the external schemas and their
application programs.
Physical Data Independence: The capacity to change the internal
schema without having to change the conceptual schema.
DBMS Languages
• Data Definition Language (DDL)
– Used to describe a schema
• Data Manipulation Language (DML)
– Used by users to query the DB and change the data
• View Definition Language (VDL)
– Define views
Data Model
•Concepts and tools used to describe DB schemas
•Examples:
•Entity-Relationship Model
•Relational Model
•Object-Oriented Model (e.g., ODL)
Different Data Models
❖ Flat file
❖ Hierarchical Data Model
❖ Network Data model
❖ Relational Data model
Features of Flat Files
A flat file database is a type of database that stores data in a single
table or a file. Placing data in a flat file offers following advantages:
• All records are stored at one place
• Easy to set up using different office applications
• Easy to understand
• Records can be viewed or extracted based on simple criteria
Disadvantages of flat files
• Potential duplication
• Data Inconsistency
• No centralized access
• Harder to change data format
• Poor at complex queries
• Poor at authorized access
Hierarchical Data Model
In this model data is organized into a tree-like
structure, implying a single upward link in each
record to describe the nesting, and a sort field
to keep the records in a particular order in each
same-level list.
Drawbacks: Hierarchical DBMS
❖Can not handle Many-Many relationships.
❖Can not reflect all real life situations
❖Difficult to perform insert, delete and update
operations.
Network Data Model
The basic data modelling construct in the network model is
the set construct. A set consists of an owner record type, a
set name, and a member record type. A member record type
can have that role in more than one set, hence the
multiparent concept is supported. An owner record type can
also be a member or owner in another set.
Relational Data Model
❖ Relational model is based on relations of the tables.
❖ It is bounded with 12 codd ’s rules.
❖ Every information is stored in the form of columns and
rows.
Relational Data Model
Example of tabular data in the relational model
Attributes
customer- customer- customer- account-
Customer-id
name street city number
192-83-7465 Johnson
Alma Palo Alto A-101
019-28-3746 Smith
North Rye A-215
192-83-7465 Johnson
Alma Palo Alto A-201
321-12-3123 Jones
Main Harrison A-217
019-28-3746 Smith
North Rye A-201
Sample Relational Database
Schema
The overall design of the database is called the
database schema.
A schema is the structure of the table which is
decided before storing the data.
Example:
Create table student
( rollno number(5),
name char(15),
address varchar2(25));
Tuple
A tuple is a related record stored in a row of
the table.
Ex: 101,Alok,MCA,85%
Tuple : Record
Attributes: columns
Entity : Tables
Entity/Relationship (E/R) Model
• Entities: objects
• Relationships: associate entities
• Roles of entities in a relationship
• Constraints on entities:
– domain constraints
– key constraints
• Constraints on relationships:
– Cardinality constraints
– Participation constraints
– Weak Entity Sets
• Multiway relationships
• Subclass/superclass Relationships
• Aggregation
Symbols Used in E-R Notation
Entities and Entity Sets
name street number balance
city
id
Customer custacct Account
• Entities:
– nouns, “things” in the world
– Have attributes: course name, id, address, dept,
age, room, …
• Entity sets: a set of entities
Attributes
• Single-valued versus multi-valued:
– “telephone number”: multi-valued
– “Salary”: single-valued
• Atomic versus composite:
– “Age”: atomic
– “Address”: composite
• Derived versus stored:
– Derived: derived from other attributes or entities,
e.g., “age” derived from “date of birth.”
– Stored: all other attributes
Relationships
• Relationship: association of multiple entities
• Relationship Set:
– set of relationships over the same entity sets
– binary, ternary, 4-nary, …n-nary
Custom-Account account
customer Relationship set
259, 10K
Tom, 62900, Main, LA
305,
Jane, 62901, North, 20K
Irvine 245,
2400
Visualizing relationships as a table.
Each row: pair of entities
participating in the relationship
ER Diagram
id name street balance
city number
age
Customer custacct Account
dob
tel
opendate
• Graphical representation of ER schema. Put as much information as possible.
• Entity set: rectangle
• Attribute: ellipse
• Derived attribute: dashed ellipse (“age”)
• Multivalued attribute: double ellipse (“tel”)
• Relationship set: diamond, with lines connected to its entity sets. May have
attributes, called “relationship attributes.”
•Not specified how to represent a composite attribute.
E-R Diagrams
● Rectangles represent entity sets.
● Diamonds represent relationship sets.
● Lines link attributes to entity sets and entity sets to relationship sets.
● Ellipses represent attributes
● Double ellipses represent multivalued attributes.
● Dashed ellipses denote derived attributes.
● Underline indicates primary key attributes (will study later)
E-R Diagram With Composite, Multivalued, and
Derived Attributes
E-R Diagram for Hospital Management System
ER Diagram for Library Management System
Dr. Edgar F. Codd (1923-2003)
❖ Codd completed his PhD at the
University of Michigan in 1963, and
presented a thesis on the topic of a
self-reproducing computer consisting
of a large number of simple identical
cells, each of which interacts in a
uniform manner with its four
immediate neighbors.
❖ Codd reported this work in a book
entitled Cellular Automata published
by Academic Press in 1968.
12 Codd's Rules
Subclass/Superclass Relationships
• Superclass and Subclass relationships arise during schema design due
to the process of specialization and generalization
• Specialization: process of classifying a class of objects into more
specialized subclasses
– E.g., start with an employee ES, then specialize it into different
types of employees.
• Generalization: Reverse of specialization. A process of synthesis of
two or more lower-level ES to produce a higher-level ES.
Specialization
❖ An entity set may include sub grouping of entities that are
distinct in some way from other entities in the set.
❖ For Instance, a subset of entities with in an entity set may
have attributes that are not shared by all the entities in the
entity set.
❖ The process of introducing new characteristics to an
existing class of objects to create one or more new classes
of objects is called Specialization..
Specialization Example
Generalization
❖ A bottom-up design process – combine a number of
entity sets that share the same features into a
higher-level entity set.
❖ Generalization is the process of viewing objects as
a single general class by concentrating on the
general properties of the constituent sets while
ignoring their differences.
❖ Specialization and generalization are simple
inversions of each other. They are represented in an
E-R diagram in the same way.
Multiple Inheritance
•Subclass inherits all its attributes from its superclass.
•If a subclass has 2 or more superclasses, then it inherits from all
the superclasses.
Relationships
One to One Relationship
One to Many Relationship
Many to One Relationship
Many to Many Reationship
Case Study 1
• Design a DB consistent with the following facts.
– Trains are either local trains or express trains, but never both.
– A train has a unique number and an engineer.
– Stations are either express stops or local stops, but never
both.
– A station has a unique name and an address.
– All local trains stop at all stations.
– Express trains stop only at express stations.
– For each train and each station the train stops at, there is a
time.
Design 1: bad
numb
er type time name
addr
Stops
trains stations
At
engin
eer type
Problem: does not capture the constraints that express trains only stop only at
express stations and local trains stop at all local stations
Design 2: good
numbe enginee
r r
train
tim nam
e e addres
d I
s
S
A StopsAt stations
local trains
2
I
d
S
express trains tim A
e
StopsAt express stations local stations
1
Relational Model
• Main idea:
– Table: relation
– Column header: attribute
– Row: tuple
• Relational schema: name(attributes)
– Example: employee(ssno,name,salary)
• Attributes:
– Each attribute has a domain – domain constraint
– Each attribute is atomic: we cannot refer to or directly see a subpart
of the value.
Relation Example
Account Customer
• Database schema consists of
– a set of relation schema
– Account(AccountId, CustomerId, Balance)
– Customer(Id, Name, Addr)
– a set of constraints over the relation schema
– AccountId, CustomerId must an integer
– Name and Addr must be a string of characters
– CustomerId in Account must be of Ids in Customer
– etc.
NULL value
Customer(Id, Name, Addr)
• Attributes can take a special value: NULL
– Either not known: we don’t know Jack’s address
– or does not exist: savings account 1001 does not have “overdraft”
• This is the single-value constrain on Attr: at most one
– Either one: a string
– Or zero: NULL
Why Constraints?
• Make tasks of application programmers easier:
– If DBMS guarantees account >=0, then debit application
programmers do not worry about overdrawn accounts.
• Enable us to identify redundancy in schemas:
– Help in database design
– E.g., if we know course names are unique, then we may not need
another “course id” attribute
• Help the DBMS in query processing.
– They can help the query optimizer choose a good execution plan
Domain Constraints
• Every attribute has a type:
– integer, float, date, boolean, string, etc.
• An attribute can have a domain. E.g.:
– Id > 0
– Salary > 0
– age < 100
– City in {Irvine, LA, Riverside}
• An insertion can violate the domain constraint.
– DBMS checks if insertion violates domain constraint and reject the insertion.
Integer String String
violations
Key Constraints
• Superkey: a set of attributes such that if two tuples agree on these
attributes, they must agree on all the attributes
– All attributes always form a superkey.
• Example:
– AccountID forms a superkey, I.e., if two records agree on this
attribute, then they must agree on other attributes
– Notice that the relational model allow duplicates
– Any superset of {Account} is also a superkey
– There can be multiple superkeys
• Log: assume LogID is a superkey
Log(LogId, AccountId, Xact#, Time, Amount) Illegal
Keys
• Key:
– Minimal superkey (no proper subset is a superkey)
– If more than one key: choose one as a primary key
• Example:
– Key 1: LogID (primary key)
– Key 2: AccountId, Xact#
– Superkeys: all supersets of the keys
Log(LogId, AccountId, Xact#, Time, Ammount)
OK
Integrity Rules
There are two Integrity Rules that every relation
should follow :
1. Entity Integrity (Rule 1)
2. Referential Integrity (Rule 2)
Entity Integrity states that –
If attribute A of a relation R is a prime attribute
of R, then A can not accept null and duplicate
values.
Referential Integrity Constraints
This is used to maintain the data consistency between two tables.
• Given two relations R and S, R has a primary key X (a set of attributes)
• A set of attributes Y is a foreign key of S if:
– Attributes in Y have same domains as attributes X
– For every tuple s in S, there exists a tuple r in R: s[Y] = r[X].
• A referential integrity constraint from attributes Y of S to R means that Y is
a foreign that refers to the primary key of R.
• The foreign key must be either equal to the primary key or be entirely null.
Foreign key Y X (primary key of R)
r
s
S R
There are two referential integrity constraint:
• Insert Constraint: Value cannot be inserted in
child table if the value is not lying in parent
table.
• Delete Constraint: value cannot be deleted
from parent table if the value is lying in child
data.
Examples of Referential Integrity
Account Customer
Account.customerId to Customer.Id
Student Dept
Student.dept to Dept.name: every value of
Student.dept must also be a value of Dept.name.
Short Questions:
Q.1 What is data independence?
Q.2.What do you mean by DBMS?
Q.3 A transaction is a set of operations that must be performed
completely or not at all. Explain with suitable example.
Q.4What is the difference between Generalization and
specialization?
Q.5 Describe the characteristics of DBMS.
Q.6 Explain all components of E-R Diagram.
Q.7 What is of keys in DBMS and explain how many types of keys
are there.
Long Questions:
Q.1 Describe 3-level of abstractions of DBMS.
Q.2 Differentiate between physical and logical
data independency.
Q.3 Discuss all 12 rules of Dr. E.F. Codd’s.
Q.4 What is data model? Discuss various data
models available in DBMS.
Q.5Differentiate between weak and strong
entity sets with example.
Q.6 What is a DBMS? How does it differ from a
conventional file system?