KEMBAR78
Dbms Notes | PDF | Relational Database | Databases
0% found this document useful (0 votes)
89 views348 pages

Dbms Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views348 pages

Dbms Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 348

Database Management

System

By,
Mr. Rahul Shyam
(BCA, B.Ed, MCA, M.Tech(CSE), Ph.D(Pursuing))
Assistant Professor
Dept. of Computer Science & Engineering
Basic Definitions

Data:
Data is a collection of facts, figures and statistics that
can be recorded and have an implicit meaning.
e.g., names, telephone numbers, addresses.
Information: information is a processed data with implicit
meaning.

A database is a collection of logically related records,


that is organized so that it can easily be accessed,
managed, and updated. It consists of an organized
collection of data for one or more uses, typically in
digital form.
Why database?

 Redundancy can be reduced

 Inconsistency can be avoided


 The data can be shared
 Standards can be enforced
 Security restrictions can be applied
 Integrity can be maintained
 Provision of data independence
Examples of databases

 Banking system
 Hospital management system
 University database
 Library database
 Airline reservations system etc.
Some of the commercial databases are
Amazon ,Flipcart, facebook, WhatsApp etc.
Note: Earlier data was maintained using
traditional file processing system .
i.e., flies of records.
An example of large commercial database is Amazon.com ,
which contains a data of over 200 million books, CDs ,
Videos, DVDs, games, electronics, apparel & other items.
In this database , data is stored on more than 200 servers.
About 100 million visitors access Amazon.com each day &
use the database to make purchases. And the database is
continually updated.
About 500 people are responsible for keeping
the Amazon.com database up to date.
Drawback of File processing system
1. Data Redundancy

2. Data dependency

3. Lack of flexibility

4. Data security

5. Data isolation
Database System vs. File System
DBMS ??

Database Management System (DBMS):


A database management system (DBMS) is system software for creating and managing databases.
The DBMS provides users and programmers with a systematic way to create, retrieve, update and
manage data.

Database System
Database system is a combination of both database & DBMS software
DBMS Functionality
The DBMS is a general purpose software system ,basic functions of DBMS
are
1. Defining,
2. Constructing and
3. Manipulating databases for various applications.
 Other features or functionalities are
 Protection or Security measures to prevent unauthorized access
 Presentation and Visualization of data
 Maintaining the database and associated programs over the lifetime of the
database applications.
1.Defining a database involves specifying the data types, structures and
constraints for the data to be stored in the database.
Ex: Let us construct the process of creating the EMPLOYEE database for
a typical organization. The first step is to , defining a database
involves the following steps:
Specifying the data types and structures of each field in the database.
Identifying the constraints on different elements.
Entity Set Attribute Data type Constraints

Emp_name Char(20) Only alphabet

Emp_no Number(6) Value > 0

Emp_address Char(50) -

EMPLOYEE
Emp_designation Char(10) -

Emp_department Char(10) -

Emp_salary Number(6,2) -
2.Constructing the database is a process of storing the data itself on some storage
medium that is controlled by DBMS.
Ex: EMPLOYEE

Emp_name Emp_no Emp_address Emp_designation Emp_department Emp_salary

Arun 101 #18,Hebbal, Manger Sales 10237.85


Bangalore

102 #27,Jayanagara Admin Administration 9865.54


Jhon Bangalore

David 108 Yelahanka, Bangalore Accountant Accounts 5455.55


3.Manipulating a database includes functions as querying the database to retrieve specific
data, updating the database to reflect changes and generating reports from the data.
Ex: List all employees whose salaries are greater than 9000.00
Examples of various DataBase Management Systems

SL No DBMS COMPANY

1 IMS IBM

2 2K DBMS SAS Inc

3 IDMS CULLINET [now-Computer


Associates]
4 DMS 1100 UNIVAC -now UNISYS

5 IMAGE HP [ Hewlett Packard ]

6 VAX –DBMS DIGITAL [ now Compaq]

7 SUPRA Cincom
Different RDBMS’
MS SQL Server

Sybase/ SQL Server 6.0


6.5
7.0
2000
Oracle 2005,2008,2012……
7.x
8.0.x
8.1.x (8i)
9.2
Other Database tools 10G, 11i…..
DB2
Informix
MySQL
Etc…..
Example of Simple Student database
Simplified database system environment
Note:
What is DBMS Catalog: The catalog contains information such as the structure of each file,
the type and storage format of each data item and various constraints on the data.
What is Metadata?
Data about data called as metadata
Meta: information to describe the database
**** Characteristics of Database approach(5/10m)

1.Self describing nature of database system


2.Insulation between program & data .

3. Data abstraction.
4.Supports multiple views of data
5.Sharing of data & multi-user transaction processing
Characteristics of the Database Approach

Self-describing nature of a database


1.

system:
The database system not only contains the data , but also the description of
the data.
2.Insulation between programs and data:
 Called program-data independence.
 Allows changing data structures and storage
organization without having to change the
DBMS access programs.
PDI [ Program Data Independence ]
In database system environment , if both data &
program is stored independent , it is known as
Program Data Independence.
POI [ Program Operation Independence ]
In database system environment , if both data &
program is operated independent , it is known as
Program Operation Independence.
Characteristics of the Database Approach
(continued)
3.Data Abstraction:
 The characteristic that allows program-data
independence and program-operation
independence called data abstraction.
 A data model is used to hide storage details
and present the users with a conceptual view
of the database.
4.Support of multiple views of the data:
 Each user may see a different view of the
database, which describes only the data of
interest to that user.
Characteristics of the Database Approach (continued)

5. Sharing of data and multi-user


transaction processing:
 Allowing a set of concurrent users to
retrieve from and to update the database.
*****Different types of Database Users(10m)
 Users may be divided into
I) “Actors on the Scene”.
II) “Workers Behind the Scene”.
I)Actors on the Scene”
USERS those who design, develop and maintain
database applications called “Actors on the
Scene”. They are
1. Database Administrator (DBA)
2. Database designer
3. Application programmers
4. End users
a. Naïve user or parametric end users
b. sophisticated user
c. Stand-alone users
1.Database Administrators:
 A DBA is responsible for authorizing access to the database ,
coordinate , monitoring its use & acquiring necessary software &
hardware resources as and when needed.
 A DBA is again responsible for problems such as breach of security or
poor system performance or response time .
2.Database Designers:
 Database designers are responsible for identifying the data to be stored in the
database by choosing appropriate structure to represent & store this data .
 To communicate with all perspective database users in order to understand
their requirements and create a design that meets these requirements
3 . Application programmers:
Application programmer is responsible for writing
database application programs in some
programming language such as COBOL,C+
+ ,java or some high level languages.
4.End-users:
 End users is a group of people whose job is to access the database for querying , updating &
generating reports.

End-users can be categorized into:


i. Casual end users
ii. Naive end users
iii. Sophisticated end users &
iv. Stand-alone end users
 a) Casual end users
 They access the database occasionally
 They need different information each time
 They use database query language to specify their requests to database
ii)Naive end users [ Parametric users]
 Naive end users include Bank tellers , Reservation clerks, clerks at receiving stations of
shipping companies .,etc
 They Use previously implemented and tested programs (ie called “canned transactions") to
access/update the database.
Categories of End-users (continued)

iii) Sophisticated:
 This group includes Engineers , Scientists , Business analysts etc ,
 They are thoroughly familiar them selves with the facilities provided by the DBMS , in
order to implement their applications to meet their complex requirements.
iv)Stand-alone end users
 Stand-alone end-users Mostly maintain personal databases using ready-to-use packaged
applications.
II “Workers Behind the Scene”.
Those who design and develop the DBMS software and
related tools, and the computer systems operators
called “Workers Behind the Scene”
They are.
1. DBMS System Designers and Implementers
2. Tool Developers
3. Operators and Maintenance Personnel
1.DBMS System Designers and Implementers
 Database designers are responsible for identifying the data to be stored in the database by
choosing appropriate structure to represent & store this data .
2.Tool Developers
 Tools are the software packages that facilitate database modeling & design and also
improves the performance of the database systems
 In many cases , independent software vendors develop & market these tools.
3.Operators & maintenance personnel
 They are responsible for actual running & maintenance of the hardware & software
environment for the database system.
****Implications(significance) of Database Approach

1.Potential for enforcing standards


2.Reduced application development time
3.Flexibility
4.Availability of up to date information
5.Economies of scale
1.Potential for enforcing standards
The database approach permits DBA to define & enforce standards among database users in
a large organization
Ex: Name of data elements , data formats
2.Reduced application development time
New applications can be created in less time using DBMS facilities in database environment
.
3.Flexibility
 DBMS allows evolutionary changes to the structure of database with out , affecting the
stored data & the existing file system .
4.Availability of up to date information
 DBMS facilitates the availability of up to date information , which is essential for many
online-transaction processing [OLTP] systems like in railway reservation , banking , etc.,
5.Economies of scale
DBMS approach reduces over all cost of operation & management by using proper
communication devices , powerful processors & storage devices , rather than each
department purchasing each equipment.
Advantages of using DBMS

1. Controlling Redundancy
2. Restricting unauthorized access of database
3. Providing persistent storage of program objects
4. Provides storage structure for efficient query processing
5. Provides Back up & recovery
6. Provides multiple user interfaces
7.Representing complex relationship among data
8.Enforcing integrity constraints
1.Controlling Redundancy
 Redundancy leads to several problems such as “Duplication of efforts ” & inconsistency
 Storage space is wasted when same data is stored repeatedly
2.Restricting unauthorized access of database
 Depending on the type of data only limited users are allowed to access the database
3. Providing persistent ( constant ) storage of program objects
 DBMS provides an environment to provide persistent storage of program objects.
4.Provides storage structure for efficient query processing
 DBMS provides facilities for storage structure & efficient execution of queries &
updates.
5.Provides Back up & recovery
 DBMS provides facilities to recover from hardware and software failures . i.e., back up and
recovery mechanism
6.Provides multiple user interfaces
 Query language for Casual users
 Programming language for Application programmers
 Form & command codes for parametric users
 Menu based ( graphics based) interfaces for stand-alone users
7.Representing complex relationship among data
 DBMS is capable of representing a variety of relationships among the data
8. Enforcing integrity constraints
 DBMS provides facilities for defining & enforcing integrity constraints .
 Simplest integrity constraint is specifying data type for each data item
 Ex: Primary key , Foreign Key , Referential integrity
Modern/Advanced Databases
1.Distributed Databases
 A distributed DB system consists of several sites
 Sites are connected by a network
 Each site can hold data and process it
 It shouldn’t matter where the data is - the system is a single entity
Distributed Databases

Client(s)
Server Client(s)
Server

Network
Client(s)
Server

Client(s)
Client(s)
Server
Server

'Modern' Databases
Client/Server Architecture

 The client/server architecture  Server


is a general model for systems  Hosts the DBMS and database
where a service is provided by  Stores the data
one system (the server) to
 Client
another (the client)
 User programs that use the
database
 Use the server for database
access
2.Web-based Databases

 Typical operation
 Database access over the
internet  Client sends a request for a
page to the web server
 Web-based clients
 Web server sends SQL to
 Web server
database
 Database server(s)  The web server uses results
 Web server serves pages to to create page
browsers (clients) and can  The page is returned to the
access database(s) client
Web-based Databases

HTTP request Client


(Browser)

SQL query Web


Server HTML page

Database
Server SQL result
Web-based Databases

 Advantages  Disadvantages
 World-wide access  Security can be a problem if you
 Internet protocols (HTTP, SSL, are not careful
etc) give uniform access and  Interface is less flexible using
security standard browsers
 Database structure is hidden  Limited interactivity over slow
from clients connections
 Uses a familiar interface
3.Object Oriented Databases


 An object oriented database An object oriented DBMS
(OODB) is a collection of  Manages a collection of
objects
persistent objects
 Allows objects to be made
 Objects - instances of a defined
persistent
class
 Permits queries to be made
 Persistent - object exist
of the objects
independently of any program
 Does all the normal DBMS
things as well
Object Oriented Databases

 Advantages  Disadvantages
 Good integration with Java, C+  There is no underlying theory to
+, etc match the relational model
 Can store complex information  Can be more complex and less
 Fast to recover whole objects efficient
 OODB queries tend to be
 Has the advantages of the
procedural, unlike SQL
(familiar) object paradigm
4.Object Relational Databases

 Extend a RDBMS with object  An object relational database


concepts  Retains most of the structure of
 Data values can be objects of the relational model
arbitrary complexity  Needs extensions to query
 These objects have inheritance languages (SQL or relational
etc. algebra)
 You can query the objects as
well as the tables
5.Multimedia Databases

 Multimedia DBs can store  They can be used in a wide


complex information range of application areas
 Images  Entertainment
 Music and audio  Marketing
 Video and animation  Medical imaging
 Full texts of books  Digital publishing
 Web pages  Geographic Information Systems
When not to use DBMS

 In spite of several advantages of using DBMS approach , it is desirable to avoid using


DBMS in following circumstances in order to reduce the over head cost of Hardware ,
Software & Probation ( Training )
 1.For Simple & well defined database applications which are not expected to change
frequently
2. When there is no multiple-user access to data , better avoid DBMS
Five or seven marks Questions:
1. Explain different functions of DBMS.
2. Explain the overall structure of a simplified database system
environment with a diagram.
3. Explain the characteristics of database approach.
4. Explain the advantages of DBMS?
5. List and explain different database users. OR
Explain Different people behind DBMS.
6.Explain the implication of DBMS.
7.Explain when not to use DBMS.
8.Explain the draw back of file system.
9.Explain different types of databases in detail.
Chapter 2

Entity-Relationship [ ER ] model

6
1
Entity-Relationship[ER]model

ER-Model is used to represent objects in the real world and of relationship among these
objects, which represents the overall logical structure of a database.
ER-model is a high level conceptual data model developed by chen in 1976 to facilitate
database design.
High level Conceptual data models for database design.

The database design consists of the following steps


1. Requirement collection & analysis
2. Conceptual design
3. Logical design ( Data model mapping ) &
4. Physical design
1.Requirement collection & analysis
 During this step , database designers interview prospective database users to understand
& document their data requirement

S
li
d
e
1
2.Conceptual design
 The next step is to create a conceptual schema for the database , using a high level
conceptual data model. Hence, this step is known as “ Conceptual Design”
 This conceptual schema explains description of the entity types , relationship &
constraints

S
li
d
e
1
3. Logical design ( Data model mapping )
 Logical design is the step , which involves the actual implementation of database, using
a commercial database
 In this step, there exists transformation between high-level data model to implement data
model. Hence, called “ Data model mapping”

S
li
d
e
1
4.Physical design:
During this step, the internal storage structures, indexes, access paths and file
organization for the database files are specified.

S
li
d
e
1
. 68
Define the following

1. Entity
2. Attributes
3. Entity sets.
4. Entity types,
5. Key attributes of an entity type.
6. Value sets (Domain) of attributes.

S
li
d
e
1
Examples of entities and attributes:

 1.An entity is a thing in the real world.


 2.Attributes :
An attribute is a property that describes an entity.

Entity Attributes Values

Car Color Red

Make Volkswagen

Model Bora

S Year 2014
li
d
e
1
 An entity type defines a collection of entities that have the same attributes.

Entity Type Example:


Entity Type:
Student
Entity Attributes:
StudentID,
Name,
Surname,
S
Date of Birth,
li
Department
d
e
1
Entity set:
 The collection of all entities of a particular entity type in the database at any point in time
is called an entity set.

S
li
d
e
1
Key attributes of an entity type:A key attribute is a minimal set of attributes of an entity set,
which uniquely identifies an entity in an entity set. Example:
For the company entity, name can be the key attribute because no two companies can have the
same name.
For the student entity, regno can be the key attribute.
Different Types of attributes.

1. Simple or atomic attributes


2. Composite attributes
3. Single valued attributes
4. Multi valued attributes
5. Derived attributes
6. Stored attributes
7. Null attributes
8. Key attributes

S
li
d
e
1
1.Simple or atomic attributes:
These attributes can’t be subdivided into further.
Example: Regno, ID, age, Deptno.
2.Composite attributes: Composite attributes can be divided into smaller subparts.
Address
Street No. Area City State Pin code

S
li
d
e
1
3.Single valued attributes: An attribute that can take only one value at a time is called single
valued attributes.
Example: The age attribute will have a single value.

S
li
d
e
1
4.Multi valued attributes: Some attribute have more than one value for the same attribute
and are called multi valued attributes.
Example: A college degree attribute can have multiple values.
Degree [BCA,B,Sc, MCA, PhD]
Admin [Bangalore, Mumbai]
Carcolor [Red, Black].

S
li
d
e
1
5.Derived attributes: If the value of an attribute can be derived from some other attributes,
then such attributes are called derived attributes.
Example, gross pay of an employee this can be derived by knowing basic pay, allowances, and
deduction for employee.

S
li
d
e
1
6.Stored attributes: The value of certain attributes
cannot be obtained or derived from some other
attributes, that is, they are not derived from any
other attributes.
Example: Birth date, Book_ID
S
li
d
e
1
7.Null attributes: A null attributes used when an
attributes does not have any value.
 Email: All employees in an employee database
may not have e-mail address.

S
li
d
e
1
 8. Key attributes: An entity type usually has an
individual whose values are distinct for each
individual entity. Such an attribute is called a key
attribute.

S
li
d
e
1
 What is Relationship?
Definition:
A relationship is an association among two or more entities. A relationship captures how two
or more entities are related to one another.

S
li
d
e
1
 1.Relationship Set: A collection of relationships
of same type is called the relationship set.

S
li
d
e
1
Degree of Relationship:
The degree of relationship is the number of entities
participating in a relation.
Relationship types
1.Binary relationship : A relationship of degree 2 is
known
as Binary relationship
2.Ternary relationship : A relationship of degree 3 is
known as Ternary relationship

8
1.Binary relationship: A relationship of degree 2 is called binary relationship. Example, the
figure shows an example of binary relationships.

Publishers Publishes
Book

S
li
d
e
1
2.Ternary relationship: Relations of degree three are called ternary relationship.

Teacher

Book
Publishers Publishes

S
li
d
e
1
Different Types of relationship or Explain cardinality in Dbms

1. One to One (1:1)


2. One to Many (1 : M)
3. Many to One (M:1)
4. Many to Many (M : M)

S
li
d
e
1
1.One to One relationship (1:1) relationships: An entity in A is associated with at most one
entity in B and vice versa.

1
Manager Mange’s
Department

S
li
d
e
1
2.One to Many relationship (1:M) relationships: An entity in A is associated with any
number in B, an entity in B however can be associated with at most one entity in A.

Department Mange’s Employee

S
li
d
e
1
3. Many to One relationship (M:1) relationships: An entity in A is associated with at most
one entity in B. An entity in B however it can be associated with any number of entities in A.
many Depositors deposit to single account.

M
Depositor Deposit Account

S
li
d
e
1
4. Many to Many relationship (M:N) relationships: An entity in A is associated with any
number of entities in B and, an entity in B is associated with any number of entities in A.

Employee Works_on
Projects

S
li
d
e
1
Relationship type vs. relationship set (1)

 Relationship Type:
 Is the schema description of a relationship
 Identifies the relationship name and the participating entity types
 Also identifies certain relationship constraints
 Relationship Set:
 The current set of relationship instances represented in the database
 The current state of a relationship type
Relationships of Higher Degree

 Relationship types of degree 2 are called binary


 Relationship types of degree 3 are called ternary and of degree n are called n-ary
 In general, an n-ary relationship is not equivalent to n binary relationships
 Constraints are harder to specify for higher-degree relationships (n > 2) than for binary
relationships
Discussion of n-ary relationships (n > 2)

 If a particular binary relationship can be derived from a higher-degree relationship at all


times, then it is redundant
 For example, the TAUGHT_DURING binary relationship in Figure 3.18 (see next slide)
can be derived from the ternary relationship OFFERS (based on the meaning of the
relationships)
example of ternary vs binary a ternary relationship
ER DIAGRAM – company database
WORKS_FOR, MANAGES, WORKS_ON, CONTROLS, SUPERVISION, DEPENDENTS_OF
Displaying constraints on higher-degree relationships

 The (min, max) constraints can be displayed on the edges –


however, they do not fully describe the constraints
 Displaying a 1, M, or N indicates additional constraints
 An M or N indicates no constraint
 A 1 indicates that an entity can participate in at most one relationship
instance that has a particular combination of the other participating
entities
 In general, both (min, max) and 1, M, or N are needed to describe
fully the constraints
COMPANY ER Schema Diagram using (min, max) notation
Memory Hierarchies:

1.Cache Memory[Static RAM]


2.Main Memory[Dynamic Ram]
3.Flash Memory[EEPROM]
4.Magnetic Disks
5.Optical Disks[CDROM ]
6.Magnetic Tapes
Memory Hierarchies:
 Primary Storage: This category includes storage media that can be operated directly by
CPU. Such as the computer main memory and smaller but faster cache memories. Primary
storage usually provides fast access to data but is of limited storage capacity.
Types of Primary Memory

 RAM (Random access memory):


 SRAM (Static RAM) (flip-flop gates)
 DRAM (Dynamic RAM)
 ROM (Read only memory)
 PROM (Programmable)
 EPROM (Erasable programmable)
 EEPROM (Electronically erasable programmable)
Difference between RAM and ROM

 RAM is the memory available for the operating system, programs and processes to use
when the computer is running.
ROM is the memory that comes with your computer that is pre-written to hold the
instructions for booting-up the computer.
 RAM requires a flow of electricity to retain data (e.g. the computer powered on).
ROM will retain data without the flow of electricity (e.g. when computer is powered off).
 RAM is a type of volatile memory. Data in RAM is not permanently written. When you
power off your computer the data stored in RAM is deleted.
ROM is a type of non- volatile memory. Data in ROM is permanently written and is not
erased when you power off your computer.
 There are different types of RAM, including DRAM (Dynamic Random Access Memory)
and SRAM (Static Random Access Memory).
There are different types of ROM, including PROM (programmable read-only memory)
that is manufactured as blank memory (e.g. a CD-ROM) and EPROM (erasable
programmable read-only memory).
Cache memory which is static RAM.
Cache memory is used to speed up execution of programs.
 Flash memory is an electronic (i.e. no moving parts) non-volatile computer storage device
that can be electrically erased and reprogrammed.
Flash memory was developed from EEPROM (electrically erasable programmable read-only
memory).
Secondary memory
Disk Storage Devices[HDD]
 Preferred secondary storage device for high storage capacity and low cost.
 Data stored as magnetized areas on magnetic disk surfaces.
 A disk pack contains several magnetic disks connected to a rotating spindle.
 Disks are divided into concentric circular tracks on each disk surface.
 Track capacities vary typically from 4 to 50 Kbytes or more
Disk Storage Devices (contd.)

 A track is divided into smaller blocks or sectors


 because it usually contains a large amount of information
 The division of a track into sectors is hard-coded on the disk
surface and cannot be changed.
 A track is divided into blocks.
 The block size B is fixed for each system.
 Typical block sizes range from B=512 bytes to B=4096 bytes.

 Whole blocks are transferred between disk and main memory for
processing.
Disk Storage Devices (contd.)
 Disks consist of platters, each with two surfaces
 Each track consists of sectors separated by gaps

tracks
surface
track k gaps

spindle

sectors
Disk structure
(Muliple-Platter View)
cylinder k
 Aligned tracks form a cylinder
surface 0
platter 0
surface 1
surface 2
platter 1
surface 3
surface 4
platter 2
surface 5

spindle
Disk Operation (Single-Platter View)
The disk surface
Read/write head
spins at a fixed
 rate is attached to end
rotational
of the arm and flies over
disk surface on
thin cushion of air

spindle

By moving radially, arm can


position read/write head over
any track
Disk Access Time

 Average time to access some target sector approximated by :

 Seek time :
 Time to position heads over cylinder containing target sector
 Rotational latency :
 Time waiting for first bit of target sector to pass under r/w head

 Transfer time :
 Time to read the bits in the target sector.
 Buffering of blocks: Before processing any data, data in terms of blocks are copied into
main memory buffer. When several blocks needed to be transferred from disks to main
memory and all the block address are known. Several buffers can be reserved in main
memory to speed up the transfer, while one buffer is being read or written, the CPU can
process data in the other buffer.
Double buffering:

 Once the transfer a block of memory is done from secondary memory to primary memory,
the CPU can start processing this block. Simultaneously the disk I/O processor can be
reading and transferring the next block into a different buffer. This technique is called
double buffering. This can be used to write a continuous stream of blocks from memory to
the disk.
Blocking factor

 The blocking factor bfr for a file is the (average) number of file records stored in a disk
block.

Blocking factor [bfr] =Block size(B)/Record size (R)


Advantages of double buffering:
 It permits continuous reading or writing of data on consecutive disk blocks, which
eliminates the seek time and rotational delay for all but the first block transfer.
 Since data is kept ready for processing, waiting time is reduced.
Buffering of blocks
 Buffers are temporary memory registers available in
main memory to speed up the transfer.
 Before processing any data, data in terms of blocks are
copied into main memory buffer.
 When several blocks needed to be transferred from disks
to main memory and all the block address are known.
 Several buffers can be reserved in main memory to
speed up the transfer, while one buffer is being read or
written, the CPU can process data in the other buffer.
 Consider two processes A and B that run concurrently in
a interleaved fashion. Consider two processes C and D
that run concurrently in a parallel fashion.
 Whether a single CPU controls multiple process,
parallel process is not possible, interleaving is possible.
 When process concurrently run in a parallel fashion,
buffering is very useful because a separate disk I/O
process is available are because multiple CPU process
exist.
Double buffering
 Advantages of double buffering: It permits continuous reading
or writing of data on consecutive disk blocks, which eliminates
the seek time and rotational delay for all but the first block
transfer.
 Since data is kept ready for processing, waiting time is rreduced.
reduced
Placing file records on disks:

 Records: Data is usually stored in the form of records. Each record consist of a collection
of related data values or items, where each value corresponds to a particular field of the
record.
 Record types: A collection of fields names and their corresponding data types constitutes
a record type or record format definition.
 Data type: A data type associated with each field specifies the types of values a field can
take.
 Files: A file is a sequence of records, all records in a file are of the same
record type.
 Fixed length record: If every record in the file has exactly the same
size in bytes, the file is said to be made up of fixed length records.
 Variable length record: If different records in the file have different
sizes, the file is said to be made up of variable length records.
 A file descriptor (or file header) includes information that describes
the file, such as the field names and their data types, and the addresses
of the file blocks on disk.
Allocating file blocks on disks:

 Allocating file blocks on disks: There are various methods for allocating the blocks of a
file on a disk.
1. Contiguous allocation
2. Linked allocation:
3. Indexed allocations
There are various methods for allocating the blocks of a file on a disk.
1. Contiguous allocation: Here, file blocks are allocated to consecutive disks blocks. This
makes reading the whole file very fast using double buffering, but it makes expanding
the file difficult.
2. Linked allocation: In linked allocation each file block contains a pointer to next file
block. A combination of the two allocates clusters of consecutive disk blocks and the
cluster are linked together clusters are sometime called segments or extents.
3.Indexed Allocation. Provides solutions to problems of contiguous and linked allocation.
A index block is created having all pointers to files. Each file has its own index block which
stores the addresses of disk space occupied by the file. Directory contains the addresses of
index blocks of files.
Operations on file:

Operation on files are usually grouped into retrieval operations and update operations such as
insertion or deletion of records or by modification of field values.
1. Find (or Locate): Searches for the first record satisfying a search condition.
2. Read (or Get): Copies the current record from the buffer to a program variable.
3. Find next: Searches for the next record.
4. Delete: Deletes the current record.
5. Modify: Modifies some fixed values for the current record.
6. Insert: Inserts a new record into the file.
Files of Records

 File records can be unspanned or spanned


 Unspanned: A record must reside in one block.
 Spanned: A record can be reside in more than
one block.
Unspanned and Spanned Records
File organization[10]

Basically files are organized into 3 Types they are


1) Unordered Files
2) Ordered Files
3) Hashing Techniques
1.Unordered Files or [Heap files]

 Also called a heap or a pile file.


 New records are inserted at the end of the file.
 A linear search through the file records is necessary to search for a record.
 Record insertion is quite efficient.
 Reading the records in order of a particular field requires sorting the file records.
2.Ordered Files

 Also called a sequential file.


 File records are kept sorted by the values of an ordering field.
 Insertion is expensive: records must be inserted in the correct order.
 A binary search can be used to search for a record on its ordering field value.
 Reading the records in order of the ordering field is quite efficient.
Hashing technique

What is Hashing?
Hashing is an effective technique to calculate the direct location of a data record on the disk
without using index structure.
The basic terms associated with the hashing
technique are:
Bucket − A hash file stores data in bucket format. Bucket is considered a unit of
storage. A bucket typically stores one complete disk block, which in turn can
store one or more records.
 Hash Function − A hash function, h, is a mapping function that maps all the
set of search-keys K to the address where actual records are placed. It is a
function from search keys to bucket addresses.
 H(K) = key MOD M
let us consider a hash function H(K) such that H(K) = key MOD M which produce a
remainder between 0 and M-1 depending on the value of key, this value then used for the
record address.
H(K) = K Mod 10
H(K) = Produces bucket address
********Hashing Types
Basically Hashing are of two types,
1.Static Hashing
2.Dynamic Hashing
1.Static Hashing

In static hashing, when a search-key value is provided, the


hash function always computes the same address.
For example, if mod4 hash function is used, then it shall
generate only 5 values.
The output address shall always be same for that function.
The number of buckets provided remains unchanged at all
times.
Operation in static Hashing
Insertion − When a record is required to be entered
using static hash, the hash function h computes the
bucket address for search key K, where the record
will be stored.
Bucket address = h(K)
Search − When a record needs to be retrieved, the
same hash function can be used to retrieve the
address of the bucket where the data is stored.
Delete − This is simply a search followed by a
deletion operation.
Bucket Overflow
The condition of bucket-overflow is known as
collision. This is a fatal state for any static hash
function. In this case, overflow chaining can be
used.
Collision leads
Overflow Chaining − When buckets are full, a new
bucket is allocated for the same hash result and is
linked after the previous one. This mechanism is
called Closed Hashing.
Example
Linear Probing − When a hash function
generates an address at which data is already
stored, the next free bucket is allocated to it.
This mechanism is called Open Hashing.
2.Dynamic Hashing

The problem with static hashing is that it does not expand or


shrink dynamically as the size of the database grows or
shrinks. Dynamic hashing provides a mechanism in which
data buckets are added and removed dynamically and on-
demand. Dynamic hashing is also known as extended
hashing.
Hash function, in dynamic hashing, is made to produce a
large number of values and only a few are used initially.
Unit -3
The Relational Model

1
4
 The relational model for database management is a database model based on
first-order predicate logic, first formulated and proposed in 1969 by Edgar F. Codd. In the
relational model of a database, all data is represented in terms of tuples, grouped into
relations. A database organized in terms of the relational model is a relational database

 Note:It is a collection of relations

S
li
d
e
1
Relation structure
attributes
(or columns)
customer_name customer_street customer_city

Jones Main Harrison


Smith North Rye tuples
Curry North Rye (or rows)
Lindsay Park Pittsfield
S customer
li
d
e
1
*****Relational Model constraints or
Schema based constraints :
1. Key constraint
2. Constraints on NULL.
3. Integrity constraint:
a) Domain constraint
b)Entity integrity constraints.
c)Referential Integrity Constraints

S
li
d
e
1
1.Key constraint:
 All tuples in a relation must also be distinct, this means that no 2 tuples can have same
value
2.Constraints on NULL:

constraint on attribute specifies whether null values are not permitted.


The constraint is specified as NOT NULL.

S
li
d
e
1
Integrity constraint:

a) Domain constraint:
It specify that, the value of each attribute (A )must be an atomic value from the domain (A) for
that attribute.
Eg: integer type hold only integer but not float value
b)Entity integrity constraints:
It states that no Primary Key value can be NULL,

S
li
d
e
1
c)Referential Integrity Constraints:
The value in the foreign key matches a value in the
primary key [unique or primary key of the same or
different table referenced key].

S
li
d
e
1
*****Codd’s 12 Rules

• 1985 Dr.E.F Codd the originator Proposed to test DBMS


for confirmation to concept of Codd’s relational model.
Rule 1: The Information Rule

 The information rule simply requires all information in the database to be represented in
one and only one way.
Rule 2: Guaranteed Access

 All data must be accessible


 Each unique piece of data (atomic value) should be accessible by the combination of
tablename + primary key + attribute value
Eg:select name from student where id=10;

S
li
d
e
1
Rule 3: Systematic Treatment of Null Values

 RDBMS must allow each attribute to remain null,specially it must support a representation
of missing information and inapplicable information
Eg:primary keys –not null

S
li
d
e
1
Rule 4: Dynamic Online Catalog

 Data dictionary should be stored as relational tables and accessible through the regular data
access language.
 Data dictionary (catalog) to have description of the database
 The same query language to be used on catalog as on the application database
 Eg:SQL is used for both the purpose

S
li
d
e
1
Rule 5: Data Sub language

 One well defined language to provide all manners of access to data.


 Example SQL because it support data definition , data manipulation, security, integrity
constraints and transaction management.

S
li
d
e
1
Rule 6: View Updating

 All views that are theoretically updatable should be updatable


Eg:if a view is formed from tables, changes to view should be reflected in base tables.

View: virtual table ,temporarily derived from base table

S
li
d
e
1
Rule 7: High-level Insert, Update, Delete

 The system must support set at a time insert, update and delete operations.
 Set operations like union, intersection and minus should be supported

S
li
d
e
1
Rule 8: Physical Data Independence

 The physical storage of data should not matter to the system


 It says some file supporting table was renamed or moved from one disk to another , it
should not effect the applications

S
li
d
e
1
Rule 9: Logical Data Independence

 If there is change in the logical strucutre (table structure)of the database the user view of
the data should not change implemented through views.
 It says if a table is spilt into two tables a new view should give result as the join of the two
tables.

S
li
d
e
1
Rule 10: Integrity Independence

 The database should be able to enforce its own integrity rather than using other programs
 A minimum of the following two integrity constraints must be
supported:
 Referential Integrity: For each distinct nonnull foreign key value in
a relational database, there must exist a matching primary key value
from the same domain.
 Integrity rules=filters to allow correct data, should be stored in data dictionary.

S
li
d
e .
1
Rule 11: Distribution Independence

 The distribution of portion of the database to various location should be invisible to the
user of the database
 A database should work properly regardless of its distribution across a network.

S
li
d
e .
1
Rule 12: Non subversion Rule
 If low level access is allowed it must not bypass security nor
integrity rules
 If low level access is allowed to a system it should not be able to
subvert or bypass integrity rules to change data

S
li
d
e .
1
*******Relational algebra[10M]

 The basic set of operations for the relational model is the relational algebra.
It enable the specification of basic retrievals
 The result of a retrieval is a new relation, which may have been formed from one or more
relations.
 Algebra operations thus produce new relations, which can be further manipulated the same
algebra.

S
li
d
e .
1
Applications

The main application of relational algebra is providing a theoretical foundation for


relational databases, particularly query languages for such databases.

S
li
d
e .
1
*****Different operations of Relational Algebra:

The relational algebra operations are divided into two groups.


1.Set operations
2.Aditional operations
1.Set operations
 UNION
 INTERSECTION
 DIFFRENCE and
 CARTESIAN PRODUCT

S
li
d
e .
1
2.Additional Operations:
Developed specifically for the relational database.

1. SELECT
2. PROJECT and
3. Join operations

S
li
d
e .
1
UNION [U]:

 The result of this operation denoted by RUS, is a


relation that includes all tuples that either in R or
in S or in both. Duplication tuples will not appear
in the output.
 Notation: R  S

S
li
d
e
1 .
Relation R and S

R S

ENO NAME ENO NAME

1 Arun 3 Smith
2 John
4 David
3 Smith
5 Anand
4 David
6 Nikil

S
li
d
e
1 .
Q =RUS

ENO NAME

1 Arun

2 John

3 Smith

4 David

5 Anand

6 Nikil

S
li
d
e
1 .
INTERSECTION [ ]:

 The intersection operation selects the common tuple from the two
relations.
 Q=R S

Q =R S

ENO NAME

3 Smith
S
li 4 David

d
e
1 .
DIFFERENCE (-):

 The result of difference operation consist of all tuples in R but not in S


 Q=R-S

Q =R-S

ENO NAME

1 Arun

2 John
S
li
d
e
1 .
CARTESIAN PRODUCTS (X):

 The Cartesian product or cross product is binary operation that is used to combine two
relations. Assuming R and S as relations of n and m attributes respectively, the Cartesian
products R X S can be written as
R(A1,A2,A3,…An) X S (B1,B2,B3,….Bm)
The result of the above set operation is,
 Q= R X S(A1,A2,A3,…An, B1,B2,B3,….Bm).

S
li
d
e
1 .
R S

DNO NAME PNO NAME

1 COMPUTER
10 NETWORKING
2 MANAGEMENT
11 PAYROLL
3 SCIENCE

Cartesian product of R and S can be written as


R SR X S

R S

DNO DNAME PNO PNAME


1 COMPUTER 10 NETWORKING
1 COMPUTER 11 PAYROLL
S 2 MANAGEMENT 10 NETWORKING
li 2 MANAGEMENT 11 PAYROLL

d 3 SCIENCE 10 NETWORKING
3 SCIENCE 11 PAYROLL
e
1 .
UNARY RELATIONAL OPERATIONS:
Basic operations:

1. Selection ( ) Selects a subset of rows from
relation.

2. Projection ( ) Projection operation is used to
select only few columns from a relation.

3. Renaming:
This operation is used to rename the relations or
S attributes.
li
d The syntax is as fallows:
e . Rename <old table> to <new table>
1

Select operation ( ):

It selects required rows from the table. This operation is used to select the subset of the tuples
from a relation that satisfy a selection condition or search criteria.
Syntax:

Sigma <selection condition> {<relation name>}

S
li
d
e
1 .
Projection(PI)

 Projection operation is used to select only few columns from a relation. The
mathematical symbol (PI) is used to denote the project operation.
 The general syntax for projection operation is shown below.

PI <attribute list>{ <Relation>}

1
7 .
Selection and projection
 Selects rows that satisfy
selection condition. sid sname rating age
28 John 9 35.0
58 Smith 10 35.0

sname rating
John 9
Smith 10
 sname,rating( rating 8(S2))

1
7 .
Projection

 sname,age(S 2)

sname age
Arun 25
Anand 28
Smith 30

1
8 .
Binary Relational Operations

Various forms of join operation


1. Natural join
2. Outer join
a. Left outer join
b. Right outer join
c. Full outer join

.
Pearson Education © 2009 181
What is join?
Join is a combination of a Cartesian product followed by a selection process. A Join operation
pairs two tuples from different relations, if and only if a given join condition is satisfied.
Natural Join ( ) operation:

User can perform a Natural Join only if there is


at least one common attribute that exists
between two relations.
In addition, the attributes must have the same
name and domain.
Natural join acts on those matching attributes
where the values of attributes in both the
relations are same.

1
8 .
Example for natural join
Project
Department
PNO PNAME DNUM
DNUM DNAME
10 Library MGT 2
1 Admin
20 ERP 1
2 Research
30 Hospital MGT 3
3 Accounts
40 Wireless n/w 2

PROJ_DEPT

PNO PNAME DNUM DNAME

10 Library MGT 2 Research

20 ERP 1 Admin

30 Hospital MGT 3 Accounts


S
40 Wireless n/w 2 Research
li
d
e .
1
OUTER JOIN (+): It returns both matching and non-matching rows, It output rows even if
they do not satisfy the JOIN condition, the outer JOIN operator (+) is used with the table
having no matching rows.

S
li
d
e .
1
WORKER SKILLS
WORKER
NAME SKILL
NAME AGE ADDRESS
Ankith Work
Ankith 23 -
Bhrath 21 - Ganesh Smithy
Chaya 20 - Lokesh Driver
Deepa 21 - Mahesh Fitter
Ganesh 24 - Pandu Smithy
Lokesh 25 - Madhu Fitting
Nataraju 23 -
Madhu 22 -

RESULT
NAME AGE SKILL
Ankith 23 Work
Bhrath 21
Chaya 20
Deepa 21
S Ganesh 24 Smithy
Lokesh 25 Driver
li Nataraju 23
d Madhu 22 Fitting
e .
1
The outer join can be used when we want to keep all the tuples in R of in S are those in both
relations, whether or not they have matching tuples in the other relation.

S
li
d
e .
1
 LEFT OUTER JOIN( ): It is denoted by , the left outer join operation keeps
every tuple in the first or left relations R in the result of relation R and S, if no
matching tuple is found in S in the JOIN result are failed with null values.
 RIGHT OUTER JOIN ( ): It is denoted by , keeps every tuple in the
second or right relation S in the result of R.

S
li
d
e .
1
 FULL OUTER JOIN ( ): It is denoted by , keeps all tuples in both the
left and right relations when no matching tuples are found, filled with NULL values are
needed.

S
li
d
e .
1
Left outer join
BRANCH_LOAN CUSTOMER_LOAN

BNAME LOANNO LOAN_AMT CNAME LOANNO

X L120 30000 A L120

Y L220 50000 B L320

Z L440 60000 C L440

BNAME LOANNO LOAN_AMT CNAME LOANNO

X L120 30000 A L120


S Z L440 60000 C L440
li
Y L220 50000 NULL NULL
d
e .
1
Right outer join

BRANCH_LOAN CUSTOMER_LOAN

BNAME LOANNO LOAN_AMT CNAME LOANNO

X L120 30000 A L120

Y L220 50000 B L320

Z L440 60000 C L440

BNAME LOANNO LOAN_AMT CNAME LOANNO

X L120 30000 A L120


S
Z L440 60000 C L440
li
d NULL NULL NULL B L320

e .
1
Full outer join

BRANCH_LOAN CUSTOMER_LOAN

BNAME LOANNO LOAN_AMT CNAME LOANNO

X L120 30000 A L120

Y L220 50000 B L320

Z L440 60000 C L440

BNAME LOANNO LOAN_AMT CNAME LOANNO

X L120 30000 A L120

Z L440 60000 C L440


S Y L220 50000 NULL NULL
li
NULL NULL NULL B L320
d
e .
1
Aggregate Functions: Many database applications
require the aggregate of summarization of data. The
aggregation operations are counting, summing,
average and finding the maximum and minimum
values on tables.

S
li
d
e .
1
COUNT(*) Counts the number of rows of the query result

SUM( ) Finds the sum of the values in a column

AVG( ) Returns the average of the values in a column

MAX( ) Returns the maximum value in a column

MIN( ) Returns the minimum value in a column

S
li
d
e .
1
Structured Query Language
(SQL)
The ANSI standard language for the
definition and manipulation of relational
database.
Includes data definition language (DDL),
statements that specify and modify database
schemas.
Includes a data manipulation language (DML),
statements that manipulate database
S content.
li
d
e .

1
Some Facts on SQL
SQL data is case-sensitive, SQL commands
are not.

First Version was developed at IBM by Donald


D. Chamberlin and Raymond F. Boyce. [SQL]

Developed using Dr. E.F. Codd's paper, “A


Relational Model of Data for Large Shared
Data Banks.”
S SQL query includes references to tuples
li
d
variables and the attributes of those variables
e .

1
Attribute names
Tables in Table
SQL name

Product

PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

MultiTouch $203.99 Household Hitachi

Tuples or rows
Different SQL Languages and their
commands
1. Data Definition Language [DDL]
2. Data Manipulation Language [DML]
3. Data Control Language [DCL]
4. Transaction Control Language [TCL]
5. Data Query Language [DQL]

S
li
d
e .

1
Different data types of SQL.
1. CHAR(size) -Fixed length character
2. VARCHAR(size)-Varying length characters.
This is also specified as VARCHAR2(Size)
3. NUMBER(size)-Integer number without
decimal point.
4. INTEGER OR INT-Integer number. Size can’t
be specified.
5. DATE-For representing date.
S
li Eg: YYYY-MM-DD.
d
e . 6.TIME-For representing time, HH: MM:SS.
1
SQL: DDL Commands
CREATE TABLE: used to create a table.

ALTER TABLE: modifies a table after it was


created.

DROP TABLE: removes a table from a database.

S
li
d
e .

1
SQL: CREATE TABLE Statement
Things to consider before you create your
table are:
The type of data
the table name
what column(s) will make up the primary
key
the names of the columns
Syntax:
CREATE TABLE statement syntax:
CREATE TABLE <table name>
S ( columnname1 datatype ( NOT NULL ),
li columnname2 datatype);
d
e .

1
SQL: ALTER TABLE Statement
To add or drop columns on existing tables.

ALTER TABLE statement syntax:

ALTER TABLE <table name>


ADD attribute datatype;
Eg:Alter table student add phno integer(10);

Drop statement syntax:


DROP table tablename;
S
li Eg:Drop table student;
d
e .

1
SQL-DML COMMANDS

INSERT: adds new rows to a table.


SELECT: To select the content
from table.
DELETE: deletes one or more
rows from a table.
S
li Modify/update: To modify the data in
d
e . the table
1
 Data Control Language [DCL]: These statements are used to give permissions to the
user.
 GRANT: - Giving permissions on object privileges to user.
 REVOKE: - Take back the given permission from the user.

S
li
d
e .

1
Transaction Control Language [TCL]:
 It is used to control transactions.
 COMMIT: - To permanently save the changes made to transaction.
 ROLLBACK: - To undo or cancel (discards) the changes up to the previous commit point.

S
li
d
e .

1
 Data Query Language [DQL]: It is used retrieve data from the database .

S
li
d
e .

1
SQL: INSERT Statement
To insert a row into a table, it is necessary to have
a value for each attribute, and order matters.
INSERT statement syntax:
INSERT into <table name>
VALUES ('value1', 'value2', NULL);
Example: INSERT into FoodCart
VALUES (’02/26/08', ‘pizza', 70 );

FoodCart

date food sold


02/25/08 pizza 350
S 02/26/08 hotdog 500
li
d 02/26/08 pizza 70
e .

1
SQL: UPDATE Statement
To update the content of the table:
UPDATE statement syntax:
UPDATE <table name> SET <attr> = <value>
WHERE <selection condition>;
Example: UPDATE FoodCart SET sold = 349
WHERE date = ’02/25/08’ AND food =
date ‘pizza’; food sold
FoodCart date food sold
02/25/08 pizza 350
02/25/08 pizza 349
S 02/26/08 hotdog 500
li 02/26/08 hotdog 500
02/26/08 pizza 70
d 02/26/08 pizza 70
e .

1
SQL: DELETE Statement
To delete rows from the table:
DELETE statement syntax:
DELETE FROM <table name>
WHERE <condition>;
Example: DELETE FROM FoodCart
WHERE food = ‘hotdog’;
FoodCart

date food sold


date food sold
02/25/08 pizza 349
02/25/08 pizza 349
02/26/08 hotdog 500
02/26/08 pizza 70 02/26/08 pizza 70
S
li Note: If the WHERE clause is omitted all rows of data are deleted
d from the table.
e .

1
SQL Statements, Operations, Clauses

SQL Statements:
Select
SQL Operations:
Join
Left Join
Right Join
Like
SQL Clauses:
Order By
S
li Group By
d Having
e .

1
SQL: SELECT Statement
A basic SELECT statement includes 3 clauses

SELECT <attribute name> FROM <tables> WHERE <condition>

SELECT FROM WHERE

Specifies the Specifies Specifies


attributes the tables the selection
that are part that serve as condition,
of the the input to including the
Sresulting the join
li
relation statement condition.
d
e .

1
SQL: SELECT Statement (cont.)
Using a “*” in a select statement
indicates that every attribute of the input
table is to be selected.
Example: SELECT * FROM … WHERE …;

To get unique rows, type the keyword


DISTINCT after SELECT.
Example: SELECT DISTINCT * FROM …
WHERE …;
S
li
d
e .

1
Example: 1) SELECT *
Person FROM person
Name Age Weight WHERE age > 30;
Harry 34 80
Name Age Weight
Sally 28 64
Harry 34 80
George 29 70
Helena 54 54
Helena 54 54
Peter 34 80
Peter 34 80

2) SELECT weight 3) SELECT distinct weight


FROM person FROM person
WHERE age > 30; WHERE age > 30;
Weight
Weight
80 80
54
54
2 . 80
1
 The DROP Schema Command: The Drop schema command can be used to drop or
delete a whole schema if it is not needed any more. There are two drop behavior options,
CASCADE & RESTRICT .

S
li
d
e .
1
 CASCADE Option: Deletes or Drops a schema along with its elements for example, To
remove the company database Schema and all its tables, domains and other elements the
cascade option is used as fallows.
DROP SCHEMA COMPANY CASCADE;

S
li
d
e .
1
 Restrict Option Drops: a Schema only if it has no elements init .
 For example, To remove the company schema if it has no elements in it, the RESTRICT
option is used as fallows.
DROP SCHEMA COMPANY RESTRICT;
Other wise the DROP commands will not be executed.

S
li
d
e .
1
 The DROP Table command: If a base relation within a schema is not needed any longer,
the relation and its definition can be deleted by using the drop Table command.
 For example: If we want to delete a employee table from company schema:

DROP TABLE EMPLOYEE;

S
li
d
e .
1
 DROP TABLE EMPLOYEE CASCADE;
This command drops employee table with all constraints and views that reference the table
automatically from the schema along with the table itself
DROP TABLE EMPLOYEE RESTRICT;
This command drops a table only it is not referenced in any constraints

S
li
d
e .
1
Views in SQL

 VIEWS (Virtual Tables) in SQL: A View in SQL is a single table that is derived from
other tables. These other tables could be base tables or previously defined Views. A view
does not exist in physical form. It is called as virtual table in contrast to base tables whose
tuples are actually stored in the databases.

S
li
d
e .
1
The general syntax for creating a VIEW is as follows:
CREATE VIEW <View Name> AS
SELECT <Table name1>, <Table name2> …………..
WHERE [Condition];

S
li
d
e .
1
Limitations of SQL:
 SQL does not have any procedural capabilities. It does not provide the programming
techniques of condition checking, looping and branching that is vital for data testing before
its permanent storage.
 SQL statements are passed to the oracle engine one at a time, each time on SQL statement
is executed, a call is made to engines resources.
 While processing on SQL sentence if an error occurs, the oracle engine displays its own
error messages, SQL has no facilities for program handling of errors that arise during the
manipulation of data.
 It is not a fully structured programming language.
Pl/SQL

Advantages of PL/SQL:
 PL/SQL is a development tool that not only supports SQL data manipulation but also
provides facilities of conditional checking, branching and looping.
 PL/SQL sends on entire block of SQL statements to the oracle engine all together
communication between the program block and the oracle engine reduces considerably
reducing network traffic. The code is processed much faster.
 PL/SQL allows declaration and use of variables in blocks of code.
 These variables can be used to store intermediate results of query
for later processing or calculate values and insert them into an
oracle table later.
 Using PL/SQL all sorts of calculations can be done quickly and
efficiently without the use of the oracle engine. This improves
transaction performance.
 Applications written in PL/SQL are portable to any computer
hardware and operating system, where oracle is operational.
Pl/SQL Block: A PL/SQL block has a definite structure, which can be divided into 4 sections.
1. The declaration section
2. The begin section
3. The exception section
4. The end section.
PL/SQL Character set:
Upper case letters: A…Z
Lower case letters: a…z
Numerals: 0…9
Special symbols: (, ), +, -, *, /, <, >, =, !, :, :, ., ‘, ”, @, #, $, %, &, _, \, {, }, [, ], ?, |.
Literals: A sequence of valid PL/SQL characters is called literal.
PL/SQL Data types:
1. Number
2. Char
3. Date
4. Boolean
PL/SQL Execution:

1. Invoke editor from edit menu of oracle SQL * Plus environment,


notepad editor is invoked.
2. Enter the PL/SQL program.
3. Save and exist using file menu.
4. To execute PL/SQL type ‘/’ at SQL prompt and press enter.
5. Error will be indicated, go back and correct them.
6. When execution is successful, the message ‘PL/SQL Procedure
Successfully Completed’ is displayed.
PL/SQL Variable: PL/SQL variables are declared in the declarative part of a PL/SQL
block.
A variable name must begin with an alphabet followed by alpha-numerals.
The maximum length of a variable is 30 characters.
Only under score (_) special character allowed.
Syntax:
Variable name < data type> (size).
Example:
STUD_NAME varchar (30);
SALARY number (6, 2);
Constant: If a particular item does not change its value in PL/SQL block, we can declare it as
a constant in the block; its value must be assigned along with the declaration.

Syntax: <constant_name> constant < data type>:= <value>;


Example: discount CONSTANT number (5, 2):= 3.50;
 Comments: This line is not executable, we can place comments
any where inside a PL/SQL block comments are very useful to the
programmers for later references and debugging.
 Single line comments ( - -): A comments having only one line is
called single line comment beginning with two dashes.
 Multiline comments ( */ ..*/): If a comment runs over more than
one line, we need not begin line with - -, but just mark the
beginning of the comment with /* and mark the end of the comment
with */.
Prog to performSum of two numbers
SQL>ed
Wrote file afiedt.buf
1 declare
2 n1 number;
3 n2 number;
4 s number;
5 begin
6 n1:=10;
7 n2:=20;
8 s:=n1 + n2;
9 dbms_output.put_line('Sum of 10 and 20 is'||s);
10* end;
SQL> /
Sum of 10 and 20 is 30
PL/SQL procedure successfully completed.
Prog to display date

SQL>ed
Wrote file afiedt.buf
1 declare
2 d1 date;
3 begin
4 d1 :=sysdate;
5 dbms_output.put_Line(d1);
6* end;
SQL> /
31-MAR-10
PL/SQL procedure successfully completed.
Difference between SQL and PLSQL

1.)SQL is a data oriented language for selecting and manipulating sets


of data.
PL/SQL is a procedural language to create applications.

2.) PL/SQL can be the application language just like Java or PHP
can. PL/SQL might be the language we use to build, format and
display those screens, web pages and reports.
SQL may be the source of data for our screens, web pages and reports.
Embedded SQL
Embedded SQL is the one which combines the high level
language with the DB language like SQL. It allows the
application languages to communicate with DB and get
requested result. The high level languages which supports
embedding SQLs within it are also known as host language.
There are different host languages which support
embedding SQL within it like C, C++, ADA, Pascal,
FORTRAN, Java etc. When SQL is embedded within C or
C++, then it is known as Pro*C/C++ or simply Pro*C
language. Pro*C is the most commonly used embedded
SQL.
Example

EXEC SQL BEGIN DECLARE SECTION;


int STD_ID;
char STD_NAME [15];
char ADDRESS[20];
EXEC SQL END DECLARE SECTION;
Questions:
1. What is SQL? Explain Insert, Delete and Update in SQL.
2. Explain Different data types in SQL.
3. Explain briefly the history of SQL.
4. Explain about ALTER command with an example.
5. What is view? Explain with an example how it is create in SQL?
6. Explain different aggregate functions available in SQL?
S 7. Write a short note on VIEWS, INDEXES.
li
d
e .
1
Definition of important terms:
 Super key: It is a set of one or more attributes that taken collectively, allows
us to identify uniquely an entity in the entity set.
 The combination of stdname and regno is a super key for the entity set customer.
 Candidate key: A super key may contain extraneous attributes. A super key
for which no proper subset is a super key. i.e., minimal super key is called
candidate key.
 The combination (stdname, regno) forms super key, the attribute stdname
alone is a candidate key.

.Slide 2- 238
 Primary key: It is a candidate key chosen by the database designers as the
principal means of identifying entities within an entity set.

 Prime attribute: An attribute of relation schema R is called a prime


attribute.

 Non prime attribute: An attribute is called non prime if it is not a prime


attribute i.e., it is not a member of candidate key.

.Slide 2- 239
 Weak entity set: An entity set not having sufficient attributes to form a
primary key is called weak entity set.

 Strong entity set: An entity set that has a primary key is termed a strong
entity set.

.Slide 2- 240
 Foreign key: A set of attributes in a relation is a foreign key if it satisfies
the following conditions.
 It should have the some domain as the primary key attributes of another relation
schema and is said to refer to this relation.
 A value of foreign key in tuple t1 either occurs as a value of primary key for
some tuple t2 in another relation.
 Composite key: If a key has more than one attribute, it is called composite
key.

.Slide 2- 241
 Relation key: Given a relation, if the value of an attribute X uniquely
determinates the value of all other attribute in a row, then X is said to be
the key of that relation.

.Slide 2- 242
The

Evils of Redundancy
Redundancy is at the root of several problems associated with relational
schemas:
 redundant storage, insert/delete/update anomalies
 Integrity constraints, in particular functional dependencies, can be used to
identify schemas with such problems and to suggest refinements.
 Main refinement technique: decomposition (replacing ABCD with, say, AB
and BCD, or ACD and ABD).
 Decomposition should be used judiciously:
 Is there reason to decompose a relation?
 What problems (if any) does the decomposition cause?

243
Redundancy

 Dependencies between attributes cause redundancy


 Ex. All addresses in the same town have the same zip code

SSN Name Town Zip …….


1234 Joe Stony Brook 11733 ……..
4321 Mary Stony Brook 11733
5454 Tom Stony Brook 11733
…………………. redundant
244
Redundancy and Other Problems

 Set valued attributes result in multiple rows in corresponding table


 Example: Person (SSN, Name, Address, Hobbies)
 A person entity with multiple hobbies yields multiple rows in
table Person
 Hence, Name, Address stored redundantly

 SSN should be the key, but instead (SSN, Hobby) is key of


corresponding relation
 Person can’t describe people without hobbies

SSN Name Address Hobby


1111 Joe 123 Main biking
1111 Joe 123 Main hiking
…………….

245
Anomalies

 Redundancy leads to anomalies:


 Update anomaly: A change in Address must be made in several places
 Deletion anomaly: Suppose a person gives up all hobbies. Do we:
 Set Hobby attribute to null (no, since Hobby is part of key)
 Delete the entire row (no, since we lose other information in the row)
 Insertion anomaly: Hobby value must be supplied for any inserted row (since
Hobby is part of key)

246
Decomposition

 Solution: use two relations to store Person information

 Person1 (SSN, Name, Address)


 Hobbies (SSN, Hobby)
 The decomposition is more general: people with hobbies can now be described
 No update anomalies:

 Name and address stored once


 A hobby can be separately supplied or
deleted

247
Normalization Theory
 Anomalies:
 update anomaly occurs if changing the value of an attribute leads to an inconsistent
database state.
 insertion anomaly occurs if we cannot insert a tuple due to some design flaw.
 deletion anomaly occurs if deleting a tuple results in unexpected loss of
information.
 Normalization is the systematic process for removing all such anomalies in
database design, based on functional dependencies

248
**What are the Building blocks of Normalization?
(1)Functional dependency;
(2)Determinants
(3)Key attributes
(4)Non-key attributes

.Slide 2- 249
Different Types of Functional Dependencies
 Fully Functional dependency
 Partially Functional dependency
 Transitive Functional dependency

.Slide 2- 250
Functional Dependencies
 A functional dependency is a constraint between two sets of
attributes in a relational database.
 If X and Y are two sets of attributes in the same relation T, then X
 Y means that X functionally determines Y so that
 the values of the attributes in X uniquely determine the values of the
attributes in Y
 for any two tuples t1 and t2 in T, t1[X] = t2[X] implies that t1[Y] =
t2[Y]
 if two tuples in T agree in their X column(s), then their Y column(s)
should also be the same.

251
Another Example: FDs & Redundancy
 Consider relation obtained from Hourly_Emps:
 Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked)
 Notation: We will denote this relation schema by listing the attributes:
SNLRWH
 This is really the set of attributes {S,N,L,R,W,H}.
 Sometimes, we will refer to all attributes of a relation by using the relation name.
(e.g., Hourly_Emps for SNLRWH)
 Some FDs on Hourly_Emps:
 ssn is the key: S SNLRWH
 rating determines hrly_wages: R W

252
R W
Wages
Example (Contd.) 8 10
5 7
Hourly_Emps2
 Problems due to R W : S N L R H
 Update anomaly: Can 123-22-3666 Attishoo 48 8 40
we change W in just 231-31-5368 Smiley 22 8 30
the 1st tuple of SNLRWH?
131-24-3650 Smethurst 35 5 30
 Insertion anomaly: What if
we want to insert an employee 434-26-3751 Guldu 35 5 32
and don’t know the hourly 612-67-4134 Madayan 35 8 40
wage for his rating? S N L R W H
 Deletion anomaly: If we 123-22-3666 Attishoo 48 8 10 40
delete all employees with
231-31-5368 Smiley 22 8 10 30
rating 5, we lose the
information about the wage 131-24-3650 Smethurst 35 5 7 30
for rating 5! 434-26-3751 Guldu 35 5 7 32
612-67-4134 Madayan 35 8 10 40
Will 2 smaller tables be better? 253
Decompositions
 Do we need to decompose a relation?
 Several normal forms for relations. If schema in these normal forms certain
problems don’t arise
 What problems does decomposition cause?
 Lossless-join property: get original relation by joining the resulting relations
 Dependency-preservation property: enforce constraints on original relation by
enforcing some constraints on resulting relations
 Queries may require a join of decomposed relations!

254
Functional Dependencies
 Dependencies for this
R A B C D E F relation:
a1 b1 c1 d1 e1 f1  AB
a1 b1 c2 d1 e2 f3  AD
a2 b1 c2 d3 e2 f3  B,C  E,F
a3 b2 c3 d4 e3 f2  Do they all hold in this
a2 b1 c3 d3 e4 f4 instance of the relation
a4 b1 c1 d5 e1 f1 R?

• Functional dependencies are specified by the database


programmer based on the intended meaning of the
attributes.
255
Functional Dependencies
 What are the functional dependencies in:
COMPANIES(company_name, company_address,
date_founded, owner_name,
owner_title, #shares )

company_name  company_address
company_name  date_founded
company_name, owner_id  owner_title
company_name, owner_id  #shares
company_name, owner_title  owner_id
owner_id  owner_name

256
Types of functional dependencies

 Full dependency
 Partial dependency
 Transitive dependency

257
Full dependencies

 An attribute B of a relation R is fully functionally dependent on


attribute A of R if it is functionally dependent on A & not
functionally dependent on any proper subset of A.
 Report( S#,C#,Title,Lname,Room#,Marks)
 S#, C# Marks

 This implies that for a given pair of (S#,C#) values occurring in the
relation Report there is exactly one value of Marks. ie Marks are
dependent on S# & C# as a composite pair, but not on either
individually

258
Partial dependencies

 An attribute B of a relation R is partially dependent on attribute A of R if it is


functionally dependent on any proper subset of A.
 Report( S#,C#,Title,Lname,Room#,Marks)
 C# Title
 C# LName

 The attributes Title, LName are said to be partially dependent on the key (S#, C#) since
they are dependent only on C# and not on S#.

259
Transitive dependencies

 An attribute B of a relation R is transitively dependent on attribute A


of R if it is functionally dependent on an attribute C Which in turn
is functionally dependent on A or any proper subset of A.

Report( S#,C#,Title,Lname,Room#,Marks)
C# LName LName Room#

 The attribute Room# is said to be transitively dependent on the key


C# since it is dependent on LName which in turn is dependent on C#.

260
Keys in the relational model
 Superkey
 A set of one or more attributes, which, taken collectively, allow us to identify
uniquely a tuple in a relation.
 Let R be a relation scheme. A subset K of R is a superkey of R if, in any legal
relation [instance] r of R, for all pairs t1 and t2 of tuples in r such that t1[K] = t2[K]
 t1 = t2.
 Candidate key
 A superkey for which no proper subset is a superkey.
 Primary key
 The candidate key that is chosen by the database designer as the principle key.

261
FD and Keys

 Key constraint is a special kind of functional dependency


 Key is on LHS, all attributes are on RHS
 SSN  SSN, Name, Address

 For a key, no two rows share the same values, thus by default, when ever a tuple agrees on LHS it
agrees on the RHS.

262
Armstrong’s Axioms of FDs

 Reflexivity: If Y  X then X  Y (trivial FD)


 Name, Address  Name
 Augmentation: If X  Y then X Z YZ
 If Town  Zip then Town, Name  Zip, Name
 Transitivity: If X  Y and Y  Z then XZ

263
Other derived rules

 Union: If X  Y and X  Z, then X  YZ


 X  YX (augment), YX YZ (augment)
 thusX YZ (transitive)
 Decomposition: If X  YZ, then X  Y and X  Z
 YZ  Y (reflexive), thus X  Y (transitive)
 Pseudotransitivity: If X  Y and WY  Z,
then XW  Z
 Accumulation rule: If X  YZ and Z  W,
then X  YZW

264
Normalization

 This is proposed by Codd (1972)


 First Codd proposed three Normal forms. Which he called 1NF,2NF and 3NF.
 The Stronger definition of 3NF called Boyce-Codd Normal Forms (BCNF).
 Later, a Fourth NF and Fifth NF were proposed, based on Multivalued dependencies and
join Dependencies.

265
What is a Normalization?

 Normalization of data can be looked upon as a process of analyzing the given relation
schemas based on their FDs and primary keys to achieve the desirable properties of
 Minimizing the redundancy
 Minimizing the insertion, deletion and update anomalies.

266
Some Definitions

 Prime Attributes:
 An attribute of relation schema R is called a prime attribute of R if it is a
member of some candidate key of R.
o Non Prime Attributes:
An attribute is called a nonprime if it is not a prime attribute.
That is it is not a member of candidate key.

 E.g Report( S#,C#,Title,Lname,Room#,Marks)


 S# is a prime attribute
 C# is a prime attribute
 Title is a non-prime attribute
267
First normal form: 1NF
o single-valued
 A•relation
restricted to assuming
schema is in 1NF ifatomic
all of itsvalues,
attributes are:

 1NF disallows having a set of values, a tuple of values, or a combination of both as an


attribute value for a single tuple.

 1NF implies:
• Composite attributes are represented only by their component attributes
• Attributes cannot have multiple values

268
Example of 1NF
DEPARTMENT

DNAME DNUMBER DMGRSSN DLOCATION

DEPARTMENT

DNAME DNUMBER DMGRSSN DLOCATION


Research 5 333445555 {bellaire,
sugarland,
Houston}
Administration 4 987654321 {Stafford}
Headquarters 1 888665555 {Houston}

Is it a 1NF?
269
There are 3 main techniques to achieve
first normal
 Remove the attributesform
DLocationfor such
that voilates a place
1NF.and relation.
it in separate
relation Dept_location.
 Expand the key so that there will be a separate tuple in the original
Department relation for each location of a department.
 If it is known that at most three locations can exist for department
example Dloc1,Dloc2,Dloc3

270
Then, what is first 1NF?
DEPARTMENT

DNAME DNUMBER DMGRSSN DLOCATION


Research 5 333445555 Bellaire
Research 5 333445555 Sugarland
Research 5 333445555 Houston
Administration 4 987654321 Stafford
Headquarters 1 888665555 Houston

Is it a 1NF?

271
Another Example

Name EmpId Address


Susan 205 525 Mabury Rd. San
Jose, CA 95133
Susan 206 875 Gridley St. San Jose,
CA 95127

This table is not in 1NF….why?

272
Example Cont.

Name EmpI Street City State Zip


d
Susan 205 525 Mabury San CA 95133
Rd. Jose
Susan 206 Gridley St. San CA 95127
Jose

This is how the table should look like:

273
Second normal form: 2NF

 A relation schema R is in 2NF if it is in 1NF and every non-prime attribute is fully


functionally dependent on every key of R.

 Consider the relational schema:


 Empdetails( E#, Project#, Role,Number_Of_shares, Share_worth)

In this,
 E#, Project# -> Role
 E# -> Number_Of_shares

274
 Consider the relational schema:
Empdetails( E#, Project#, Role,Number_Of_shares, Share_worth)

In this,
 E#, Project# -> Role
 E# -> Number_Of_shares

275
A typical snapshot may look like…

Professor Subject Office


Jones Math42 MH 410
Jones CS49C MH 410
Smith Chem1A DH 211
Smith Chem100W DH 211
Lee Math161A MH 320

276
2NF Example cont.

To make the previous table in 2NF, the table must be split into two separate
tables:

Professor Subject
Jones Math42
Professor Office
Jones MH 410 Jones CS49C
Smith DH 211 Smith Chem1A
Lee MH 320
Smith Chem100W

Lee Math161A

277
Third normal form:3 NF
A relation schema R is in 3NF.if it is in 2NFand if, whenever a nontrivial
functional dependency X A holds in R, either
(a) X is super key of R
(b) A is a prime attribute of R.

278
Example of 3NF
Ename ssn Bdate addres Dnumber Dname Dmgssn
s
Functional dependencies
SSN Ename,Bdate,Address,Dnumber
Dnumber Dname,DmgSSN.

The dependency SSNDmgssn is transitive through Dnumber in Emp_dept.


Dependency SsnDnumber and DnumberDmgssn hold and Dnumber is neither a key
itself nor a subset of the key of Emp_dept. since Dnumber Is not a key of Emp_dept

279
What is a solution?

280
3NF Normaliztion.
Enam SSN Bdate Address Dnuumber
ED1 e

Functional dependencies
SSN Ename,Bdate,Address,Dnumber

ED2 Dnumber Dname Dmgssn

Functional dependencies
Dnumber Dname,Dmgssn

281
Boyce Codd Normal Form

 A relation schema R is in BCNF if whenever a nontrivial functional dependency X


A holds in R, then X is a Super key of R.

 The Formal definition of BCNF differs slightly from the definition of 3NF. The only
difference between the 3NF and BCNF is the condition (b) of 3NF. Which allows A to be a
prime.

282
An example
Courses (Dept#, Course#, Lecturer#, Num_Students)
 Consider the relation:

 Assumptions
– Each Department offers may courses
– Course# is unique within a Department only
– Each Lecturer belongs to one Dept only
– Each Lecturer may handle several courses within the dept.
– A particular course offered by a department may be handled by a single lecturer

283
The functional dependencies

• {Dept#,Course#}->Lecturer#
• {Dept#,Course#}-> Num-of_students
• {Lecturer#,Course#}->Num-of_students
• Lecturer# -> Course#

The candidate keys are:


 {Dept#,Course#}
• {Lecturer#,Course#}

284
A sample table

Dept# Course# Lecturer# Num_students

D1 C1 L1 20
D1 C2 L1 15
D1 C3 L2 42
… … … …
D2 C5 L3 12
D2 C6 L4 19

285
Observations

 In the table, the only non-prime attribute is Num-of_students.


• It depends on every key of the table non-transitively
 So, it is in 3NF
 But, the fact that Lecturer L1 belongs to department D1 is repeated redundancy
• Lecturer#->Dept#. In this, the attribute Dept# is only partially dependent on the key

286
The solution

 Course_Offering(Lecturer#, Course#, Num-of-Students)


• Lecturer(Lecturer#, Dept#)

287
Difference between BCNF and 3NF

 If XA is non trivial functional dependency

 3NF either X is super key and Allows A to be Prime.


 X is a Super key and A Is absent from BCNF( may or may not Prime)

288
Conclusion

 The primary objective of normalization is to avoid anomalies.

289
Properties of Relational Decompositions.

 Relation Decomposition and Insufficiency of Normal Forms:


• Universal Relation Schema:
 A relation schema R = {A1, A2, …, An} that includes all the attributes of the database.
• Universal relation assumption:
 Every attribute name is unique
• Decomposition:
 The process of decomposing the universal relation schema R into a set of relation schemas D = {R1,R2,
…, Rm} that will become the relational database schema by using the functional dependencies.
• Attribute preservation condition:
 Each attribute in R will appear in at least one relation schema Ri in the decomposition so that no
attributes are “lost”.

290
Properties of Relational Decompositions
(2)
 Another goal of decomposition is to have each individual relation Ri in the decomposition
D be in BCNF or 3NF.
 Additional properties of decomposition are needed to prevent from generating spurious
tuples.

291
Properties of Relational Decompositions (3)

 Lossless (Non-additive) Join Property of a Decomposition:


 Definition: Lossless join property: a decomposition D = {R1, R2, ...,
Rm} of R has the lossless (nonadditive) join property with respect to the set
of dependencies F on R if, for every relation state r of R that satisfies F, the
following holds, where * is the natural join of all the relations in D:
* ( R1(r), ..., Rm(r)) = r

Note: The word loss in lossless refers to loss of information, not to


loss of tuples. In fact, for “loss of information” a better term is
“addition of spurious information

292
(Example) Lossless (Non-additive) Join
Property
S P D S P P D
S1 P1 D1 S1 P1 P1 D1
S2 P2 D2 S2 P2 P2 D2
S3 P1 D3 S3 P1 P1 D3

S P D
S1 P1 D1
S2 P2 D2
S3 P1 D3
S1 P1 D3
S3 P1 D1

2
9
Dependency Preservation property of a
Decomposition.
 If each FD XY Specified in F either appeared directly in one of the relation schemas Ri
in the decomposition D or could be inferred from the dependencies that appear in some
Ri. Informally this is the dependency preservation.

294
Formal Definition for Dependency
Preservation.
 Given a set of dependencies F on R, the projection of F on Ri denoted by  Ri(F) where Ri
is a subset of R,

A decomposition D={ R1,R2,…,Rm } of R is dependency-preserving with


respect to F if the union of th projections of F on each Ri in D is equivalent to F
( R1(F)U  R2(F)U , ...,U Rm(F)) += F+

295
UNIT 05
Transaction Processing Concept

What is Transaction?
A transaction is an atomic unit comprised of one or more SQL statements.
A transaction begins with the first executable statements and ends when it is
committed or rollback.

S
li
d
e
1
Transaction and system concepts: A transaction is a logical unit of
database processing that includes one or more database access operation.

 Example of transaction processing system:


1. Reservation systems.
2. Credit card processing system
3. Stock market processing system.
4. Super market processing system
S 5. Insurance processing system etc..
li
d
e
1
Singile user Versus Multiuser Systems
 Single-User : at most one user at a time can use the
system

 Multiuser : many users can use the system


concurrently.

298
****Desirable Properties of transactions or
ACID properties of transactions[6M]

ACID should be enforced by the concurrency control and


recovery methods of the DBMS.
ACID properties of transactions :

1.Atomicity : A transaction is an atomic unit of


processing, it is either performed entirely or
not performed at all.
2.Consistency : Transaction must preserve database consistency.
ransform the database from one
A transaction t

consistent state to another consistent state.

299
Desirable Properties of transactions (continued)

3.Isolation :
The execution of the transaction should be isolated
from other transactions (Locking).
 No inference with other transactions.

4.Durability :
Once a transation completes, the changes made to database permanent and are available to all the
transactions that follow it.

300
*****Transaction state diagram and additional operations[6M]

 A transaction is an atomic unit of work that is either completed in its entirety


or not done at all.
 For recovery purposes the system needs to keep track of when the transaction
starts, terminates, and commits or aborts.
 The recovery manager keeps track of the following operations :
 BEGIN_TRANSACTION
 READ OR WRITE
 END_TRANSACTION
 COMMIT_TRANSACTION
 ROLLBACK

301
Transaction states and additional operations
(continued)
/READ
WRITE
BEGIN END
TRANSACTION TRANSACTION COMMIT
ACTIVE PARTIALLY COMMITTED
COMMITTED

ABORT
ABORT

FAILD TERMINATED

Figure 19.4 State transition diagram illustrating the states for


transaction execution
302
Basic database access operations :

read_item(X) : reads a database item X


into program variable.

write_item(X) : writes the value of


program variable X into the database item
X.

303
Read_item(x) include the following steps:

1. Find the address of the disk block that contain


item ‘x’.
2. Copy the disk block into a buffer in main
memory.
3. Copy item ‘x’ from the buffer to the program
S variable x.
li
d
e
1
Executing the write item(x) includes the following steps:

1. Find the address of the disk block that contains item


(x).
2. Copy that disk block into a buffer in main memory.
3. Copy item(x) from the program variable into its
current location in the buffer.
S 4. Store the update block from the buffer to disk.
li
d
e
1
Concurrency control

What is concurrency control?


 In a multiprogramming environment where multiple transactions can be executed
simultaneously, it is highly important to control the concurrency of transactions.
 We have concurrency control protocols to ensure atomicity, isolation, and serializability of
concurrent transactions.

S
li
d
e
1
 Concurrency control protocols can be broadly divided into two categories −
i. Lock based protocols
ii. Time stamp based protocols
Advantages of concurrent execution:
1.Increased performance
2.Resource utilization.
3.Decreased waiting time
Why concurrency control is needed?
In a multiuser database, transaction submitted by the various user may execute concurrently
and many update the same data concurrently executing transactions must be guaranteed to
produce the same effect as serial execution of transaction (one by one).

S
li
d
e
1
****Problems with concurrent execution

The problems that occur two transactions run concurrently are.


1. The Lost update problem.
2. Temporary update (or Dirty read) problem.
3. The incorrect summary problem.

S
li
d
e
1
***The Lost update problem:

Suppose transaction T1 and T2 are submitted at


the same time, when these two transactions are
executed concurrently then the final value of X
is incorrect because T2 reads the value of x
before T1 changes it in the database and hence
S the updated value resulting from T1 is lost.
li
d
e
1
Example: x=100 at the start (100 reservation at the beginning), n=10 (T1 updated with 10
seats reservations from flight Y to X) and m=20 (T2 transfer 20 seats on y), the final result
should be x =90 but due to interleaving of operations x = 80 because T1 updating that added
10 seats from Y was Lost.

S
li
d
e
1
Example
***2,Dirty read Problem:

This operation occurs when one transaction


updates a database item and then the
transaction fails for some reason, the update
item is accessed by another transaction before
it is changed back to its original value.
S
li
d
e
1
 Example: T1 updates item x and then fails before completion, so the system must change
x back to original value, before it can be do so, however, transaction T2 reads the
temporary value of x, which will be no record permanently in the database because of the
failure of T1. The value of item x that is read by T2 is called Dirty data. Because it has
been created by a transaction that has not completed and committed yet, hence this
problem is also known as the temporary update problem.

S
li
d
e
1
Example for dirty read problem
***3.Incorrect summary problem:

 If one transaction is calculating an aggregate


summary function on a number of records
while other transaction are updating some of
the records, the aggregate functions may
calculate some values before they are update
S and others after they are updated.
li
d
e
1
Example: Transaction T3 is calculating the total number of reservations on all
the flights, mean while transaction T1 is executing. The T3 reads the value of x
after n seats have been subtracted from it but reads the value of y before those n
seats have been added to it.
T1 T3

Sum:=0
read_item(a)
sum := sum+a;

read_item(x);
x:=x-N;
write_item(x);
read_iterm(x);
sum:=sum+x; T3 reads X after N is
read_item(y); subtracted and reads y
sum:= sum+y; before N is added.
read_item(y);
y:=y+N;
write_item(y); -a wrong summary is the result.
fig(b) Transaction[T2].
fig(a): Transaction [t1]
Why Recovery Is Needed

There are several possible reasons for a transaction to fail


1. A computer failure : A hardware, software, or network error occurs in the
computer system during transaction execution.

2. A transaction or system error : Some operations in the transaction may cause it to


fail.

3. Local errors or exception conditions detected by the transaction.

319
Why Recovery Is Needed (continued)
4.Concurrency control enforcement :
The concurrency control method may decide to abort the transaction.

5.Disk failure : all disk or some disk blocks may lose their data

6.Physical problems : Disasters, theft, fire, etc.

The system must keep sufficient information to recover from the failure.

320
****Concurrency control techniques

Some of the main techniques used to control


concurrent execution of transaction are based on the
concept of locking data items.
 A lock is a restriction on access to the data in a
multiuser environment.
 It prevents multiple users from changing the same
S
li data simultaneously.
d
e
1
Types of LOCKs can be used, they are,
1. Binary LOCK.
2. Shared LOCK.
3. Exclusive lock
1.Binary LOCK: A binary lock can have two states or values:
i)Locked (1)
ii)Unlocked(0)
A distinct lock is associated with each database item A.
S If the value of the lock on A is 1, item A cannot be accessed by a database operation that
i requests the item. If the value of the lock on A is 0 then item can be accessed when requested.
d
e
1
Locking Technique

2.Shared lock (s):


It is used for read only operations. i.e., Used for operations that does not change or update the
data.
Once a transaction puts the s lock on a particular resource, only read can be performed by
the other transaction ,modification or updates not possible to perform.

S
li
d
e
1
3.Exclusive LOCK: Exclusive locks are used for data modification operations such as
update, delete and insert.
Once a transaction puts the X lock on a particular resource, no other transaction can put any
kind of lock on this resource.
This resource is exclusively reserved for the first transaction and no other transaction can use
it for read or write operation.
Hence X lock allows least concurrency

S
li
d
e
1
Locking Technique
 The effect of a lock is to lock other transaction out of the object.

B X S
A
X N N Y
 Compatibility matrix
S N Y Y

Y Y Y

: no lock
N
Y
: request not compatible
S : request compatible
li
d
e
1
OBJECTIVE

• Diagrammatic study of
VIEW,

• Concept of VIEW,

• Creation of VIEW,

• How to Update, modify,


drop VIEW,

• Concept of ASSETION,

• Examples of
ASSERTION.
DIAGRAMMATIC REPRESENTATION OF VIEW

VIEW OF THE TABLE


A C D

Remember that the view exists


as query not as a table

A B C D E F

ACTUAL TABLE
WHAT IS VIEW

• A view is a virtual table. It does not physically exist. Rather, it is created by a query
joining one or more tables.

• A view contains rows and columns, just like a real table.

• The fields in a view are fields from one or more real tables in the database.

Creating an SQL VIEW

Syntax:
CREATE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE conditions;
VIEW CREATION-EXAMPLE

• View Creation-Example
CREATE VIEW sup_orders

AS SELECT suppliers.supplier_id, orders.quantity, orders.price

FROM suppliers, orders

WHERE suppliers.supplier_id = orders.supplier_id and


suppliers.supplier_name = “IBM”;

• The view(create statement) would create a virtual table based on the result set of the
select statement. You can now query the view as follows

SELECT*FROM sup_orders;
Updating View Modify View-Example Dropping View

You can modify the • CREATE or REPLACE • The syntax for


definition of a view VIEW sup_orders dropping a VIEW:
without dropping it by
using the following • AS SELECT DROP VIEW view_name;
syntax: suppliers.supplier_id,
orders.quantity, • View Drop – Example
orders.price
DROP VIEW sup_orders;
* CREATE OR REPLACE
• FROM suppliers, orders
VIEW view_name
• WHERE
* AS SELECT columns
suppliers.supplier_id =
* FROM table orders.supplier_id and
suppliers.supplier_name
* WHERE predicates; = “Microsoft”;
ASSERTIONS

An expression that should be always true

When created, the expression must be true

DBMS checks the assertion after any change that may violate the
expression

Must return True or False


EXAMPLE 1

Sum of loans taken by a customer does not exceed


100,000 Must return True or False
Create Assertion SumLoans Check (not a relation)
( 100,000 >= ALL
Select Sum(amount)
From borrower B , loan L
Where B.loan_number = L.loan_number
Group By customer_name );
EXAMPLE 2

Number of accounts for each customer in a given branch is at most


two
Create Assertion NumAccounts Check
( 2 >= ALL
Select count(*)
From account A , depositor D
Where A.account_number = D.account_number
Group By customer_name, branch_name );
EXAMPLE 3

Customer city is always not null


Create Assertion CityCheck Check
( NOT EXISTS (
Select *
From customer
Where customer_city is null));
INDEXING
OBJECTIVE

• Concept of Indexing

• Examples with Query of Indexing

• Types of Indexing
INDEXING

• Indexes are used by queries to find data from tables quickly. Indexes are created on tables
and views. Index on a table or a view, is very similar to an index that we find in a book.

• If you don’t have an index, and I ask you to locate a specific chapter in the book, you will
have to look at every pages starting from the first page of the book.

• On the other hand, if you have the index, you look up the page number of the chapter in
the index, and then directly go to that page number to locate the chapter.

• Obviously, the book index is helping drastically reduce the time it takes to find the
chapters.

• In a similar way, Table and View indexes, can help the query to find data quickly.

• In fact, the existence of the right indexes, can directically improve the performance of the
query. If there is no index to help the query , then the query engine, checks every row in
the table from the beginning to the end. This is called Table Scan. Table scan is bad for
performance.
INDEX EXAMPLE

At the moment, the employees table, does not have an index on SALARY column.
Id Name Salary Gender

1 Sam 2500 Male


2 Pam 6500 Female
3 John 4500 Male
4 Sara 5500 Female
5 Todd 3100 Male

Select * from tblEmployee


Where Salary > 5000 and Salary < 7000

To find all the employees, who has salary greater than 5000 and less than 7000, the query engine has to
check each and every row in the table, resulting in a table scan, which can adversely affect the
performance, especially if the table is large. Sine there is no index, to help the query, the query engine
performs an entire table scan.
CREATING AN INDEX
CREATE INDEX IX_tblEmployee_Salary
ON tblEmployee (SALARY ASC)
The index stores salary of employees , in the ascending order as shown below. The actual index may
look slightly different.
I Salary Row Address
Name Salary Gender
d
2500 Row address
1 Sam 2500 Male
3100 Row address
2 Pam 6500 Female
4500 Row address
3 John 4500 Male
5500 Row address
4 Sara 5500 Female
5 Todd 3100 Male 6500 Row address

Now , when the SQL server has to execute the same query, it has an index on the salary column to help
this query. Salaries between the range of 5000 and 7000 are usually present at the bottom, since the
salaries are arranged in an ascending order. SQL server picks up the row address from the index and
directly fetch the records from the table, rather than scanning each row in the table. This is called
Index Seek.
As we studied, Indexes are used to retrieve data from the database very fast. The users
cannot see the indexes, they are just used to speed up searches/queries.

Syntax for Creating Index Syntax for Dropping Index

M.s.Access :
CREATE INDEX index_name
DROP INDEX index_name
ON table_name(column1,
ON table_name;
column2,….);
SQL Server :
DROP
INDEX table_name.index_name;
CREATE UNIQUE INDEX
DB2/Oracle :
Index_name
DROP INDEX index_name;
ON table_name(column1,
My SQL :
Column2,….);
ALTER TABLE table_name
DROP INDEX index_name;
CLUSTERED INDEX
A clustered index determines the physical order of data in a table. For this reason, a table can have
only one clustered index.
CREATE TABLE (tblEmployee)
{
[Id] int Primary key,
[Name] nvarchar (50),
[Salary] int,
[Gender] nvarchar (10),
[City] nvarchar (50)
}
Note that id column is marked as primary key. Primary key, constraint create clustered indexes
Automatically if no clustered index already exists on the table.

To confirm: Execute sp_helpindex tblEmployee I Name Salary Gender City


d
1 Sam 2500 Male London

Insert into tblEmployee Value(3,’John’ , 4500,’male’,’New York’) 2 Pam 6500 Female Sydney
Insert into tblEmployee Value(1,’Sam’,2500,’female’,’London’)
3 John 4500 Male New York
Insert into tblEmployee Value(4,’Sara’,5500,’female’,’Tokyo’
Insert into tblEmployee Value (5, ’Todd’,3100,’male’,’Toronto’) 4 Sara 5500 Female Tokyo
Insert into tblEmployee Value (2, ‘Pam’,6500, ’female’, Sydney’)
5 Todd 3100 Male Toronto
CLUSTERED INDEX
A clustered index is analogous to a telephone directory, where the data is arranged by
the last name.

Create a composite clustered index on the Gender and Salary columns :


Create clustered index IX_ tblEmployee_Gender_Salary
ON tblEmployee (Gender DESC, Salary ASC)

Select * from tblEmployee


I Name Salary Gender City
d
1 Sam 2500 Male London

2 Pam 6500 Female Sydney

3 John 4500 Male New York

4 Sara 5500 Female Tokyo

5 Todd 3100 Male Toronto


NONCLUSTERED INDEX
Create Nonclustered Index IX_tblEmployee_name
ON tblEmployee(name)

I Name Salary Gender City Name Row Address


d
1 Sam 2500 Male London John Row address

2 Todd 3100 Female Toronto Pam Row address

3 John 4500 Male New York Sam Row address

4 Sara 5500 Female Tokyo Sara Row address

5 Pam 6500 Male Sydney Todd Row address

A nonclustered index is analogous to an index in a textbook. The data is stored in one place, the index
in another place. The index will have pointers to the storage location of the data.

In the index itself, the data is stored in an ascending or descending order of the index key, which
doesn’t in any way influence the storage of data in the table.
DBMS
SYLLABUS
COMPLETED.

346
ANY QUESTIONS FROM
UNIT 1,2,3,4 & 5.

FEEL FREE TO ASK QUESTIONS


…………….

347
Thank you

You might also like