1.
Database Management Systems
1. A database management system (DBMS), or simply a database system (DBS),
consists of
o A collection of interrelated and persistent data (usually referred to as the database
(DB)).
o A set of application programs used to access, update and manage that data (which
form the data management system (MS)).
2. he goal of a DBMS is to pro!ide an en!ironment that is "oth convenient and efficient
to use in
o #etrie!ing information from the data"ase.
o Storing information into the data"ase.
$. Data"ases are usually designed to manage large "odies of information. his in!ol!es
o Definition of structures for information storage (data modeling).
o %ro!ision of mechanisms for the manipulation of information (file and systems
structure, &uery processing).
o %ro!iding for the safety of information in the data"ase (crash reco!ery and
security).
o 'oncurrency control if the system is shared "y users.
2. Purpose of Database Systems/DBMS VS File System
o Data redundancy and inconsistency in file system
Same information may "e duplicated in se!eral places.
All copies may not "e updated properly.
o Difficulty in accessing data with file system.
May ha!e to write a new application program to satisfy an unusual
re&uest.
(.g. find all customers with the same postal code. 'ould generate this data
manually, "ut a long )o"...
o Data isolation
Data in different files.
Data in different formats.
Difficult to write new application programs.
o Multiple users*'oncurrency 'ontrol mechanism
+ant concurrency for faster response time.
,eed protection for concurrent updates.
(.g. two customers withdrawing funds from the same account at the same
time - account has ./00 in it, and they withdraw .100 and ./0. he result
could "e .$/0, .100 or .1/0 if no protection.
o Security pro"lems
(!ery user of the system should "e a"le to access only the data they are
permitted to see.
(.g. payroll people only handle employee records, 2 cannot see customer
accounts3 tellers only access account data and cannot see payroll data.
Difficult to enforce this with file system.
o 4ntegrity pro"lems
Maintains correctness in a data"ase.
Data may "e re&uired to satisfy constraints. (.g. no account "alance "elow
.2/.00.
Again, difficult to enforce or to change constraints with the file-processing
approach.
ypes of integrity5
i. Domain 4ntegrity5 Data types in columns, null !alues allowed or
not.
ii. (ntity 4ntegrity5 All rows in ta"le ha!e uni&ue identifier called
primary 6ey.
iii. #eferential 4ntegrity5 7oreign 6ey cannot "e deleted
i!. 8ser defined 4ntegrity5 (.g. ,o a*c "alance can "e "elow .2/.
hese pro"lems and others led to the de!elopment of database management systems.
. Basic Building Bloc! of Database"
7or a complete mechanism we ha!e to define5
1) Structure of Data"ase5 fields(columns*rows), files, records
2) Data"ase &uery language5 pro!ision of mechanism for manipulation of language for e.g.
insertion, deletion, updating.
$) ransaction Mechanism5
Any logical operation on data is transaction.
During transaction processing, DBMS must guarantee A'4D property.
4. Atomicity5 - 4t says, 9(ither all tas6 of transaction must "e performed or none of
them:.
(.g. ransfer of funds.
44. 'onsistency5 4f a transaction fails, still the data"ase must maintain the structure it
did "efore the transaction occurred.
7or e.g. 4f you had deleted a user, he cannot "e signed in again.
444. 4solation5 4f any transaction is in intermediate state, data used "y that transaction
cannot "e accessed "y other transaction or they are hidden or in!isi"le to other
transaction unless the transaction is completed.
4;. Dura"ility5 <nce user is notified of success (for e.g. transaction is complete for e.g
insertion is done or transfer is complete), transaction persists.
4f ,A' issued a seat 22A and minutes after the hard dis6 crashes, it won=t forget
you will "e sitting in sear 22A.
#. Data $bstraction
Data A"straction is a mechanism or practice that the system hides certain details of how
data is stored and created and maintained. 'omple>ity should "e hidden from data"ase
users. he ma)or purpose of a data"ase system is to pro!ide users with an abstract vie%
of the system.
here are se!eral le!els of a"straction5
1. %hysical ?e!el5
@ow the data are stored.
(.g. inde>, B-tree, hashing.
?owest le!el of a"straction.
'omple> low-le!el structures descri"ed in detail.
2. 'onceptual ?e!el5
,e>t highest le!el of a"straction.
Descri"es what data are stored.
Descri"es the relationships among data.
Data"ase administrator le!el.
$. ;iew ?e!el5
@ighest le!el.
Descri"es part of the data"ase for a particular group of users.
'an "e many different !iews of a data"ase.
(.g. tellers in a "an6 get a !iew of customer accounts, "ut not of payroll
data.
7ig. 1.1 (figure 1.1 in the te>t) illustrates the three le!els.
Figure 1.1" he three le!els of data a"straction
&. Data Models
Data models are a collection of conceptual tools for descri"ing data, data relationships,
data semantics and data constraints. here are three different groups5
1. <")ect-"ased ?ogical Models.
2. #ecord-"ased ?ogical Models.
$. %hysical Data Models.
a' (b)ect*based +ogical Models
<")ect-"ased logical models5
o Descri"e data at the conceptual and !iew le!els.
o %ro!ide fairly fle>i"le structuring capa"ilities.
o Allow one to specify data constraints e>plicitly.
o <!er $0 such models, including
(ntity-relationship model.
<")ect-oriented model.
Binary model.
Semantic data model.
4nfological model.
7unctional data model.
At this point, weAll ta6e a closer loo6 at the entity*relations,ip -.*/' and ob)ect*
oriented models.
i) The E-R Model
he entity-relationship model is "ased on a perception of the world as consisting of a
collection of "asic ob)ects (entities) and relations,ips among these o")ects.
o An entity is a distinguisha"le o")ect that e>ists.
o (ach entity has associated with it a set of attributes descri"ing it.
o (.g. number and balance for an account entity.
o A relations,ip is an association among se!eral entities.
o e.g. A cust_acct relationship associates a customer with each account he or she
has.
o he set of all entities or relationships of the same type is called the entity set or
relations,ip set.
o Another essential element of the (-# diagram is the mapping cardinalities,
which e>press the num"er of entities to which another entity can "e associated !ia
a relationship set.
he o!erall logical structure of a data"ase can "e e>pressed graphically "y an .*/
diagram5
o rectangles5 represent entity sets.
o ellipses5 represent attri"utes.
o diamonds5 represent relationships among entity sets.
o lines5 lin6 attri"utes to entity sets and entity sets to relationships.
See figure 1.2 for an e>ample.
Figure 1.2" A sample (-# diagram.
ii) The Object-Oriented Model
he o")ect-oriented model is "ased on a collection of o")ects, li6e the (-# model.
o An o")ect contains !alues stored in instance variables within the o")ect.
o 8nli6e the record-oriented models, these !alues are themsel!es o")ects.
o hus o")ects contain o")ects to an ar"itrarily deep le!el of nesting.
o An o")ect also contains "odies of code that operate on the the o")ect.
o hese "odies of code are called met,ods.
o <")ects that contain the same types of !alues and the same methods are grouped
into classes.
o A class may "e !iewed as a type definition for o")ects.
o Analogy5 the programming language concept of an a"stract data type.
o he only way in which one o")ect can access the data of another o")ect is "y
in!o6ing the method of that other o")ect.
o his is called sending a message to the o")ect.
o 4nternal parts of the o")ect, the instance !aria"les and method code, are not !isi"le
e>ternally.
o #esult is two le!els of data a"straction.
7or e>ample, consider an o")ect representing a "an6 account.
o he o")ect contains instance !aria"les number and balance.
o he o")ect contains a method pay-interest which adds interest to the "alance.
o 8nder most data models, changing the interest rate entails changing code in
application programs.
o 4n the o")ect-oriented model, this only entails a change within the pay-interest
method.
8nli6e entities in the (-# model, each o")ect has its own uni&ue identity, independent of
the !alues it contains5
o wo o")ects containing the same !alues are distinct.
o Distinction is created and maintained in physical le!el "y assigning distinct o")ect
identifiers.
b' /ecord*based +ogical Models
#ecord-"ased logical models5
o Also descri"e data at the conceptual and !iew le!els.
o 8nli6e o")ect-oriented models, are used to
Specify o!erall logical structure of the data"ase, and
%ro!ide a higher-le!el description of the implementation.
o ,amed so "ecause the data"ase is structured in fi>ed-format records of se!eral
types.
o (ach record type defines a fi>ed num"er of fields, or attri"utes.
o (ach field is usually of a fi>ed length (this simplifies the implementation).
o #ecord-"ased models do not include a mechanism for direct representation of
code in the data"ase.
o Separate languages associated with the model are used to e>press data"ase &ueries
and updates.
o he three most widely-accepted models are the relational0 net%or!, and
,ierarc,ical.
o his course will concentrate on the relational model.
o he net%or! and ,ierarc,ical models are co!ered in appendices in the te>t.
i) The Relational Model
Data and relationships are represented "y a collection of tables.
(ach table has a num"er of columns with uni&ue names, e.g. customer, account.
7igure 1.$ shows a sample relational data"ase.
Figure 1." A sample relational data"ase.
ii) The Network Model
Data are represented "y collections of records.
#elationships among data are represented "y lin6s.
<rganiBation is that of an arbitrary grap,.
7igure 1.1 shows a sample networ6 data"ase that is the e&ui!alent of the relational
data"ase of 7igure 1.$.
7igure 1.15 A sample networ6 data"ase
iii) The ierarchical Model
Similar to the networ6 model.
<rganiBation of the records is as a collection of trees, rather than ar"itrary graphs.
7igure 1./ shows a sample hierarchical data"ase that is the e&ui!alent of the relational
data"ase of 7igure 1.$.
Figure 1.&" A sample hierarchical data"ase
he relational model does not use pointers or lin6s, "ut relates records "y the !alues they
contain. his allows a formal mathematical foundation to "e defined.
c' P,ysical Data Models
4t is used to descri"e data at the lowest le!el.
;ery few models, e.g. unifying model, 7rame memory.
1. 2nstances and Sc,emes
o Data"ases change o!er time.
o he information in a data"ase at a particular point in time is called an instance of the
data"ase.
o he o!erall design of the data"ase is called the database sc,eme.
o Analogy with programming languages5
Data type definition C scheme
;alue of a !aria"le - instance
o here are se!eral schemes, corresponding to le!els of a"straction5
%hysical scheme
'onceptual scheme
Su"scheme (can "e many)
4nformation change fre&uently "ut scheme doesn=t.
3. Data 2ndependence
4n older system, data were dependent upon system physically and logically. 9%hysically: means
the way data is represented in secondary storage. (.g. hashing method used for record. he
application program through which we access must 6now whether file has used inde>ing method
or hashing. So it won=t "e easy to change inde>ing to hashing without changing the application.
he a"ility to modify a scheme definition in one le!el without affecting a scheme definition in a
higher le!el is called data independence.
here are two 6inds5
i' P,ysical data independence"
o he a"ility to modify the physical scheme without causing application programs
to "e rewritten.
o Modifications at this le!el are usually to impro!e performance
ii' +ogical data independence"
o he a"ility to modify the conceptual scheme without causing application
programs to "e rewritten
o 8sually done when logical structure of data"ase is altered
?ogical data independence is harder to achie!e as the application programs are usually hea!ily
dependent on the logical structure of the data. An analogy is made to a"stract data types in
programming languages.
4. Data Definition +anguage -DD+'
o Data Definition ?anguage (DD?) is a standard for commands that define the different
structures in a data"ase. DD? statements create, modify, and remo!e data"ase o")ects
such as ta"les, inde>es, and users. 'ommon DD? statements are '#(A(, A?(#, and
D#<%.
o Simply, DD? is used to create data"ase schemes.
o DD? statements are compiled, resulting in a set of ta"les stored in a special file called a
data dictionary or data directory.
o he data directory contains metadata (data a"out data).
o he storage structure and access methods used "y the data"ase system are specified "y a
set of definitions in a special type of DD? called a data storage and definition language.
o "asic idea5 hide implementation details of the data"ase schemes from the users
5. Data Manipulation +anguage -DM+'
1. Data Manipulation is5
o retrieval of information from the data"ase
o insertion of new information into the data"ase
o deletion of information in the data"ase
o modification of information in the data"ase
2. A DM? is a language which ena"les users to access and manipulate data.
he goal is to pro!ide efficient human interaction with the system.
$. here are two types of DM?5
o procedural5 the user specifies what data is needed and how to get it
o nonprocedural5 the user only specifies what data is needed
(asier for user
May not generate code as efficient as that produced "y procedural
languages
4. A 6uery language is a portion of a DM? in!ol!ing information retrie!al only. he terms
DM? and &uery language are often used synonymously.
17. Database Manager
1. he database manager is a program module which pro!ides the interface "etween the
low-le!el data stored in the data"ase and the application programs and &ueries su"mitted
to the system.
2. Data"ases typically re&uire lots of storage space (giga"ytes). his must "e stored on
dis6s. Data is mo!ed "etween dis6 and main memory (MM) as needed.
$. he goal of the data"ase system is to simplify and facilitate access to data. %erformance
is important. ;iews pro!ide simplification.
1. So the data"ase manager module is responsi"le for
o 2nteraction %it, t,e file manager" Storing raw data on dis6 using the file system
usually pro!ided "y a con!entional operating system. he data"ase manager must
translate DM? statements into low-le!el file system commands (for storing,
retrie!ing and updating data in the data"ase).
o 2ntegrity enforcement" 'hec6ing that updates in the data"ase do not !iolate
consistency constraints (e.g. no "an6 account "alance "elow .2/)
o Security enforcement" (nsuring that users only ha!e access to information they
are permitted to see
o Bac!up and recovery" Detecting failures due to power failure, dis6 crash,
software errors, etc., and restoring the data"ase to its state "efore the failure
o 8oncurrency control" %reser!ing data consistency when there are concurrent
users.
/. Some small data"ase systems may miss some of these features, resulting in simpler
data"ase managers. (7or e>ample, no concurrency is re&uired on a %' running MS-D<S.)
hese features are necessary on larger systems.
Database $dministrator
1. he database administrator is a person ha!ing central control o!er data and programs
accessing that data. Duties of the data"ase administrator include5
o Sc,eme definition" the creation of the original data"ase scheme. his in!ol!es
writing a set of definitions in a DD? (data storage and definition language),
compiled "y the DD? compiler into a set of ta"les stored in the data dictionary.
o Storage structure and access met,od definition" writing a set of definitions
translated "y the data storage and definition language compiler
o Sc,eme and p,ysical organi9ation modification" writing a set of definitions
used "y the DD? compiler to generate modifications to appropriate internal
system ta"les (e.g. data dictionary). his is done rarely, "ut sometimes the
data"ase scheme or physical organiBation must "e modified.
o :ranting of aut,ori9ation for data access" granting different types of
authoriBation for data access to !arious users
o 2ntegrity constraint specification" generating integrity constraints. hese are
consulted "y the data"ase manager module whene!er updates occur.
Database ;sers
1. he database users fall into se!eral categories5
o $pplication programmers are computer professionals interacting with the
system through DM? calls em"edded in a program written in a host language (e.g.
', %?*1, %ascal).
hese programs are called application programs.
he DM+ precompiler con!erts DM? calls (prefaced "y a special
character li6e ., D, etc.) to normal procedure calls in a host language.
he host language compiler then generates the o")ect code.
Some special types of programming languages com"ine %ascal-li6e
control structures with control structures for the manipulation of a
data"ase.
hese are sometimes called fourt,*generation languages.
hey often include features to help generate forms and display data.
o Sop,isticated users interact with the system without writing programs.
hey form re&uests "y writing &ueries in a data"ase &uery language.
hese are su"mitted to a 6uery processor that "rea6s a DM? statement
down into instructions for the data"ase manager module.
o Speciali9ed users are sophisticated users writing special data"ase application
programs. hese may "e 'ADD systems, 6nowledge-"ased and e>pert systems,
comple> data systems (audio*!ideo), etc.
o <aive users are unsophisticated users who interact with the system "y using
permanent application programs (e.g. automated teller machine).
(verall System Structure
1. Data"ase systems are partitioned into modules for different functions. Some functions
(e.g. file systems) may "e pro!ided "y the operating system.
2. 'omponents include5
o File manager manages allocation of dis6 space and data structures used to
represent information on dis6.
o Database manager5 he interface "etween low-le!el data and application
programs and &ueries.
o =uery processor translates statements in a &uery language into low-le!el
instructions the data"ase manager understands. (May also attempt to find an
e&ui!alent "ut more efficient form.)
o DM+ precompiler con!erts DM? statements em"edded in an application
program to normal procedure calls in a host language. he precompiler interacts
with the &uery processor.
o DD+ compiler con!erts DD? statements to a set of ta"les containing metadata
stored in a data dictionary.
4n addition, se!eral data structures are re&uired for physical system implementation5
o Data files" store the data"ase itself.
o Data dictionary" stores information a"out the structure of the data"ase. A set of
tables that contains descriptive information about the database's
components, such as the data fles, tablespaces, tables, and users
o 2ndices" pro!ide fast access to data items holding particular !alues.
$. 7igure 1.E shows these components.
Figure 1.1" Data"ase system structure.