Copyright © 2007 Ramez Elmasri and Shamkant B.
Navathe 1
Chapter 2
Database System Concepts and
Architecture
Outline
■ Data Models and Their Categories
■ History of Data Models
■ Schemas, Instances, and States
■ Three-Schema Architecture
■ Data Independence
■ DBMS Languages and Interfaces
■ Database System Utilities and Tools
■ Centralized and Client-Server Architectures
■ Classification of DBMSs
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 3
Data Models
■ Data Model:
■ A set of concepts to describe the structure of a database,
the operations for manipulating these structures, and
certain constraints that the database should obey.
■ Data Model Structure and Constraints:
■ Constructs are used to define the database structure
■ Constructs typically include elements (and their data
types) as well as groups of elements (e.g. entity, record,
table), and relationships among such groups
■ Constraints specify some restrictions on valid data; these
constraints must be enforced at all times
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 4
Data Models (continued)
■ Data Model Operations:
■ These operations are used for specifying
database retrievals and updates by referring to the
constructs of the data model.
■ Operations on the data model may include basic
model operations (e.g. generic insert, delete,
update) and user-defined operations (e.g.
compute_student_gpa, update_inventory)
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 5
Categories of Data Models
■ Conceptual (high-level, semantic) data models:
■ Provide concepts that are close to the way many users
perceive data.
■ (Also called entity-based or object-based data models.)
■ Physical (low-level, internal) data models:
■ Provide concepts that describe details of how data is stored
in the computer. These are usually specified in an ad-hoc
manner through DBMS design and administration manuals
■ Implementation (representational) data models:
■ Provide concepts that fall between the above two, used by
many commercial DBMS implementations (e.g. relational
data models used in many commercial systems).
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 6
Schemas versus Instances
■ Database Schema:
■ The description of a database.
■ Includes descriptions of the database structure,
data types, and the constraints on the database.
■ Schema Diagram:
■ An illustrative display of (most aspects of) a
database schema.
■ Schema Construct:
■ A component of the schema or an object within
the schema, e.g., STUDENT, COURSE.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 7
Schemas versus Instances
■ Database State:
■ The actual data stored in a database at a
particular moment in time. This includes the
collection of all the data in the database.
■ Also called database instance (or occurrence or
snapshot).
■ The term instance is also applied to individual
database components, e.g. record instance, table
instance, entity instance
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 8
Database Schema
vs. Database State
■ Database State:
■ Refers to the content of a database at a moment
in time.
■ Initial Database State:
■ Refers to the database state when it is initially
loaded into the system.
■ Valid State:
■ A state that satisfies the structure and constraints
of the database.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 9
Database Schema
vs. Database State (continued)
■ Distinction
■ The database schema changes very infrequently.
■ The database state changes every time the
database is updated.
■ Schema is also called intension.
■ State is also called extension.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 10
Example of a Database Schema
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 11
Example of a database state
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 12
Three-Schema Architecture
■ Proposed to support DBMS characteristics of:
■ Program-data independence.
■ Support of multiple views of the data.
■ Not explicitly used in commercial DBMS
products, but has been useful in explaining
database system organization
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 13
Three-Schema Architecture
■ Defines DBMS schemas at three levels:
■ Internal schema at the internal level to describe physical
storage structures and access paths (e.g indexes).
■ Typically uses a physical data model.
■ Conceptual schema at the conceptual level to describe
the structure and constraints for the whole database for a
community of users.
■ Uses a conceptual or an implementation data model.
■ External schemas at the external level to describe the
various user views.
■ Usually uses the same data model as the conceptual schema.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 14
The three-schema architecture
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 15
Three-Schema Architecture
■ Mappings among schema levels are needed to
transform requests and data.
■ Programs refer to an external schema, and are
mapped by the DBMS to the internal schema for
execution.
■ Data extracted from the internal DBMS level is
reformatted to match the user’s external view (e.g.
formatting the results of an SQL query for display
in a Web page)
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 16
Data Independence
■ Logical Data Independence:
■ The capacity to change the conceptual schema
without having to change the external schemas
and their associated application programs.
■ Physical Data Independence:
■ The capacity to change the internal schema
without having to change the conceptual schema.
■ For example, the internal schema may be
changed when certain file structures are
reorganized or new indexes are created to
improve database performance
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 17
Data Independence (continued)
■ When a schema at a lower level is changed, only
the mappings between this schema and
higher-level schemas need to be changed in a
DBMS that fully supports data independence.
■ The higher-level schemas themselves are
unchanged.
■ Hence, the application programs need not be
changed since they refer to the external schemas.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 18
DBMS Languages
■ Data Definition Language (DDL)
■ Data Manipulation Language (DML)
■ High-Level or Non-procedural Languages: These
include the relational language SQL
■ May be used in a standalone way or may be
embedded in a programming language
■ Low Level or Procedural Languages:
■ These must be embedded in a programming
language
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 19
DBMS Languages
■ Data Definition Language (DDL):
■ Used by the DBA and database designers to
specify the conceptual schema of a database.
■ In many DBMSs, the DDL is also used to define
internal and external schemas (views).
■ In some DBMSs, separate storage definition
language (SDL) and view definition language
(VDL) are used to define internal and external
schemas.
■ SDL is typically realized via DBMS commands
provided to the DBA and database designers
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 20
DBMS Languages
■ Data Manipulation Language (DML):
■ Used to specify database retrievals and updates
■ DML commands (data sublanguage) can be
embedded in a general-purpose programming
language (host language), such as COBOL, C,
C++, or Java.
■ A library of functions can also be provided to
access the DBMS from a programming language
■ Alternatively, stand-alone DML commands can be
applied directly (called a query language).
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 21
Types of DML
■ High Level or Non-procedural Language:
■ For example, the SQL relational language
■ Are “set”-oriented and specify what data to
retrieve rather than how to retrieve it.
■ Also called declarative languages.
■ Low Level or Procedural Language:
■ Retrieve data one record-at-a-time;
■ Constructs such as looping are needed to retrieve
multiple records, along with positioning pointers.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 22
DBMS Interfaces
■ Stand-alone query language interfaces
■ Example: Entering SQL queries at the DBMS
interactive SQL interface (e.g. SQL*Plus in
ORACLE)
■ Programmer interfaces for embedding DML in
programming languages
■ User-friendly interfaces
■ Menu-based, forms-based, graphics-based, etc.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 23
DBMS Programming Language Interfaces
■ Programmer interfaces for embedding DML in a
programming languages:
■ Embedded Approach: e.g embedded SQL (for C,
C++, etc.), SQLJ (for Java)
■ Procedure Call Approach: e.g. JDBC for Java,
ODBC for other programming languages
■ Database Programming Language Approach:
e.g. ORACLE has PL/SQL, a programming
language based on SQL; language incorporates
SQL and its data types as integral components
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 24
User-Friendly DBMS Interfaces
■ Menu-based, popular for browsing on the web
■ Forms-based, designed for naïve users
■ Graphics-based
■ (Point and Click, Drag and Drop, etc.)
■ Natural language: requests in written English
■ Combinations of the above:
■ For example, both menus and forms used
extensively in Web database interfaces
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 25
Other DBMS Interfaces
■ Speech as Input and Output
■ Web Browser as an interface
■ Parametric interfaces, e.g., bank tellers using
function keys.
■ Interfaces for the DBA:
■ Creating user accounts, granting authorizations
■ Setting system parameters
■ Changing schemas or access paths
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 26
Database System Utilities
■ To perform certain functions such as:
■ Loading data stored in files into a database.
Includes data conversion tools.
■ Backing up the database periodically on tape.
■ Reorganizing database file structures.
■ Report generation utilities.
■ Performance monitoring utilities.
■ Other functions, such as sorting, user monitoring,
data compression, etc.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 27
Other Tools
■ Data dictionary / repository:
■ Used to store schema descriptions and other
information such as design decisions, application
program descriptions, user information, usage
standards, etc.
■ Active data dictionary is accessed by DBMS
software and users/DBA.
■ Passive data dictionary is accessed by
users/DBA only.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 28
Other Tools
■ Application Development Environments and
CASE (computer-aided software engineering)
tools:
■ Examples:
■ PowerBuilder (Sybase)
■ JBuilder (Borland)
■ JDeveloper 10G (Oracle)
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 29
Typical DBMS Component Modules
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 30
Centralized and
Client-Server DBMS Architectures
■ Centralized DBMS:
■ Combines everything into single system including-
DBMS software, hardware, application programs,
and user interface processing software.
■ User can still connect through a remote terminal –
however, all processing is done at centralized site.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 31
A Physical Centralized Architecture
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 32
Basic 2-tier Client-Server Architectures
■ Specialized Servers with Specialized functions
■ Print server
■ File server
■ DBMS server
■ Web server
■ Email server
■ Clients can access the specialized servers as
needed
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 33
Logical two-tier client server architecture
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 34
Clients
■ Provide appropriate interfaces through a client
software module to access and utilize the various
server resources.
■ Clients may be diskless machines or PCs or
Workstations with disks with only the client
software installed.
■ Connected to the servers via some form of a
network.
■ (LAN: local area network, wireless network, etc.)
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 35
DBMS Server
■ Provides database query and transaction services to the
clients
■ Relational DBMS servers are often called SQL servers,
query servers, or transaction servers
■ Applications running on clients utilize an Application
Program Interface (API) to access server databases via
standard interface such as:
■ ODBC: Open Database Connectivity standard
■ JDBC: for Java programming access
■ Client and server must install appropriate client module
and server module software for ODBC or JDBC
■ See Chapter 9
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 36
Two Tier Client-Server Architecture
■ A client program may connect to several DBMSs,
sometimes called the data sources.
■ In general, data sources can be files or other
non-DBMS software that manages data.
■ Other variations of clients are possible: e.g., in
some object DBMSs, more functionality is
transferred to clients including data dictionary
functions, optimization and recovery across
multiple servers, etc.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 37
Three Tier Client-Server Architecture
■ Common for Web applications
■ Intermediate Layer called Application Server or Web
Server:
■ Stores the web connectivity software and the business logic
part of the application used to access the corresponding
data from the database server
■ Acts like a conduit for sending partially processed data
between the database server and the client.
■ Three-tier Architecture Can Enhance Security:
■ Database server only accessible via middle tier
■ Clients cannot directly access database server
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 38
Three-tier client-server architecture
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 39
Classification of DBMSs
■ Based on the data model used
■ Traditional: Relational, Network, Hierarchical.
■ Emerging: Object-oriented, Object-relational.
■ Other classifications
■ Single-user (typically used with personal
computers)
vs. multi-user (most DBMSs).
■ Centralized (uses a single computer with one
database)
vs. distributed (uses multiple computers, multiple
databases)
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 40
Variations of Distributed DBMSs
(DDBMSs)
■ Homogeneous DDBMS
■ Heterogeneous DDBMS
■ Federated or Multidatabase Systems
■ Distributed Database Systems have now come to
be known as client-server based database
systems because:
■ They do not support a totally distributed
environment, but rather a set of database servers
supporting a set of clients.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 41
Cost considerations for DBMSs
■ Cost Range: from free open-source systems to
configurations costing millions of dollars
■ Examples of free relational DBMSs: MySQL,
PostgreSQL, others
■ Commercial DBMS offer additional specialized modules,
e.g. time-series module, spatial data module, document
module, XML module
■ These offer additional specialized functionality when
purchased separately
■ Sometimes called cartridges (e.g., in Oracle) or blades
■ Different licensing options: site license, maximum number
of concurrent users (seat license), single user, etc.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 42
History of Data Models
■ Network Model
■ Hierarchical Model
■ Relational Model
■ Object-oriented Data Models
■ Object-Relational Models
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 43
History of Data Models
■ Network Model:
■ The first network DBMS was implemented by
Honeywell in 1964-65 (IDS System).
■ Adopted heavily due to the support by CODASYL
(Conference on Data Systems Languages)
(CODASYL - DBTG report of 1971).
■ Later implemented in a large variety of systems -
IDMS (Cullinet - now Computer Associates), DMS
1100 (Unisys), IMAGE (H.P. (Hewlett-Packard)),
VAX -DBMS (Digital Equipment Corp., next
COMPAQ, now H.P.).
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 44
Example of Network Model Schema
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 45
Network Model
■ Advantages:
■ Network Model is able to model complex
relationships and represents semantics of
add/delete on the relationships.
■ Can handle most situations for modeling using
record types and relationship types.
■ Language is navigational; uses constructs like
FIND, FIND member, FIND owner, FIND NEXT
within set, GET, etc.
■ Programmers can do optimal navigation through
the database.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 46
Network Model
■ Disadvantages:
■ Navigational and procedural nature of processing
■ Database contains a complex array of pointers
that thread through a set of records.
■ Little scope for automated “query optimization”
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 47
History of Data Models
■ Hierarchical Data Model:
■ Initially implemented in a joint effort by IBM and
North American Rockwell around 1965. Resulted
in the IMS family of systems.
■ IBM’s IMS product had (and still has) a very large
customer base worldwide
■ Hierarchical model was formalized based on the
IMS system
■ Other systems based on this model: System 2k
(SAS inc.)
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 48
Hierarchical Model
■ Advantages:
■ Simple to construct and operate
■ Corresponds to a number of natural hierarchically
organized domains, e.g., organization (“org”) chart
■ Language is simple:
■ Uses constructs like GET, GET UNIQUE, GET NEXT, GET
NEXT WITHIN PARENT, etc.
■ Disadvantages:
■ Navigational and procedural nature of processing
■ Database is visualized as a linear arrangement of records
■ Little scope for "query optimization"
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 49
History of Data Models
■ Relational Model:
■ Proposed in 1970 by E.F. Codd (IBM), first commercial
system in 1981-82.
■ Now in several commercial products (e.g. DB2, ORACLE,
MS SQL Server, SYBASE, INFORMIX).
■ Several free open source implementations, e.g. MySQL,
PostgreSQL
■ Currently most dominant for developing database
applications.
■ SQL relational standards: SQL-89 (SQL1), SQL-92 (SQL2),
SQL-99, SQL3, …
■ Chapters 5 through 11 describe this model in detail
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 50
History of Data Models
■ Object-oriented Data Models:
■ Several models have been proposed for implementing in a
database system.
■ One set comprises models of persistent O-O Programming
Languages such as C++ (e.g., in OBJECTSTORE or
VERSANT), and Smalltalk (e.g., in GEMSTONE).
■ Additionally, systems like O2, ORION (at MCC - then
ITASCA), IRIS (at H.P.- used in Open OODB).
■ Object Database Standard: ODMG-93, ODMG-version 2.0,
ODMG-version 3.0.
■ Chapters 20 and 21 describe this model.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 51
History of Data Models
■ Object-Relational Models:
■ Most Recent Trend. Started with Informix
Universal Server.
■ Relational systems incorporate concepts from
object databases leading to object-relational.
■ Exemplified in the latest versions of Oracle-10i,
DB2, and SQL Server and other DBMSs.
■ Standards included in SQL-99 and expected to be
enhanced in future SQL standards.
■ Chapter 22 describes this model.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 52
Summary
■ Data Models and Their Categories
■ History of Data Models
■ Schemas, Instances, and States
■ Three-Schema Architecture
■ Data Independence
■ DBMS Languages and Interfaces
■ Database System Utilities and Tools
■ Centralized and Client-Server Architectures
■ Classification of DBMSs
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe 53