Chapter 1
Introduction to Database
System
• A database is an organized collection of structured information, or data, typically stored
electronically in a computer system.
• A database is usually controlled by a database management system (DBMS).
• Database systems are designed to manage large data set in an organization.
• Today, Databases are essential to every business.
• They are used to maintain internal records, to present data to customers and clients on
the World-Wide- Web, and to support many other commercial processes.
• Databases are likewise found at the core of many modern organizations.
Cont..
• The power of databases comes from a body of knowledge and technology that has
developed over several decades and is embodied in specialized software called a
database management system, or DBMS.
• A DBMS is a powerful tool for creating and managing large amounts of data efficiently
and allowing it to persist over long periods of time, safely.
• These systems are among the most complex types of software available.
• Thus, for our question: What is a database? In essence a database is nothing more than a
collection of shared information that exists over a long period of time, often many years.
• In common language, the term database refers to a collection of data that is managed by
a DBMS.
Cont..
• Data management passes through the different levels of development along
with the development in technology and services.
• These levels could best be described by categorizing the levels into three
levels of development.
• Even though there is an advantage and a problem overcome at each new
level, all methods of data handling are in use to some extent.
• The major three levels are;
1. Manual Approach
2. Traditional File Based Approach
3. Database Approach
Manual Approach
• In the manual approach, data storage and retrieval follows the primitive and traditional way
of information handling where cards and paper are used for the purpose.
• The data storage and retrieval will be performed using human labour.
• Files, for as many event and objects as the organization has, are used to store information.
• Each of the files containing various kinds of information is labelled and stored in one or more
cabinets.
• The cabinets could be kept in safe places for security purpose based on the sensitivity of the
information contained in it.
• Insertion and retrieval is done by searching first for the right cabinet then for the right the
file then the information.
• One could have an indexing system to facilitate access to the data
Limitations of the Manual
approach
• Prone to error
• Difficult to update, retrieve, integrate
• You have the data but it is difficult to compile the
information
• Limited to small size information
• Cross referencing is difficult
Traditional File Based Approach
• After the introduction of Computer for data processing to the
business community, the need to use the device for data
storage and processing increase.
• There were, and still are, several computer applications with
file based processing used for the purpose of data handling.
• File based systems were an early attempt to computerize the
manual filing system.
• In such systems, every application program that provides
service to end users define and manage its own data.
• Since every application defines and manages its own data, the
system is subject to serious data duplication problem.
• File, in traditional file based approach, is a collection of records
which contains logically related data.
Limitations of the Traditional File Based
approach
• Separation or Isolation of Data: Available information in one application may not be
known. Data Synchronization is done manually.
• Limited data sharing- every application maintains its own data. Lengthy development
and maintenance time.
• Duplication or redundancy of data (money and time cost and loss of data integrity)
• Data dependency on the application- data structure is embedded in the application;
hence, a change in the data structure needs to change the application as well.
• Incompatible file formats or data structures (e.g. “C” and COBOL) between different
applications and programs creating inconsistency and difficulty to process jointly.
Database Approach
• Database is just a computerized record keeping system or a kind of electronic filing cabinet.
• Database is a repository for collection of computerized data files.
• Database is a shared collection of logically related data and description of data designed to meet the information
needs of an organization.
• Database is a collection of logically related data where these logically related data comprises entities, attributes,
relationships, and business rules of an organization's information.
• In addition to containing data required by an organization, database also contains a description of the data which is
known as “Metadata” or “Data Dictionary” or “Systems Catalogue” or “Data about Data” or some times “Data
Directory”.
• Database is deigned once and used simultaneously by many users.
• Unlike the traditional file based approach in database approach there is program data independence. That is the
separation of the data definition from the application. Thus the application is not affected by changes made in the
data structure and file organization. Each database application will perform the combination of: Creating database,
Reading, Updating and Deleting data.
Benefits of the database approach
• Data can be shared: two or more users can access and use same data instead of storing data in
redundant manner for each user.
• Redundancy can be reduced: isolated data is integrated in database to decrease the redundant data
stored at different applications.
• Quality data can be maintained: the different integrity constraints in the database approach will maintain
the quality leading to better decision making.
• Inconsistency can be avoided: controlled data redundancy will avoid inconsistency of the data in the
database to some extent.
• Transaction support can be provided: basic demands of any transaction support systems are implanted
in a full scale DBMS.
• Integrity can be maintained: data at different applications will be integrated together with additional
constraints to facilitate validity and consistency of shared data resource.
Limitations and risk of Database
Approach
• Introduction of new professional and specialized personnel.
• Complexity in designing and managing data
• The cost and risk during conversion from the old to the new system
• High cost to be incurred to develop and maintain the system
• Complex backup and recovery services from the users perspective
• Reduced performance due to centralization and data independency
• High impact on the system when failure occurs to the central system.
Database Management System (DBMS)
• Database Management System (DBMS) is a Software package used for providing EFFICIENT, CONVENIENT and SAFE
storage of and access to MASSIVE amounts of PERSISTENT (data outlives programs that operate on it) data.
• A DBMS also provides a systematic method for creating, updating, storing, retrieving data in a database. DBMS also
provides the service of controlling data access, enforcing data integrity, managing concurrency control, and recovery.
• Having this in mind, a full scale DBMS should at least have the following services to provide to the user.
1. Data storage, retrieval and update in the database
2. A user accessible catalogue
3. Transaction support service: ALL or NONE transaction, which minimize data inconsistency.
4. Concurrency Control Services: access and update on the database by different users simultaneously should be
implemented correctly.
Cont..
5. Recovery Services: a mechanism for recovering the database after a failure must be
available.
6. Authorization Services (Security): must support the implementation of access and
authorization service to database administrator and users.
7. Support for Data Communication: should provide the facility to integrate with data
transfer software or data communication managers.
8. Integrity Services: rules about data and the change that took place on the data,
correctness and consistency of stored data, and quality of data based on business
constraints.
9. Services to promote data independency between the data and the application
10. Utility services: sets of utility service facilities like
Importing data Statistical analysis support Index reorganization Garbage collection
Cont..
• A DBMS is software package used to design, manage, and maintain
databases.
• Each DBMS should have facilities to define the database, manipulate the
content of the database and control the database.
• These facilities will help the designer, the user as well as the database
administrator to discharge their responsibility in designing, using and
managing the database.
• It provides the following facilities:
Data Definition Language (DDL)
• Language used to define each data element required by the
organization.
• Commands for setting up schema or the intension of database
• These commands are used to setup a database, create, delete and alter
table with the facility of handling constraints
Data Manipulation Language (DML)
• Is a core command used by end-users and programmers to store, retrieve,
and access the data in the database e.g. SQL
• Since the required data or Query by the user will be extracted using this
type of language, it is also called “Query Language”.
Data Control Language
• Database is a shared resource that demands control of data access and usage.
• The database administrator should have the facility to control the overall operation
of the system.
• Data Control Languages are commands that will help the Database Administrator
to control the database.
• The commands include grant or revoke privileges to access the database or
particular object within the database and to store or remove database transactions.
Database Development Life Cycle (DDLC)
• As it is one component in most information system development tasks, there are several steps
in designing a database system.
• Here more emphasis is given to the design phases of the system development life cycle. The
major steps in database design are;
• Planning: that is identifying information gap in an organization and propose a database
solution to solve the problem.
• Analysis: that concentrates more on fact finding about the problem or the opportunity.
• Feasibility analysis, requirement determination and structuring, and selection of best design
method are also performed at this phase.
Cont..
• Design: in database development more emphasis is given to this phase. The phase is
further divided into three sub-phases.
• Conceptual Design: concise description of the data, data type, relationship between
data and constraints on the data.
• There is no implementation or physical detail consideration.
• Logical Design: a higher level conceptual abstraction with selected specific data
model to implement the data structure.
• It is particular DBMS independent and with no other physical considerations.
• Physical Design: physical implementation of the logical design of the database with
respect to internal storage and file structure of the database for the selected DBMS.
Cont..
• Implementation: the testing and deployment of the designed database
for use.
• Operation and Support: administering and maintaining the operation
of the database system and providing support to users.
• Change the database operations for best performance.
Roles in Database Design and Use
• As people are one of the components in DBMS environment, there are group of
roles played by different stakeholders of the designing and operation of a database
system.
1. Database Administrator (DBA)
• Responsible to oversee, control and manage the database resources (the database
itself, the DBMS and other related software)
• Authorizing access to the database
• Coordinating and monitoring the use of the database
• Responsible for determining and acquiring hardware and software resources
• Accountable for problems like poor security, poor performance of the system
• Involves in all steps of database development
Cont..
• We have further classifications of this role in big organizations having
huge amount of data and user requirement.
• Data Administrator (DA): is responsible on management of data
resources. This involves in database planning, development, maintenance
of standards policies and procedures at the conceptual and logical design
phases.
• Database Administrator (DBA): This is more technically oriented role.
DBA is responsible for the physical realization of the database.
• It is involved in physical design, implementation, security and integrity
control of the database
2. Database Designer (DBD)
• Identifies the data to be stored and choose the appropriate structures to
represent and store the data.
• Should understand the user requirement and should choose how the user views
the database.
• Involve on the design phase before the implementation of the database system.
• We have two distinctions of database designers, one involving in the logical
and conceptual design and another involving in physical design.
Logical and Conceptual DBD
• Identifies data (entity, attributes and relationship) relevant to the
organization.
• Identifies constraints on each data
• Understand data and business rules in the organization
• Sees the database independent of any data model at conceptual level and
consider one specific data model at logical design phase.
Physical DBD
• Take logical design specification as input and decide how it should be
physically realized.
• Map the logical data model on the specified DBMS with respect to
tables and integrity constraints. (DBMS dependent designing).
• Select specific storage structure and access path to the database
• Design security measures required on the database
3. Application Programmer and Systems Analyst
• System analyst determines the user requirement and how the user
wants to view the database.
• The application programmer implements these specifications as
programs; code, test, debug, document and maintain the application
program.
• The application programmer determines the interface on how to
retrieve, insert, update and delete data in the database.
• The application could use any high level programming language
according to the availability, the facility and the required service.
4. End Users
• Workers, whose job requires accessing the database frequently for various
purposes, there are different group of users in this category.
Naïve Users
• Sizable proportion of users
• Unaware of the DBMS
• Only access the database based on their access level and demand
• Use standard and pre-specified types of queries
Cont..
Sophisticated Users
• Users familiar with the structure of the Database and facilities of the
DBMS.
• Have complex requirements
• Have higher level queries
• Are most of the time engineers, scientists, business analysts, etc
Cont..
Casual Users
• Users who access the database occasionally.
• Need different information from the database each time.
• Use sophisticated database queries to satisfy their needs.
• Are most of the time middle to high level managers.
Cont..
• These users can be again classified as “Actors on the Scene” and
“Workers Behind the Scene”.
Actors on the Scene
• Data Administrator
• Database Administrator
• Database Designer
• End Users
Workers behind the scene
• DBMS designers and implementers: who design and implement different DBMS
software.
• Tool Developers: experts who develop software packages that facilitates database
system designing and use. Prototype, simulation, code generator developers could be
an example. Independent software vendors could also be categorized in this group.
• Operators and Maintenance Personnel: system administrators who are
responsible for actually running and maintaining the hardware and software of the
database system and the information technology facilities.