KEMBAR78
Module 1 Introduction To DBMS | PDF | Databases | Data Model
0% found this document useful (0 votes)
2 views67 pages

Module 1 Introduction To DBMS

Uploaded by

snow25jon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views67 pages

Module 1 Introduction To DBMS

Uploaded by

snow25jon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 67

Introduction to

DBMS
Course Overview
Course Overview
Course Overview
Literature
Literature
Data, Database, DBMS
Data: Known facts that can be recorded and have an implicit meaning; raw

Database: A highly organized, interrelated, and structured set of data about a


particular enterprise
 Controlled by a database management system (DBMS)

DBMS:
 A collection of programs that enables users to create and maintain a database
 Set of programs to access the data
 An environment that is both convenient and efficient to use
 A software package/system to facilitate the creation and maintenance of a computerized
database
Mini-world:
 Some part of the real world about which data is stored in a database. For example,
student grades and transcripts at a university.

Database systems are used to manage collections of data that are:


 Highly valuable
 Relatively large
 Accessed by multiple users and applications, often at the same time.

A modern database system is a complex software system whose task is to


manage a large, complex collection of data.
Types of Databases and
Database Applications
Traditional applications:
 Numeric and textual databases

More recent applications:


 Multimedia databases

 Geographic Information Systems (GIS)

 Data warehouses

 Mobile databases

 Real-time and active databases


Recent Developments
Social Networks started capturing a lot of information about people and about
communications among people-posts, tweets, photos, videos in systems such as:
- Facebook
- Twitter
- Linked-In
All of the above constitutes data
Search Engines, Google, Bing, Yahoo: collect their own repository of web pages for
searching purposes
New technologies are emerging from the so-called non-SQL, non-database
software vendors to manage vast amounts of data generated on the web:
 Big data storage systems involving large clusters of distributed computers
 NOSQL (Non-SQL, Not Only SQL) systems are non-tabular databases
A large amount of data now resides on the “cloud” which means it is in huge data
centers using thousands of machines. Cloud means over the internet.
What a DBMS Facilitates
DBMS is a general-purpose software system that facilitates the processes of
defining, constructing, manipulating, and sharing databases among various
users and applications.
Define a particular database in terms of its data types, structures, and
constraints
Construct or load the initial database contents
Manipulating the database:
 Retrieval: Querying, generating reports
 Modification: Insertions, deletions and updates to its content
 Accessing the database through Web applications (e.g., GLA Module)
Processing and sharing by a set of concurrent users and application programs
– yet, keeping all data valid and consistent
DBMS vs File System
File System
File based systems were an early attempt to computerize the manual system.

It is also called a traditional based approach in which a decentralized


approach was taken where each department stored and controlled its own
data with the help of a data processing specialist.

The main role of a data processing specialist was to create the necessary
computer file structures, and also manage the data within structures and
design some application programs that create reports based on file data.
File System
Consider an example of a student's file system
The student file will contain information regarding
the student (i.e. roll no, student name, course etc.).

Similarly, we have a subject file that contains


information about the subject and the
result file which contains the information
regarding the result.

Some fields are duplicated in more than one file, which leads to data
redundancy. So to overcome this problem, we need to create a centralized
DBMS File System
DBMS

A database approach is a well-organized collection


of data that are related in a meaningful way which
can be accessed by different users but stored only
once in a system.
The various operations performed by the DBMS
system are: Insertion, deletion, selection,
sorting etc.
In the above figure, duplication of data is reduced due to centralization of
data.
Difference between File system &
DBMS
File System Database Management System
1. File system is a software that manages and 1. DBMS is a software that is used for accessing ,
organizes the data files in a computer system. creating and managing the databases.

2. The File system provides the details of data 2. DBMS gives an abstract view of data that hides the
representation and storage of data. details.

3. Redundant data can be present in a file system. 3. In DBMS there is no redundant data.

4.There is no efficient query processing in File system. 4. DBMS contains efficient query processing (SQL).

5.Backup and recovery of data is not efficient, 5. DBMS provides backup and recovery of data even if
because it is not possible to recover the lost data. it is lost.
Difference between File system &
DBMS
File System Database Management System
6. Provide less security. 6. DBMS has more security mechanism as compared
to file system.

7. Less complex 7. Complex

8. The file system does not have a crash recovery 8. DBMS provides a crash recovery mechanism i.e.
mechanism. i.e. if the system crashes while entering DBMS protects the user from the system failure
some data, than the content of the file will lost.

9.In the file system, concurrent access has many 9. DBMS takes care of concurrent access of data using
problems. some form of locking mechanism.

10. File system is appropriate to handle data of a 10. DBMS is suitable for medium to large organization
small scale organization or individual user. or multiple users.
Schema and State
1. Database Schema

Database Schema
 The description of a database.
 Specified during database design and is not expected to change frequently.
 Includes descriptions of the database structure, data types, and the constraints on the
database.
Schema Diagram
 An illustrative display of (most aspects of) a database schema.
2. Database State

The actual data stored in a database at a particular moment in time. This


includes the collection of all the data in the database.

Also called database instance (or occurrence or snapshot).

Refers to the content of a database at a moment in time.


Example of Database State
Database Schema vs. Database
State

The database schema changes very infrequently.


The database state changes every time the database is updated.
Schema is also called intension.
State is also called extension.
Database Users
Database administrators
 In any organization where many people use the same resources, there is a need for a chief
administrator to oversee and manage these resources.
 In a database environment, the primary resource is the database itself, and the secondary
resource is the DBMS and related software.
 Database administrator (DBA) is responsible for authorizing access to the database, coordinating
and monitoring its use, acquiring software and hardware resources as needed, controlling its
use and monitoring efficiency of operations.
 The DBA is accountable for problems such as security breaches and poor system response time.
Database designers
 Database designers are responsible for identifying the data to be stored in the database and for
choosing appropriate structures to represent and store this data.
 Responsible to define the content, the structure, the constraints, and functions or transactions
against the database. They must communicate with the end-users and understand their needs.
Database Users

End-users
 End users are the people whose jobs require access to the database for querying,
updating, and generating reports; the database primarily exists for their use.
 They use the data for queries, reports and some of them update the database content.
Three-Tier Architecture
Three-Schema Architecture
The goal of the three-schema architecture is to separate user applications from the
physical database.
 Defines DBMS schemas at three levels:
 Internal schema(or Physical Schema): it describes physical storage structures
and access paths (e.g. indexes).
 Also Known as Low-level.
 In simple terms, it tells “How data physically stored in DB”.
Three-Schema Architecture
 Conceptual schema at the conceptual level to describe the structure and
constraints for the whole database for a community of users.
• hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, user operations, and constraints.
• In simple terms, it describes “What data exists in the Database”.

 External schemas at the external level to describe the various user views
• Each external schema describes the part of the database that a
particular user group is interested in and hides the rest of the database
from that user group.
Three-Schema Architecture

The processes of transforming


requests and results between
levels are called mappings.
Three-Schema Architecture
Three-Schema Architecture
Three-Schema Architecture
Three-Schema Architecture
Three-Schema Architecture
Data Independence
Data Independence
It is defined as a property of a DBMS that helps to change the
schema at one level of a database system without having to change
the schema at the next higher level.

Types of Data Independence


• Physical data independence
• Logical data independence
Data Independence
1. Physical data independence

 With Physical data independence, you can easily change the physical storage structures
or devices without an effect on the conceptual schema.
 Any change done would be absorbed by the mapping between the conceptual and
internal levels.
Data Independence
• Examples of changes under Physical Data Independence: Due to Physical
independence, any of the below change will not affect the conceptual layer.

o Using a new storage device like Hard Drive or Magnetic Tapes


o Modifying the file organization technique in the Database
o Switching to different data structures.
o Changing the access method.
o Modifying indexes.
o Changes to compression techniques or hashing algorithms.
o Change of Location of Database from say C drive to D Drive
Data Independence
 2. Logical data independence
• Logical Data Independence is the ability to change the conceptual scheme
without changing
• External views
• External API or programs
• We may change the conceptual schema to expand the database (by adding
records), to change the constraints, or to reduce the database.
• For example: for user1 application program is different, for user2 application
program is different, and so on.

 Logical data independence is achieved by implementing VIEWS. Views are virtual table.
Data Independence
• Examples of changes under Logical Data Independence: Due to Logical
independence, any of the below change will not affect the external layer.
o Add/Modify/Delete a new attribute, entity or relationship is possible without a rewrite
of existing application programs
o Merging two records into one
o Breaking an existing record into two or more records
Difference between Physical and Logical Data
Independence

Logica Data Independence Physical Data Independence

Logical Data Independence is mainly concerned


with the structure or changing the data Mainly concerned with the storage of the data.
definition.

Compared to Logic Physical independence it is Compared to Logical Independence it is easy to


difficult to achieve logical data independence. achieve physical data independence.

Concerned with conceptual schema Concerned with internal schema

Example: change in compression techniques,


Example: Add/Modify/Delete a new attribute
hashing algorithms, storage devices, etc
Classification of DBMSs
Classification of DBMSs
 Based on the data model used
 Legacy: Hierarchical, Network.
 Currently Used: Relational, Object-oriented, Object-relational
 Recent Technologies: Key-value storage systems, NOSQL systems: document based,
graph-based DBMSs etc.
 Based on Architecture
 Single-user (typically used with personal computers) vs. multi-user (most DBMSs).
 Centralized (uses a single computer with one database) vs. distributed (multiple
computers, multiple DBs)
1. Based on Data Models

• A set of rules and standards that defines how the database


organizes/store data is called database model.
• It also defines how users views the organization of data.
Hierarchical Model
Hierarchical Database model is one of the oldest database models.
The hierarchical model represents data as hierarchical tree structures. Each
hierarchy represents a number of related records.
One to many relationship.
Network Model
It is the extension of Hierarchical Model.
The Network model represents data as a graph.
The main difference of the network model from the hierarchical model, is its
ability to handle many to many (M:N) relations.
Relational Model
Relational model stores data in the form of Tables.
Tables consist of rows and columns.
Each column has specific data types and constraints.
Tables are connected with other tables.
Object-oriented Data Models
An Object oriented data model defines a database in terms of objects, their
properties, and their operations.
All the data and relationships of each object are combined as a single unit.
Objects with the same structure and behavior belong to a class.
The operations of each class are specified in terms of predefined procedures
called methods.
Object-Relational Models
Relational DBMSs have been extending their models to incorporate object
database concepts.
These systems are referred to as object-relational or extended relational
systems.
An Object relational model is a combination of a Object oriented database
model and a Relational database model. So, it supports objects, classes,
inheritance etc. just like Object Oriented models and has support for data
types, tabular structures etc. like Relational data model.
One of the major goals of Object relational data model is to close the
gap between relational databases and the object oriented practises
frequently used in many programming languages such as C++, C#, Java
etc.
The problem with this model is that this can get complex and difficult
to handle. So, proper understanding of this model is required.

Relational model = ER diagrams


Object-Relational = EER diagrams
XML model
 Extensible Markup language (XML) is widely used for representing the semi-
structured data.
Semi structured model is an evolved form of the relational model.
The XML model has emerged as a standard for exchanging data over the Web.
 XML uses hierarchical tree structures.
 It combines database concepts with concepts from document representation
models.
 Data is represented as elements with the use of tags, data can be nested to
create complex hierarchical structures.
XML
• <student 1>
<Roll No.>....................</Roll No.>
<Name>....................</Name>
<Class>....................</Class>
<Age>....................</Age>
</student 1>

<student 2>
<Name>....................</Name>
<Class>....................</Class>
<Age>....................</Age>
</student 2>
2. Based on Architecture
1. Centralised Database
• A database that is stored, located and maintained at a single location this
centralized database is accessed via an internet connection.
Advantages
• it is less costly.
• it is easy to manage and update.
• very minimal data redundancy because all the
data is stored at a single place.
Centralised Database
Disadvantages
• Performance issues( multiple users access the database at the same time).
• Due to single failure data will be lost and user cant access the database
• Highly dependent on network.
2. Distributed Database
• It is an integrated collection of database that is physically distributed across
sites in a computer network.
• that appears to users as a single database.
Advantage
• If one database fails then users can
access the other database files.
• This databases is more secure as
compared to centralised database.
Distributed Database
• More reliable -Due to single failure entire data will not lost.
• Data traffic is less.
• Performance- No bottlenecks as the load is spread over multiple server.
Disadvantages
• Data integrity concerns- need to update data in multiple sites.
• Time for synchronization between sites.
• This database is costly.
• It is difficult to maintain and update due to its complexity.
Database Languages
Database Languages
A database system provides:
data-definition language (DDL) to specify the database schema.
data-manipulation language (DML) to express database queries and updates.
data-control language (DCL) : it controls the access to the data.

DDL: Create, alter, drop, rename


DML: insert, update, delete, select
DCL: revoke, grant
Data Definition Language (DDL)
We specify a database schema by a set of definitions expressed by a special
language called a data-definition language (DDL).
Specification notation for defining the database schema.
Data values stored in the database must satisfy certain consistency constraints.
For example, suppose the university requires that the account balance of a
department must never be negative. The DDL provides facilities to specify such
constraints. The database system checks these constraints every time the
database is updated.
 Domain Constraints: A domain of possible values must be associated with every
attribute (for example, integer types, character types, date/time types). Declaring
an attribute to be of a particular domain acts as a constraint on the values that it
can take. Domain constraints are the most elementary form of integrity constraint.
They are tested easily by the system whenever a new data item is entered into the
database.
Data Definition Language (DDL)
Referential Integrity: There are cases where we wish to ensure that a
value that appears in one relation for a given set of attributes also
appears in a certain set of attributes in another relation (referential
integrity). For example, the department listed for each course must be
one that actually exists in the university. More precisely, the dept_name
value in a course record must appear in the dept_name attribute of some
record of the department relation.
Authorization: We may want to differentiate among the users as far as
the type of access they are permitted on various data values in the
database. These differentiations are expressed in terms of authorization:
read authorization, which allows reading, but not modification of data;
insert authorization, which allows insertion of new data, but not
modification of existing data; update authorization, which allows
modification, but not deletion of data; and delete authorization, which
allows deletion of data. We may assign the user all, none, or a
combination of these types of authorization.
Data Definition Language (DDL)
The output of the DDL is placed in the data dictionary, which contains
metadata—that is, data about data.
DDL compiler generates a set of table templates stored in a data dictionary.
Example:
create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary double(8,2));
Data dictionary contains metadata (i.e., data about data)
 Database schema
 Integrity constraints
• Primary key (ID uniquely identifies instructors)
 Authorization
• Who can access what
Data Manipulation Language

(DML)
Language for accessing and updating the data organized by the
appropriate data model. The types of access are:
 Retrieval of information stored in the database
 Insertion of new information into the database
 Deletion of information from the database
 Modification of information stored in the database.
There are basically two types of data-manipulation language:
 Procedural DMLs - require a user to specify what data are needed and how
to get those data.
• Declarative DMLs (Nonprocedural DMLs) - require a user to specify what
data are needed without specifying how to get those data.
Declarative DMLs are usually easier to learn and use than are
procedural DMLs.
Data Manipulation Language
(DML)
A query is a statement requesting the retrieval of information.
The portion of a DML that involves information retrieval is called a
query language.
SQL
SQL is nonprocedural. A query takes as input one or several tables
and always returns a single table.
Example to find all instructors in Comp. Sci. dept:
select name
from instructor
where dept_name = 'Comp. Sci.’;
The query specifies that those rows from the table instructor where
the dept_name is Comp. Sci. must be retrieved, and the name
attribute of these rows must be displayed. The result of executing this
query is a table with a single column labeled name and a set of rows,
each of which contains the name of an instructor whose dept_name is
Comp. Sci.
SQL
Queries may involve information from more than one table.
The following query finds the instructor ID and department name of
all instructors associated with a department with a budget of more
than Rs. 95,000.
select instructor.ID, department.dept_name
from instructor, department
where instructor.dept_name= department.dept_name and
department.budget > 95000;
SQL is NOT a Turing machine equivalent language.
To be able to compute complex functions SQL is usually embedded in
some higher-level language.

You might also like