KEMBAR78
BCA Notes | PDF | Databases | Software Engineering
0% found this document useful (0 votes)
12 views22 pages

BCA Notes

Very important for all students of CS.

Uploaded by

MS Mourya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views22 pages

BCA Notes

Very important for all students of CS.

Uploaded by

MS Mourya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

UNIT – I

Introduction to
Database System

Er. Rahul Mishra Page 1


INTRODUCTION
Database is a collection of data. It contains information about one particular
enterprise. Some examples of enterprises and their databases are:

Bank - which stores customers data


Hospital - which stores patient data
University - which stores student data

DBMS (Data Base Management System) is a collection of programs or it is


software that enables the users to create and maintain a database. Also, DBMS
allows the users to insert, update and retrieve the data from the database.

Some examples of DBMS are:

MS-Access
D-base
FoxPro
Oracle etc.

APPLICATIONS OF DBMS
Some of the applications areas of DBMS are listed below:

Banking
Airlines
Universities
Credit and transaction
Tele-communication
Sales
Manufacturing etc.

NEED FOR DBMS


Before the DBMS came into existence, 'file processing system was used to
handle the data of various organizations. The file system needs a set of
application programs to add information to files, to extract information from

Er. Rahul Mishra Page 2


files and to update the existing information. There were several other problems
with file system and they are explained as follows:

Data Redundancy and Inconsistency

In file system, same information is stored at multiple places. This duplication of


data is called as data redundancy. Because of redundancy, the file system
suffers from data inconsistency problem during updates. Inconsistency means
that the value of an attribute is different at different places.

Difficulty in Accessing Data

Conventional file processing system does not allow the needed data to be
retrieved in a convenient and efficient manner. For example, consider a data file
named as saving-account with fields named as acc-no, name, address, and
balance. Application programs to access the data are also written. But, if the
user wants to display only those records for which balance is greater than Rs.
10,000 and if that program is not written then, it is very difficult to access that
data.

Data Isolation

Because data are scattered in various files and files may be in different formats
so, it is difficult to write new application programs to retrieve the appropriate
data.

Difficulty in Enforcing Integrity Constraints

The data values stored in the database must satisfy certain consistency
constraints. A constraint is the restriction that we want to impose on some data
values. Application programs enforce these consistency constraints by adding
appropriate code in the various application programs. However, when a new
constraint is to be added, it is difficult to change the programs to enforce the
new constraint.

Atomicity Problem

A computer system can fail at any time. In many applications, it is needed to


ensure that once a failure has occurred and has been detected, the data are

Er. Rahul Mishra Page 3


stored to the consistent state that existed prior to the failure. It is very difficult
to ensure this property in a conventional file processing system.

For example, consider a program to transfer Rs. 600 from account X to account
Y. If a failure occurs after removing Rs.600 from account X but before adding
Rs.600 to account Y then, it is very difficult to maintain the atomicity property.

Difficulty in Concurrency Control

In case of file processing system, data is not centralized. Sometimes, two or


more users want to access the database at the same time but it is very difficult to
build in the concurrency control feature at the application programs level.

Security Problems

Since, application programs are added to the system in an ad-hoc manner and
information does not have centralized access path thus, it is difficult to enforce
security constraints.

ADVANTAGES OF DBMS
DBMS was developed to overcome the limitation of file processing system.
Following are some of the advantages of DBMS:

Controlling Redundancy

In file processing systems, every user group maintains its own file for handling
data. For example, in a University, two groups of users might be course-
registration and accounting-office. The account-office keeps data on registration
and related billing information whereas registration-office keeps data on student
courses and grades. Most of data is stored twice, once in each user file. This
results in redundancy of data which leads to several problems. Firstly, there is
wastage of storage space as data is duplicated. Another problem is
inconsistency. This may happen because an update is applied to some of the
files but not to others. For example, if the address of a student is changed and
student informs this change to course-registration user but does not inform to
accounting-office. The changed address is update by course-registration user

Er. Rahul Mishra Page 4


but it is not updated by accounting-office. Thus, the value of address is different
at two different places. This results in data inconsistency. For consistency, we
should have a database design that stores each logical data item such as name or
birth-date in only one place in the database. This design does not permit
inconsistency and also saves the storage space.

Restricting Unauthorized Access

When multiple users share a database, some users will not be authorized to
access all the information in the database. For example, financial data is often
confidential and hence, only authorized persons are allowed to access such data.
Also, some users may be permitted only to retrieve data, others are allowed
both to retrieve and update the data. For this purpose, a password is given to all
the users by DBA (Database administrator). Thus, only the authorized person
can perform a particular operation on the database for which the authority is
given to him.

Centralized Control

DBA (Database administrator) is the person who has centralized control over
the database. Several of the drawbacks of the file processing system are
eliminated in DBMS because of centralized control.

Backup and Recovery

Hardware and software can fail at any time. So, there should be some
mechanism for recovering from such failures. Backup and recovery subsystem
of DBMS is responsible for recovery for such failures. For example, if
computers system fails in the middle of a complex update program then, the
recovery subsystem ensures that the database is restored to the state it was in
before the program started executing.

Enforcing Integrity Constraints

For some applications, integrity constraints must hold for the data values. For
example, data values for a column of numeric type must be an integer between
1 and 7, value of name must be a string no more than 20 characters etc. Another

Er. Rahul Mishra Page 5


constraint is that a data value should not be null. DBMS provides all these
constraints.

Providing Multiple User Interface

Many users of different levels use the database. DBMS provides a variety of
interfaces for these users. These include query languages for casual users,
programming language interface for application programmers etc.

Shared Data

DBMS allows the multiple users to share the database. It means that more than
one user can use the same database.

Representing Complex Relationships among Data

DBMS provides not only the simple relationships among the data but, also
provides the complex relationships.

DISADVANTAGES OF DBMS
Along with several advantages, DBMS also has some disadvantages. But, these
disadvantages are negligible as compared to the advantages offered by DBMS.

Some of the disadvantages of DBMS are as follows:

Numbers of problems are associating with centralized data.


Cost of hardware and software.
Complexity of backup and recovery mechanism.

DBMS ARCHITECTURE
DBMS architecture is also called as three-level architecture. The goal of three-
level architecture is to separate the user applications and the physical database.
Three-level architecture is shown in the figure.

Er. Rahul Mishra Page 6


A major purpose of DBMS is to provide users with an abstract view of data.
Many database systems users are not computer-trained hence the complexity is
hidden from them through several levels of abstraction.

Abstraction means to hide certain details of how the data is stored and
maintained. These levels of abstraction can be explained with the help of three-
level architecture.

Followings are the different levels of abstraction:

Physical Level

It is the lowest level of abstraction. It describes how data is actually stored and
also described the data structures and access methods to be used by the
database. Physical level has an internal schema which describes the physical
storage structure of the database. The internal schema uses a physical data

Er. Rahul Mishra Page 7


model and describes the complete details of data storage and access paths for
database.

Conceptual Level

It is the next higher level of the abstraction. The conceptual level has a
conceptual schema which describes structure of the whole database for
community of users. This schema hides the details of physical storage structures
and concentrates on describing entities, data types, relationships and
constraints.

View Level

It is the highest level of abstraction. The view level includes a number of user
views. It describes only a part of the entire database. Many users will not be
concerned with all of the information. Different users may need only a part of
the entire database. To simplify their interaction with the database, view level is
defined. A high-level data model can be used at this level.

Between the two levels, there is a mapping as shown in figure. The process of
transforming the requests and results between the different levels is called as
mapping.

Example 1.1

To differentiate between different levels, consider a record defined using


structure in C language as follows:

structure client

char client-name [20];

char client-street [20];

char client-city [20];

}clnt;

Er. Rahul Mishra Page 8


In this example, client is a structure name having three fields. Each of the field
has a name and its data type. Now, let us compare three levels with the above
example.

Physical Level

In case of programming language, at physical level, the client structure can be


described as a block of consecutive storage locations (say 250 bytes). The
compiler hides this level of detail from programmers. Similarly, the database
system hides many of the lowest level storage details from database
programmers. Database administrator may be aware of certain details of the
physical organization of data.

Conceptual Level

At conceptual level, programmers of programming language work abstraction


and define structure by a type definition and also the interrelationship at this
level of among these structures is defined. Similarly, database administrators
usually work at this level of abstraction.

View Level

At view level, users can see the final results of their programs by giving
different inputs. Similarly, at view level, several views of the database are
defined and database users can get the required output.

INSTANCES AND SCHEMAS

Database is a collection of related data. Database changes over time as the


information is inserted and deleted from the database. The collection of the
information stored in the database at a particular moment of time is called an
instance of the database.

The overall design of the database is called as database schema.

Consider the following C program statements:

int a, b, c; // variable declaration

c = 5; // assigning 5 to variable c

Er. Rahul Mishra Page 9


A database schema corresponds to the variable declaration in any programming
language (i.e, int a, b, c). Each variable has a particular value at a given instant.
Thus, the value of the variables in a program at a given moment corresponds to
an instance of database schema (i.e., c = 5).

According to the levels of abstraction, database system has several schemas and
they are as follows:

Physical Schema: The physical schema describes the database design at


the physical level.
Logical Schema: The logical schema describes the database design at
logical level.
Subschema: A database may have several schemas at the view level and
these schemas are called as sub-schemas that describe different views of the
database.

Once defined, the database schema rarely changes whereas instance may
change as a result of insertion, deletion and updating commands.

DATABASE LANGUAGES
A database is a collection of inter-related data. To perform various operations
on schema and the data, we need some kind of languages using which
commands can be issued. Database languages act as an interface between the
database and the user. For example, SQL (Structured query language) is a
database language. SQL has a set of commands. Some of these commands are
used to specify the database schema whereas others are used to express
database queries and updates. SQL is classified into following three parts:

Date definition language (DDL)


Data manipulation language (DML)
Data control language (DCL)

Er. Rahul Mishra Page 10


Data Definition Language

Database schema is specified by a set of definitions expressed by a special


language called data definition language. DDL includes the commands that are
used to create, alter and drop the structure of the tables. In simple words, we
can say that DDL deals with the structure of the database.

Data Manipulation Language

DML is a language that enables the user to access or manipulate the data. Data
manipulation language includes the commands:

 To retrieve the information from database.


 To insert new information into the database.
 To delete information from the database.
 To modify the information stored in the database.

DML is classified into two types:

Procedural DML: It requires the user to specify what data are


required and also how to get those data.

Non-procedural DML: It requires the user to specify only what data


are required without specifying how to get those data.

The DML component of the SQL is non-procedural.

Data Manipulation Language

DCL includes the statements which control access to data and database. A
privilege can be granted to a user by using GRANT statement. The
privileges assigned can be select, alter, delete, execute, insert and index etc.
Privileges are assigned to the users because we do not want every user to
perform all the operations on the database. In addition, we can also revoke
(take back) the privileges by using REVOKE command.

Er. Rahul Mishra Page 11


DATA INDEPENDENCE
Three-level architecture can be used to explain the concept of data
independence. It is defined as the capability to change the schema at one
level of the database system without having to change the schema at next
higher level. The types of data independence are:

Physical data independence: It is the capability to change the internal


schema without having to change the conceptual schema. Change to the
internal schema is needed because some physical files have to be
reorganized.
Logical data independence: It is the capability to change the conceptual
schema without having to change the external schema. The conceptual
schema is changed to expand the database or to reduce the database.

The data independence is accomplished when the schema is changed at some


level but, the schema at next higher level remains unchanged. But, the mapping
between the two levels is changed.

USERS OF DBMS
There are different types of the users who use the database in different ways
and they are explained as follows:

Database Administrator (DBA)

In any organization in which many persons use the same resources, there is the
need for chief administrator to oversee and manage those resources. DBA is a
person or group of persons who manage these resources. The DBA is
responsible for authorizing the access to the database. In large organizations,
the DBA is assisted by a staff that helps the DBA for carrying out its functions.
Also, the centralized control of the database is exerted by the DBA.

Role of DBA: The main role of DBA is explained as follows:

Er. Rahul Mishra Page 12


 Schema definition: DBA creates the original database schema by
writing a set of definitions that is translated by the DDL complier to a
set of tables that is stored permanently in the data dictionary.
 Storage structure and access method definition: DBA creates the
appropriate storage structure and access methods by writing a set of
definitions which is translated by the data storage and DDL compiler.
 Schema and physical organization modification: It is also the
responsibility of the DBA to modify the schema and physical
organization of database by writing a set of definitions that are used
either by DDL compiler or data storage and DDL compiler which
generates the modifications to the database.
 Granting authorization for data access: DBMS allows the multiple
users to share the data. Not all the users are allowed to access all the
data because some data is confidential. So, it is the responsibility of
DBA to grant authorization to different users to access the data. A
user can access a part of database if and only if, he is granted
authority by the DBA. The authorization information is kept in a
special system structure.
 Integrity constraint specification: Data values stored in the
database must satisfy certain constraints. Such constraints must be
specified explicitly by the DBA. The integrity constraints are kept in
special system structure.

Database Designer
Database designers are responsible for identifying the data to be stored in the
database and for choosing the appropriate structure to represent and store
that data. The database designers communicate with all the database users in
order to understand their requirements and to come up with a design that
meets these requirements. In many cases, the designers are on the staff of
DBA.

End Users
End users are the people whose job requires access to database for querying.
Updating and generating reports. The types of end users are:
Er. Rahul Mishra Page 13
 Casual and users: These are the users who occasionally access the
database but, may need different information each time. They use a
query language to specify their requests.
 Naive or parametric and users: Their main job function revolves
around constantly querying and updating the database using standard
types of queries and updates called as canned transactions that have
been carefully programmed and tested. For example, bank-teller who
checks account balance withdraws and deposits, reservation clerks for
airlines, hostels and railways who make the reservation and
canalization.
 Sophisticated end users: These include Engineers, Scientists and
Analysts who use database to implement their application to meet
their complex requirements.
 Stand-alone users: These users maintain personal database by using
ready-made program packages. For example, users of tax package.
 System Analyst and Application Programmers: System analysts
determine the requirements of end users, especially naive users and
develop specifications for canned transactions that meet these
requirements. Application programmers implement these
specifications as programs and they test, debug, document and
maintain these canned transactions.

STORAGE MANAGER RESPONSIBILITIES


Storage Manager is also called as Database Manager. We have seen that storage
manager is an important component of DBMS structure. It is a program module
which provides the interface between low-level data stored in the database and the
application programs and queries submitted to the system.

Followings are the responsibilities of the storage manager:

 Interaction with the file manager: Actual data is stored in the file
system. The database manager is responsible for actual storing,
retrieving and updating the data in the database.

Er. Rahul Mishra Page 14


 Integrity constraints enforcement: Consistency constraints are
specified by DBA. But, the responsibility of database manager is to
enforce, implement or check those constraints.
 Security enforcement: It is the responsibility of the database
manager to enforce the security requirements.
 Backup and recovery: It is responsibility of database manager to
detect system failures and restore the database to a consistent state.
 Concurrency control: Interaction among the concurrent users is
controlled by database manager.

TYPES OF DATABASE SYSTEMS


DBMS can be classified on the basis of number of users and the database site
locations. This categorization is shown as in the following figure.

Types of Database

On the basis of On the basis of Site


Number of Users Location

Single User Multi-User


DBMS DBMS

Centralized Parallel Distributed Client


System System System Server
System

Er. Rahul Mishra Page 15


On The Basis of Number of Users

Followings are the types of DBMS on the basis of number of users:


 Single-user DBMS: In the single-user DBMS, database resides on
one computer and it is only accessed by one user at a time.
 Multi-user DBMS: In multi-user system, the multiple users access
the data from one central storage area so that the database remains in
the integrated form. A database is integrated when the same
information is not stored in two places. Most of the DBMS are multi-
user.

On The Basis of Site Location


Followings are the types of DBMS on the basis of site location:

 Centralized System

In centralized systems, the database resides on some single central


location. A number of processors can access this central database. These
processors are connected to the central database via some computer
network. Given Figure shows the centralized database arrangement.

Er. Rahul Mishra Page 16


Railway reservation system is an example of centralized system, where
the database is centrally located at New Delhi. A number of reservations
counter from all over the country access this database making the
reservations and canalizations. The main drawback of the centralized
system is that, when the central site computer goes down then, the users
are blocked from using system until the system comes back.

 Parallel System

Parallel systems improve the processing and I/O speed by using multiple
CPU's and disks in parallel. In a parallel system, many operations are
executed simultaneously. There are two main measures of performance of
the database system:
o Throughput: It is the number of tasks that can be completed in a given time
interval.

o Response time: It is the amount of time the system takes to complete a


single task from the time it is submitted to the system.

A system that processes large transactions can improve response time as


well as throughput by performing subtasks of each transaction in parallel.
There are three architectures for parallel DBMS as illustrated in figure given
below.

 Shared memory: In this model, all the processors share a common


memory. The communication between the components is through
interconnection network. A processor can send massages to other
processor using memory writes. This architecture provides high-
speed data access for a limited number of processors but, it is not
scalable beyond 64 processors since the interconnection network
becomes bottleneck. Then, the processors have to spend most of
their time waiting for their turn to access the memory.

Er. Rahul Mishra Page 17


Disk 1 Disk 2 Disk 3

CPU 1 CPU 2 CPU 3

Interconnection Network

Memory

 Shared disk: In this model, all the processors share a common


disk. Shared disk models are sometimes called as clusters. Here, all
the processors can access all the disks directly via an
interconnection network. All the processors have their private
memories.

Memory Memory Memory

CPU 1 CPU 2 CPU 3

Interconnection Network

Disk 1 Disk 2 Disk 3

Er. Rahul Mishra Page 18


 Shared nothing: In this model, processors share neither a
common neither memory nor a common disk. Each node of the
machine consists of a processor, memory and one or more disks.
The communication between the processors is through the
interconnection network. Shared-nothing architecture is more
scalable than shared memory and can easily support a large
number of processors.

Memory Memory

CPU 1 CPU 2
Disk 1 Disk 2

Interconnection Network

CPU 3 Disk 2

Memory

 Distributed System

In distributed database management system, the database is split into a


number of fragments. Each fragment is stored on one or more computers.
These computers are connected by a communication network. The
computers in distributed system may vary in size and function. The
computers in distributed system are referred to by a number of different
names such as sites or nodes. The structure of distributed system is shown in
following figure:

Er. Rahul Mishra Page 19


In distributed system, database is geographically separated and is
administrated separately and has slower interconnection. Users access the
distributed database via applications. Applications can be local or global.
The local applications do not require data from other sites whereas the
global applications require data from other sites.

Banking system is an example of distributed system, where the database


system is implemented on a number of computer systems rather than on a
centralized computer. A network connecting the computers will enable the
different branches to communicate with each other. Thus, a customer living
in one city can also check his account during the stay in another city.

The main advantage of distributed system is that, if one site fails, the
remaining sites may be able to continue operating. The failure of a site does
not necessarily imply the shutdown of the system. The drawbacks of this
system are software development cost and increased processing overhead.

Er. Rahul Mishra Page 20


 Client –Server System

Client server system has two main components namely client and server. A
client is a computer which request for the service from the server. A server is
a computer which provides the service to the client. All data resides at the
server site. All the applications execute at the client side. The client and
server computer are connected through a network. A general structure of
client-server system is shown in the following figure:

A client request for a service from the server and the server returns the result
to its client. Internet is an example of client-server architecture.

Er. Rahul Mishra Page 21


Assignment – 1

1. What is File Processing System? Explain its advantages.


2. What is DBMS? What are its advantages over traditional file System?
3. Draw and explain 3-Schema Architecture of DBMS.
4. What is Data Model? Explain its types.
5. What is Data independence? What are its types?
6. Explain various users of DBMS.
7. What is DBA? Explain the role of DBA.
8. Explain various types of Database Systems.

Er. Rahul Mishra Page 22

You might also like