RDBMS concepts
Concepts
● The relational model represents the database as a collection of relations.
Informally, each relation resembles a table of values.
● Each row in the table represents a collection of related data values
● In the relational model, each row in the table represents a fact that typically
corresponds to a real-world entity or relationship.
● The table name and column names are used to help in interpreting the
meaning of the values in each row.
● Some popular Relational Database management systems are:
○ DB2 and Informix Dynamic Server - IBM
○ Oracle and RDB – Oracle
○ SQL Server and Access - Microsoft
● In the formal relational model terminology, a row is called a tuple, a column
header is called an attribute, and the table is called a relation.
● The data type describing the types of values that can appear in each column
is represented by a domain of possible values
Concepts
● Attribute: Each column in a Table. Attributes are the properties which define
a relation. e.g., Student_Rollno, NAME,etc.
● Tables – In the Relational model the, relations are saved in the table format.
It is stored along with its entities. A table has two properties rows and
columns. Rows represent records and columns represent attributes.
● Tuple – It is nothing but a single row of a table, which contains a single
record.
● Relation Schema: A relation schema represents the name of the relation with
its attributes.
● Degree: The total number of attributes which in the relation is called the
degree of the relation.
● Cardinality: Total number of rows present in the Table.
● Column: The column represents the set of values for a specific attribute.
● Relation instance – Relation instance is a finite set of tuples in the RDBMS
system. Relation instances never have duplicate tuples.
● Relation key - Every row has one, two or multiple attributes, which is called
relation key.
● Attribute domain – Every attribute has some pre-defined value and scope
which is known as attribute domain
● A relation schema/ R, denoted by R(A1, A2, ... , An) is made up of a relation
name R and a list of attributes A1, A2, ..., An .
● Each attribute Ai is the name of a role played by some domain D in the
relation schema R.
● D is called the domain of Ai and is denoted by dom(Ai).
● A relation schema is used to describe a relation; R is called the name of this
relation.
● The degree (or arity) of a relation is the number of attributes n of its relation
schema.
An example of a relation schema for a relation of degree seven, which describes
university students, is the following:
STUDENT(Name, SSN, HomePhone, Address, OfficePhone, Age, GPA)
Using the data type of each attribute, the definition is sometimes written as:
STUDENT(Name: string, SSN: string, HomePhone: string, Address: string,
OfficePhone: string, Age: integer, GPA: real)
For this relation schema, STUDENT is the name of the relation, which has seven
attributes. In the above definition, we showed assignment of generic types such
as string or integer to the attributes.
Dr Edgar F Codd
● Dr E.F.Codd, also known to the world as the ‘Father of Database
Management Systems’ had propounded 12 rules which are in-fact 13 in
number.
● The rules are numbered from zero to twelve.
● According to him, a DBMS is fully relational if it abides by all his twelve rules.
● Till now, only few databases abide by all the eleven rules.
● His twelve rules are fondly called ‘E.F.Codd’s Twelve Commandments’.
● His brilliant and seminal research paper ‘A Relational Model of Data for
Large Shared Data Banks’ in its entirety is a visual treat to eyes.
● If a management system or software follows any of 5-6 rules proposed by E.
F.Codd, it qualifies to be a Database Management System (DBMS).
● If a management system or software follows any of 7-9 rules proposed by E.
F.Codd, it qualifies to be a semi-Relational Database Management System
(semi- RDBMS).
● If a management system or software follows 9-12 rules proposed by E.F.
Codd, it qualifies to be a complete Relational Database Management System
(RDBMS).
Rule 0 − Foundation rule
Any relational database management system that is propounded to be RDBMS or
advocated to be a RDBMS should be able to manage the stored data in its
entirety through its relational capabilities.
Rule 1 − Rule of Information
Relational Databases should store the data in the form of relations. Tables are
relations in Relational Database Management Systems. Be it any user defined
data or meta-data, it is important to store the value as an entity in the table cells.
Rule 2 − Rule of Guaranteed Access
The use of pointers to access data logically is strictly forbidden. Every data entity
which is atomic in nature should be accessed logically by using a right
combination of the name of table, primary key represented by a specific row
value and column name represented by attribute value.
Each unique piece of data(atomic value) should be accesible by : Table Name +
Primary Key(Row) + Attribute(column).
NOTE: Ability to directly access via POINTER is a violation of this rule.
Rule 3 − Rule of Systematic Null Value Support
Null values are completely supported in relational databases. They should be
uniformly considered as ‘missing information’. Null values are independent of any
data type. They should not be mistaken for blanks or zeroes or empty strings.
Null values can also be interpreted as ‘inapplicable data’ or ‘unknown information.’
Rule 4 − Rule of Active and online relational Catalog
In the Database Management Systems lexicon, ‘metadata’ is the data about the
database or the data about the data. The active online catalog that stores the
metadata is called ‘Data dictionary’. The so called data dictionary is accessible
only by authored users who have the required privileges and the query languages
used for accessing the database should be used for accessing the data of data
dictionary.
Rule 5 − Rule of Comprehensive Data Sub-language
A single robust language should be able to define integrity constraints, views,
data manipulations, transactions and authorizations. Example: SQL, etc. If the
database allows access to the data without the use of this language, then that is
a violation.
Rule 6 − Rule of Updating Views
Views should reflect the updates of their respective base tables and vice versa. A
view is a logical table which shows restricted data. Views generally make the data
readable but not modifiable. Views help in data abstraction.
Rule 7 − Rule of Set level insertion, update and deletion
A single operation should be sufficient to retrieve, insert, update and delete the
data.
Rule 8 − Rule of Physical Data Independence
Batch and end user operations are logically separated from physical storage and
respective access methods.
The physical storage of data should not matter to the system. If say, some file
supporting table is renamed or moved from one disk to another, it should not
effect the application.
Rule 9 − Rule of Logical Data Independence
Batch and end users can change the database schema without having to
recreate it or recreate the applications built upon it.
If there is change in the logical structure(table structures) of the database the
user view of data should not change. Say, if a table is split into two tables, a new
view should give result as the join of the two tables. This rule is most difficult to
satisfy.
Rule 10 − Rule of Integrity Independence
Integrity constraints should be available and stored as metadata in data dictionary
and not in the application programs.
The database should be able to enforce its own integrity rather than using other
programs. Key and Check constraints, trigger etc, should be stored in Data
Dictionary. This also make RDBMS independent of front-end.
Rule 11 − Rule of Distribution Independence
The Data Manipulation Language of the relational system should not be
concerned about the physical data storage and no alterations should be required
if the physical data is centralized or distributed.
A database should work properly regardless of its distribution across a network.
Even if a database is geographically distributed, with data stored in pieces, the
end user should get an impression that it is stored at the same place. This lays
the foundation of distributed database.
Rule 12 − Rule of Non Subversion
Any row should obey the security and integrity constraints imposed. No special
privileges are applicable.
If low level access is allowed to a system it should not be able to subvert or
bypass integrity rules to change the data. This can be achieved by some sort of
looking or encryption.
● Almost all full scale DBMSs are RDMSs.
● Oracle implements 11+ rules and so does Sybase.
● SQL Server also implements 11+ rules while FoxPro implements 7+ rules.
Domain
● A domain D is a set of atomic values. By atomic we mean that each value in
the domain is indivisible as far as the relational model is concerned.
● A common method of specifying a domain is to specify a data type from
which the data values forming the domain are drawn.
● It is also useful to specify a name for the domain, to help in interpreting its
values.
● Some examples of domains follow:
○ USA_phone_numbers: The set of ten-digit phone numbers valid in the United States.
○ Local_phone_numbers: The set of seven-digit phone numbers valid within a particular area
code in the United States.
○ Social_securiry_numbers: The set of valid nine-digit social security numbers.
○ Names: The set of character strings that represent names of persons.
○ Grade_paint_averages: Possible values of computed grade point averages; each must be a
real (floating-point) number between 0 and 4.
○ Employee_ages: Possible ages of employees of a company; each must be a value between
15 and 80 years old.
○ Academic_department_names: The set of academic department names in a university, such
as Computer Science, Economics, and Physics.
○ Academic_departmenccodes: The set of academic department codes, such as CS, ECON,
and PHYS.
● Above examples are called logical definitions of domains.
● A data type or format is also specified for each domain.
● For example, the data type for the domain uSA_phone_numbers can be
declared as a character string of the form (ddd)ddd-dddd, where each d is a
numeric (decimal) digit and the first three digits form a valid telephone area
code.
● The data type for Employee_ages is an integer number between 15 and 80.
● For Academic_department_names, the data type is the set of all character
strings that represent valid department names.
● A domain is thus given a name, data type, and format.
● Additional information for interpreting the values of a domain can also be
given; for example, a numeric domain such as Person_weights should have
the units of measurement, such as pounds or kilograms.