ENGISTAN.
COM
DBMS Concepts For IBPS IT-Officer 2014
Contents 1. Basic Terms 2. Database Models 3. RDBMS 4. Database Keys 5. Database Users 6. Normalization 7. E-R Diagram 8. Generalization & Specialization 9. SQL Basics 10. Data Languages 11. SQL Queries 12. Transactions-ACID Properties
This document is prepared for IBPS SO (IT-Officer) Examination 2014. The key concepts of DBMS are explained in a very precise & lucid way to assist the aspirants in their preparation. If you have any queries, doubts, or suggestions, please do share with us in our Forum. We wish you All The Best  TEAM Engistan
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Data: Data is the quantities, characters, or symbols on which operations are performed by a computer. Data (or) Information Processing: The process of converting the facts into meaningful information is known as Data processing. It is also known as Information processing. Meta Data: The term Metadata refers to "data about data. Metadata is defined as the data providing information about one or more aspects of the data, such as:
     
Means of creation of the data Purpose of the data Time and date of creation Creator or author of the data Location on a computer network where the data were created Standards used
Database: A database is a structured collection of data, which is organized into files called tables. o A logically coherent collection of related data that (i) describes the entities and their inter-relationships, and (ii) is designed, built & populated for a specific reason.
Database Model
A Database model defines the logical design of data. The model describes the relationships between different parts of the data. In history of database design, three models have been in use.
  
Hierarchical Model Network Model Relational Model
 Hierarchical Model: In this model each entity has only one parent but can have several children. At the top of hierarchy there is only one entity which is called Root.
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]
 Network Model: In the network model, entities are organised in a graph, in which some entities can be accessed through several path
 Relational Model: In this model, data is organised in two-dimesional tables called relations. The tables or relation are related to each other.
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]
RDBMS Concepts
A Relational Database management System (RDBMS) is a database management system based on relational model introduced by E.F Codd. In relational model, data is represented in terms of tuples (rows). RDBMS is used to manage Relational database. Relational database is a collection of organized set of tables from which data can be accessed easily. Relational Database is most commonly used database. It consists of number of tables and each table has its own primary key.
What is Table ? In Relational database, a table is a collection of data elements organised in terms of rows and columns. A table is also considered as convenient representation of relations. But a table can have duplicate tuples while a true relation cannot have duplicate tuples. Table is the most simplest form of data storage. Below is an example of Employee table. ID Name Age Salary
3
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] 1 2 3 4 Adam Alex Stuart Ross 34 28 20 42 13000 15000 18000 19020
What is a Record ? A single entry in a table is called a Record or Row. A Record in a table represents set of related data. For example, the above Employee table has 4 records. Following is an example of single record. 1 Adam 34 13000
What is Field ? A table consists of several records (row), each record can be broken into several smaller entities known as Fields. The above Employee table consist of four fields, ID, Name, Age and Salary.
What is a Column ? In Relational table, a column is a set of value of a particular type. The term Attribute is also used to represent a column. For example, in Employee table, Name is a column that represent names of employee. Name
4
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Adam Alex Stuart Ross
Database Management System (DBMS):
A collection of programs that enables users to perform certain actions on a particular database:  define the structure of database information (descriptive attributes, data types, constraints, etc), storing this as meta- data   populate the database with appropriate information manipulate the database (for retrieval/update/removal/insertion of information)  protect the database contents against accidental or deliberate corruption of contents (involves secure access by users and automatic recovery in the case of user/hardware faults)  share the database among multiple users, possibly concurrently Examples of DBMS are Oracle, Sybase, MySQL, DB/2, SQLServer, Informix, MS-Access, FileMaker etc
Sample Databases Shown below is an extract from a (relational) database that might be part of a Universitys Academic Information System: Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]
Terminology:
   relation = table (file) attribute = column (field) tuple = row (record)
6
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]
Database Keys:
Keys are very important part of Relational database. They are used to establish and identify relation between tables. They also ensure that each record within a table can be uniquely identified by combination of one or more fields within a table.
 Super Key: Super Key is defined as a set of attributes within a table that uniquely identifies each record within a table. Super Key is a superset of Candidate key.
 Candidate Key: Candidate keys are defined as the set of fields from which primary key can be selected. It is an attribute or set of attribute that can act as a primary key for a table to uniquely identify each record in that table.
 Primary Key: Primary key is a candidate key that is most appropriate to become main key of the table. It is a key that uniquely identify each record in a table.
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]
 Foreign Key: A foreign key is generally a primary key from one table that appears as a field in another where the first table has a relationship to the second. In other words, if we had a table A with a primary key X that linked to a table B where X was a field in B, then X would be a foreign key in B.
 Composite Key: Key that consists of two or more attributes that uniquely identify an entity occurrence is called Composite key. But any attribute that makes up the Composite key is not a simple key in its own.
 Secondary or Alternative key: The candidate key which are not selected for primary key are known as secondary keys or alternative keys
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]  Non-key Attribute: Non-key attributes are attributes other than candidate key attributes in a table.
 Non-prime Attribute: Non-prime Attributes are attributes other than Primary attribute.
Database Users:
 Database Administrators (DBA): o individual(s) that determine & implement policy regarding users, their permissions on a database and the design & construction of that database  Database Designers: o individual(s)  possibly also software engineers  who apply design techniques to produce database structures pertinent to a specific application  End Users: o People who, from time to time, access the contents of a database:  Casual end users may submit ad-hoc queries as the need arises, using a high-level query language  nave, or parametric, end-users access the database through pre-written programs that effect an appropriate interface to the database  database programmers write code, using a relevant programming language and the high-level query language, that can later be used by parametric users
Normalization
Normalization is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anomalies. It is a twoEngistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] step process that puts data into tabular form by removing duplicated data from the relation tables.
Normalization is used for mainly two purposes,
 Eliminating redundant (useless) data. Ensuring data dependencies make sense i.e data is logically stored.
Problem Without Normalization Without Normalization, it becomes difficult to handle and update the database, without facing data loss. Insertion, Updation and Deletion Anomalies are very frequent if Database is not Normalized. To understand these anomalies let us take an example of Student table. S_id 401 402 403 404 S_Name Adam Alex Stuart Adam S_Address Noida Panipat Jammu Noida Subject_opted Bio Maths Maths Physics
Updation Anomaly : To update address of a student who occurs twice or more than twice in a table, we will have to update S_Address column in all the rows, else data will become inconsistent.
10
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]
Insertion Anomaly: Suppose for a new admission, we have a Student id(S_id), name and address of a student but if student has not opted for any subjects yet then we have to insert NULL there, leading to Insertion Anamoly. Deletion Anomaly: If (S_id) 401 has only one subject and temporarily he drops it, when we delete that row, entire student record will be deleted along with it.
Normalization Rule
Normalization rule are divided into following normal form. 1. First Normal Form 2. Second Normal Form 3. Third Normal Form 4. BCNF 1. First Normal Form (1NF): A row of data cannot contain repeating group of data i.e each column must have a unique value. Each row of data must have a unique identifier i.e Primary key. For example consider a table which is not in First normal form Student Table : S_id 401 401 402 403 S_Name Adam Adam Alex Stuart subject Biology Physics Maths Maths
11
You can clearly see here that student name Adam is used twice in the table and subject math is also repeated. This violates the First Normal form. To reduce above table to First Normal form breaks the table into two different tables Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] New Student Table : S_id 401 402 403 Subject Table : subject_id 10 11 12 12 student_id 401 401 402 403 subject Biology Physics Math Math S_Name Adam Alex Stuart
In Student table concatenation of subject_id and student_id is the Primary key. Now both the Student table and Subject table are normalized to first normal form
2. Second Normal Form (2NF): A table to be normalized to Second Normal
Form should meet all the needs of First Normal Form and there must not be any partial dependency of any column on primary key. It means that for a table that has concatenated primary key, each column in the table that is not part of the primary key must depend upon the entire concatenated key for its existence. If any column depends oly on one part of the concatenated key, then the table fails Second normal form. For example, consider a table which is not in Second normal form.
12
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Customer Table: customer_id 101 101 102 103 Customer_Name Adam Adam Alex Stuart Order_id 10 11 12 13 Order_name order1 order2 order3 order4 Sale_detail sale1 sale2 sale3 sale4
In Customer table concatenation of Customer_id and Order_id is the primary key. This table is in First Normal form but not in Second Normal form because there are partial dependencies of columns on primary key. Customer_Name is only dependent on customer_id, Order_name is dependent on Order_id and there is no link between sale_detail and Customer_name. To reduce Customer table to Second Normal form break the table into following three different tables. Customer_Detail Table : customer_id 101 102 103 Customer_Name Adam Alex Stuart
13
Order_Detail Table :
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Order_id 10 11 12 13 Order_Name Order1 Order2 Order3 Order4
Sale_Detail Table :
customer_id 101 101 102 103
Order_id 10 11 12 13
Sale_detail sale1 sale2 sale3 sale4
Now all these three table comply with Second Normal form.
3. Third Normal Form (3NF): Third Normal form applies that every non-prime
attribute of table must be dependent on primary key. The transitive functional dependency should be removed from the table. The table must be in Second Normal form. For example, consider a table with following fields. Engistan.com | Engineers Community
14
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Student_Detail Table:
Student_id
Student_name
DOB
Street
city
State
Zip
In this table Student_id is Primary key, but street, city and state depends upon Zip. The dependency between zip and other fields is called transitive dependency. Hence to apply 3NF, we need to move the street, city and state to new table, with Zip as primary key.
New Student_Detail Table : Student_id Student_name DOB Zip
Address Table : Zip Street city state
The advantage of removing transitive dependency is,   Amount of data duplication is reduced. Data integrity achieved.
4. Boyce and Codd Normal Form (BCNF): Boyce and Codd Normal Form is a higher version of the Third Normal form. This form deals with certain type of anamoly that is not handled by 3NF. A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF. Engistan.com | Engineers Community
15
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]
E-R Diagram
ER-Diagram is a visual representation of data that describes how data is related to each other.
Symbols and Notations
16
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]
Components of E-R Diagram
The E-R diagram has three main components. 1) Entity An Entity can be any object, place, person or class. In E-R Diagram, an entity is represented using rectangles. Consider an example of an Organisation. Employee, Manager, Department, Product and many more can be taken as entities from an Organisation.
Weak Entity Weak entity is an entity that depends on another entity. Weak entity doen't have key attribute of their own. Double rectangle represents weak entity.
2) Attribute An Attribute describes a property or characterstic of an entity. For example, Name, Age, Address etc can be attributes of a Student. An attribute is represented using eclipse.
17
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Key Attribute Key attribute represents the main characteristic of an Entity. It is used to represent Primary key. Ellipse with underlying lines represent Key Attribute.
Composite Attribute An attribute can also have their own attributes. These attributes are known as Composite attribute.
3) Relationship A Relationship describes relations between entities. Relationship is represented using diamonds.
There are three types of relationship that exist between Entities.
  Binary Relationship Recursive Relationship Ternary Relationship
18
Binary Relationship Binary Relationship means relation between two Entities. This is further divided into three types. Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] 1. One to One : This type of relationship is rarely seen in real world.
The above example describes that one student can enroll ony for one course and a course will also have only one Student. This is not what you will usually see in relationship. 2. One to Many : It reflects business rule that one entity is associated with many number of same entity. For example, Student enrolls for only one Course but a Course can have many Students.
The arrows in the diagram describes that one student can enroll for only one course. 3. Many to Many :
The above diagram represents that many students can enroll for more than one courses.
Recursive Relationship When an Entity is related with itself it is known as Recursive Relationship.
19
Ternary Relationship Relationship of degree three is called Ternary relationship. Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]
Generalization and Specialization
Generalization: Generalization is a bottom-up approach in which two lower level entities combine to form a higher level entity. In generalization, the higher level entity can also combine with other lower level entity to make further higher level entity.
Specialization: Specialization is opposite to Generalization. It is a top-down approach in which one higher level entity can be broken down into two lower level entity. In specialization, some higher level entities may not have lower-level entity sets at all.
Aggregation: Aggregation is a process when relation between two entity is treated as a single entity. Here the relation between Center and Course is acting as an Entity in relation with Visitor.
20
Engistan.com | Engineers Community
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]
SQL Basics
Introduction to SQL Structure Query Language (SQL) is a programming language used for storing and managing data in RDBMS. SQL was the first commercial language introduced for E.F Codd's Relational model. Today almost all RDBMS (MySql, Oracle, Infomix, Sybase, MS Access) uses SQL as the standard database language.  SQL is used to perform all type of data operations in RDBMS.
SQL Command SQL defines following data languages to manipulate data of RDBMS.
 DDL : Data Definition Language All DDL commands are auto-committed. That means it saves all the changes permanently in the database. Command create Description to create new table or database for alteration delete data from table to drop a table to rename a table
alter truncate drop rename
 DML : Data Manipulation Language DML commands are not auto-committed. It means changes are not permanent to database, they can be rolled back. Engistan.com | Engineers Community
21
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014] Command insert update delete merge Description to insert a new row to update existing row to delete a row merging two rows or two tables
 TCL : Transaction Control Language These commands are to keep a check on other commands and their affect on the database. These commands can annul changes made by other commands by rolling back to original state. It can also make changes permanent. Command commit rollback savepoint Description to permanently save to undo change to save temporarily
 DCL : Data Control Language Data control language provides command to grant and take back authority. Command grant revoke Description grant permission of right take back permission. Engistan.com | Engineers Community
22
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]
 DQL : Data Query Language Command select Description retrieve records from one or more table
Basic Structure of SQL Queries:
The basic structure of an SQL query consists of three clauses: SELECT, FROM, and WHERE. 1. SELECT Statement: SELECT Statement Defines WHAT is to be returned (separated by commas)  Database Columns (From Tables or Views)  Constant Text Values  Formulas  Pre-defined Functions  Group Functions (COUNT, SUM, MAX, MIN, AVG) * Means All Columns From All Tables In the FROM Statement Example: SELECT state_code, state_name 2. FROM Statement: Defines the Table(s) or View(s) Used by the SELECT or WHERE Statements   You MUST Have a FROM statement   Multiple Tables/Views are separated by Commas 3. WHERE Clause: Defines what records are to be included in the query  It is Optional.  Uses Comparison Operators (=, >, >=, <, <=,!=,<>  Multiple Conditions Linked with AND & OR Statements  Strings Contained Within SINGLE QUOTES. AND & OR Statements:  Multiple WHERE conditions are Linked by AND / OR Statements   AND Means All Conditions are TRUE for the Record Engistan.com | Engineers Community
23
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]    Examples: 1. SELECT * FROM annual_summaries WHERE sd_duration_code = 1 2. SELECT state_name FROM states WHERE state_population > 15000000 3. SELECT state_name, state_population FROM states WHERE state_name LIKE %NORTH% 4. SELECT * FROM annual_summaries WHERE sd_duration_code IN (1, , W, , X) AND annual_summary_year = 2000  OR Means at least 1 of the Conditions is TRUE  You May Group Statements with ( )  BE CAREFUL MIXING AND & OR Conditions
Transaction Management:
Transaction: A transaction is a unit of program execution that accesses and possibly updates various data items. Or in simple words A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the database. Goal Of Transactions: The ACID properties  Atomicity: Either all actions are carried out, or none are.  Consistency: If each transaction is consistent, and the database is initially consistent, then it is left consistent.  Isolation: Transactions are isolated, or protected, from the effects of other scheduled transactions.  Durability: If a transaction completes successfully, then its effects persist.
1. Atomicity: A transaction can  Commit after completing its actions, or  Abort because of - Internal DBMS decision: restart - System crash: power, disk failure,  - Unexpected situation: unable to access disk, data value,   A transaction interrupted in the middle could leave the database inconsistent Engistan.com | Engineers Community
24
Engistan.com [90BDBMS CONCEPTS FOR IBPS IT-OFFICER 2014]  DBMS needs to remove the effects of partial transactions to ensure atomicity: either all a transactions actions are performed or none.
2. Consistency: Database consistency is the property that every transaction sees a consistent database instance. It follows from transaction atomicity, isolation and transaction consistency  Users are responsible for ensuring transaction consistency - when run to completion against a consistent database instance, the transaction leaves the database consistent  For example, consistency criterion that my inter-account-transfer transaction does not change the total amount of money in the accounts!
3. Isolation: Guarantee that even though transactions may be interleaved, the net effect is identical to executing the transactions serially  For example, if transactions T1 and T2 are executed concurrently, the net effect is equivalent to executing - T1 followed by T2, or - T2 followed by T1  NOTE: The DBMS provides no guarantee of effective order of execution. 4. Durability: DBMS uses the log to ensure durability.  If the system crashed before the changes made by a completed transaction are written to disk, the log is used to remember and restore these changes when the system is restarted.  Again, this is handled by the recovery manager
25
Engistan.com | Engineers Community