What is database? How it is differ from traditional file system? 6.
**One-to-Many Relationship**: Illustrated by a line connecting an **System Analyst**:
A database is an organized collection of structured data stored entity to a relationship with a crow's foot at the "many" end, it 1. **Requirement Gathering**: Gathers and analyzes user
electronically, managed by a Database Management System (DBMS) indicates that one instance of an entity can be associated with many requirements for the database system.
that enables efficient data retrieval, updating, and management. instances of another entity. Example: "One Department has many 2. **System Design**: Designs the overall architecture and structure
### Differences from Traditional File System: Employees". of the database system based on user needs.
1. **Data Organization**: Databases use tables within a schema to 7. **Many-to-Many Relationship**: Shown by connecting two 3. **Database Design**: Collaborates with DBAs to design the
organize data, while file systems store data in separate files without entities with lines and placing a crow's foot at both ends, it signifies database schema, including tables, relationships, and constraints.
inherent structure for relationships. that many instances of one entity can be associated with many 4. **Documentation**: Documents system requirements, design
2. **Data Integrity**: Databases enforce integrity and consistency instances of another entity. Example: "Many Students enroll in many specifications, and user manuals for guidance.
automatically through constraints, whereas file systems rely on Courses". 5. **Testing and Quality Assurance**: Oversees testing processes to
application code. 8. **Participation Constraint**: Indicates whether the participation ensure system functionality and user satisfaction.
3. **Redundancy**: Databases minimize redundancy through of entities in a relationship is mandatory or optional. Denoted by a **Application Programmer**:
normalization; file systems often have duplicated data. solid line (mandatory) or dashed line (optional) connecting the entity 1. **Software Development**: Develops software applications that
4. **Security**: Databases offer robust security features like user to the relationship. interact with the database system.
roles and permissions; file systems have basic file-level permissions. 9. **Overlap Constraint**: Determines whether an entity can 2. **Database Connectivity**: Establishes connections between
5. **Concurrency**: Databases handle concurrent multi-user access participate in multiple relationships simultaneously. Denoted by applications and the database, enabling data retrieval and
efficiently, while file systems can lead to conflicts without careful double lines (total overlap) or single lines (partial overlap) connecting manipulation.
management. the entity to the relationships. 3. **Querying and Data Manipulation**: Writes SQL queries and
10. **Covering Constraint**: Specifies whether an entity participates scripts to retrieve, update, insert, and delete data.
What do you mean by data isolation ? Explain
in all relationships within an entity set. Denoted by double lines (total 4. **Application Logic**: Implements business logic within
Data isolation refers to the challenge of accessing and integrating
covering) or single lines (partial covering) connecting the entity to the applications to process data and perform operations.
data stored in different files or systems, which complicates data
relationships. 5. **User Interface Development**: Designs and develops user
retrieval, processing, and maintenance. In traditional file systems,
11. **Weak Entity Set**: An entity set that cannot be uniquely interfaces for applications to facilitate user interaction.
data is kept in separate, isolated files, often with varying formats and
identified by its attributes alone and relies on a related entity set for Define the concept of aggregation. Give two examples of where this
without a unified structure. This makes it difficult to ensure data
identification. Represented by double rectangles. concept is useful
consistency and integrity across multiple sources, requiring complex
12. **Aggregation**: Denoted by a diamond shape containing the **Aggregation** is a modeling concept in database design where a
and error-prone application code to manage and integrate the
relationship, it represents a higher-level entity that comprises related higher-level entity, known as the aggregate, is composed of related
isolated data.
entities. lower-level entities. It represents a whole-part relationship where the
For example, an organization might store customer information in
13. **Role Indicator**: Labels on relationships specifying the role aggregate entity encompasses one or more constituent entities.
one file, sales data in another, and inventory details in a third.
Combining this data for comprehensive reporting or analysis is played by each entity in the relationship. Example: "Manager" and ### Examples of Aggregation:
"Employee" in a "Manages" relationship. 1. **Order and Order Items**:
complex and prone to errors. Databases address data isolation by
### Brief Explanations: - In an e-commerce system, an "Order" entity can aggregate
organizing data in a unified structure, typically using tables within a
- **Attribute**: Describes a property or characteristic of an entity. multiple "Order Item" entities.
schema, allowing for easy definition and management of
- **Domain**: The set of allowable values for an attribute. - Each "Order Item" represents a specific product purchased within
relationships between data sets. This organization simplifies data
retrieval and integration, enhances reliability, and ensures - **Entity**: A real-world object with a distinct identity. an order.
- **Relationship**: Association between entities. - The "Order" entity aggregates all "Order Items" associated with a
consistency and integrity across the system.
- **Entity Set**: A collection of similar entities. single customer transaction.
### Advantages of DBMS: - **Relationship Set**: A collection of similar relationships. 2. **Library and Books**:
1. **Data Integrity**: Ensures data accuracy and consistency. - **One-to-Many Relationship**: One instance of an entity is - In a library management system, a "Library" entity can aggregate
2. **Data Security**: Controls access to sensitive information. associated with many instances of another entity. multiple "Book" entities.
3. **Data Redundancy Control**: Minimizes duplicate data. - **Many-to-Many Relationship**: Many instances of one entity are
4. **Efficient Querying**: SQL allows for complex data retrieval. - Each "Book" entity represents an individual book in the library's
associated with many instances of another entity. collection.
5. **Concurrent Access**: Manages simultaneous user interactions. - **Participation Constraint**: Specifies if entity participation in a - The "Library" entity aggregates all "Books" held within its
6. **Backup and Recovery**: Automated processes protect against relationship is mandatory or optional. inventory.
data loss. - **Overlap Constraint**: Determines if an entity can participate in
### Disadvantages of DBMS: Keys in a database are used to uniquely identify records within a
multiple relationships simultaneously.
1. **Complexity**: Requires specialized knowledge and skills. table and establish relationships between tables. There are several
- **Covering Constraint**: Specifies if an entity participates in all types of keys:
2. **Cost**: Initial setup and maintenance can be expensive. relationships within an entity set.
3. **Performance Overhead**: May impact performance for small- Primary Key (PK): Uniquely identifies each record in a table, ensuring
- **Weak Entity Set**: An entity set that cannot be uniquely
scale applications. data integrity and facilitating efficient data retrieval.
identified by its attributes alone.
4. **Scalability Challenges**: Some systems struggle with large Foreign Key (FK): Establishes relationships between tables by
- **Aggregation**: Represents a higher-level entity composed of
datasets. referencing the primary key of another table, enforcing referential
related entities. integrity.
5. **Maintenance Burden**: Regular updates and optimization are - **Role Indicator**: Labels specifying the role of entities in a
necessary. Composite Key: Combination of multiple attributes that uniquely
relationship.
identifies each record, used when a single attribute is insufficient.
Applications of Database Management Systems (DBMS) are diverse: ### Object-Based Data Models: Candidate Key: Set of attributes that can uniquely identify records,
Business Management: ERP systems use DBMS for finance, HR, and 1. **Conceptual Basis**: Objects encapsulate both data and any of which can be chosen as the primary key.
supply chain. behavior.
Customer Relations: CRM systems store customer data for sales and Super Key: Set of attributes that uniquely identifies records, may
2. **Complexity**: Support for complex relationships and contain more attributes than necessary for uniqueness.
service. hierarchies. Alternate Key: Candidate key not selected as the primary key, still
Healthcare: Patient records and medical data are managed using 3. **Flexibility**: Dynamic and mutable data structures. uniquely identifies records and can serve as a backup identifier.
DBMS. 4. **Expressiveness**: Inheritance, polymorphism, and
Banking: DBMS handles transactions, accounts, and loans securely. A **Relational Database Management System (RDBMS)** is a
encapsulation.
E-commerce: Online stores manage inventory and orders with DBMS. software system that manages relational databases. It organizes data
5. **Suitability**: Well-suited for modeling complex systems with
Telecom: Subscriber data and call records are stored and managed. into tables, where each table consists of rows and columns. Here are
evolving data structures.
Education: Student records and course schedules are managed using the key properties of a valid relation within an RDBMS:
### Record-Based Data Models:
DBMS. 1. **Atomic Values**: Each cell in the table holds a single, indivisible
1. **Conceptual Basis**: Records represent fixed sets of fields or value, ensuring data integrity and consistency.
Government: Citizen data, permits, and public services are organized. attributes.
Manufacturing: Production schedules and inventory are managed 2. **Unique Column Names**: Every column in a table must have a
2. **Simplicity**: Straightforward representation of structured data.
with DBMS. unique name to avoid ambiguity and facilitate data retrieval.
3. **Scalability**: Efficiency in large-scale data storage and
Media: Content management and distribution are streamlined with 3. **Unique Rows**: Each row in a table must be unique, preventing
processing.
DBMS. duplicate data entries and ensuring data accuracy.
4. **Performance**: Optimized for querying and indexing.
4. **Ordering Independence**: The order of rows and columns in a
**Entity-Relationship (ER) diagrams** are graphical representations 5. **Common Usage**: Widely used in relational databases for table does not affect the meaning of the data, allowing for flexible
used in database design to illustrate the logical structure of a structured data management.
data retrieval and manipulation.
database. They are primarily used to model the entities and Rules for Reducing ER Diagram to Tables: 5. **No Duplicate Columns**: Each column within a table should
relationships within a database system. 1. **Entity Sets to Tables**: contain different types of data, preventing redundancy and
### Various Symbols Used in ER Diagrams: - Each entity set becomes a table, with attributes as columns and a maintaining data integrity.
1. **Entity**: Represented by a rectangle with rounded corners, it primary key. 6. **Entity Integrity**: Each table must have a primary key that
denotes a real-world object, such as a person, place, or thing, that 2. **Relationship Sets to Tables**:
can be uniquely identified. Example: "Employee" or "Customer". uniquely identifies each row, ensuring that no null values are allowed
- Relationship sets may result in separate tables, often for many-to- in the primary key column and enforcing entity integrity.
2. **Attribute**: Represented by an oval connected to the entity, it many relationships. 7. **Referential Integrity**: If a foreign key exists in a table, it must
describes a property or characteristic of an entity. Example: 3. **Attribute Domains**: reference a valid primary key in another table, maintaining referential
"Employee ID" or "Customer Name". - Attribute domains define data types for table columns, ensuring integrity and preserving the relationships between tables.
3. **Relationship**: Represented by a diamond shape connecting consistency.
entities, it signifies an association between two or more entities. Relation algebra is a formal system for manipulating relations, which
4. **Normalization**:
Example: "Works_For" or "Owns". are sets of tuples representing data in a relational database. Its
- The schema is normalized to Third Normal Form (3NF) or Boyce-
4. **Primary Key**: Denoted by underlining an attribute within an fundamental operations are:
Codd Normal Form (BCNF) to minimize redundancy. 1. **Selection (σ)**: Selects tuples from a relation that satisfy a
entity, it uniquely identifies each entity instance. Example: 5. **Foreign Keys**:
Underlining "Employee ID" in the "Employee" entity. specified condition.
- Foreign keys establish relationships between tables, enforcing
5. **Foreign Key**: Represented similarly to the primary key, it refers - Example: σ<condition>(relation)
referential integrity.
to a primary key in another entity, establishing a relationship 2. **Projection (π)**: Extracts specified columns (attributes) from a
between entities. Example: "Department_ID" in the "Employee" What are the roles of system analyst and application programmer in relation, eliminating duplicates.
database system ? - Example: π<attribute list>(relation)
entity referencing the "Department" entity's primary key.
3. **Union (∪)**: Combines two relations, eliminating duplicates. 5. **Transitive Functional Dependency:** FOREIGN KEY (CustomerID) REFERENCES Customers(CustomerID)
- Example: relation1 ∪ relation2 - \(X \rightarrow Y\) and \(Y \rightarrow Z\) implies \(X \rightarrow );
4. **Intersection (∩)**: Retrieves tuples that appear in both Z\). Unique Constraint:
relations. - Example: \(A \rightarrow B\) and \(B \rightarrow C\) gives \(A Ensures values in a column are unique.
- Example: relation1 ∩ relation2 \rightarrow C\). Example:
5. **Difference (-)**: Retrieves tuples from the first relation that do **Example:** CREATE TABLE Products (
not appear in the second relation. For a relation **Student** with attributes **(StudentID, ProductID INT PRIMARY KEY,
- Example: relation1 - relation2 StudentName, CourseID, CourseName, Instructor)**: ProductName VARCHAR(50) UNIQUE,
6. **Cartesian Product (×)**: Generates a new relation containing all - \(StudentID \rightarrow StudentName\) Price DECIMAL(10,2)
possible combinations of tuples from two relations. - \(CourseID \rightarrow CourseName, Instructor\) );
- Example: relation1 × relation2 - \((StudentID, CourseID) \rightarrow Instructor\) Check Constraint:
7. **Join (⋈)**: Combines tuples from two relations based on a Explain the insert, select, delete and update operation in relation Ensures values in a column meet specific conditions.
common attribute. database Example:
- Example: relation1 ⋈<condition> relation2 1. **INSERT:** CREATE TABLE Employees (
What is normalization? Why we need it - **Purpose:** INSERT is used to add new records (rows) into a EmployeeID INT PRIMARY KEY,
Normalization is a data preprocessing technique that adjusts the scale table. Age INT CHECK (Age >= 18),
of data features to a similar range. It is essential because: - **Syntax:** Department VARCHAR(50)
1. **Improves Convergence in Gradient Descent (1 mark):** INSERT INTO table_name (column1, column2, ...) VALUES (value1, );
- Speeds up optimization by ensuring equal contribution from all value2, ...); Not Null Constraint:
features. - **Example:** Ensures a column does not contain NULL values.
2. **Avoids Numerical Instability (1 mark):** INSERT INTO Students (StudentID, Name, Age) VALUES (1, 'Alice', Example:
- Prevents numerical issues in algorithms involving matrix 20); CREATE TABLE Customers (
operations. - This statement inserts a new record into the Students table with CustomerID INT PRIMARY KEY,
3. **Enhances Model Performance (1 mark):** StudentID, Name, and Age values. Name VARCHAR(50) NOT NULL,
- Models like k-nearest neighbors (KNN) and support vector 2. **SELECT:** Email VARCHAR(100)
machines (SVM) perform better with normalized data. - **Purpose:** SELECT retrieves data from one or more tables );
4. **Supports Regularization Techniques (1 mark):** based on specified criteria. Define aggregate function
- Regularization methods like L1 and L2 work effectively with - **Syntax:** An aggregate function in SQL is a function that performs a calculation
normalized features. SELECT column1, column2, ... FROM table_name WHERE condition; on a set of values and returns a single value. Aggregate functions are
5. **Balances Feature Contributions (1 mark):** - **Example:** often used with the `GROUP BY` clause of the `SELECT` statement to
- Ensures that no single feature dominates the learning process due SELECT * FROM Students WHERE Age > 18; perform calculations on groups of rows. They are essential for
to scale differences. - This statement retrieves all columns from the Students table summarizing data in various ways.
Types or level of normalization where the Age is greater than 18. Common aggregate functions include:
1. **First Normal Form (1NF):** 3. **DELETE:** 1. **COUNT()** - Returns the number of rows that match the
- **Definition:** 1NF ensures that each column in a table contains - **Purpose:** DELETE removes existing records from a table based specified criteria.
atomic values, and there are no repeating groups or arrays within on specified conditions. - Example:
columns. - **Syntax:** SELECT COUNT(*) FROM Employees;
- **Example:** In a table representing student enrollments, if a DELETE FROM table_name WHERE condition; - This returns the total number of employees.
column contains multiple course names separated by commas, - **Example:** 2. **SUM()** - Returns the sum of a numeric column.
breaking this into individual rows for each course would achieve 1NF. DELETE FROM Students WHERE StudentID = 1; - Example:
2. **Second Normal Form (2NF):** - This statement deletes the record from the Students table - This returns the total sum of all salaries in the Employees table.
- **Definition:** 2NF builds upon 1NF and ensures that all non-key where the StudentID is equal to 1. 3. **AVG()** - Returns the average value of a numeric column.
attributes are fully functionally dependent on the entire primary key. 4. **UPDATE:** - Example:
- **Example:** In a table representing orders with columns - **Purpose:** UPDATE modifies existing records in a table based SELECT AVG(Salary) FROM Employees;
(OrderID, ProductID, ProductName, ProductPrice), moving on specified conditions. - This returns the average salary of employees.
ProductName and ProductPrice to a separate Products table with - **Syntax:** 4. **MAX()** - Returns the maximum value in a set.
ProductID as the primary key would achieve 2NF. UPDATE table_name SET column1 = value1, column2 = value2, ... - Example:
3. **Third Normal Form (3NF):** WHERE condition; SELECT MAX(Salary) FROM Employees;
- **Definition:** 3NF builds upon 2NF and removes transitive - **Example:** - This returns the highest salary in the Employees table.
dependencies. It ensures that no non-key attribute depends on UPDATE Students SET Age = 21 WHERE StudentID = 1; 5. **MIN()** - Returns the minimum value in a set.
another non-key attribute. - This statement updates the Age column in the Students table to 21 - Example:
- **Example:** In a table representing employees with columns where the StudentID is equal to 1. SELECT MIN(Salary) FROM Employees;
(EmployeeID, Department, ManagerName), removing ManagerName Is it essential to have more than one table to difen foregin key? If - This returns the lowest salary in the Employees table.
to a separate Managers table with ManagerID as the primary key no, then explain uisng suitable example Explain the ways to insert records in to table with example
would achieve 3NF. Answer: No, it's not essential to have more than one table to define a There are several ways to insert records into a table in SQL. Here are
4. **Boyce-Codd Normal Form (BCNF):** foreign key. A foreign key can reference the same table it resides in, the main methods with examples:
- **Definition:** BCNF is a stricter version of 3NF, where every which is known as a self-referencing or self-referential foreign key. Inserting a Single Row:
determinant is a candidate key. It eliminates any non-trivial functional Example: Syntax:
dependency where the determinant is not a superkey. Consider a table named Employees with the following attributes: INSERT INTO table_name (column1, column2, ...) VALUES (value1,
- **Example:** In a table representing courses with columns EmployeeID (primary key), Name, and ManagerID (foreign key value2, ...);
(CourseID, InstructorName, Department), if InstructorName depends referencing EmployeeID). Example:
on Department, it violates BCNF unless Department is a candidate EmployeeID Name ManagerID INSERT INTO Employees (EmployeeID, Name, Age, Department)
key. 1 Alice NULL VALUES (1, 'Alice', 30, 'HR');
5. **Fourth Normal Form (4NF):** 2 Bob 1 Inserting Multiple Rows:
- **Definition:** 4NF eliminates multi-valued dependencies. It 3 Charlie 2 Syntax:
ensures that there are no two independent multi-valued facts within In this example, the ManagerID column references the EmployeeID INSERT INTO table_name (column1, column2, ...) VALUES (value1,
the same table. column in the same Employees table. Each employee has a value2, ...), (value3, value4, ...), ...;
- **Example:** In a table representing student activities with corresponding ManagerID indicating who their manager is. Example:
columns (StudentID, Language, Sport), if a student can participate in Alice (EmployeeID 1) is a manager and has no manager (ManagerID is INSERT INTO Employees (EmployeeID, Name, Age, Department)
multiple languages and sports, splitting this into separate tables NULL). VALUES (2, 'Bob', 25, 'IT'), (3, 'Charlie', 28, 'Finance');
would achieve 4NF. Bob (EmployeeID 2) reports to Alice (ManagerID 1). Inserting Data from Another Table:
What is functional dependency with it types and example Charlie (EmployeeID 3) reports to Bob (ManagerID 2). Syntax:
**Functional Dependency:** What is constraint? Explain different types of constraints INSERT INTO table_name (column1, column2, ...)
In a relational database, a functional dependency (FD) is a A constraint in a database refers to a rule or condition enforced on SELECT column1, column2, ...
relationship that exists when one attribute uniquely determines the data stored within tables to maintain the integrity, accuracy, and FROM another_table
another attribute. Denoted as \(X \rightarrow Y\), it means if two consistency of the database. Constraints help ensure that the data WHERE condition;
rows have the same value for \(X\), they must also have the same remains valid and meaningful. There are several types of constraints Example:
value for \(Y\). commonly used in databases: INSERT INTO Employees_Backup (EmployeeID, Name, Age,
**Types of Functional Dependencies:** Primary Key Constraint: Department)
1. **Trivial Functional Dependency:** Ensures each row is uniquely identified. SELECT EmployeeID, Name, Age, Department
- \(Y\) is a subset of \(X\). Example: FROM Employees
- Example: \(\{A, B\} \rightarrow A\). CREATE TABLE Students ( WHERE Department = 'HR';
2. **Non-Trivial Functional Dependency:** StudentID INT PRIMARY KEY, Inserting a Row with Default Values:
- \(Y\) is not a subset of \(X\). Name VARCHAR(50), Syntax:
- Example: \(A \rightarrow B\). Age INT INSERT INTO table_name DEFAULT VALUES;
3. **Completely Non-Trivial Functional Dependency:** ); Example:
- \(X\) and \(Y\) have no common attributes. Foreign Key Constraint: CREATE TABLE DefaultsExample (
- Example: \(A \rightarrow B\) with \(A\) and \(B\) disjoint. Establishes referential integrity between tables. ID INT PRIMARY KEY,
4. **Partial Functional Dependency:** Example: Name VARCHAR(50) DEFAULT 'Unknown',
- Part of \(X\) can determine \(Y\). CREATE TABLE Orders ( Age INT DEFAULT 0
- Example: \(\{A, B\} \rightarrow C\) if \(A \rightarrow C\). OrderID INT PRIMARY KEY, );
CustomerID INT,
INSERT INTO DefaultsExample (ID) VALUES (1); correctness of the database. Each isolation level balances the trade- Explain the purpose and scope of database security?
This will insert a row with the default values specified in the table off between data consistency and system performance. 1. **Protect Data Confidentiality:**
definition. Isolation Levels - **Purpose:** Prevent unauthorized access to sensitive data stored
Inserting Using Subquery: Read Uncommitted: in the database.
Syntax: Description: Transactions can read data that has been modified but - **Scope:** Implement access controls, encryption, and data
INSERT INTO table_name (column1, column2, ...) not yet committed by other transactions. masking techniques to restrict access and obscure sensitive
VALUES ((SELECT value1, value2 FROM another_table WHERE Pros: Highest concurrency, least restrictive. information.
condition)); Cons: Prone to dirty reads, non-repeatable reads, and phantom 2. **Ensure Data Integrity:**
Example: reads. - **Purpose:** Maintain the accuracy and consistency of data
INSERT INTO Employees (EmployeeID, Name, Age, Department) Example: A transaction reads an uncommitted update from another stored in the database.
VALUES ((SELECT 4, 'Daisy', 27, 'Marketing' FROM DUAL)); transaction, which is later rolled back. - **Scope:** Utilize integrity constraints, auditing mechanisms, and
What do you mean by transaction processing systems? Define Read Committed: data validation techniques to detect and prevent unauthorized
transaction with example Description: Transactions can only read committed data. Data being modifications or tampering with data.
Transaction Processing Systems (TPS) modified by another transaction cannot be read until that transaction 3. **Safeguard Data Availability:**
Transaction Processing Systems (TPS) are information systems that is committed. - **Purpose:** Ensure that authorized users and applications can
collect, store, modify, and retrieve the data transactions of an Pros: Prevents dirty reads. access data when needed.
enterprise. They ensure the smooth and accurate processing of Cons: Prone to non-repeatable reads and phantom reads. - **Scope:** Implement backup and recovery procedures,
business transactions. Example: A transaction reads a row, another transaction modifies and redundancy measures, and disaster recovery plans to mitigate risks of
Transaction commits that row, and the first transaction reads the updated row. data loss and ensure uninterrupted access to the database.
A transaction is a single unit of work that is performed within a Repeatable Read: 4. **Mitigate Security Risks:**
database management system. It adheres to the ACID properties: Description: Ensures that if a transaction reads a row, any subsequent - **Purpose:** Identify and mitigate security threats and
Atomicity: Ensures all operations within the transaction are reads of that row will return the same data, even if other transactions vulnerabilities that could compromise the database.
completed; if not, the transaction is aborted. modify the row and commit. - **Scope:** Conduct regular vulnerability assessments, apply
Consistency: Ensures the transaction brings the database from one Pros: Prevents dirty reads and non-repeatable reads. security patches and updates, and deploy intrusion detection and
consistent state to another. Cons: Prone to phantom reads. prevention systems to detect and respond to security incidents.
Isolation: Ensures operations of a transaction are isolated from other Example: A transaction reads a set of rows, another transaction
transactions. inserts a new row that matches the query criteria, and the first 5. **Comply with Regulations and Standards:**
Durability: Ensures changes are permanent once the transaction is transaction reads the same query again but does not see the new - **Purpose:** Adhere to industry regulations, data protection
committed. row. laws, and organizational security policies.
Example of a Transaction Serializable: - **Scope:** Align database security measures with relevant
Transferring $100 from Account A to Account B: Description: Ensures complete isolation from other transactions, compliance standards (e.g., GDPR, HIPAA, PCI DSS), implement
making it appear as if transactions are executed sequentially, one controls to protect sensitive data, and maintain audit trails for
Start Transaction after the other. regulatory compliance.
Check Balance: Pros: Prevents dirty reads, non-repeatable reads, and phantom reads. Explain query processing in detail with example?
sql Cons: Lowest concurrency, most restrictive, can lead to higher latency Query processing is the mechanism by which a database
SELECT Balance FROM Accounts WHERE AccountID = 'A'; and lower throughput. management system (DBMS) interprets and executes user queries. It
Debit Account A: Example: A transaction reads rows with a certain condition, another involves several steps, from analyzing the query syntax to retrieving
sql transaction tries to insert a row that meets the same condition, and the requested data. Let's explore each step in detail with an example:
UPDATE Accounts SET Balance = Balance - 100 WHERE AccountID = the first transaction does not see the new row until it finishes ### Example Query:
'A'; ### Differences Between Starvation and Deadlock Consider the following SQL query:
Credit Account B: 1. **Definition:** ```sql
sql - **Deadlock:** A situation where two or more transactions are SELECT FirstName, LastName
UPDATE Accounts SET Balance = Balance + 100 WHERE AccountID = waiting for each other to release resources, causing all involved FROM Employees
'B'; transactions to be unable to proceed. WHERE Department = 'IT' AND Salary > 50000;
Commit Transaction: - **Starvation:** A situation where a transaction is perpetually ```
sql delayed from acquiring resources because other transactions ### Steps in Query Processing:
COMMIT; continuously acquire them. 1. **Lexical Analysis (Tokenization):**
What is concurrency control? Write the purpose of concurrency 2. **Cause:** - **Purpose:** Break the query into individual tokens (keywords,
control? - **Deadlock:** Arises from circular waiting conditions among identifiers, operators).
**Concurrency control** is a database management technique used transactions. - **Example:** The query is broken down into tokens like SELECT,
to manage simultaneous operations on a database without conflicts, - **Starvation:** Caused by unfair resource allocation or scheduling FirstName, FROM, Employees, WHERE, etc.
ensuring data integrity and consistency. policies that favor certain transactions over others. 2. **Syntax Analysis (Parsing):**
### Purpose of Concurrency Control 3. **Characteristics:** - **Purpose:** Validate the syntax and structure of the query.
1. **Maintain Data Integrity:** - **Deadlock:** Involves mutual exclusion, hold and wait, no - **Example:** Verify that the query follows the SQL syntax rules,
- Ensures transactions are completed accurately and consistently preemption, and circular wait. such as proper placement of keywords and clauses.
without interfering with each other. - **Starvation:** Involves continuous delay of a transaction without 3. **Semantic Analysis:**
2. **Prevent Data Anomalies:** circular waiting. - **Purpose:** Ensure the query's semantics are valid against the
- Avoids issues like: 4. **Detection:** database schema.
- **Lost Updates:** When simultaneous updates overwrite each - **Deadlock:** Can be detected using algorithms designed to - **Example:** Check if the specified table (Employees) and
other. identify circular waits. columns (FirstName, LastName, Department, Salary) exist in the
- **Dirty Reads:** When a transaction reads uncommitted - **Starvation:** Harder to detect; typically identified through database.
changes from another transaction. performance monitoring and observing prolonged delays. 4. **Query Optimization:**
- **Non-repeatable Reads:** When repeated reads yield different 5. **Resolution:** - **Purpose:** Generate an efficient execution plan to retrieve
results due to other transactions’ modifications. - **Deadlock:** Resolved by prevention, avoidance, or detection data.
- **Phantom Reads:** When insertions or deletions by other and recovery mechanisms. - **Example:** Consider different execution strategies (e.g., join
transactions affect a transaction's query results. - **Starvation:** Resolved by implementing fair scheduling methods, index selection) and choose the optimal plan based on
3. **Ensure Serializability:** algorithms, aging techniques, or adjusting resource allocation factors like cost and resource utilization.
- Provides the illusion that transactions are executed one after strategies. 5. **Execution Plan Generation:**
another (serially), even though they may be executed concurrently. ### Data Administration vs. Database Administration - **Purpose:** Convert the logical query plan into a physical
4. **Increase Throughput:** 1. **Scope:** execution plan.
- Allows more transactions to be processed simultaneously without - Data Administration (DA): Manages overall data assets, policies, - **Example:** Determine the order of operations (e.g., table scans,
compromising data integrity. and standards. index lookups, joins), select appropriate access paths, and generate
### Methods of Concurrency Control - Database Administration (DBA): Focuses on technical aspects of the execution plan.
- **Locking:** specific databases. 6. **Data Retrieval and Processing:**
- **Pessimistic Locking:** Locks data to prevent other transactions 2. **Responsibilities:** - **Purpose:** Retrieve data from the database and perform
from modifying it until the lock is released. - DA: Defines data architecture, ensures compliance, and manages necessary operations.
- **Optimistic Locking:** Transactions execute without initial locks data quality. - **Example:** Access the Employees table, filter rows where
but check for conflicts before committing. - DBA: Installs, configures, and maintains databases, optimizing Department is 'IT' and Salary is greater than 50000, and project the
- **Timestamp Ordering:** performance and security. FirstName and LastName columns.
- Assigns timestamps to transactions to ensure they are executed in 3. **Focus:**
timestamp order to maintain consistency. - DA: Strategic use of data to support organizational objectives. 7. **Result Formation:**
- **Multiversion Concurrency Control (MVCC):** - DBA: Operational tasks to ensure database availability and - **Purpose:** Format the retrieved data into the final result set.
- Maintains multiple versions of data, allowing reads to access a reliability. - **Example:** Format the selected FirstName and LastName
snapshot without blocking writes. 4. **Level of Involvement:** columns into rows, producing the final result set that satisfies the
- **Serialization Graph Checking:** - DA: Collaborates with business leaders on high-level decisions. query conditions.
- Constructs a graph to ensure that the schedule of transactions is - DBA: Engages in day-to-day operational tasks and IT interactions.
conflict-serializable. 5. **Impact:** 8. **Result Delivery:**
- DA: Affects business strategy and decision-making by ensuring - **Purpose:** Return the result set to the user or application.
Transaction Isolation Levels
Transaction isolation levels are a key concept in database data usability. - **Example:** Send the result set containing employee names to
- DBA: Directly impacts database performance, security, and the client application or display it in a user interface.
management systems (DBMS) that determine how transaction
integrity is visible to other transactions and ensure consistency and reliability.
Objectives of DBMS: Measuring query cost END;
Efficiently manage data storage and retrieval: DBMS organizes data in 1. **Execution Time:** Measure the time taken for query execution, How to create and dropping index n database? Explain
a structured manner, facilitating efficient storage and retrieval including parsing, optimization, and execution phases. Creating and dropping indexes in a database is essential for
operations. 2. **Resource Utilization:** Monitor CPU, memory, and disk I/O optimizing query performance and managing database resources
Ensure data integrity and security: DBMS enforces constraints and usage during query execution to gauge resource consumption. efficiently. Here's how to create and drop indexes, along with
security measures to maintain data accuracy and protect against 3. **IO Operations:** Count the number of disk reads and writes explanations:
unauthorized access. performed by the query, as excessive IO operations can impact **Creating Index:**
Support concurrent access by multiple users: DBMS provides performance. ```sql
mechanisms for managing concurrent access to ensure data 4. **Query Plan Complexity:** Evaluate the complexity of the query -- Syntax to create an index
consistency and avoid conflicts. execution plan generated by the database optimizer, including the CREATE INDEX index_name ON table_name (column_name);
Facilitate data sharing and collaboration: DBMS allows multiple users number of joins, sorts, and aggregations.
to access and modify data simultaneously, enabling collaboration 5. **Index Usage:** Analyze the effectiveness of indexes in -- Example: Create an index on the "email" column of the "users"
within an organization. optimizing query performance by measuring the number of index table
Provide data consistency and reliability: DBMS ensures that data scans, seeks, and lookups. CREATE INDEX idx_email ON users (email);
remains consistent and reliable across different operations and Describe different types of trigger used in database with suitable Explanation:
transactions. practical example - The `CREATE INDEX` statement is used to create an index in a
Importance of DBMS: Triggers in databases are procedural code blocks that are database.
Reduces data redundancy and inconsistency: By centralizing data automatically executed in response to certain events occurring in the - `index_name` is the name given to the index.
storage and enforcing data integrity constraints, DBMS minimizes database. There are mainly three types of triggers: - `table_name` is the name of the table on which the index is created.
redundancy and ensures consistency across the database. 1. **Before Triggers (BEFORE INSERT, UPDATE, DELETE):** These - `column_name` is the name of the column(s) for which the index is
Improves data security and access control: DBMS provides triggers fire before the execution of an INSERT, UPDATE, or DELETE created.
authentication, authorization, and encryption mechanisms to protect operation on a table. **Benefits of Indexes:**
sensitive data and control access to it. 2. **After Triggers (AFTER INSERT, UPDATE, DELETE):** These - Indexes improve query performance by allowing the database
Enhances data integrity and reliability: DBMS enforces data integrity triggers fire after the execution of an INSERT, UPDATE, or DELETE engine to quickly locate rows based on indexed columns.
constraints and transactional properties to maintain the accuracy and operation on a table. - They facilitate faster data retrieval and can optimize the execution
reliability of data. 3. **Instead of Triggers (INSTEAD OF INSERT, UPDATE, DELETE):** of SELECT, JOIN, and WHERE clauses.
Facilitates efficient data querying and analysis: With SQL-based query These triggers are executed instead of the triggering INSERT, UPDATE, **Dropping Index:**
languages and optimization techniques, DBMS enables users to or DELETE operation. ```sql
retrieve and analyze data efficiently. **Example:* -- Syntax to drop an index
Simplifies application development and maintenance: DBMS Consider a scenario where you have a database table called DROP INDEX index_name ON table_name;
abstracts the complexities of data management, allowing developers "employees" with columns `id`, `name`, `salary`, and `bonus`. You
to focus on application logic rather than low-level data operations. want to automatically calculate the total salary (including bonus) -- Example: Drop the "idx_email" index from the "users" table
Applications of DBMS: whenever a new employee is inserted or their salary or bonus is DROP INDEX idx_email ON users;
Business applications: Customer relationship management (CRM), updated. You can use triggers to achieve this. Explanation:
enterprise resource planning (ERP), inventory management. ```sql - The `DROP INDEX` statement is used to remove an index from a
Banking and finance systems: Transaction processing, account -- Create the employees table database.
management, risk analysis. CREATE TABLE employees ( - `index_name` is the name of the index to be dropped.
Healthcare information systems: Electronic health records (EHR), id INT PRIMARY KEY, - `table_name` is the name of the table from which the index is
patient management, medical imaging. name VARCHAR(100), dropped.
E-commerce and online transactions: Online shopping, order salary DECIMAL(10, 2), Define row level trigger. How it is differ from statement level
processing, payment processing. bonus DECIMAL(10, 2) trigger? Explain
Social media platforms: User profiles, content management, social ); A row-level trigger is a type of database trigger that is automatically
network analysis. -- Create the trigger to calculate total salary after insert or update executed for each row affected by the triggering event. This means
Scientific research and data analysis: Data mining, bioinformatics, CREATE TRIGGER calculate_total_salary that the trigger action is performed individually for every row that
climate modeling. AFTER INSERT, UPDATE ON employees meets the condition specified by the trigger, such as an INSERT,
Government and administrative systems: Public records FOR EACH ROW UPDATE, or DELETE operation. Row-level triggers provide granular
management, taxation, voting systems. BEGIN control over data manipulation and allow for specific actions to be
Data encryption is the process of converting plaintext data into -- Calculate total salary (salary + bonus) taken on a per-row basis.
cipher text using cryptographic algorithms and keys, ensuring data DECLARE total_salary DECIMAL(10, 2); Here's how a row-level trigger differs from a statement-level trigger:
security and confidentiality. SET total_salary = NEW.salary + NEW.bonus; **Row-Level Trigger:**
Data Encryption Process: - A row-level trigger fires once for each row affected by the triggering
1. Select Encryption Algorithm: Choose a cryptographic algorithm -- Update the total_salary column in the employees table event.
(e.g., AES, RSA) suitable for the security requirements and UPDATE employees - The trigger action is executed separately for each affected row.
application. SET total_salary = total_salary - Useful for performing operations that depend on individual row
2. Generate Encryption Key: Create a secret encryption key that will WHERE id = NEW.id; data, such as validating or modifying specific column values.
be used to encrypt the plaintext data. END; - Example: Updating a column value based on conditions specific to
3. Apply Encryption Algorithm: Use the selected encryption algorithm How trigger is implemented in database explain with a suitable each row.
and the encryption key to transform plaintext data into ciphertext. example **Statement-Level Trigger:**
4. Store or Transmit Ciphertext: The resulting ciphertext, which Triggers in databases are implemented using procedural code blocks - A statement-level trigger fires once for each SQL statement that
appears as random and unintelligible data, can be securely stored or that are associated with specific database tables and automatically triggers the event.
transmitted over networks. executed in response to predefined events. Here's how triggers are - The trigger action is executed once for the entire set of affected
Data decryption is the reverse process of converting ciphertext back implemented, along with a suitable example: rows, rather than individually for each row.
into plaintext using decryption algorithms and keys, allowing **Implementation Steps:** - Typically used for performing operations that apply to the entire set
authorized users to access the original data. 1. **Create Trigger:** Define the trigger using SQL syntax, specifying of affected rows, such as auditing or logging.
Data Decryption Process: the trigger event (e.g., INSERT, UPDATE, DELETE), the triggering table, - Example: Logging the details of a bulk data modification operation.
1. Retrieve Ciphertext: Obtain the encrypted ciphertext that needs to and the trigger action (code block).
be decrypted. 2. **Associate Trigger with Table:** Link the trigger to a specific What do you mean by sub queries? Write the advantage of sub
2. Select Decryption Algorithm: Choose the appropriate decryption database table, indicating the events that should activate the trigger. queries
algorithm corresponding to the encryption algorithm used. 3. **Define Trigger Action:** Write the procedural code block that Subqueries are SQL queries embedded within another query. Their
3. Provide Decryption Key: Use the correct decryption key, which will be executed when the trigger is activated, which typically advantages include modularity, data filtering, comparison,
corresponds to the encryption key used during encryption. includes SQL statements or program logic. aggregation, correlated subqueries, and query optimization,
4. Apply Decryption Algorithm: Utilize the decryption algorithm and 4. **Enable or Activate Trigger:** Ensure that the trigger is enabled enhancing query flexibility and power.
the decryption key to reverse the encryption process, converting and active within the database environment so that it can respond to 1. **Modularity:** Subqueries break down complex queries into
ciphertext back into plaintext. the specified events. smaller, more manageable components, improving code readability
5. Access Original Data: The decrypted plaintext data is now **Example:** and maintainability.
accessible and can be used for its intended purpose. Consider a scenario where you have a database table called "orders" 2. **Data Filtering:** They enable precise data retrieval by filtering
Data security issues that stores information about customer orders. You want to rows based on specific conditions derived from another query.
Unauthorized Access: Preventing unauthorized users from accessing implement a trigger that automatically updates the "order_status" 3. **Data Comparison:** Subqueries facilitate comparisons between
sensitive data through stringent access controls and authentication column to "Shipped" whenever a new order is inserted into the data from different tables or the same table, enhancing data analysis
mechanisms. "orders" table with a shipping date. capabilities.
Data Breaches: Implementing measures to detect, prevent, and ```sql 4. **Data Aggregation:** They allow for the use of aggregate
respond to data breaches, including intrusion detection systems and -- Step 1: Create Trigger functions to analyze subsets of data, providing insights into data
incident response plans. CREATE TRIGGER update_order_status trends and patterns.
Malware and Ransomware: Deploying antivirus software, firewalls, AFTER INSERT ON orders
and security patches to protect against malware and ransomware FOR EACH ROW 5. **Optimization:** Subqueries can optimize query performance by
attacks. BEGIN reducing the amount of data processed at each stage of execution,
Insider Threats: Implementing employee monitoring, access controls, -- Step 3: Define Trigger Action leading to more efficient query execution plans.
and training programs to mitigate risks associated with insider IF NEW.shipping_date IS NOT NULL THEN
threats. UPDATE orders
Encryption and Data Protection: Utilizing encryption techniques to SET order_status = 'Shipped'
safeguard data during transmission and storage, ensuring WHERE order_id = NEW.order_id;
confidentiality and integrity. END IF;