Advanced Database Systems Study Guide - Unit 1
Short Answer Quiz:
What is a Database Management System (DBMS) and describe
its role in a database system.
A Database Management System (DBMS) is software that manages and
provides controlled access to a database. It acts as an intermediary
between users/applications and the database itself, enabling data
storage, retrieval, modification, and administration.
Differentiate between logical and physical database design.
What considerations are relevant to each?
Logical design focuses on the conceptual structure of the database,
including entities, attributes, and relationships, while physical design
deals with the physical storage and organization of data on disk. Logical
design prioritizes data modelling and integrity, while physical design
prioritizes efficiency and performance.
Explain the key differences between centralized and distributed
database architectures.
Centralized databases store data in a single physical location, while
distributed databases spread data across multiple physical locations,
potentially geographically dispersed. Centralized systems offer
simplicity and ease of management, while distributed systems offer
scalability, fault tolerance, and localized data access.
List and briefly describe three types of NoSQL databases.
Three types of NoSQL databases are:
(1) Key-Value Stores (e.g., Redis), storing data as key-value pairs for fast
retrieval;
(2) Document Databases (e.g., MongoDB), storing data in flexible JSON-
like documents;
(3) Graph Databases (e.g., Neo4j), representing data as nodes and
relationships, ideal for interconnected data.
What are the three main components of a database engine, and
what are their primary responsibilities?
The three main components:
(1) Storage Manager: handles data storage and retrieval from physical
storage;
(2) Query Processor: interprets and executes queries, optimizing for
efficiency;
(3) Transaction Manager: ensures atomicity, consistency, isolation, and
durability (ACID) of transactions.
What is the purpose of a buffer manager in a DBMS, and how
does it contribute to performance optimization?
The buffer manager acts as a cache between the DBMS and the disk
storage. It keeps frequently accessed data blocks in memory, reducing
disk I/O and improving query performance. It manages the buffer pool,
determining which blocks to load/evict based on usage patterns.
What are the three levels of database schemas, and what does
each level represent?
The three levels are:
(1) External Schema: user's view of the data, specific to their needs;
(2) Logical Schema: overall conceptual structure, including entities,
attributes, and relationships;
(3) Physical Schema: how data is physically stored, including files,
indexes, and data formats
What is normalization in the context of relational databases,
and what are its primary goals?
Normalization is a process of organizing data in relations (tables) to
reduce data redundancy and improve data integrity. It aims to eliminate
update anomalies (inconsistencies from data updates), reduce storage
space, and improve query efficiency.
Describe the ACID properties of transactions and explain why
they are crucial in database management.
ACID properties ensure data reliability:
(1) Atomicity: all parts of a transaction succeed or fail together;
(2) Consistency: transactions maintain database rules and constraints;
(3) Isolation: concurrent transactions don't interfere with each other;
(4) Durability: committed changes are permanent, even after failures.
What is Role-Based Access Control (RBAC) in database security,
and how does it work to protect sensitive data?
RBAC simplifies security management by grouping users with similar
access needs into roles. Permissions are assigned to roles, and users are
then assigned to specific roles. This allows for easier administration, as
permissions can be granted or revoked at the role level.
Glossary of Key Terms:
Term Definition
Database - An organized collection of data, typically managed by a
database management system (DBMS).
DBMS Database Management System - Software for managing and
accessing databases, providing tools for data storage, retrieval,
modification, and administration.
Schema - A blueprint or description of the logical structure of a
database, defining tables, columns, data types, and relationships.
Relational Database - A database that organizes data into tables with
rows and columns, using relationships (foreign keys) to link data
between tables.
SQL Structured Query Language: A standard language for interacting
with relational databases, used for data definition, manipulation, and
querying.
Normalization - The process of organizing data in a relational database
to reduce data redundancy and improve data integrity.
Entity Relationship Diagram (ERD) - A visual representation of
entities (concepts) and the relationships between them in a database.
Transaction - A unit of work in a database that must be completed
entirely or not at all, ensuring data consistency and integrity.
ACID Properties - Atomicity, Consistency, Isolation, Durability: Four
properties that guarantee database transactions' reliability and
integrity.
Concurrency Control - Mechanisms that manage simultaneous access
to data in a database, preventing conflicts and ensuring data integrity
when multiple transactions occur concurrently.
Recovery - The process of restoring a database to a consistent state
after a failure, such as a system crash or disk failure.
Security - Measures taken to protect a database from unauthorized
access, modification, or destruction, including authentication,
authorization, and auditing.
Buffer Manager - A DBMS component responsible for managing the
buffer pool, a cache of data blocks in memory that improves query
performance by reducing disk I/O.
Query Processor - The component of a DBMS that interprets and
executes SQL queries, optimizing them for efficient execution.
Index - A data structure that improves the speed of data retrieval
operations on a database table at the cost of additional writes and
storage space to maintain the index data structure.
Stored Procedure - A pre-compiled SQL code block stored and
executed within the database server, allowing for the creation of
reusable database operations.
Trigger - A stored procedure in a database that automatically executes
in response to specific events, such as data modifications.
NoSQL Database - A non-relational database that provides a
mechanism for storage and retrieval of data that is modelled in means
other than the tabular relations used in relational databases.