DS 5110: Introduction to Data Management and Processing
Sample Midterm Exam
• You have 2.5 hours.
• The exam is closed book.
• You are allowed two letter-sized pages of notes (both sides).
• No electronic devices are allowed.
• Please read all questions carefully before answering them.
• Good luck!
Name:
NEU ID (optional):
1
1 Multiple-Choice Questions (40 points)
Circle ALL the correct choices: there may be more than one correct choice, but there is always
at least one correct choice. NO partial credit: the set of all the correct answers must be checked.
There are 10 multiple choice questions worth 4 points each.
1. Which of the following describes the role of a Data Definition Language (DDL) in a DBMS?
(a) Manages database transactions and ensures data integrity.
(b) Facilitates the querying and retrieval of data from a database.
(c) Defines and manages the database structure.
(d) Manages the user access rights and security roles within a database.
2. The following ERD describes a relation between employees, managers, and departments:
Which of the following statements is true?
(a) An employee can work in more than one department.
(b) Every employee has a manager.
(c) Two employees in the same department can have different managers.
(d) There may be departments with no managers.
3. Suppose that S(A, ...) is a relation containing m rows and T (B, ...) is a relation containing
n rows. Assume that A is the primary key of S and B is a foreign key reference to S(A).
Suppose that we perform an inner join between S and T on S.A = T.B. The maximum
number of rows that can be in the output is:
(a) m
(b) n
(c) m + n
(d) mn
(e) max(m, n)
4. Assume that we have executed the following SQL statements:
CREATE TABLE s ( a INT , b INT );
CREATE TABLE r ( a INT , b INT , c INT );
INSERT INTO s (a , b ) VALUES (1 ,2);
INSERT INTO s (a , b ) VALUES (3 ,3);
2
INSERT INTO s (a , b ) VALUES (1 ,2);
INSERT INTO s (a , b ) VALUES (1 ,4);
INSERT INTO s (a , b ) VALUES (3 ,4);
INSERT INTO r (a ,b , c ) VALUES (1 ,2 ,10);
INSERT INTO r (a ,b , c ) VALUES (3 ,3 ,3);
INSERT INTO r (a ,b , c ) VALUES (1 ,7 ,3);
INSERT INTO r (a ,b , c ) VALUES (1 ,2 ,8);
INSERT INTO r (a ,b , c ) VALUES (3 ,3 ,5);
INSERT INTO r (a ,b , c ) VALUES (1 ,1 ,10);
INSERT INTO r (a ,b , c ) VALUES (2 ,2 ,10);
What will be the result of the following query:
SELECT c
FROM r
GROUP BY c
HAVING COUNT ( c ) = ( SELECT ( COUNT ( DISTINCT a ))
FROM s );
(a) 1
(b) 3
(c) 5
(d) 10
5. Which of the following statements about foreign keys is true?
(a) A foreign key always references a primary key in another table.
(b) A table can have multiple foreign keys.
(c) Foreign keys prevent actions that would destroy links between tables.
(d) A foreign key cannot have NULL values.
6. Consider the following table:
Employees(EmployeeID, FirstName, LastName, ManagerID)
You are asked to find the entire lineage (all superiors) of employee with id 100 up to the
highest level in the organization. Which SQL statement will retrieve this information?
(a) WITH RecursiveCTE AS (
SELECT EmployeeID , FirstName , LastName , ManagerID
FROM Employees
WHERE EmployeeID = 100
UNION ALL
SELECT e . EmployeeID , e . FirstName , e . LastName , e . ManagerID
FROM Employees e
JOIN RecursiveCTE r ON e . EmployeeID = r . ManagerID
)
SELECT FirstName , LastName FROM RecursiveCTE ;
3
(b) WITH RecursiveCTE AS (
SELECT EmployeeID , FirstName , LastName , ManagerID
FROM Employees
WHERE EmployeeID = 100
UNION ALL
SELECT e . EmployeeID , e . FirstName , e . LastName , e . ManagerID
FROM Employees e
JOIN RecursiveCTE r ON e . ManagerID = r . EmployeeID
)
SELECT FirstName , LastName FROM RecursiveCTE ;
(c) WITH RecursiveCTE AS (
SELECT EmployeeID , FirstName , LastName , ManagerID
FROM Employees
WHERE EmployeeID = 100
UNION ALL
SELECT e . EmployeeID , e . FirstName , e . LastName , e . ManagerID
FROM Employees e
JOIN RecursiveCTE r ON e . EmployeeID = r . EmployeeID
)
SELECT FirstName , LastName FROM RecursiveCTE ;
(d) WITH RecursiveCTE AS (
SELECT EmployeeID , FirstName , LastName , ManagerID
FROM Employees
WHERE EmployeeID = 100
UNION ALL
SELECT e . EmployeeID , e . FirstName , e . LastName , e . ManagerID
FROM Employees e
JOIN RecursiveCTE r ON e . ManagerID = r . ManagerID
)
SELECT FirstName , LastName FROM RecursiveCTE ;
7. Given the schema R = (A, B, C, D, E, G) with the following functional dependencies:
F = {AB → C, AD → E, B → E, BC → A, AD → B, E → G}.
Which of the following decompositions of R preserves the dependencies?
(a) R1 = (A, B, C), R2 = (A, C, D, E), R3 = (A, D, G)
(b) R1 = (A, B, C), R2 = (A, C, D, E), R3 = (B, D, E, G)
(c) R1 = (A, B, C, D), R2 = (A, D, E, G)
(d) R1 = (A, B, C, D), R2 = (B, D, E, G)
4
8. What normal form the following table is in?
OrderId CustomerId CustomerName ProductId ProductName
1 1001 Alice P01 Apple
2 1001 Alice P02 Banana
3 1002 Bob P03 Cherry
(a) 1NF
(b) 2NF
(c) 3NF
(d) BCNF
9. What is the primary use of a View in SQL?
(a) To speed up the performance of database queries.
(b) To provide a virtual table on the result-set of an SQL statement.
(c) To impose additional security on database transactions.
(d) To abstract the complexity of underlying database schemas.
10. What is the purpose of a cursor object when working with databases in Python?
(a) Create and manage database connections.
(b) Store tables from the database in memory for fast access.
(c) Execute SQL queries and fetch the results from the database.
(d) Define the database schema.
5
2 ERD (15 points)
Design a system for movie rentals that manages movie collections, customer rentals, payments, and
employee interactions. Draw an ERD using the provided entities and their relationships. Mark
primary keys, and indicate the cardinality and participation (total/partial) of relationships.
System description:
• A movie has id, title, release year, and a director.
• A director has an id, first name, last name, and birth date.
• Each movie has one director, but a director can direct multiple movies.
• A customer has id, first name, last name, address, and email.
• A movie can be rented multiple times by different customers at different times. For each
rental, the system stores the rental date, due date, and return date.
• Each rental is associated with a payment. A payment has amount and date.
• An employee has id, first name, last name, and hire date.
• A rental can be processed by only one employee, but an employee can process multiple rentals.
6
3 Data Normalization (15 points)
Consider the relational schema R(A, B, C, D, E, G) with the functional dependencies:
F = {AB → C, C → A, BC → D, ACD → B, D → EG, BE → C, CG → BD, CE → AG}.
1. Is the dependency BD → ACG in F + ?
2. Find all the candidate keys in R.
3. Indicate all BCNF violations in R.
4. Decompose R into a set of relations that are in BCNF.
7
4 SQL (30 points)
You are given the following schemas of a database for a coaching club.
coach(coach_id, name, e_mail, from_date, hourly_rate)
types(type_name, description)
coaches(coach_id, type_name)
clients(client_id, address, mobile)
training_program(client_id, start_date, coach_id, type_name, hours)
Description of the relations:
• The relation coach contains data on coaches that work in the club. Each coach has an id,
name, e-mail address, date of starting his/her job as a coach (from_date) and hourly rate.
• The relation types contains data on the coaching types offered in the club, including the type
name (e.g., ”life coaching”, ”career coaching”, etc.) and description.
• The relation coaches describes which coaching types are offered by each coach. Each coaching
type has at least one coach who offers it.
• The relation clients contains data on clients of the club. Each client has an id, name,
address and mobile phone. Each client has at least one training program.
• The relation training_program contains data on the training programs of the clients. For
each client, it stores the starting date of the program (start_date), id of the coach, the
coaching type (type_name), and the number of total hours planned for the program. The
cost of the program is calculated based on the number of hours and the hourly rate of the
selected coach.
Write the following queries in SQL (5 points for each query):
1. Find all the clients that have been trained in a coaching type whose description contains the
word ”life”.
2. Find pairs of different clients who started their training at the same day with the same coach.
The result columns should include the ids of the customers and the starting date of the
training. Each such pair should appear only once in the result.
8
3. Find all the clients who have never been trained by a coach named Levi.
4. Find for each client the total amount he/she has to pay for all his/her training programs.
5. Find customers who have been trained in all the coaching types offered by the club.
6. Find customers who have had at least 3 different training programs, and all their training
programs have been carried out by the same coach.