0% found this document useful (0 votes)

23 views8 pages

Databases Unit Test 3

Uploaded by

ayushvanani01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views8 pages

Databases Unit Test 3

Uploaded by

ayushvanani01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Question 1: Define a data warehouse.

Compare and contrast the design

of a data warehouse with the design of a third normal form database.

A data warehouse is a centralized repository designed to store, manage, and

analyze large volumes of structured and sometimes semi-structured data. It is
optimized for querying and reporting, enabling organizations to derive insights
from historical and current data. Data in a warehouse is often subject-oriented,
integrated from multiple sources, time-variant, and non-volatile.

Comparison: Data Warehouse vs. Third Normal Form (3NF) Database

Aspect Data Warehouse 3NF Database
Optimized for
Optimized for querying,
Purpose transactional processing
reporting, and data analysis.
(OLTP).
Uses normalized
Data Uses denormalized structures
structures to reduce
Structure like star or snowflake schemas.
redundancy.
Historical and aggregated data Real-time data for day-to-
Focus
for decision-making. day operations.
Query performance is prioritized,
Performan Write and update
often through denormalization
ce performance prioritized.
and indexing.
Typically, single-source,
Data Integrates data from multiple
handling specific
Sources heterogeneous sources.
operational data.
Frequent updates to
Data Rarely updated; data is
support transactional
Updates appended for analysis purposes.
needs.
Highly normalized
Schema Star Schema or Snowflake
schema (up to 3NF or
Design Schema.
higher).
Smaller datasets are
Handles large datasets; uses
Storage designed for compact
partitioning and compression.
storage.

Key Differences
1. Design Goals:
o A data warehouse focuses on efficient querying and analytics, often
employing denormalization for performance.
o A 3NF database ensures data integrity and minimizes redundancy,
aligning with operational needs.
2. Schema Design:
o Data warehouses use star or snowflake schemas with fewer joins for
quicker query performance.
o 3NF databases emphasize many joins due to normalization.
3. Data Updates:
o Data in a warehouse is primarily read-only and updated through
periodic ETL (Extract, Transform, Load) processes.
o 3NF databases undergo frequent updates to handle transactional
workflows.
4. Usage Scenarios:
o Data warehouses support strategic decision-making through historical
data analysis.
o 3NF databases support day-to-day business operations.
Question 2: Define and explain the difference between a fact and a
dimension.

 Fact: A fact represents the numerical or measurable data in a dataset.

These are typically quantitative metrics or values that an organization wants
to analyze, such as sales revenue, quantities, or profit. Facts are usually
stored in tables in a data warehouse.
 Dimension: A dimension provides the descriptive context for facts. These
are qualitative attributes or categories that help interpret the facts, such as
product names, customer details, geographic locations, or time. Dimensions
are stored in dimension tables and are used to slice, filter, or aggregate
the fact data during analysis.

Key Differences Between Fact and Dimension

Aspect Fac Dimension
t

Nature of Data Quantitative Qualitative or

measurable data. descriptive data.

Purpose Represents "what is Provides context or

being measured." "how to describe the
facts."
Example Sales revenue, units Product name,
sold, profit, cost. customer
demographics, region.
Storage Stored in fact Stored in dimension
tables tables.

Table Characteristics Contains foreign keys Contains primary keys

linking to dimension and descriptive
tables and measurable attributes.
columns.
Granularity Highly granular, Less granular,
reflecting individual reflecting categories or
transactions or events. hierarchies.
Analysis role Provides the "values" Provides the context to
for calculations like group, filter, or drill
sum, average, or count. down into the facts.
Example in a Data Warehouse
Scenario: A retailer analyzing sales data.
 Fact Table:
o Columns: Sale_ID (key), Product_ID (foreign key), Customer_ID
(foreign key), Sale_Amount, Quantity.
o Facts: Sale_Amount and Quantity are the measurable data points.
 Dimension Table:
o Product Dimension:
 Columns: Product_ID (key), Product_Name, Category, Brand.
 Dimensions: Product_Name and Category provide descriptive
details for the facts.

Question 3:

1. Provide a high-level description (without modeling) of the fact and dimension

tables required

Fact Table

The fact tables store the key metrics (e.g., total users, usage hours). It will also
capture the activity in the computer labs. It will include metrics like the number of
students, session durations, and applications used.

Dimension Tables
The dimension tables provide descriptive attributes such as time, student
demographics, majors, classes, and software details.

Time Dimension: Stores data about date and time for aggregating metrics by
different time periods.
Student Dimension: Stores demographic and academic details about students,
such as gender, age, major, semester, and class.
Lab Dimension: Contains information about the computer labs, such as location
and available resources.
Application Dimension: Tracks details about the applications used, including the
name, version, and operating system.
Class Dimension: Represents the academic classes or courses students are
associated with.
Major Dimension: Tracks details about the academic majors offered.

2. Describe the granularity of each of the fact tables

Lab Usage Fact Table: One row per student session in a lab, per application
used, with a timestamp. For example, A single session where a student used two
different applications will generate two rows.
3. Present the full table attributes of both the fact and dimension tables

Lab Usage Fact

Column Name Key Data Type Description
usage_id PK INT Unique identifier for the fact row
time_id FK INT Foreign key to the Time Dimension
student_id FK INT Foreign key to the Student
Dimension
lab_id FK INT Foreign key to the Lab Dimension
application_id FK INT Foreign key to the Application
Dimension
class_id FK INT Foreign key to the Class Dimension
major_id FK INT Foreign key to the Major Dimension
num_students INT Number of students in this session
session_duratio FLOAT Duration of the session in hours
n

Time Dimension
Column Ke Data Description
Name y Type
time_id PK INT Unique identifier
date DATE Full date
day_of_wee VARCHAR Day of the week
k
month VARCHAR Month name
year INT Year
quarter VARCHAR Quarter (e.g., Q1, Q2)

Student Dimension
Column Ke Data Type Description
Name y
student_id PK INT Unique identifier
gender CHAR(1) Gender (M/F/Other)
age INT Age of the student
major_id FK INT Foreign key to Major
Dimension
semester VARCHAR Semester (e.g., Fall 2024)

Lab Dimension
Column Ke Data Description
Name y Type
lab_id PK INT Unique identifier
location VARCHAR Lab location
capacity INT Number of workstations
Application Dimension
Column Ke Data Description
Name y Type
application PK INT Unique identifier
_id
app_name VARCHA Name of the application
R
version VARCHA Version of the application
R
os VARCHA Operating system (e.g.,
R Windows)

Class Dimension
Column Ke Data Description
Name y Type
class_id PK INT Unique identifier
class_name VARCHAR Name of the class/course

Major Dimension
Column Ke Data Description
Name y Type
major_id PK INT Unique identifier
major_nam VARCHAR Name of the major
e

4. Provide a diagram representing the data warehouse design

Student Dimension
student_id PK
gender
age
Time Dimension major_id FK
time_id PK semester
date
day_of_week
month Lab Dimension
year lab_id PK
quarter location
Lab Usage
usage_id
Fact PK
time_id FK
student_id FK
lab_id FK
application_id FK
Application Dimension class_id FK
major_id FK
application_id PK num_students
Class
session_durati Major Dimension
app_name major_id PK
Dimension
version class_id PK major_name
os class_nam
e
5. Using sample data (which you will create), provide a sample of three
reports that may be generated from this warehouse data.

Report: Total Number of Users by Month and Major

 Columns: Month, Major, Total Users

 Example Data:

Month Major Total

Users
Januar Computer 450
y Sci
Januar Business 300
y

Report: Application Usage by Gender

 Columns: Application Name, Gender, Total Sessions

 Example Data:

App Gend Total

Name er Sessions
Microsoft M 1200
Word
MATLAB F 850

Report: Average Session Duration by Lab Location

 Columns: Lab Location, Average Duration

 Example Data:

Lab Average
Location Duration (hrs)
Lab A 1.5
Lab B 2.0

6. Provide a solution handling the following cases and explain why your
solution works:

I. When a Student Changes Majors

Solution: Add a new record for the student with the updated major while
preserving the old record for historical tracking. Include fields such as
Effective_Start_Date and Effective_End_Date to manage validity
periods.
Why This Works: It preserves the historical association of the student
with their previous major while linking new activities to the updated
major. This ensures accurate reporting and trend analysis over time.

II. When a Lab Moves Locations

Solution: Add a new record with the updated location details while
keeping the old record for historical reference. Use a surrogate key to
differentiate between the old and new locations.

Why This Works: It maintains a historical record of the lab's prior

location, allowing reports to reflect the correct location for activities
conducted before and after the move.

III. When a New Version of a Software Package Is Installed in the

Lab
Solution: Create a new record for the software with the updated version
and operating system details. Link any new usage data to this record.

Why This Works: This allows the data warehouse to track software
usage at a granular level, distinguishing between different versions of the
same application. It ensures that reporting can provide insights into
which version was used during specific periods.

7. Does your solution include any snow flaking? If so, why?

Yes, Snowflaking Is Used for Lab Location Dimension. The Lab Location
Dimension might reference a separate table for Campus Details (e.g.,
campus name, city, state) to normalize the design. This creates a
snowflaked schema for labs.

Why? Because if multiple labs share common attributes (e.g., campus),

snowflaking reduces redundancy by storing shared attributes in a
separate table. This can improve data consistency and reduce storage
overhead for repetitive data.

DW - Unit 2
No ratings yet
DW - Unit 2
11 pages
Data Stage
No ratings yet
Data Stage
10 pages
Tutorial # 1
No ratings yet
Tutorial # 1
58 pages
Datawarehouse Concepts
No ratings yet
Datawarehouse Concepts
7 pages
Correct DW
No ratings yet
Correct DW
9 pages
Week 3
No ratings yet
Week 3
39 pages
Data Warehousing INTERVIEW QUESTION
No ratings yet
Data Warehousing INTERVIEW QUESTION
17 pages
AdvDB Sheet4
No ratings yet
AdvDB Sheet4
7 pages
DWM
No ratings yet
DWM
19 pages
Data Warehousing Concepts
No ratings yet
Data Warehousing Concepts
14 pages
Data Warehouse Design & Implementation
No ratings yet
Data Warehouse Design & Implementation
27 pages
Dimensional Modeling: Prof. Sunita Sahu
No ratings yet
Dimensional Modeling: Prof. Sunita Sahu
50 pages
DWM Exp1
No ratings yet
DWM Exp1
11 pages
Data Warehouse and Data Mining Notes
No ratings yet
Data Warehouse and Data Mining Notes
66 pages
Chapter Four - Data Warehouse Design: SATA Technology and Business Collage
No ratings yet
Chapter Four - Data Warehouse Design: SATA Technology and Business Collage
10 pages
Data Warehouse Design Tasks
100% (3)
Data Warehouse Design Tasks
22 pages
What Is A Data Warehouse
No ratings yet
What Is A Data Warehouse
11 pages
Data Warehouse Schema
No ratings yet
Data Warehouse Schema
10 pages
Data Warehouse Implementation
No ratings yet
Data Warehouse Implementation
37 pages
DW Mod 4
No ratings yet
DW Mod 4
37 pages
Dimensional Modeling
No ratings yet
Dimensional Modeling
47 pages
The Data WareHouse ETL Toolkit - Chapter 05
100% (1)
The Data WareHouse ETL Toolkit - Chapter 05
40 pages
Understanding The Dimensional Modeling 1727537549
No ratings yet
Understanding The Dimensional Modeling 1727537549
21 pages
DWDM Answer Key For UQ
No ratings yet
DWDM Answer Key For UQ
11 pages
Dimensional Modeling Essentials
No ratings yet
Dimensional Modeling Essentials
10 pages
Name: Reena Kale Te Comps Roll No: 23 DWM Experiment No: 1 Title: Designing A Data Warehouse Schema For A Case Study and Performing
No ratings yet
Name: Reena Kale Te Comps Roll No: 23 DWM Experiment No: 1 Title: Designing A Data Warehouse Schema For A Case Study and Performing
7 pages
Unit - 1
100% (1)
Unit - 1
29 pages
DW Notes
No ratings yet
DW Notes
13 pages
Draw Schema
No ratings yet
Draw Schema
11 pages
Datawarehouse Design Problems
No ratings yet
Datawarehouse Design Problems
17 pages
In The Star Schema Design
No ratings yet
In The Star Schema Design
11 pages
Data Warehousing Concepts 3
No ratings yet
Data Warehousing Concepts 3
48 pages
Experiment No.02: LAB Manual Part A
No ratings yet
Experiment No.02: LAB Manual Part A
10 pages
Experiment2 E059 DWM PDF
No ratings yet
Experiment2 E059 DWM PDF
10 pages
DW CrashCoursePPT
No ratings yet
DW CrashCoursePPT
24 pages
Dimensions DW
No ratings yet
Dimensions DW
6 pages
Data Warehouse: What, Why and How ?
No ratings yet
Data Warehouse: What, Why and How ?
25 pages
Unit 1
No ratings yet
Unit 1
36 pages
Unit-1 Lecture Notes
100% (1)
Unit-1 Lecture Notes
43 pages
Bi Lecture4 - 2023
No ratings yet
Bi Lecture4 - 2023
49 pages
DWM Exp1 C49
No ratings yet
DWM Exp1 C49
13 pages
Informatica Bhaskar20161012
No ratings yet
Informatica Bhaskar20161012
90 pages
DWH Quiz
100% (6)
DWH Quiz
32 pages
Joins
No ratings yet
Joins
8 pages
What Are The Dimensions in Data Warehouse
100% (1)
What Are The Dimensions in Data Warehouse
6 pages
CH 3
No ratings yet
CH 3
60 pages
Data Warehousing for Analysts
No ratings yet
Data Warehousing for Analysts
11 pages
Assignment 4-1
100% (2)
Assignment 4-1
27 pages
CS8075 DATAWAREHOUSING AND DATA MINING - Watermark
No ratings yet
CS8075 DATAWAREHOUSING AND DATA MINING - Watermark
83 pages
DW Basics
No ratings yet
DW Basics
24 pages
Dimensional Modeling
No ratings yet
Dimensional Modeling
7 pages
Cs655 Unit II
No ratings yet
Cs655 Unit II
27 pages
DWM QB Cyse
No ratings yet
DWM QB Cyse
8 pages
DWDM Class PPT 9-9-23
No ratings yet
DWDM Class PPT 9-9-23
65 pages
Data Warehousin G Concepts
No ratings yet
Data Warehousin G Concepts
41 pages
Data Warehouse Fact Tables Guide
No ratings yet
Data Warehouse Fact Tables Guide
3 pages
4.DWH Design - KH2-24-25
No ratings yet
4.DWH Design - KH2-24-25
51 pages
Intro to Number Theory Basics
No ratings yet
Intro to Number Theory Basics
32 pages
Elliott Wave Pattern Recognition Scanner
No ratings yet
Elliott Wave Pattern Recognition Scanner
5 pages
Internship Report
No ratings yet
Internship Report
25 pages
Logic Circuit & Switching Theory Sequencial Logic Circuits
No ratings yet
Logic Circuit & Switching Theory Sequencial Logic Circuits
67 pages
Lua Programming Quick Reference
No ratings yet
Lua Programming Quick Reference
6 pages
MBA Syllabus 2019-21 PDF
100% (1)
MBA Syllabus 2019-21 PDF
266 pages
Power Electronics for Engineers
No ratings yet
Power Electronics for Engineers
45 pages
M514 M516 AU400: EL Hardware Manual Rev. 0700
No ratings yet
M514 M516 AU400: EL Hardware Manual Rev. 0700
15 pages
DSEWebNet Smart Device Application Manual
No ratings yet
DSEWebNet Smart Device Application Manual
46 pages
CNG Owners Manual V1.0.2
No ratings yet
CNG Owners Manual V1.0.2
35 pages
Vehicle Warning Systems Guide
No ratings yet
Vehicle Warning Systems Guide
4 pages
Student's Digital Archive
No ratings yet
Student's Digital Archive
51 pages
SO003B Food Handbook 2011 - FINAL - Lowres
No ratings yet
SO003B Food Handbook 2011 - FINAL - Lowres
64 pages
(Ebook PDF) Calculus For AP: A Complete Course PDF Download
100% (6)
(Ebook PDF) Calculus For AP: A Complete Course PDF Download
50 pages
Designing The Modules: This Lecture Is Based On The Chapter 6 of The Book "Software Engineering: Theory and Practice"
No ratings yet
Designing The Modules: This Lecture Is Based On The Chapter 6 of The Book "Software Engineering: Theory and Practice"
100 pages
Metaverse and Education
No ratings yet
Metaverse and Education
15 pages
Oracle Control File Recreation Guide
100% (1)
Oracle Control File Recreation Guide
3 pages
Computer Engineering Department: Operating System Activity Manual
No ratings yet
Computer Engineering Department: Operating System Activity Manual
6 pages
Ventilador - Siemens Servo Screen 390 - Service Manual
0% (1)
Ventilador - Siemens Servo Screen 390 - Service Manual
49 pages
Panvalet Users Guide: Number: 11.04.01 Effective: 08/01/01
No ratings yet
Panvalet Users Guide: Number: 11.04.01 Effective: 08/01/01
71 pages
Codebook Swo3
No ratings yet
Codebook Swo3
144 pages
User Manual
No ratings yet
User Manual
175 pages
Professional Practices in Information Technology: Hand Book
No ratings yet
Professional Practices in Information Technology: Hand Book
131 pages
LVS Verification Process Guide
No ratings yet
LVS Verification Process Guide
11 pages
Advanced Stack Implementations
No ratings yet
Advanced Stack Implementations
2 pages
C Programming Basics Quiz
No ratings yet
C Programming Basics Quiz
6 pages
XML Services Developer's Guide 7.1
No ratings yet
XML Services Developer's Guide 7.1
80 pages
Model 363 Control Valve Guide
No ratings yet
Model 363 Control Valve Guide
20 pages
Kano State Polytechnic - Google Search
No ratings yet
Kano State Polytechnic - Google Search
1 page
Started On State Completed On Time Taken Marks Grade Feedback
100% (1)
Started On State Completed On Time Taken Marks Grade Feedback
4 pages

Databases Unit Test 3

Uploaded by

Databases Unit Test 3

Uploaded by

Question 1: Define a data warehouse.

Compare and contrast the design

A data warehouse is a centralized repository designed to store, manage, and

Comparison: Data Warehouse vs. Third Normal Form (3NF) Database

 Fact: A fact represents the numerical or measurable data in a dataset.

Key Differences Between Fact and Dimension

Nature of Data Quantitative Qualitative or

Purpose Represents "what is Provides context or

Table Characteristics Contains foreign keys Contains primary keys

1. Provide a high-level description (without modeling) of the fact and dimension

2. Describe the granularity of each of the fact tables

Lab Usage Fact

4. Provide a diagram representing the data warehouse design

Report: Total Number of Users by Month and Major

 Columns: Month, Major, Total Users

Month Major Total

Report: Application Usage by Gender

 Columns: Application Name, Gender, Total Sessions

App Gend Total

Report: Average Session Duration by Lab Location

 Columns: Lab Location, Average Duration

I. When a Student Changes Majors

II. When a Lab Moves Locations

Why This Works: It maintains a historical record of the lab's prior

III. When a New Version of a Software Package Is Installed in the

7. Does your solution include any snow flaking? If so, why?

Why? Because if multiple labs share common attributes (e.g., campus),

You might also like