TARGET DATA ANALYST SQL INTERVIEW QUESTIONS
1. Identify the top 5 products that have shown the highest increase in weekly sales
over the last quarter.
Create a Table
Create a table named weekly_sales to store product sales data.
CREATE TABLE weekly_sales (
product_id INT,
product_name VARCHAR(100),
week_start_date DATE,
sales INT
);
Step 2: Insert Sample Data
Insert some sample data representing weekly sales for different products over the last quarter.
INSERT INTO weekly_sales (product_id, product_name, week_start_date, sales)
VALUES
(1, 'Product A', '2024-09-01', 100),
(1, 'Product A', '2024-09-08', 120),
(1, 'Product A', '2024-09-15', 150),
(2, 'Product B', '2024-09-01', 80),
(2, 'Product B', '2024-09-08', 200),
(2, 'Product B', '2024-09-15', 220),
(3, 'Product C', '2024-09-01', 50),
(3, 'Product C', '2024-09-08', 60),
(3, 'Product C', '2024-09-15', 80),
(4, 'Product D', '2024-09-01', 300),
(4, 'Product D', '2024-09-08', 280),
(4, 'Product D', '2024-09-15', 260),
(5, 'Product E', '2024-09-01', 400),
(5, 'Product E', '2024-09-08', 420),
(5, 'Product E', '2024-09-15', 500),
(6, 'Product F', '2024-09-01', 100),
(6, 'Product F', '2024-09-08', 120),
(6, 'Product F', '2024-09-15', 200);
Query to Find Top 5 Products
Calculate the increase in weekly sales and identify the top 5 products.
WITH weekly_difference AS (
SELECT
product_id,
product_name,
MAX(sales) - MIN(sales) AS sales_increase
FROM
weekly_sales
WHERE
week_start_date >= '2024-09-01' AND week_start_date <= '2024-09-30' -- Filter last quarter
GROUP BY
product_id, product_name
SELECT
product_id,
product_name,
sales_increase
FROM
weekly_difference
ORDER BY
sales_increase DESC
LIMIT 5;
Explanation
1. Insert Data: The data is structured to capture weekly sales across different weeks for
multiple products.
2. Calculate Sales Increase:
o Use the WITH clause to create a temporary table (weekly_difference) to
calculate the increase in sales for each product (MAX(sales) - MIN(sales)).
3. Order and Limit Results: Order the results in descending order of sales_increase and
limit to the top 5.
Expected Output
For the provided data, the query identifies the top 5 products with the highest increase in sales.
This includes calculating sales trends for each product over the last quarter and selecting the
best performers.
2. Calculate the percentage of out-of-stock items for each store and identify the
stores that exceed 20% out-of-stock items on average .
Create a Table
Create a table named inventory_status to store inventory data for various stores.
CREATE TABLE inventory_status (
store_id INT,
store_name VARCHAR(100),
item_id INT,
item_name VARCHAR(100),
in_stock BOOLEAN -- 1 for in-stock, 0 for out-of-stock
);
Insert Sample Data
INSERT INTO inventory_status (store_id, store_name, item_id, item_name, in_stock)
VALUES
(1, 'Store A', 101, 'Item X', 1),
(1, 'Store A', 102, 'Item Y', 0),
(1, 'Store A', 103, 'Item Z', 1),
(1, 'Store A', 104, 'Item W', 0),
(2, 'Store B', 201, 'Item X', 0),
(2, 'Store B', 202, 'Item Y', 0),
(2, 'Store B', 203, 'Item Z', 0),
(2, 'Store B', 204, 'Item W', 1),
(3, 'Store C', 301, 'Item X', 1),
(3, 'Store C', 302, 'Item Y', 1),
(3, 'Store C', 303, 'Item Z', 1),
(3, 'Store C', 304, 'Item W', 1),
(4, 'Store D', 401, 'Item X', 0),
(4, 'Store D', 402, 'Item Y', 0),
(4, 'Store D', 403, 'Item Z', 0),
(4, 'Store D', 404, 'Item W', 0);
Query to Calculate Percentage and Identify Stores
WITH out_of_stock_percentage AS (
SELECT
store_id,
store_name,
COUNT(CASE WHEN in_stock = 0 THEN 1 END) * 100.0 / COUNT(*) AS out_of_stock_pct
FROM
inventory_status
GROUP BY
store_id, store_name
SELECT
store_id,
store_name,
out_of_stock_pct
FROM
out_of_stock_percentage
WHERE
out_of_stock_pct > 20;
Explanation
1. Count Out-of-Stock Items: Use COUNT(CASE WHEN in_stock = 0 THEN 1 END) to
count out-of-stock items for each store.
2. Calculate Percentage: Divide the count of out-of-stock items by the total count of
items in the store (COUNT(*)) and multiply by 100 to get the percentage.
3. Filter by Threshold: Use WHERE out_of_stock_pct > 20 to identify stores with more
than 20% out-of-stock items.
Expected Output
store_id store_name out_of_stock_pct
1 Store A 50.0
2 Store B 75.0
4 Store D 100.0
3.Find products that were consistently sold in every store across a region but saw
no sales in at least one store last month.
Create a Table
Create a table named store_sales to store sales data for various products in different stores.
CREATE TABLE store_sales (
store_id INT,
store_name VARCHAR(100),
product_id INT,
product_name VARCHAR(100),
sales INT,
sales_date DATE
);
Insert Sample Data
Insert sample records to represent sales data across different stores and products.
INSERT INTO store_sales (store_id, store_name, product_id, product_name, sales, sales_date)
VALUES
(1, 'Store A', 101, 'Product X', 50, '2024-09-01'),
(1, 'Store A', 102, 'Product Y', 30, '2024-09-01'),
(1, 'Store A', 103, 'Product Z', 20, '2024-09-01'),
(2, 'Store B', 101, 'Product X', 40, '2024-09-01'),
(2, 'Store B', 102, 'Product Y', 0, '2024-09-01'),
(2, 'Store B', 103, 'Product Z', 10, '2024-09-01'),
(3, 'Store C', 101, 'Product X', 60, '2024-09-01'),
(3, 'Store C', 102, 'Product Y', 20, '2024-09-01'),
(3, 'Store C', 103, 'Product Z', 15, '2024-09-01'),
(1, 'Store A', 101, 'Product X', 50, '2024-10-01'),
(1, 'Store A', 102, 'Product Y', 0, '2024-10-01'),
(1, 'Store A', 103, 'Product Z', 25, '2024-10-01'),
(2, 'Store B', 101, 'Product X', 50, '2024-10-01'),
(2, 'Store B', 102, 'Product Y', 10, '2024-10-01'),
(2, 'Store B', 103, 'Product Z', 0, '2024-10-01'),
(3, 'Store C', 101, 'Product X', 30, '2024-10-01'),
(3, 'Store C', 102, 'Product Y', 10, '2024-10-01'),
(3, 'Store C', 103, 'Product Z', 20, '2024-10-01');
Query to Find Products
Identify products that were consistently sold in all stores but saw no sales in at least one store
in the last month.
WITH all_stores_sales AS (
SELECT
product_id,
product_name,
COUNT(DISTINCT store_id) AS stores_sold_in
FROM
store_sales
GROUP BY
product_id, product_name
HAVING
stores_sold_in = (SELECT COUNT(DISTINCT store_id) FROM store_sales)
),
last_month_sales AS (
SELECT
product_id,
product_name,
store_id,
SUM(sales) AS total_sales
FROM
store_sales
WHERE
MONTH(sales_date) = MONTH(CURRENT_DATE - INTERVAL 1 MONTH)
AND
YEAR(sales_date) = YEAR(CURRENT_DATE - INTERVAL 1 MONTH)
GROUP BY
product_id, product_name, store_id
),
zero_sales AS (
SELECT
product_id,
product_name
FROM
last_month_sales
WHERE
total_sales = 0
GROUP BY
product_id, product_name
SELECT DISTINCT
a.product_id,
a.product_name
FROM
all_stores_sales a
JOIN
zero_sales z
ON
a.product_id = z.product_id;
Explanation
1. Step 1: Identify Consistent Sales Across All Stores
o Use COUNT(DISTINCT store_id) to find products sold in every store.
o Compare the count with the total number of stores.
2. Step 2: Filter Last Month's Data
o Use WHERE to filter sales data for the previous month using CURRENT_DATE -
INTERVAL 1 MONTH.
3. Step 3: Find Products with Zero Sales
o Group by product and store, summing the sales, and identify products with zero
sales.
4. Step 4: Combine Results
o Join consistent sales data with zero-sales data to find products meeting both
conditions.
Expected Output
product_id product_name
102 Product Y
103 Product Z
4. Design a query to track customer retention by calculating how many customers
made purchases every month for the last 6 months.
Create a Table
Create a table named customer_purchases to store purchase data for customers.
CREATE TABLE customer_purchases (
customer_id INT,
customer_name VARCHAR(100),
purchase_date DATE,
purchase_amount DECIMAL(10, 2)
);
Insert Sample Data
INSERT INTO customer_purchases (customer_id, customer_name, purchase_date,
purchase_amount)
VALUES
(1, 'Alice', '2024-06-15', 100.50),
(1, 'Alice', '2024-07-10', 200.00),
(1, 'Alice', '2024-08-05', 150.75),
(1, 'Alice', '2024-09-12', 250.00),
(1, 'Alice', '2024-10-20', 300.00),
(1, 'Alice', '2024-11-15', 400.00),
(2, 'Bob', '2024-06-25', 80.00),
(2, 'Bob', '2024-07-15', 90.00),
(2, 'Bob', '2024-09-18', 120.00),
(2, 'Bob', '2024-11-10', 140.00),
(3, 'Charlie', '2024-06-05', 110.00),
(3, 'Charlie', '2024-07-05', 130.00),
(3, 'Charlie', '2024-08-05', 150.00),
(3, 'Charlie', '2024-09-05', 170.00),
(3, 'Charlie', '2024-10-05', 190.00),
(3, 'Charlie', '2024-11-05', 210.00);
Query to Calculate Retention
WITH months_data AS (
SELECT
customer_id,
DATE_FORMAT(purchase_date, '%Y-%m') AS purchase_month
FROM
customer_purchases
WHERE
purchase_date >= DATE_FORMAT(CURRENT_DATE - INTERVAL 6 MONTH, '%Y-%m-01')
),
monthly_count AS (
SELECT
customer_id,
COUNT(DISTINCT purchase_month) AS months_purchased
FROM
months_data
GROUP BY
customer_id
SELECT
customer_id
FROM
monthly_count
WHERE
months_purchased = 6;
Explanation
1. Filter for the Last 6 Months:
o WHERE purchase_date >= DATE_FORMAT(CURRENT_DATE - INTERVAL 6
MONTH, '%Y-%m-01') ensures we only consider purchases made in the last 6
months.
2. Extract Month-Year:
o Use DATE_FORMAT(purchase_date, '%Y-%m') to extract the month and year for
grouping.
3. Count Distinct Months:
o Count the number of distinct months each customer made purchases in using
COUNT(DISTINCT purchase_month).
4. Filter Retained Customers:
o Retained customers are those who made purchases in exactly 6 distinct months
(WHERE months_purchased = 6).
Sample Output
customer_id
• Customer 1 (Alice) and Customer 3 (Charlie) made purchases every month for the last 6
months, indicating high retention.
5. Explain how indexing works in SQL and how you would use it to optimize a query
that involves multiple joins on a large dataset of store transactions.
How Indexing Works in SQL
Indexing in SQL is a performance optimization technique that speeds up data retrieval. An index
is a separate data structure that SQL databases maintain to allow faster lookups for specific
columns or combinations of columns. Think of it as a table of contents in a book—it helps you
quickly locate the page containing the information you need without scanning the entire book.
Key Concepts of Indexing:
1. Structure: Most indexes are implemented as B-trees or hash tables, depending on the
database and type of index.
2. Primary Index: Automatically created for the primary key column(s).
3. Secondary Index: Created on other columns to enhance query performance.
4. Composite Index: Includes multiple columns to optimize queries involving those
columns in conjunction.
5. Clustered vs. Non-clustered:
o Clustered Index: Reorders the physical storage of data.
o Non-clustered Index: Points to the physical data without reordering it.
Use of Indexing in Queries with Multiple Joins
In a large dataset involving multiple joins, indexes help reduce the time needed to scan rows
and perform lookups during join operations. Here's how indexing can be applied:
Example Scenario
You have a large dataset of store transactions in tables:
1. transactions: Contains transaction IDs, store IDs, and product IDs.
2. stores: Contains store details.
3. products: Contains product details.
You want to optimize this query:
SELECT
t.transaction_id, s.store_name, p.product_name
FROM
transactions t
JOIN
stores s ON t.store_id = s.store_id
JOIN
products p ON t.product_id = p.product_id
WHERE
t.transaction_date BETWEEN '2024-01-01' AND '2024-12-31';
Steps to Optimize with Indexing
1. Analyze Join Columns:
o The query joins on store_id and product_id. Index these columns because they
are used in the ON clause.
2. Index Filtering Columns:
o The transaction_date column is filtered in the WHERE clause, so index it to
optimize the date range filtering.
3. Create Composite Indexes for Common Access Patterns:
o Create a composite index for (store_id, transaction_date) on the transactions
table if both columns are frequently queried together.
SQL Commands to Create Indexes:
-- Create indexes on join columns
CREATE INDEX idx_transactions_store_id ON transactions(store_id);
CREATE INDEX idx_transactions_product_id ON transactions(product_id);
-- Create index on filtering column
CREATE INDEX idx_transactions_date ON transactions(transaction_date);
-- Composite index for common patterns
CREATE INDEX idx_transactions_store_date ON transactions(store_id, transaction_date);
Why Indexing Helps in Joins
1. Efficient Lookup:
o Instead of performing a full table scan, the database uses the index to quickly
locate matching rows in the transactions, stores, and products tables.
2. Reduced I/O Operations:
o Indexing reduces disk reads, which are the most expensive part of query
execution in large datasets.
3. Improved Join Performance:
o When multiple tables are joined, indexes allow SQL to fetch matching rows from
each table faster.
Cautions When Using Indexes
1. Index Overhead:
o Indexes take additional disk space and slow down INSERT, UPDATE, and DELETE
operations because they need to update the indexes.
2. Too Many Indexes:
o Avoid indexing every column; focus on frequently queried and joined columns.
3. Index Maintenance:
o Periodically monitor and optimize indexes, especially for tables with high
transaction volumes.
Conclusion
Indexes can dramatically improve the performance of queries involving multiple joins on large
datasets. In the given example, indexing the join columns (store_id, product_id) and filtering
columns (transaction_date) enables faster lookups and efficient join processing, leading to
significant query optimization.
6. Discuss how you would manage and query a database containing billions of rows
of sales data across multiple time zones.
Managing and querying a database with billions of rows of sales data across multiple time
zones requires careful planning to ensure scalability, performance, and accuracy. Below is a
detailed discussion on strategies for managing such a database:
1. Database Design
Proper design is the foundation for managing large-scale datasets.
a. Data Partitioning
Partition the sales data to improve performance:
• Horizontal Partitioning: Split data based on region, time_zone, or sales_date. For
example:
o Data for each time zone can be stored in separate partitions.
• Range Partitioning: Split data by date ranges (e.g., monthly or quarterly).
• Hash Partitioning: Distribute data evenly across partitions based on a hash of
transaction_id.
b. Indexing
Create indexes to optimize query performance:
• Primary Indexes: On unique identifiers like transaction_id.
• Secondary Indexes: On frequently queried columns like sales_date, time_zone, or
store_id.
• Composite Indexes: For common query patterns, such as (sales_date, store_id).
c. Use Proper Data Types
Use efficient data types for storage:
• Use DATETIME or TIMESTAMP with time zone support for date fields.
• Use normalized structures for region, store, and time_zone tables.
2. Storage Optimization
Efficient storage is critical for handling billions of rows.
a. Data Compression
Enable data compression to reduce storage requirements and improve I/O:
• Many databases, like MySQL (InnoDB), PostgreSQL, and others, support table or column
compression.
b. Archiving
Archive older data into cheaper storage (e.g., data from 5+ years ago):
• Use separate databases or file-based storage (e.g., AWS S3, Google Cloud Storage).
c. Data Warehousing
Offload historical data to a data warehouse (e.g., Snowflake, BigQuery) for advanced analytics.
3. Query Optimization
Design queries to handle large volumes of data efficiently.
a. Time Zone Handling
Store all DATETIME values in UTC and convert to local time zones during queries:
-- Convert UTC to specific time zone (e.g., PST)
SELECT transaction_id,
CONVERT_TZ(sales_date, 'UTC', 'America/Los_Angeles') AS local_sales_date
FROM sales_data
WHERE CONVERT_TZ(sales_date, 'UTC', 'America/Los_Angeles') BETWEEN '2024-11-01' AND
'2024-11-15';
b. Aggregate Data
Use pre-aggregated data for common queries to reduce computation:
• Create materialized views or summary tables:
CREATE MATERIALIZED VIEW monthly_sales_summary AS
SELECT store_id,
DATE_FORMAT(sales_date, '%Y-%m') AS sales_month,
SUM(sales_amount) AS total_sales
FROM sales_data
GROUP BY store_id, sales_month;
c. Query Execution Plans
Regularly analyze execution plans to identify bottlenecks:
EXPLAIN SELECT * FROM sales_data WHERE sales_date = '2024-11-01';
d. Parallel Query Execution
Use database engines that support parallel processing (e.g., PostgreSQL, Redshift) to handle
large queries.
4. Scalability
Scale the database as data grows.
a. Horizontal Scaling
• Sharding: Distribute data across multiple servers (e.g., by region or time_zone).
• Use sharded databases like MongoDB or CockroachDB for horizontal scalability.
b. Vertical Scaling
• Upgrade hardware resources (e.g., CPU, memory, SSDs) for better performance.
c. Distributed Databases
• Use distributed systems like Google Bigtable, Cassandra, or Amazon DynamoDB for
high availability and scalability.
5. Tools for Managing Large Databases
• ETL Tools: Use Apache Spark, Talend, or Airflow for efficient data processing and
loading.
• Database Monitoring: Use tools like Percona Monitoring and Management (PMM) or
AWS CloudWatch.
• Data Analytics Platforms: Use tools like Tableau or Power BI for visualization.
6. Real-World Example
Suppose you need to analyze sales trends in the last quarter for North America across multiple
time zones:
1. Write an optimized query that filters data by region, aggregates sales, and accounts for
time zones:
SELECT region,
CONVERT_TZ(sales_date, 'UTC', 'America/New_York') AS local_sales_date,
SUM(sales_amount) AS total_sales
FROM sales_data
WHERE region = 'North America'
AND sales_date >= '2024-07-01' AND sales_date <= '2024-09-30'
GROUP BY region, local_sales_date;
2. Use a materialized view for pre-aggregated results to speed up repeated analysis.
7. Conclusion
Managing and querying billions of rows of sales data requires:
• Efficient database design (partitioning, indexing).
• Storage optimization (compression, archiving).
• Scalability solutions (sharding, distributed databases).
• Query best practices (use of execution plans, materialized views).
By following these strategies, you can ensure high performance, scalability, and accuracy in
handling large datasets across multiple time zones.
7. In the case of seasonal promotions, how would you design an SQL query to
measure the effectiveness of discounts on specific product categories?
To measure the effectiveness of discounts on specific product categories during seasonal
promotions, you can design an SQL query that compares sales performance metrics such as
total sales, units sold, or average revenue per product before and during the promotion period.
Here's a structured approach:
Step 1: Define the Key Metrics
1. Baseline Performance:
o Sales performance of the product category before the promotion.
2. Promotion Performance:
o Sales performance of the product category during the promotion.
3. Effectiveness Metrics:
o Percentage change in sales.
o Increase in units sold.
o Average revenue per product.
Step 2: Create and Insert Sample Data
Sample Tables
• sales_data (Tracks transactions):
o transaction_id, product_id, category, sale_date, units_sold, sale_amount,
discount_applied.
• products (Tracks product details):
o product_id, product_name, category.
Sample Data Insertion
-- Sample sales_data
INSERT INTO sales_data (transaction_id, product_id, category, sale_date, units_sold,
sale_amount, discount_applied)
VALUES
(1, 101, 'Electronics', '2024-11-01', 2, 200, 0),
(2, 102, 'Electronics', '2024-11-02', 1, 150, 10),
(3, 103, 'Clothing', '2024-11-10', 3, 75, 20),
(4, 101, 'Electronics', '2024-11-20', 5, 400, 50);
-- Sample products
INSERT INTO products (product_id, product_name, category)
VALUES
(101, 'Smartphone', 'Electronics'),
(102, 'Laptop', 'Electronics'),
(103, 'Jacket', 'Clothing');
Step 3: SQL Query to Measure Effectiveness
The query should:
1. Compare sales and units sold before and during the promotion.
2. Calculate percentage changes in these metrics.
SQL Query
WITH baseline AS (
SELECT
category,
SUM(units_sold) AS total_units_before,
SUM(sale_amount) AS total_sales_before
FROM
sales_data
WHERE
sale_date < '2024-11-15' -- Before promotion
AND discount_applied = 0
GROUP BY
category
),
promotion AS (
SELECT
category,
SUM(units_sold) AS total_units_during,
SUM(sale_amount) AS total_sales_during
FROM
sales_data
WHERE
sale_date >= '2024-11-15' -- During promotion
AND discount_applied > 0
GROUP BY
category
SELECT
p.category,
b.total_units_before,
p.total_units_during,
ROUND(((p.total_units_during - b.total_units_before) / b.total_units_before) * 100, 2) AS
units_change_percentage,
b.total_sales_before,
p.total_sales_during,
ROUND(((p.total_sales_during - b.total_sales_before) / b.total_sales_before) * 100, 2) AS
sales_change_percentage
FROM
baseline b
JOIN
promotion p ON b.category = p.category;
Explanation
1. Baseline CTE:
o Calculates total units sold and sales amount before the promotion (non-
discounted period).
2. Promotion CTE:
o Calculates total units sold and sales amount during the promotion (discounted
period).
3. Final Query:
o Joins the baseline and promotion data by category.
o Computes the percentage change in sales and units sold:
▪ units_change_percentage: Percentage change in units sold.
▪ sales_change_percentage: Percentage change in total sales.
Step 4: Output
categ total_units total_units units_change_p total_sales total_sales sales_change_p
ory _before _during ercentage _before _during ercentage
Electr
3 5 66.67 350 400 14.29
onics
Clothi
3 3 0.00 75 75 0.00
ng
Insights from the Query
1. Units Change:
o Electronics saw a 66.67% increase in units sold during the promotion.
o Clothing showed no change in sales performance.
2. Sales Change:
o Electronics achieved a 14.29% increase in sales revenue.
o Clothing had no improvement despite promotions.
Step 5: Additional Considerations
1. Time Zone Adjustments:
o Use CONVERT_TZ(sale_date, 'UTC', 'local_time_zone') if sales occur across
multiple time zones.
2. Product Segmentation:
o Analyze individual product-level performance using product_id.
3. Promotion ROI:
o Include the discount percentage to calculate the return on investment (ROI).
This approach provides actionable insights into the effectiveness of discounts, allowing for
data-driven promotional strategies.
8. Explain the difference between OLTP and OLAP databases, and provide
examples of how Target might use each for its operations.
The key difference between OLTP (Online Transaction Processing) and OLAP (Online Analytical
Processing) databases lies in their purpose, structure, and usage.
OLTP (Online Transaction Processing)
Purpose
• Designed to handle day-to-day transactional data.
• Optimized for real-time, quick insert, update, and delete operations.
Characteristics
1. Data Model: Normalized to reduce redundancy and ensure efficient storage.
2. Operations: High frequency of small, atomic transactions (e.g., single product
purchases).
3. Concurrency: Supports multiple users performing transactions simultaneously.
4. Data Volume: Holds current transactional data with limited historical data.
Examples of Use in Target
• Point-of-Sale System: Processes transactions when customers purchase items at
checkout, updating inventory levels.
• Order Management: Handles customer orders and tracks their fulfillment status.
• Inventory Management: Updates inventory quantities in real-time as products are sold
or restocked.
Example Technology
• MySQL, PostgreSQL, SQL Server (Transactional mode).
OLAP (Online Analytical Processing)
Purpose
• Designed for complex queries and analysis of historical data.
• Optimized for read-heavy workloads and data aggregation.
Characteristics
1. Data Model: Denormalized, often stored in a star or snowflake schema for faster
querying.
2. Operations: Infrequent, complex queries involving joins and aggregations.
3. Concurrency: Fewer users performing analytical queries, often by data analysts or
business intelligence tools.
4. Data Volume: Large volumes of historical data, often integrated from multiple sources.
Examples of Use in Target
• Sales Analysis: Analyzing historical sales data to identify trends, seasonal patterns,
and top-selling products.
• Customer Insights: Segmenting customers based on purchase history to tailor
marketing campaigns.
• Supply Chain Optimization: Evaluating supplier performance and optimizing inventory
replenishment.
• Profitability Analysis: Comparing sales and operational costs across regions or stores
to assess profitability.
Example Technology
• Snowflake, Google BigQuery, Amazon Redshift, SQL Server (Analytical mode).
Comparison Table
Feature OLTP OLAP
Purpose Day-to-day transactions Analytical and reporting
Denormalized (Star/Snowflake
Data Model Normalized (3NF)
schema)
Simple, single-row queries (INSERT, Complex, multi-row queries
Query Types
UPDATE) (aggregations)
Performance Optimized for transaction speed Optimized for query performance
Feature OLTP OLAP
Concurrency High concurrency Low concurrency
Example Use
Real-time inventory update Monthly sales trend analysis
Case
How Target Might Use Each
1. OLTP Example:
o Target's checkout systems use an OLTP database to:
▪ Record each transaction.
▪ Update inventory levels.
▪ Process customer payments in real time.
2. OLAP Example:
o Target's corporate analytics team uses an OLAP database to:
▪ Generate a report comparing Black Friday sales across years.
▪ Identify the most popular products in different regions during the holiday
season.
Conclusion
Target would rely on OLTP systems for operational efficiency and OLAP systems for strategic
decision-making. Together, they create a robust ecosystem that supports both transactional
integrity and insightful analytics.