Data Testing and ETL Testing Real-time Scenarios with SQL queries:
Data Testing Scenarios
1. Data Integrity Checks:
○ Scenario: Verify that all records in a table have unique primary keys and foreign keys
reference existing records in the related table.
○ SQL Query:
SQL
-- Check for duplicate primary keys
SELECT column_name
FROM table_name
GROUP BY column_name
HAVING COUNT(*) > 1;
-- Check for foreign key constraints
SELECT table_name, constraint_name
FROM user_constraints
WHERE constraint_type = 'R';
2. Data Accuracy Checks:
○ Scenario: Ensure that data in specific columns adheres to predefined business rules
(e.g., date format, valid email addresses, numeric ranges).
○ SQL Query:
SQL
-- Check for valid email addresses
SELECT *
FROM table_name
WHERE NOT REGEXP_LIKE(email_column, '[a-zA-Z0-9._%+-]+@[a-zA-Z0-
9.-]+\.[a-zA-Z]{2,4}');
-- Check for date format
SELECT *
FROM table_name
WHERE NOT REGEXP_LIKE(date_column, '^[0-9]{4}-(0[1-9]|1[0-2])-
(0[1-9]|[1-2][0-9]|3[0-1])$');
3. Data Completeness Checks:
○ Scenario: Verify that all mandatory fields in a table are populated with valid values.
○ SQL Query:
SQL
-- Check for null values in mandatory columns
SELECT *
FROM table_name
WHERE column1 IS NULL OR column2 IS NULL;
4. Data Consistency Checks:
○ Scenario: Ensure that data across different tables or within the same table is consistent
(e.g., calculated fields match expected values, derived values are accurate).
○ SQL Query:
SQL
-- Check if calculated field matches expected value
SELECT *
FROM table_name
WHERE calculated_field != (SELECT expected_calculation FROM
expected_values);
-- Check for data consistency across tables
SELECT t1.*, t2.*
FROM table1 t1
JOIN table2 t2 ON t1.id = t2.id
WHERE t1.value != t2.value;
ETL Testing Scenarios
1. Data Volume Checks:
○ Scenario: Verify that the number of records extracted, transformed, and loaded matches
the expected count.
○ SQL Query:
SQL
-- Count records in source table
SELECT COUNT(*) FROM source_table;
-- Count records in target table
SELECT COUNT(*) FROM target_table;
2. Data Transformation Checks:
○ Scenario: Ensure that data transformations (e.g., data cleansing, data type conversions,
data aggregations) are performed correctly.
○ SQL Query:
SQL
-- Compare transformed data with expected values
SELECT *
FROM target_table
WHERE transformed_field != expected_value;
3. Data Integrity Checks (ETL):
○ Scenario: Verify that data integrity constraints (e.g., unique keys, foreign keys) are
maintained during the ETL process.
○ SQL Query: (Similar to Data Integrity Checks above)
4. Data Quality Checks (ETL):
○ Scenario: Verify that data quality rules (e.g., data accuracy, completeness, consistency)
are enforced during the ETL process.
○ SQL Query: (Similar to Data Quality Checks above)
Important Notes:
● These are just examples, and the specific scenarios and SQL queries will vary depending on
your business requirements and data environment.
● Consider using automated testing tools and frameworks to streamline your data testing and
ETL testing processes.
● Regularly review and update your test cases to ensure they remain relevant and effective.
I hope this helps! Let me know if you have any further questions.