Course Title: Fundamentals of Data Engineering
Program: Associate in Computer Technology
Course Duration: 14 weeks (1 semester)
Credits: 3 units
Course Description
This course introduces the foundational concepts, tools, and practices of data
engineering. Students will learn how to design, build, and maintain data pipelines and
systems that prepare raw data for analysis. Emphasis is on practical skills including
data ingestion, storage, transformation, data modeling, and querying using SQL and
Python. The course also covers cloud data platforms and real-world use cases,
preparing students for careers in data engineering and related fields.
Course Objectives
By the end of this course, students will be able to:
● Understand the data engineering lifecycle and its role in data-driven
organizations
● Design and implement data pipelines for batch and stream processing
● Use relational and NoSQL databases and apply data modeling techniques
● Perform ETL/ELT processes and automate workflows
● Query and transform data using SQL and Python
● Understand cloud data engineering fundamentals and tools
● Apply best practices in data quality, security, and pipeline maintenance
Weekly Topics & Activities
Wee Topics Hands-on / Activities
Introduction to Data Engineering: roles, lifecycle, Set up Python environment and
1
and tools cloud/local VM
Build simple data ingestion
2 Data Sources and Ingestion: batch vs. streaming
pipelines
Relational Databases & SQL Basics: schema Design schemas and write SQL
3
design, normalization queries
NoSQL Databases: document, key-value, columnar Install and experiment with
4
stores MongoDB
Create ER diagrams and physical
5 Data Modeling: conceptual, logical, physical
schemas
ETL vs. ELT Processes and Tools (Apache Airflow, Build an ETL pipeline with Python
6
Kafka) and Airflow
Data Warehousing and Data Lakes: architecture Design a simple data warehouse
7
and use cases schema
8 Data Quality, Validation, and Security Best Practices Implement data validation checks
9 Batch Processing with Hadoop and Spark Basics Run batch jobs on sample datasets
10 Stream Processing Fundamentals Build a basic streaming pipeline
Cloud Data Engineering Essentials Deploy data pipelines on cloud
11
(AWS/GCP/Azure) platform
Automate workflows and monitor
12 Data Pipeline Orchestration and Monitoring
pipelines
13 Data Visualization and Reporting Basics Create dashboards using BI tools
Present end-to-end data engineering
14 Final Project Presentations and Course Review
project
Assessment
● Weekly quizzes and exercises (30%)
● Hands-on lab assignments (30%)
● Midterm exam (15%)
● Final project (25%)
Recommended Tools & Technologies
● Programming: Python, SQL
● Databases: PostgreSQL, MongoDB
● ETL/Workflow: Apache Airflow, Apache Kafka
● Big Data: Hadoop, Spark
● Cloud Platforms: AWS, GCP, or Azure
● Visualization: Power BI, Tableau, or similar
References
● Fundamentals of Data Engineering guides and tutorials
● University syllabi on Data Engineering
● Industry best practices and tools overview