6-Month Data Engineering Study Plan
Month 1: Python & SQL Foundations
- Week 1-2: Python Refresher - loops, functions, OOP, file I/O
- Week 3-4: SQL - SELECT, JOIN, GROUP BY, window functions
Month 2: Data Modeling & Warehousing
- Week 5-6: Database normalization, schemas, indexes
- Week 7-8: Dimensional modeling, star/snowflake schemas
Month 3: ETL & Data Pipelines
- Week 9-10: ETL concepts, batch vs stream processing
- Week 11-12: Tools - Apache Airflow, dbt basics
Month 4: Big Data & Spark
- Week 13-14: Hadoop, distributed systems overview
- Week 15-16: PySpark - RDDs, DataFrames, transformations
Month 5: Cloud Platforms
- Week 17: AWS/GCP basics - S3, Redshift, BigQuery
- Week 18-19: Dataflow/Glue - building cloud ETL jobs
Month 6: Projects & Portfolio
- Week 20-21: Build 1-2 end-to-end pipelines (ETL + Cloud + SQL)
- Week 22-24: Add documentation, GitHub portfolio, resume prep