Machine Learning Roadmap (From Start to Advanced)
- Introduction to Machine Learning and its Types
- Setting up Python environment for ML (Jupyter, scikit-learn, PyTorch, TensorFlow)
- Numpy recap: vectors, matrices, broadcasting
- Pandas recap: Series, DataFrames, groupby, joins
- Basic Linear Algebra for ML (dot product, matrix multiplication)
- Eigenvalues and Eigenvectors intuition
- Calculus for ML: derivatives and gradients
- Partial derivatives and gradient vectors
- Probability basics: random variables, distributions
- Bayes theorem and conditional probability
- PROJECT: Implement Linear Regression from scratch using NumPy
- Statistics recap: mean, variance, standard deviation
- Hypothesis testing and p-values
- Introduction to datasets: features, labels, training/test split
- Bias-variance tradeoff
- Overfitting and underfitting
- Gradient descent algorithm intuition
- Implementing gradient descent from scratch in Python
- Linear regression model theory
- Linear regression implementation (scikit-learn)
- PROJECT: House Price Prediction using Linear Regression
- Logistic regression model theory
- Logistic regression implementation (classification example)
- k-Nearest Neighbors algorithm
- Decision Trees algorithm
- Random Forest algorithm
- Naive Bayes classifier
- Support Vector Machines (SVM)
- k-Means clustering
- Hierarchical clustering
- PROJECT: Titanic Survival Prediction (classification)
- DBSCAN clustering
- PCA (Principal Component Analysis)
- t-SNE and UMAP for visualization
- Train-test split and cross-validation
- Performance metrics: accuracy, precision, recall, F1-score
- ROC curve and AUC
- Hyperparameter tuning: GridSearchCV and RandomizedSearchCV
- Feature scaling: normalization and standardization
- Handling missing values in datasets
- Encoding categorical variables (one-hot, label encoding)
- PROJECT: Customer Segmentation with Clustering
- Feature selection techniques (filter, wrapper, embedded)
- Building pipelines in scikit-learn
- Ensemble learning: bagging vs boosting
- Gradient Boosting intuition
- XGBoost hands-on
- LightGBM hands-on
- CatBoost hands-on
- PROJECT: Kaggle competition with XGBoost
- Neural networks basics: perceptron model
- Activation functions (sigmoid, ReLU, tanh, softmax)
- Forward propagation explained
- Backpropagation explained
- Building a neural network from scratch (NumPy)
- Introduction to PyTorch
- Training a simple MLP classifier in PyTorch
- Introduction to TensorFlow and Keras
- Convolutional Neural Networks (CNN) basics
- PROJECT: Handwritten Digit Classification (MNIST)
- Convolution and pooling operations
- Dropout and Batch Normalization
- Image classification with CNNs in PyTorch
- Transfer learning with pretrained CNNs (ResNet, VGG)
- Text preprocessing: tokenization, stemming, lemmatization
- Bag-of-Words and TF-IDF representations
- Word embeddings (Word2Vec, GloVe)
- Recurrent Neural Networks (RNN) basics
- LSTMs and GRUs
- PROJECT: Sentiment Analysis on Movie Reviews
- Attention mechanism explained
- Transformers architecture basics
- BERT and GPT overview
- Autoencoders explained
- Variational Autoencoders (VAE)
- Generative Adversarial Networks (GANs) basics
- Implementing a simple GAN in PyTorch
- PROJECT: Image Generation with GANs
- Reinforcement learning basics: agents, environments, rewards
- Q-Learning algorithm explained
- Policy gradient methods explained
- Deploying ML models with Flask
- Deploying ML models with FastAPI
- Introduction to Docker for ML deployment
- PROJECT: Deploy Sentiment Analysis API
- Model monitoring and retraining strategies
- Basics of MLOps (CI/CD pipelines for ML)
- Reading ML research papers effectively
- Introduction to Large Language Models (LLMs)
- Prompt engineering basics
- Fine-tuning a transformer model on custom dataset
- PROJECT: Fine-tune a Transformer Model on Custom Text