Roadmap
Weak 1: Python basics
Day 1:
Libreries Understanding
Day 2:
Data Loading and Its manipulation
Day 3:
Graphs and Plots
Day 4:
Machine Learning Basics
Day 5:
Machine Learning Algorithms
Weak 2: Machine Learning Tasks
Day 1:
Linear Regression
Day 2:
Classification
Day 3:
Clustering
Day 4:
Deep Learning Basics
Day 5:
Neural Networks, Tensorflow and Keras
Week 3:
Day 1:
Deep Learning Algorithms
Day 2:
Convolutional Neural Networks
Day 3:
Recuurent Neural Networks
Day 4:
NLP
Day 5:
GANS
Week 4: Deep Learning Tasks
Python Basics Pandas, Numpy, Matplotlib
Machine Learning Libreries
Regression
Classification
Deep Learning
Python has emerged as the premier language for machine learning (ML), playing a pivotal role in both
academic research and industry applications. Its popularity stems from a combination of ease of use,
extensive libraries, strong community support, and flexibility. Here, we delve into the reasons why
Python is the language of choice for machine learning, highlighting its key features and benefits.
1. Ease of Learning and Use
Python's syntax is intuitive and readable, making it accessible to both beginners and experienced
developers. Its clean and straightforward structure allows users to focus on understanding machine
learning concepts rather than grappling with complex language syntax. This ease of learning accelerates
the onboarding process for new developers and facilitates rapid prototyping and experimentation, which
are crucial in the fast-paced field of machine learning.
2. Extensive Libraries and Frameworks
Python boasts a rich ecosystem of libraries and frameworks that significantly streamline the
development of machine learning models:
NumPy: Essential for numerical computations,
offering support for arrays, matrices, and a collection of mathematical functions.
Pandas: Provides data structures and data analysis tools, making it easier to handle and preprocess data.
Matplotlib and Seaborn: Powerful libraries for data visualization, allowing for the creation of informative
and aesthetically pleasing plots and graphs.
Scikit-learn: A comprehensive library for traditional machine learning algorithms, including tools for
model selection, preprocessing, and evaluation.
TensorFlow and Keras: Popular libraries for deep learning, offering flexible and efficient tools for building
neural networks.
PyTorch: Another leading deep learning library known for its dynamic computation graph and ease of
use, particularly favored in research.
These libraries and frameworks offer pre-built modules and functions, reducing the need for writing
boilerplate code and enabling developers to focus on refining their models and algorithms.
NUMPY:
Basic Examples of NumPy Library
NumPy (Numerical Python) is a powerful library for numerical computing in Python. It provides support
for arrays, matrices, and a wide range of mathematical functions. Here are some basic examples to help
you get started with NumPy.
1. Installing NumPy
First, ensure you have NumPy installed. You can install it using pip:
sh
Copy code
pip install numpy
2. Importing NumPy
To use NumPy, you need to import it. It is common to import it with the alias np:
python
Copy code
import numpy as np
3. Creating Arrays
1D Array:
python
Copy code
import numpy as np
# Creating a 1D array
array_1d = np.array([1, 2, 3, 4, 5])
print("1D Array:", array_1d)
2D Array:
python
Copy code
# Creating a 2D array (matrix)
array_2d = np.array([[1, 2, 3], [4, 5, 6]])
print("2D Array:\n", array_2d)
Array of Zeros:
python
Copy code
# Creating an array of zeros
zeros_array = np.zeros((3, 3))
print("Zeros Array:\n", zeros_array)
Array of Ones:
python
Copy code
# Creating an array of ones
ones_array = np.ones((2, 4))
print("Ones Array:\n", ones_array)
Array with a Range of Values:
python
Copy code
# Creating an array with a range of values
range_array = np.arange(0, 10, 2) # Start at 0, end before 10, step by 2
print("Range Array:", range_array)
Array of Linearly Spaced Values:
python
Copy code
# Creating an array of linearly spaced values
linspace_array = np.linspace(0, 1, 5) # 5 values between 0 and 1
print("Linearly Spaced Array:", linspace_array)
4. Basic Operations
Element-wise Operations:
python
Copy code
# Adding, subtracting, multiplying, and dividing arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print("Addition:", a + b)
print("Subtraction:", a - b)
print("Multiplication:", a * b)
print("Division:", a / b)
Universal Functions (ufuncs):
python
Copy code
# Applying mathematical functions
a = np.array([1, 2, 3, 4, 5])
print("Square Root:", np.sqrt(a))
print("Exponential:", np.exp(a))
print("Sine:", np.sin(a))
5. Array Indexing and Slicing
Indexing:
python
Copy code
# Accessing elements
a = np.array([1, 2, 3, 4, 5])
print("First Element:", a[0])
print("Last Element:", a[-1])
Slicing:
python
Copy code
# Slicing arrays
b = np.array([10, 20, 30, 40, 50])
print("Elements from index 1 to 3:", b[1:4])
print("Elements from start to index 2:", b[:3])
print("Elements from index 3 to end:", b[3:])
2D Array Indexing and Slicing:
python
Copy code
# Accessing elements and subarrays in 2D arrays
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("Element at (0, 0):", matrix[0, 0])
print("First row:", matrix[0, :])
print("Second column:", matrix[:, 1])
print("Subarray:\n", matrix[1:, 1:])
6. Reshaping Arrays
Reshape:
python
Copy code
# Reshaping arrays
a = np.arange(1, 10)
reshaped_array = a.reshape((3, 3))
print("Reshaped Array:\n", reshaped_array)
7. Aggregations and Statistics
Sum, Mean, and Standard Deviation:
python
Copy code
# Aggregation functions
a = np.array([1, 2, 3, 4, 5])
print("Sum:", np.sum(a))
print("Mean:", np.mean(a))
print("Standard Deviation:", np.std(a))
Max and Min:
python
Copy code
# Finding the maximum and minimum values
b = np.array([10, 20, 30, 40, 50])
print("Maximum Value:", np.max(b))
print("Minimum Value:", np.min(b))
These examples cover the basics of NumPy, providing a foundation for more advanced numerical
computations and data manipulation. NumPy's efficiency and functionality make it a crucial tool for
scientific computing and machine learning in Python.
Basic Example of Matplotlib
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in
Python. It is particularly useful for generating plots, histograms, bar charts, scatter plots, and much
more. Here’s a basic guide to get you started with Matplotlib.
1. Installing Matplotlib
First, ensure you have Matplotlib installed. You can install it using pip:
sh
Copy code
pip install matplotlib
2. Importing Matplotlib
To use Matplotlib, you need to import it. It is common to import the pyplot module as plt:
python
Copy code
import matplotlib.pyplot as plt
3. Creating a Simple Line Plot
A line plot is the simplest type of plot in Matplotlib. Here's how to create one:
python
Copy code
import matplotlib.pyplot as plt
# Data
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
# Create the plot
plt.plot(x, y)
# Add titles and labels
plt.title('Simple Line Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
# Show the plot
plt.show()
4. Creating a Scatter Plot
A scatter plot is useful for displaying the relationship between two numerical variables.
python
Copy code
import matplotlib.pyplot as plt
# Data
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
# Create the scatter plot
plt.scatter(x, y)
# Add titles and labels
plt.title('Simple Scatter Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
# Show the plot
plt.show()
5. Creating a Bar Plot
A bar plot is used to display values for different categorical data.
python
Copy code
import matplotlib.pyplot as plt
# Data
categories = ['A', 'B', 'C', 'D', 'E']
values = [4, 7, 1, 8, 5]
# Create the bar plot
plt.bar(categories, values)
# Add titles and labels
plt.title('Simple Bar Plot')
plt.xlabel('Categories')
plt.ylabel('Values')
# Show the plot
plt.show()
6. Creating a Histogram
A histogram is used to display the distribution of a dataset.
python
Copy code
import matplotlib.pyplot as plt
import numpy as np
# Data
data = np.random.randn(1000)
# Create the histogram
plt.hist(data, bins=30)
# Add titles and labels
plt.title('Simple Histogram')
plt.xlabel('Value')
plt.ylabel('Frequency')
# Show the plot
plt.show()
7. Creating a Pie Chart
A pie chart is used to display proportions of a whole.
python
Copy code
import matplotlib.pyplot as plt
# Data
labels = ['A', 'B', 'C', 'D']
sizes = [15, 30, 45, 10]
# Create the pie chart
plt.pie(sizes, labels=labels, autopct='%1.1f%%')
# Add a title
plt.title('Simple Pie Chart')
# Show the plot
plt.show()
8. Adding Customizations
Matplotlib allows extensive customization to enhance the readability and aesthetics of the plots.
Changing Line Styles and Colors:
python
Copy code
plt.plot(x, y, linestyle='--', color='r', marker='o') # Dashed red line with circle markers
Adding Grid:
python
Copy code
plt.grid(True)
Setting Axis Limits:
python
Copy code
plt.xlim(0, 6)
plt.ylim(0, 12)
Adding a Legend:
python
Copy code
plt.plot(x, y, label='Prime Numbers')
plt.legend()
Saving the Plot:
You can save your plots to a file using savefig:
python
Copy code
plt.savefig('plot.png')
Complete Example
Here’s a complete example combining several elements:
python
Copy code
import matplotlib.pyplot as plt
import numpy as np
# Data
x = np.linspace(0, 10, 100)
y = np.sin(x)
# Create the plot
plt.plot(x, y, label='Sine Wave', color='b', linestyle='--')
# Customize the plot
plt.title('Sine Wave Example')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.grid(True)
plt.legend()
# Save the plot
plt.savefig('sine_wave.png')
# Show the plot
plt.show()
This should give you a solid foundation for creating and customizing basic plots using Matplotlib. The
library’s extensive documentation and tutorials can further help you explore more advanced features
and customizations.
Pandas is a powerful and flexible data analysis and manipulation library for Python. It provides data
structures like Series and DataFrame, which are essential for handling structured data efficiently. Here’s
an overview of the basics of Pandas to get you started.
1. Installing Pandas
First, ensure you have Pandas installed. You can install it using pip:
sh
Copy code
pip install pandas
2. Importing Pandas
To use Pandas, you need to import it, commonly using the alias pd:
python
Copy code
import pandas as pd
3. Pandas Data Structures
Pandas has two primary data structures: Series and DataFrame.
Series
A Series is a one-dimensional labeled array capable of holding any data type.
python
Copy code
import pandas as pd
# Creating a Series
s = pd.Series([1, 3, 5, 7, 9])
print(s)
DataFrame
A DataFrame is a two-dimensional labeled data structure with columns of potentially different types.
python
Copy code
import pandas as pd
# Creating a DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [24, 27, 22, 32],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
df = pd.DataFrame(data)
print(df)
4. Reading Data
Pandas can read data from various file formats such as CSV, Excel, SQL databases, and more.
python
Copy code
# Reading data from a CSV file
df = pd.read_csv('data.csv')
print(df)
5. Basic Operations
Viewing Data:
python
Copy code
# Display the first few rows
print(df.head())
# Display the last few rows
print(df.tail())
Getting Basic Information:
python
Copy code
# Summary statistics
print(df.describe())
# DataFrame information
print(df.info())
Selecting Data:
python
Copy code
# Selecting a column
print(df['Name'])
# Selecting multiple columns
print(df[['Name', 'Age']])
# Selecting rows by index
print(df.iloc[1]) # Second row (index starts at 0)
print(df.iloc[1:3]) # Second and third rows
# Selecting rows by condition
print(df[df['Age'] > 25])
Adding a New Column:
python
Copy code
# Adding a new column
df['Salary'] = [50000, 60000, 45000, 80000]
print(df)
Modifying a Column:
python
Copy code
# Modifying a column
df['Age'] = df['Age'] + 1
print(df)
Dropping a Column:
python
Copy code
# Dropping a column
df = df.drop(columns=['Salary'])
print(df)
6. Handling Missing Data
Detecting Missing Data:
python
Copy code
# Checking for missing values
print(df.isnull())
print(df.isnull().sum())
Filling Missing Data:
python
Copy code
# Filling missing values with a specified value
df['Age'] = df['Age'].fillna(0)
print(df)
Dropping Missing Data:
python
Copy code
# Dropping rows with missing values
df = df.dropna()
print(df)
7. Grouping and Aggregating Data
Grouping Data:
python
Copy code
# Grouping by a column and calculating the mean
grouped = df.groupby('City').mean()
print(grouped)
Aggregating Data:
python
Copy code
# Aggregating data with different functions
agg = df.groupby('City').agg({'Age': 'mean', 'Salary': 'sum'})
print(agg)
8. Merging and Joining DataFrames
Merging DataFrames:
python
Copy code
# Creating two DataFrames
df1 = pd.DataFrame({'Key': ['A', 'B', 'C'], 'Value1': [1, 2, 3]})
df2 = pd.DataFrame({'Key': ['A', 'B', 'D'], 'Value2': [4, 5, 6]})
# Merging DataFrames on a key column
merged_df = pd.merge(df1, df2, on='Key', how='inner')
print(merged_df)
Joining DataFrames:
python
Copy code
# Setting the index for join operation
df1.set_index('Key', inplace=True)
df2.set_index('Key', inplace=True)
# Joining DataFrames
joined_df = df1.join(df2, how='outer')
print(joined_df)
9. Saving Data
To CSV:
python
Copy code
# Saving DataFrame to a CSV file
df.to_csv('output.csv', index=False)
To Excel:
python
Copy code
# Saving DataFrame to an Excel file
df.to_excel('output.xlsx', index=False)
Conclusion
Pandas is an essential tool for data analysis and manipulation in Python. It provides powerful, flexible
data structures that make it easy to work with structured data. This basic overview should give you a
good starting point for using Pandas in your data projects
Introduction to Scikit-learn
Scikit-learn is a popular open-source machine learning library for Python. It provides simple and efficient
tools for data mining and data analysis, built on NumPy, SciPy, and Matplotlib. Scikit-learn offers a wide
range of supervised and unsupervised learning algorithms for classification, regression, clustering,
dimensionality reduction, and more. Here’s a basic overview of Scikit-learn to help you get started.
Key Features:
Simple and Consistent API: Scikit-learn provides a consistent interface for various machine learning
algorithms, making it easy to experiment with different models.
Wide Range of Algorithms: It includes implementations of many popular machine learning algorithms,
including support vector machines (SVM), random forests, k-nearest neighbors (KNN), decision trees,
and more.
Efficient and Optimized: Scikit-learn is optimized for performance and scalability, making it suitable for
both small and large datasets.
Built-in Datasets: It comes with several built-in datasets for practice and experimentation, allowing users
to get started quickly without the need for external data sources.
Model Evaluation: Scikit-learn provides tools for evaluating model performance through metrics such as
accuracy, precision, recall, F1-score, and area under the curve (AUC).
Data Preprocessing: It offers a range of preprocessing techniques for scaling, normalization, imputation,
feature extraction, and feature selection.
Integration with Other Libraries: Scikit-learn integrates seamlessly with other Python libraries such as
NumPy, Pandas, and Matplotlib, enabling a smooth workflow for data analysis and visualization.
Basic Usage:
1. Importing Scikit-learn:
python
Copy code
import sklearn
2. Loading Datasets:
Scikit-learn provides built-in datasets that can be loaded using the load_* functions.
python
Copy code
from sklearn.datasets import load_iris
# Load the Iris dataset
iris = load_iris()
# Accessing features and target
X = iris.data # Features
y = iris.target # Target
3. Splitting Data:
It is common to split the dataset into training and testing sets for model evaluation.
python
Copy code
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
4. Creating and Training Models:
Instantiate the desired model and fit it to the training data.
python
Copy code
from sklearn.linear_model import LogisticRegression
# Create the model
model = LogisticRegression()
# Train the model
model.fit(X_train, y_train)
5. Making Predictions:
Use the trained model to make predictions on new data.
python
Copy code
# Predictions on the test set
y_pred = model.predict(X_test)
6. Evaluating Model Performance:
Evaluate the model's performance using appropriate metrics.
python
Copy code
from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)