MLDAP Module1
MLDAP Module1
Machine Learning (ML) is a subset of Artificial Intelligence (AI) that focuses on developing
algorithms that enable computers to learn patterns from data and make decisions or predictions
without being explicitly programmed for every task.
In more technical terms, machine learning involves training a model on a dataset, so that it can
learn relationships or structures within the data. Once trained, the model can be used to make
predictions or decisions based on new, unseen data.
1. Automation of Decision-Making
o ML systems can automate complex decision-making processes in real time, which
increases efficiency and reduces the need for manual intervention.
2. Data-Driven Insights
o Machine learning can extract useful insights from large and complex datasets that
are often beyond human capabilities to analyze manually.
3. Improved Accuracy and Predictions
o With continuous learning from data, ML models can improve over time, leading
to more accurate forecasts, recommendations, and decisions.
4. Wide Range of Applications
o ML is applied in diverse fields:
Healthcare: Disease prediction, medical imaging, personalized treatment.
Finance: Fraud detection, risk assessment, algorithmic trading.
Marketing: Customer segmentation, personalized recommendations.
Transportation: Route optimization, self-driving cars.
Manufacturing: Predictive maintenance, quality control.
5. Real-Time Applications
o Machine learning enables systems like real-time language translation, facial
recognition, and spam filtering, which are increasingly integral to daily digital
experiences.
6. Scalability
o ML models can handle vast amounts of data and scale effectively as data grows,
making them suitable for big data environments.
1. Supervised Learning
Definition
Supervised learning is a type of machine learning where the algorithm is trained on a labeled
dataset — meaning each training example is paired with an output label. The model learns to
map inputs to known outputs.
Goal: To learn a function that, given an input, provides the correct output based
on examples.
How It Works
Types
Common Algorithms
Linear Regression
Logistic Regression
Decision Trees
Random Forest
Support Vector Machines (SVM)
Neural Networks
k-Nearest Neighbors (k-NN)
Applications
2. Unsupervised Learning
Definition
Unsupervised learning is a type of machine learning where the algorithm is trained on unlabeled
data. It tries to find hidden patterns or intrinsic structures in the input data without any
supervision.
Goal :To explore the underlying structure or distribution in the data to learn
more about it.
How It Works
Types
Common Algorithms
K-Means Clustering
Hierarchical Clustering
DBSCAN
Principal Component Analysis (PCA)
Autoencoders
t-SNE
Applications
Customer segmentation
Market basket analysis
Anomaly detection
Document/topic modeling
Recommendation systems
3. Reinforcement Learning (RL)
Definition
Reinforcement learning is a type of machine learning where an agent learns to make decisions
by interacting with an environment, aiming to maximize cumulative rewards through trial and
error.
Goal: To learn an optimal policy or strategy that maximizes rewards over time.
How It Works
Key Concepts
Common Algorithms
Q-Learning
Deep Q Networks (DQN)
SARSA
Policy Gradient Methods
Actor-Critic Models
Applications
Reinforcement
Feature Supervised Learning Unsupervised Learning
Learning
Reinforcement
Feature Supervised Learning Unsupervised Learning
Learning
Labeled
Yes No No (uses rewards instead)
Data
Learn optimal actions
Goal Predict outputs Find patterns/structures
(policy)
Indirect
Feedback Direct (correct labels) None
(rewards/penalties)
Email classification, price Market segmentation, Game AI, robotics,
Examples
prediction anomaly detection trading bots
Applications:
Disease Prediction & Diagnosis: ML models can predict diseases like diabetes, cancer,
and heart conditions using patient data (e.g., IBM Watson).
Medical Imaging: ML analyzes X-rays, MRIs, and CT scans for early detection of
conditions (e.g., tumors).
Drug Discovery: Accelerates development by predicting molecular interactions.
Personalized Treatment: Recommends tailored treatments based on a patient’s history
and genetics.
Remote Monitoring: Predicts health issues from wearable device data (e.g., Fitbit, Apple
Watch).
Applications:
Fraud Detection: ML identifies unusual patterns in transactions (e.g., credit card fraud).
Credit Scoring: Assesses creditworthiness of individuals or businesses.
Algorithmic Trading: Automated trading based on market trends and patterns.
Loan Approval: Predicts default risks and streamlines underwriting.
Customer Service: AI chatbots for support and queries.
Applications:
Recommendation Systems: Suggests products based on browsing and purchase history
(e.g., Amazon, Netflix).
Customer Segmentation: Groups customers for targeted marketing.
Inventory Management: Predicts demand and optimizes stock.
Price Optimization: Adjusts prices dynamically based on competition, demand, and
seasonality.
Chatbots and Virtual Assistants: Improve customer support.
Applications:
Autonomous Vehicles: Self-driving cars use ML for object detection, navigation, and
decision-making (e.g., Tesla Autopilot).
Route Optimization: Finds efficient routes using traffic and weather data (e.g., Google
Maps, Uber).
Predictive Maintenance: Prevents failures by analyzing sensor data from vehicles.
Fleet Management: Optimizes logistics and delivery schedules.
5. Manufacturing
Applications:
6. Education
Applications:
7. Gaming
Applications:
Game AI: NPCs (non-player characters) learn and adapt to player behavior.
Dynamic Difficulty Adjustment: Modifies game complexity based on player skill.
Procedural Content Generation: ML creates game levels, stories, and assets.
Player Behavior Analysis: Detects toxic behavior or cheating.
Applications:
9. Cybersecurity
Applications:
10. Agriculture
Applications:
Applications:
Applications:
PYTHON PROGRAMMING
Python Syntax
Definition
Python syntax refers to the set of rules that defines how Python code should be written and
interpreted.
Key Characteristics
Python uses indentation (whitespace) to define blocks of code instead of braces {} like
other languages.
It is case-sensitive (Variable and variable are different).
Statements do not require a semicolon ; (though allowed).
Example
def greet(name):
print("Hello,", name)
Variables
Definition
Variables are containers for storing data values. Python is dynamically typed, meaning you
don’t need to declare the type of a variable.
Syntax
x = 5
name = "Alice"
price = 12.99
Examples
my_var = 10
MyVar = "text" # different from my_var
_name = "John"
age2 = 25
Comments
Definition: Comments are notes in the code that are not executed. They help explain the code.
Single-Line Comment
python
CopyEdit
# This is a comment
x = 10 # This is an inline comment
Python doesn't have a native multi-line comment syntax, but triple quotes are often used:
"""
This is a multi-line comment
used for documentation
"""
Data Types
Python has several built-in data types, which are automatically assigned when values are stored
in variables.
Type Checking
x = 5
print(type(x)) # <class 'int'>
Type Casting
Definition: Type casting means converting the data type of a value to another
type.
Common Functions
Examples
x = "123"
y = int(x) # y is now 123 (int)
a = 3.14
b = str(a) # b is "3.14" (str)
c = 0
d = bool(c) # d is False
Operators
Arithmetic Operators: Used for mathematical calculations.
Operator Description Example Result
+ Addition 5 + 3 8
- Subtraction 5 - 3 2
* Multiplication 5 * 3 15
// Floor Division 5 // 3 1
% Modulus (remainder) 5 % 3 2
** Exponentiation 5 ** 3 125
Example:
a = 7
b = 3
print(a + b) # 10
print(a / b) # 2.3333
print(a // b) # 2
print(a % b) # 1
print(a ** b) # 343
== Equal to 5 == 3 False
Operator Description Example Result
Example:
x = 10
y = 20
print(x == y) # False
print(x < y) # True
print(x >= 10) # True
Logical Operators
Example:
a = 5
b = 10
print(a > 3 and b < 15) # True
print(a == 5 or b == 5) # True
print(not(a == b)) # True
Assignment Operators : Used to assign values to variables, often combined with arithmetic.
= Assign x = 5 -
Operator Description Example Equivalent To
Example:
x = 10
x += 5 # x = 15
x *= 2 # x = 30
print(x)
` ` OR `5
Example:
a = 5 # binary 0101
b = 3 # binary 0011
print(a & b) # 1
print(a | b) # 7
print(a ^ b) # 6
print(~a) # -6
print(a << 1) # 10
print(a >> 1) # 2
Membership Operators
Example:
my_list = [1, 2, 3, 4]
print(3 in my_list) # True
print(5 not in my_list) # True
Identity Operators
Example:
a = [1, 2, 3]
b = a
c = [1, 2, 3]
Strings in python
What is a String?
Strings are immutable, meaning once created, their contents cannot be changed.
Creating Strings
s1 = 'Hello'
s2 = "World"
s3 = '''This is
a multi-line
string'''
String Operations
Concatenation (+)
greeting = s1 + " " + s2 # 'Hello World'
Repetition (*)
echo = "Ha" * 3 # 'HaHaHa'
Indexing starts at 0
Supports negative indexing (from the end)
text = "Python"
print(text[0]) # 'P'
print(text[-1]) # 'n'
print(text[1:4]) # 'yth' (from index 1 to 3)
print(text[:3]) # 'Pyt' (start to 2)
print(text[3:]) # 'hon' (3 to end)
print(text[:]) # 'Python' (whole string)
4.1. Length
len(text) # 6
4.2. Changing Case
.startswith(sub) "hello".startswith("he") →
Checks if string starts with sub
True
.endswith(sub) "hello".endswith("lo") →
Checks if string ends with sub
True
.join(iterable) Joins list/tuple into string with separator ",".join(['a', 'b', 'c']) →
"a,b,c"
Escape Characters
\\ Backslash \
\n New line
\t Tab
Example:
String Formatting
Example Code
text = " Hello, Python! "
words = text.strip().split(",")
print(words) # ['Hello', ' Python!']
print("Python".startswith("Py")) # True
print("Python".find("th")) # 2
List
Definition
Accessing Elements
print(my_list[0]) # 1
print(my_list[-1]) # 4 (last element)
print(my_list[1:3]) # [2, 3]
Example
lst = [3, 1, 4]
lst.append(2) # [3,1,4,2]
lst.sort() # [1,2,3,4]
lst.remove(3) # [1,2,4]
print(lst.pop()) # 4
print(lst) # [1,2]
Tuple
Definition
Creating Tuples
t = (1, 2, 3)
singleton = (5,) # single element tuple requires comma
empty = ()
Accessing Elements
print(t[0]) # 1
print(t[-1]) # 3
print(t[1:3]) # (2, 3)
Tuple Methods
Method Description
Example
t = (1, 2, 3, 2)
print(t.count(2)) # 2
print(t.index(3)) # 2
Set
Definition
Creating Sets
s = {1, 2, 3}
empty_set = set() # must use set() to create empty set, {} creates empty
dict
Adding/Removing Elements
s.add(4) # add element
s.remove(2) # remove element, raises KeyError if not found
s.discard(5) # remove element if present, no error if absent
s.pop() # removes and returns arbitrary element
s.clear() # remove all elements
Example
a = {1, 2, 3}
b = {2, 3, 4}
print(a | b) # {1, 2, 3, 4}
print(a & b) # {2, 3}
print(a - b) # {1}
Dictionary
Definition
Creating Dictionaries
d = {'name': 'Alice', 'age': 30}
empty_dict = {}
d2 = dict(a=1, b=2)
Accessing Values
print(d['name']) # 'Alice'
print(d.get('age')) # 30
print(d.get('salary', 0)) # 0 (default if key not found)
Adding/Updating Entries
d['salary'] = 5000
d.update({'age': 31, 'city': 'NY'})
Removing Entries
d.pop('age') # removes 'age' key and returns its value
d.popitem() # removes and returns arbitrary key-value pair
del d['name'] # delete by key
d.clear() # empty dictionary
Dictionary Methods
Method Description
Example
person = {'name': 'Bob', 'age': 25}
print(person.keys()) # dict_keys(['name', 'age'])
print(person.values()) # dict_values(['Bob', 25])
print(person.items()) # dict_items([('name', 'Bob'), ('age', 25)])
person['age'] = 26
person['city'] = 'LA'
age = person.pop('age')
print(age) # 26
1. Purpose
Used to execute certain blocks of code only if specific conditions are true.
Controls program flow by making decisions.
2. Basic Syntax
if condition:
# code to execute if condition is True
else:
# code to execute if condition is False
3. Example
age = 18
Output:
Syntax:
if condition1:
# block 1
elif condition2:
# block 2
elif condition3:
# block 3
else:
# else block
Example:
score = 85
if score >= 90:
print("Grade: A")
elif score >= 80:
print("Grade: B")
elif score >= 70:
print("Grade: C")
else:
print("Grade: F")
Output:
Grade: B
5. Conditions
== Equal to x == 5
!= Not equal to x != 10
6. Logical Operators
Example:
x = 15
if x > 10 and x < 20:
print("x is between 10 and 20")
7. Nested if Statements
Example:
x = 10
if x > 5:
if x < 15:
print("x is between 6 and 14")
else:
print("x is 15 or more")
else:
print("x is 5 or less")
Output:
x is between 6 and 14
Syntax:
Example:
age = 17
status = "Adult" if age >= 18 else "Minor"
print(status) # Output: Minor
if num > 0:
print("Positive number")
elif num == 0:
print("Zero")
else:
print("Negative number")
Python Loops and Functions in Detail
1. For Loop
Purpose:
Syntax:
for <variable> in <sequence>:
# code block to execute
Example:
fruits = ['apple', 'banana', 'cherry']
for fruit in fruits:
print(fruit)
Output:
apple
banana
cherry
Using range()
for i in range(5): # 0 to 4
print(i)
2. While Loop
Purpose:
Syntax:
while condition:
# code block to execute
Example:
count = 0
while count < 3:
print(count)
count += 1
Output:
0
1
2
Output:
1
3
4. Functions
Purpose:
Defining a Function
def function_name(parameters):
# code block
return value # optional
Calling a Function
result = function_name(arguments)
Example:
def greet(name):
print(f"Hello, {name}!")
5. Function Components
sum = add(3, 4) # 7
describe_pet(animal_type="cat", name="Whiskers")
def add_numbers(*args):
return sum(args)
print(add_numbers(1, 2, 3, 4)) # 10
for i in range(6):
print(f"{i}! = {factorial(i)}")
Output:
0! = 1
1! = 1
2! = 2
3! = 6
4! = 24
5! = 120
NumPy in Python: Detailed Overview
1. What is NumPy?
Python lists are slow for numerical operations because they store objects of different
types and lack vectorized operations.
NumPy arrays (called ndarrays) are:
o Fixed-type (homogeneous).
o Stored compactly in memory.
o Support vectorized operations (operate on whole arrays element-wise without
explicit loops).
Essential for:
o Data analysis
o Machine learning
o Image processing
o Scientific research
3. Installing NumPy
4. Importing NumPy
import numpy as np
5. NumPy Arrays
a = np.array([1, 2, 3, 4])
print(a) # [1 2 3 4]
7. Array Operations
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(a + b) # [5 7 9]
print(a * b) # [4 10 18]
print(a ** 2) # [1 4 9]
print(np.sqrt(a)) # [1. 1.414 1.732]
a = np.array([1, 2, 3])
print(a + 10) # [11 12 13]
8. Indexing and Slicing
a = np.array([1, 2, 3, 4, 5])
print(a[0]) # 1
print(a[1:4]) # [2 3 4]
9. Reshaping Arrays
a = np.arange(6) # [0 1 2 3 4 5]
b = a.reshape((2, 3))
print(b)
# [[0 1 2]
# [3 4 5]]
Calculate summaries:
a = np.array([1, 2, 3, 4])
print(a.sum()) # 10
print(a.mean()) # 2.5
print(a.min()) # 1
print(a.max()) # 4
print(a.std()) # Standard deviation
a = np.array([1, 2, 3, 4, 5])
print(a[a > 3]) # [4 5]
12. Copy vs View
c = np.dot(a, b)
print(c)
# [[19 22]
# [43 50]]
Or using operator:
c = a @ b
print("Matrix:\n", mat)
1. What is Pandas?
3. Installing Pandas
4. Importing Pandas
import pandas as pd
5.1 Series
A 1D labeled array capable of holding any data type.
Has an index for labels (default integer index if not provided).
import pandas as pd
Output:
a 10
b 20
c 30
d 40
dtype: int64
5.2 DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['NY', 'LA', 'Chicago']
}
df = pd.DataFrame(data)
print(df)
Output:
8. Selecting Data
grouped = df.groupby('City')
print(grouped['Age'].mean()) # Average age per city
# Multiple aggregations
grouped['Age'].agg(['mean', 'min', 'max'])
12. Sorting
df.sort_values(by='Age') # Sort ascending by Age
df.sort_values(by='Age', ascending=False) # Sort descending
15. Reshaping
# Create DataFrame
data = {'Name': ['Anna', 'Bob', 'Cara', 'Dave'],
'Age': [28, 24, 35, 40],
'Score': [85, 90, 88, 92]}
df = pd.DataFrame(data)
# Filter people older than 30
older_than_30 = df[df['Age'] > 30]
# Calculate average score
avg_score = df['Score'].mean()
print("People older than 30:\n", older_than_30)
print("Average score:", avg_score)
Python Matplotlib
1. What is Matplotlib?
Matplotlib is a widely used Python library for creating static, interactive, and
animated visualizations.
It provides a MATLAB-like interface for plotting, making it familiar to those from
scientific and engineering backgrounds.
The most commonly used module is pyplot, which provides functions to create plots and
charts easily.
Enables creation of a wide variety of plots: line, bar, scatter, histogram, pie, etc.
Highly customizable plots.
Integrates well with NumPy and Pandas for quick data visualization.
Essential tool for data analysis, exploratory data analysis (EDA), and reporting.
3. Installing Matplotlib
4. Importing Matplotlib
import matplotlib.pyplot as plt
5. Basic Plotting
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
7. Plot Customization
9. Subplots
x = np.random.rand(50)
y = np.random.rand(50)
sizes = np.random.randint(20, 200, size=50)
colors = np.random.rand(50)
import pandas as pd
data = {'Apples': [3, 2, 0, 1],
'Oranges': [0, 3, 7, 2]}
df = pd.DataFrame(data)
df.plot(kind='bar')
plt.show()
Data Preprocessing in Python: Step-by-Step
Data preprocessing is the process of cleaning and transforming raw data into a format
that is suitable for modeling.
It typically involves:
o Handling missing values
o Encoding categorical variables
o Scaling/normalizing features
o Handling outliers
o Splitting data into train/test sets
Sample Dataset
import pandas as pd
import numpy as np
data = {
'Age': [25, 30, np.nan, 22, 40, np.nan, 28],
'Salary': [50000, 60000, 58000, 52000, np.nan, 62000, 58000],
'City': ['New York', 'Los Angeles', 'New York', 'Chicago', 'Chicago',
'Los Angeles', np.nan],
'Purchased': ['Yes', 'No', 'Yes', 'No', 'Yes', 'No', 'Yes']
}
df = pd.DataFrame(data)
print(df)
scaler = StandardScaler()
df[['Age', 'Salary']] = scaler.fit_transform(df[['Age', 'Salary']])
scaler = MinMaxScaler()
df[['Age', 'Salary']] = scaler.fit_transform(df[['Age', 'Salary']])
X = df.drop('Purchased', axis=1)
y = df['Purchased']
# Sample data
data = {
'Age': [25, 30, np.nan, 22, 40, np.nan, 28],
'Salary': [50000, 60000, 58000, 52000, np.nan, 62000, 58000],
'City': ['New York', 'Los Angeles', 'New York', 'Chicago', 'Chicago',
'Los Angeles', np.nan],
'Purchased': ['Yes', 'No', 'Yes', 'No', 'Yes', 'No', 'Yes']
}
df = pd.DataFrame(data)
# Handle missing values
df['Age'].fillna(df['Age'].mean(), inplace=True)
df['Salary'].fillna(df['Salary'].median(), inplace=True)
df['City'].fillna(df['City'].mode()[0], inplace=True)
# Feature scaling
scaler = StandardScaler()
df[['Age', 'Salary']] = scaler.fit_transform(df[['Age', 'Salary']])
# Split data
X = df.drop('Purchased', axis=1)
y = df['Purchased']
This is the basic flow of preprocessing. Depending on your dataset and problem, you may
want to include feature engineering, handling imbalanced data, or more advanced scaling.