KEMBAR78
Unit 1 | PDF | Machine Learning | Artificial Intelligence
0% found this document useful (0 votes)
7 views23 pages

Unit 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views23 pages

Unit 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Introduction to Machine Learning

1.1Introduction
1.2Working, Features and Need
1.3Applications
1.4Life cycle
1.5Machine Learning Required Skills
1.6Difference between Data Science, Artificial Intelligence and Machine
Learning and Deep Learning
1.7Types of Machine Learning: Supervised, Unsupervised, and
Reinforcement Learning
1.8Key Concepts: Features, Labels, Models, Training, and Testing,
Overfitting, Underfitting.

Introduction-
Different types of phones data
Machine Learning (ML) is a branch of artificial intelligence (AI) that allows
computers to learn from experience (data) and make decisions without being
explicitly programmed to perform specific tasks. Instead of relying on a fixed
set of rules to handle every possible scenario, machine learning algorithms
analyze data, detect patterns, and make predictions or decisions based on
what they've learned. Over time, as these algorithms are exposed to more
data, they can adapt and improve, becoming better at their tasks through
experience.

The term "machine learning" was first introduced by Arthur Samuel in 1959.

Machine learning algorithms create mathematical models that help the


computer make predictions or decisions based on sample data, known as
training data. By combining concepts from statistics and computer science,
machine learning helps us build models that can predict outcomes. The more
data these models have, the better they perform and the more accurate their
predictions become.

For Example:

Social Media Recommendations:


On platforms like Facebook or LinkedIn, ML algorithms analyze data about your
friends, interactions, groups, and interests to recommend new people you may
know or want to connect with. By analyzing mutual friends, shared interests,
and interaction patterns, ML algorithms make friend suggestions that align
with your social circle. As you interact more with the platform, the
recommendations become increasingly accurate.

Working, Features and Need-

Working-
Machine learning uses a step-by-step approach to make accurate predictions,
where each step is essential. Here’s a breakdown of the process in simpler
terms:

1. Data Collection: Collecting data is the first and most important step. The
quality of the data largely affects the accuracy of the model’s
predictions. We can gather data from sources like APIs, websites, social
media, or use built-in datasets for learning. It’s important to use data
responsibly, respecting privacy and fairness.
2. Data Preprocessing: Before using data in a model, we clean it up. This
means removing duplicates, handling missing values, addressing outliers,
and standardizing the format. Good preprocessing helps prevent errors
and improves the model’s accuracy.
3. Model Training: Once the data is ready, we choose an algorithm that fits
our problem and use it to build the model. We usually split the data into
training and testing sets. Common algorithms include linear regression,
logistic regression, and decision trees. To get better results, we adjust
model settings, or "hyperparameters," using techniques like grid search
or random search.

Imagine baking cookies and trying to get the perfect batch. You might
adjust the baking time, oven temperature, or amount of ingredients,
which can impact the final taste and texture. Hyperparameters are similar
for machine learning models; they’re settings we tweak to try to get the
best results.

To find the best hyperparameters, we can try methods like:

1. Grid Search: This method tests every possible combination of settings


from a grid of options. For example, if you’re not sure about which
temperature and time work best, you try each possible combination until
you find the best one. In machine learning, grid search tries each possible
hyperparameter combination to find what works best for the model.
2. Random Search: Instead of trying every single combination, random
search picks random combinations of hyperparameters to test. This is
faster because it skips some combinations, but it often still finds a good
result.

4. Model Evaluation: After training the model, we need to test its


accuracy. Evaluation metrics like accuracy, precision, recall, F1-score,
and AUC measure how well the model performs. We also use cross-
validation methods like k-fold and leave-one-out to ensure the model’s
reliability.
5. Model Deployment: This is the step where we put the trained model
into action. Deployment integrates the model into real-world
applications, allowing it to solve practical problems.

Besides these steps, we also visualize the model’s performance and predictions
to gain insights. For example, creating feature importance plots helps identify
which features most affect predictions, aiding in feature selection and
engineering.
Features-
 Adaptability: ML models improve as they’re exposed to more data,
making them highly adaptable.
 Scalability: ML can process large datasets, uncovering insights and
trends from large volumes of data that would be difficult for humans to
analyze.
 Efficiency: Automates repetitive or complex tasks, saving time and
effort.
Need-
 Rising Demand: Machine learning is increasingly in demand due to its
ability to handle tasks too complex for direct human intervention.
 Data Processing Power: Humans can’t manually process the vast
amounts of data available today, so we rely on machine learning to
manage and interpret this data.
 Automatic Learning from Data: Machine learning algorithms learn from
large datasets, automatically building models and making predictions
based on patterns in the data.
 Efficiency and Cost Savings: Machine learning saves time and money by
automating tasks and reducing the need for manual intervention.
 Performance Measurement: We use the cost function to evaluate how
well a machine learning model is performing and adjust as needed.
 Real-World Applications: Machine learning is used in self-driving cars,
fraud detection, face recognition, and social media recommendations.
 Industry Adoption: Companies like Netflix and Amazon use machine
learning to analyze user data, understand customer preferences, and
recommend products.
Key Benefits:

 Rapid Growth in Data: Machine learning helps us manage and utilize the
vast amounts of data being produced.
 Solving Complex Problems: It tackles complex tasks that are challenging
for humans.
 Improving Decision-Making: Machine learning supports decision-making
across various sectors, including finance and healthcare.
 Uncovering Patterns: It identifies hidden patterns and extracts valuable
information from data.

Applications-
Machine learning is a buzzword for today's technology, and it is growing very
rapidly day by day. We are using machine learning in our daily life even
without knowing it such as Google Maps, Google assistant, Alexa, etc. Below
are some most trending real-world applications of Machine Learning:
1. Image Recognition:
Image recognition is one of the most common applications of machine
learning. It is used to identify objects, persons, places, digital images, etc. The
popular use case of image recognition and face detection is, Automatic friend
tagging suggestion:

Facebook provides us a feature of auto friend tagging suggestion. Whenever


we upload a photo with our Facebook friends, then we automatically get a
tagging suggestion with name, and the technology behind this is machine
learning's face detection and recognition algorithm.

It is based on the Facebook project named "Deep Face," which is responsible


for face recognition and person identification in the picture.

2. Speech Recognition
While using Google, we get an option of "Search by voice," it comes under
speech recognition, and it's a popular application of machine learning.
Speech recognition is a process of converting voice instructions into text, and it
is also known as "Speech to text", or "Computer speech recognition." At
present, machine learning algorithms are widely used by various applications
of speech recognition. Google assistant, Siri, Cortana, and Alexa are using
speech recognition technology to follow the voice instructions.

3. Traffic prediction:
If we want to visit a new place, we take help of Google Maps, which shows us
the correct path with the shortest route and predicts the traffic conditions.

It predicts the traffic conditions such as whether traffic is cleared, slow-moving,


or heavily congested with the help of two ways:

o Real Time location of the vehicle form Google Map app and sensors
o Average time has taken on past days at the same time.
Everyone who is using Google Map is helping this app to make it better. It
takes information from the user and sends back to its database to improve the
performance.

4. Product recommendations:
Machine learning is widely used by various e-commerce and entertainment
companies such as Amazon, Netflix, etc., for product recommendation to the
user. Whenever we search for some product on Amazon, then we started
getting an advertisement for the same product while internet surfing on the
same browser and this is because of machine learning.

Google understands the user interest using various machine learning


algorithms and suggests the product as per customer interest.

As similar, when we use Netflix, we find some recommendations for


entertainment series, movies, etc., and this is also done with the help of
machine learning.

5. Self-driving cars:
One of the most exciting applications of machine learning is self-driving cars.
Machine learning plays a significant role in self-driving cars. Tesla, the most
popular car manufacturing company is working on self-driving car. It is using
unsupervised learning method to train the car models to detect people and
objects while driving.
6. Email Spam and Malware Filtering:
Whenever we receive a new email, it is filtered automatically as important,
normal, and spam. We always receive an important mail in our inbox with the
important symbol and spam emails in our spam box, and the technology
behind this is Machine learning. Below are some spam filters used by Gmail:

o Content Filter
o Header filter
o General blacklists filter
o Rules-based filters
o Permission filters
Some machine learning algorithms such as Multi-Layer Perceptron, Decision
tree, and Naïve Bayes classifier are used for email spam filtering and malware
detection.

7. Virtual Personal Assistant:


We have various virtual personal assistants such as Google
assistant, Alexa, Cortana, Siri. As the name suggests, they help us in finding
the information using our voice instruction. These assistants can help us in
various ways just by our voice instructions such as Play music, call someone,
Open an email, Scheduling an appointment, etc.

These virtual assistants use machine learning algorithms as an important part.

These assistant record our voice instructions, send it over the server on a
cloud, and decode it using ML algorithms and act accordingly.

8. Online Fraud Detection:


Machine learning is making our online transaction safe and secure by detecting
fraud transaction. Whenever we perform some online transaction, there may
be various ways that a fraudulent transaction can take place such as fake
accounts, fake ids, and steal money in the middle of a transaction. So to
detect this, Feed Forward Neural network helps us by checking whether it is a
genuine transaction or a fraud transaction.

For each genuine transaction, the output is converted into some hash values,
and these values become the input for the next round. For each genuine
transaction, there is a specific pattern which gets change for the fraud
transaction hence, it detects it and makes our online transactions more secure.
9. Stock Market trading:
Machine learning is widely used in stock market trading. In the stock market,
there is always a risk of up and downs in shares, so for this machine
learning's long short term memory neural network is used for the prediction
of stock market trends.

10. Medical Diagnosis:


In medical science, machine learning is used for diseases diagnoses. With this,
medical technology is growing very fast and able to build 3D models that can
predict the exact position of lesions in the brain.

It helps in finding brain tumors and other brain-related diseases easily.


11. Automatic Language Translation:
Nowadays, if we visit a new place and we are not aware of the language then it
is not a problem at all, as for this also machine learning helps us by converting
the text into our known languages. Google's GNMT (Google Neural Machine
Translation) provide this feature, which is a Neural Machine Learning that
translates the text into our familiar language, and it called as automatic
translation.

The technology behind the automatic translation is a sequence to sequence


learning algorithm, which is used with image recognition and translates the
text from one language to another language.

Machine learning Life cycle-


The machine learning lifecycle is a process that helps build effective machine
learning projects. This lifecycle involves a series of steps, each essential for
developing a reliable and accurate model that can solve a particular problem.
Here’s a simplified overview of these steps:

1. Gathering Data

 The first step is to gather data from various sources, like files, databases,
or the internet.
 The quantity and quality of data greatly impact the accuracy of
predictions, so more data typically means better results.
2. Data Preparation

 After gathering data, we organize it and prepare it for the next steps.
 This includes putting the data in order and combining it into a single set,
ready for analysis.

3. Data Wrangling

 Data wrangling involves cleaning and structuring the raw data.


 This step removes errors, like missing or duplicate values, and formats
data for easier analysis.

4. Data Analysis

 With clean data, we begin analyzing it to understand patterns and


trends.
 During this stage, we choose the type of model (such as classification or
regression) that best fits our problem.

5. Model Training

 In this step, we use algorithms to train our model, allowing it to learn


from the data patterns.
 Training helps the model recognize patterns and rules that it can apply
to make predictions.

6. Model Testing

 After training, we test the model using new data to see how accurate it
is.
 Testing helps measure how well the model performs and whether it
meets our requirements.

7. Deployment

 Once the model works well, we deploy it in the real world, where it can
help solve practical problems.
 Before finalizing, we verify that the model continues to perform well
with real-time data.
These steps ensure that the machine learning model is built, tested, and ready
for practical use. Each stage is essential for creating a model that meets project
needs and delivers accurate results.

Machine Learning Required Skills-


To work in machine learning, individuals need skills in several key areas:

 Mathematics and Statistics: A solid understanding of concepts like linear


algebra, calculus, probability, and statistics is essential. These concepts
help in understanding how algorithms make predictions.
 Programming Skills: Knowledge of programming languages, especially
Python or R, is critical for implementing and testing ML models.
 Data Handling: Knowing how to clean, preprocess, and transform data is
crucial for building high-quality models. This includes understanding
libraries like Pandas and NumPy in Python.
 Critical Thinking: Understanding and interpreting the results, fine-tuning
models, and solving problems creatively are important for successful ML
work.
Example: When building a model to recommend music, data handling skills
help process user data, math and stats help analyze patterns, and
programming allows developers to implement and deploy the model.

Difference between Data Science, Artificial


Intelligence and Machine Learning and Deep Learning-

Artificial
Machine Deep
Aspect Data Science Intelligence
Learning (ML) Learning
(AI)

Subset of ML
Branch of
that uses
Field focused on computer Subset of AI
neural
extracting science aimed focused on
networks
insights from at creating enabling
with
Definition data through intelligent machines to
multiple
analysis, systems that learn from data
layers for
statistics, and can simulate and improve
complex
computation human with experience
data
behavior
processing

To enable
machines to
learn and
To create To allow
To analyze and make
systems that machines to
interpret data to decisions on
Primary Goal can mimic learn from data
aid decision- complex
human and make
making data, often
intelligence predictions
involving
images or
text
Algorithms, Neural
logical Supervised, networks,
Statistics, data
reasoning, unsupervised, especially
Techniques mining, data
problem- and deep neural
Used cleaning,
solving, reinforcement networks
visualization
expert learning (CNNs,
systems RNNs)

Not always Requires


Heavily data- data- large
Data dependent for dependent; Requires data for amounts of
Dependency insights and may use rule- training models data,
decision support based or logic especially
systems labeled data

Image
Robotics,
Customer Email filtering, recognition,
natural
segmentation, recommendation language
Example Use language
fraud detection, engines, translation,
Cases processing,
recommendation predictive self-driving
autonomous
systems maintenance car
vehicles
perception

Advanced
Statistics, Logic, Statistics,
math, deep
programming, programming, probability,
Skill learning
data wrangling, understandin programming,
Requirements frameworks
machine learning g of AI knowledge of ML
(TensorFlow,
basics algorithms algorithms
PyTorch)

High, due to
Varies; can be Moderate to
Moderate to intensive
Computational low for simple high, depending
high, depending training of
Power tasks, high for on algorithm
on data size deep neural
complex AI complexity
networks
Varies; includes
AI platforms Scikit-Learn, TensorFlow,
Key Python, R, SQL,
like IBM TensorFlow, Keras, Keras,
Tools/Libraries Pandas, Matplotlib
Watson, PyTorch PyTorch
Google AI

Types of Machine Learning:


Machine learning (ML) can be categorized into three main types based on the
way they learn from data: Supervised Learning, Unsupervised Learning, and
Reinforcement Learning. Each type is distinct in how it approaches learning
from data and solving problems.
1. Supervised Learning

 Definition: In supervised learning, the model is trained on a labeled


dataset. Each training example consists of an input-output pair, meaning
the input data is accompanied by the correct output (label). The goal of
the model is to learn the relationship between the inputs and outputs so
it can predict the output for new, unseen data.
 How It Works: The algorithm analyzes the labeled examples and
attempts to map inputs to outputs by recognizing patterns in the data.
During training, it continuously adjusts its parameters to reduce the
difference between its predicted outputs and the actual labels.

 Common Algorithms: Linear Regression, Logistic Regression, Support


Vector Machines (SVM), Decision Trees, k-Nearest Neighbors (k-NN), and
Neural Networks.
 Example Use Cases:
o Spam Detection: Emails are labeled as "spam" or "not spam," and
the algorithm learns to classify new emails into these categories.
o Image Recognition: The algorithm is trained with labeled images
(e.g., a photo of a cat labeled as "cat") to recognize new images.
o Credit Scoring: Financial institutions use supervised learning
models to evaluate credit risk based on labeled data (e.g., past
loans with repayment or default labels).
2. Unsupervised Learning

 Definition: In unsupervised learning, the model is trained on unlabeled


data, meaning there are no predefined categories or outputs. Instead,
the algorithm tries to discover patterns, groupings, or structures within
the data on its own.
 How It Works: Since there are no labels, the algorithm explores the data
to identify patterns, often based on similarities or differences in
features. The goal is to find underlying structures that can organize the
data into meaningful groupings or segments.
 Common Algorithms: K-means Clustering, Hierarchical Clustering,
Principal Component Analysis (PCA), and Association Rule Learning.
 Example Use Cases:
o Customer Segmentation: E-commerce sites analyze customer
behavior (e.g., browsing and purchase history) to segment users
into groups, allowing for more personalized marketing.
o Anomaly Detection: Detecting unusual patterns or outliers in
data, such as fraudulent transactions in financial datasets.
o Market Basket Analysis: In retail, unsupervised learning can
identify associations between products frequently bought
together, helping businesses improve their product
recommendations.

3. Reinforcement Learning

 Definition: Reinforcement learning (RL) is a type of machine learning


where an agent learns by interacting with an environment and receiving
feedback in the form of rewards or penalties. The agent makes a series
of decisions and adjusts its actions over time to maximize cumulative
rewards.


 How It Works: The agent observes the current state of the environment
and decides on an action. Based on this action, the environment
changes, and the agent receives a reward or penalty. Through trial and
error, the agent learns the best actions to take in various situations to
maximize the total reward over time. Reinforcement learning involves
exploring various actions (exploration) and sticking to known successful
actions (exploitation).
 Common Algorithms: Q-learning, Deep Q-Networks (DQN), Policy
Gradient Methods, and Actor-Critic Methods.
 Example Use Cases:
o Game-playing AI: RL is commonly used to train AI systems in
games such as chess, Go, and video games, where the agent
learns strategies through self-play and competition, refining its
moves with each game.
o Autonomous Driving: Reinforcement learning helps self-driving
cars learn how to navigate roads by interacting with simulated or
real environments, receiving feedback based on safe driving and
rule adherence.
o Robotic Control: Robots learn to perform tasks, such as picking up
objects or walking, by receiving rewards or penalties based on the
success of their actions.
Key Concepts:
1. Features

Features are the individual measurable properties or characteristics of the data


used to make predictions in machine learning models. They are the input
variables or predictors that the model uses to identify patterns and
relationships. The quality and relevance of features significantly impact the
model's performance.

For example, in a model predicting house prices, the features could include:

 Square footage: The size of the house.


 Location: The geographical area or neighborhood where the house is
located.
 Number of bedrooms: The number of rooms designated for sleeping.
 Age of the house: How old the house is, which could affect its condition
and value.

Good feature engineering, which involves selecting and transforming the


features, is crucial for building effective machine learning models.

2. Labels

Labels represent the output variables the model is trying to predict. They are
also referred to as target variables or dependent variables. Labels are what the
model aims to learn to predict based on the input features.

In supervised learning, the model is trained using data where both the features
and labels are known. For example:

 In the house price prediction model, the label would be the house price,
which the model tries to predict based on the features like square
footage, location, and number of bedrooms.
 In a classification problem, the label could be a category such as "spam"
or "not spam" for an email classification model.

3. Models

A model is an algorithm that maps the features to the labels by learning


patterns in the data. Different types of models are used depending on the
nature of the data and the task (regression, classification, etc.).

Common examples of machine learning models include:

 Linear Regression: A statistical model that predicts a continuous output


(e.g., house price) by finding the best-fitting linear relationship between
the features and the label.
 Decision Trees: A non-linear model that splits the data into subsets
based on feature values, aiming to make predictions at each leaf node
based on the majority class or average label.
 Neural Networks: A family of models inspired by the human brain,
composed of interconnected nodes (neurons), often used for complex
tasks like image recognition, speech processing, and natural language
understanding.
Each model has its strengths and weaknesses, and the choice of model
depends on factors like the complexity of the data, the interpretability of the
model, and the task at hand.

4. Training and Testing

To evaluate the performance of a machine learning model, the data is typically


split into two sets: training and testing.

 Training Set: This is the portion of the data used to train the model. The
model learns patterns from this data by adjusting its internal parameters
(such as weights in a neural network or coefficients in a linear regression
model).
 Testing Set: Once the model has been trained, it is tested on a separate
set of data that it has not seen before. This helps evaluate how well the
model can generalize to new, unseen data. The testing set gives an
indication of how the model might perform in real-world scenarios.

This split is often done using techniques like:

 Train-test split: A simple random division of the data into two parts (e.g.,
80% for training, 20% for testing).
 Cross-validation: A more robust method where the data is divided into
multiple folds, and the model is trained and tested multiple times, with
each fold serving as the testing set once.

5. Overfitting

Overfitting occurs when a model learns the details and noise in the training
data to such an extent that it negatively impacts its performance on new,
unseen data. It essentially memorizes the training data rather than
generalizing the underlying patterns.

This can happen when:

 The model is too complex (e.g., a very deep decision tree or a neural
network with too many layers).
 There is insufficient training data, which leads to the model learning
irrelevant patterns that don't generalize.
Overfitting can be detected by comparing the performance of the model on
the training set and the testing set. If the model performs well on the training
set but poorly on the testing set, overfitting is likely.

6. Underfitting

Underfitting occurs when a model is too simple to capture the underlying


patterns in the data. The model fails to learn from the training data and, as a
result, performs poorly on both the training data and new data.

This happens when:

 The model is too simple (e.g., a linear model for data that has a non-
linear relationship).
 The training data is not enough or lacks diversity.
 There is not enough time for the model to learn from the data (e.g.,
insufficient training or too little data).

You might also like