Unit 1 Introduction To Machine Learning
Unit 1 Introduction To Machine Learning
UNIT - 1
INTRODUCTION TO MACHINE LEARNING
Machine learning is a subfield of artificial intelligence (AI) that focuses on the
development of algorithms and statistical models that enable computers to
perform tasks without explicit programming instructions. The primary goal of
machine learning is to allow computers to learn from data and make predictions
or decisions based on that learning.
Key Algorithms:
1. Linear Regression: A basic algorithm used for modeling the
relationship between a dependent variable and one or more independent
variables.
2. Logistic Regression: Used for binary classification problems, logistic
regression estimates the probability that an instance belongs to a
particular class.
3. Decision Trees: Decision trees recursively split the data based on
features, resulting in a tree-like structure used for classification or
regression tasks.
Page 1 of 22
lOMoARcPSD|37347388
Evaluation Metrics:
1. Various metrics are used to assess the performance of machine learning
models, including accuracy, precision, recall, F1-score, ROC-AUC,
mean squared error (MSE), etc.
Machine learning finds applications across various domains, including but not
limited to healthcare, finance, marketing, image recognition, natural language
processing, and autonomous vehicles. It continues to evolve rapidly with
advancements in algorithms, computing power, and data availability.
Page 2 of 22
lOMoARcPSD|37347388
2. Data Representation:
Page 3 of 22
lOMoARcPSD|37347388
Page 4 of 22
lOMoARcPSD|37347388
Machine learning life cycle involves seven major steps, which are given
below:
1. Gathering Data
2. Data preparation
3. Data Wrangling
4. Analyze Data
5. Train the model
6. Test the model
7. Deployment
The most important thing in the complete process is to understand the problem
and to know the purpose of the problem. Therefore, before starting the life cycle,
we need to understand the problem because the good result depends on the better
understanding of the problem.
In the complete life cycle process, to solve a problem, we create a machine
learning system called "model", and this model is created by providing "training".
But to train a model, we need data; hence, life cycle starts by collecting data.
Page 5 of 22
lOMoARcPSD|37347388
1. Gathering Data:
➢ The first step in the machine learning journey is gathering data. This
means finding and collecting all the information we need.
➢ In this step, we look for data in different places like files, databases, the
internet, or even from mobile devices.
➢ This step is super important because the amount and quality of data we
collect determine how well our predictions will work. The more data
we have, the better our predictions will be.
➢ We do a few things in this step:
• Figure out where to get data from
• Collect the data
• Put all the data together from different places into one big set
called a dataset.
Once we've got our dataset, we can move on to the next steps in our machine
learning adventure!
2. Data preparation
After we've collected our data, the next step is getting it ready for the rest of the
machine learning process. This step is called data preparation.
2. Data Exploration:
➢ We take some time to really understand the data we have. This
means looking at what kind of data it is, how it's structured, and if
there are any mistakes or missing pieces.
➢ Understanding our data well helps us get better results later on.
During this stage, we try to find patterns, trends, and any unusual
bits of data.
3. Data Pre-processing:
➢ Once we know our data inside out, we get it ready for analysis.
So, in simple terms, data preparation involves getting all our data organized,
understanding it, and then getting it ready to be analyzed by our machine learning
system.
Page 6 of 22
lOMoARcPSD|37347388
3. Data Wrangling
Data wrangling is all about getting your data into shape so it's ready for analysis.
This process involves cleaning up the data, selecting the important parts, and
transforming it into a format that's easy to work with in the next steps.
1. Cleaning Data: Sometimes, the data we collect isn't perfect. It might have
missing values, duplicates, invalid entries, or even random noise. Data
wrangling helps us identify and fix these issues to make our data more
reliable.
2. Selecting Variables: Not all the data we collect will be useful for our
analysis. Data wrangling allows us to pick out the important variables that
we actually need.
By using various techniques like filtering, we can clean up our data and make
sure it's in good shape for the next stages of our project. This is crucial because
the quality of our data directly impacts the quality of our final results.
Example
Let's say you're planning a picnic with your friends, and you need to prepare the
food. Data wrangling is like preparing the ingredients for your picnic dishes.
1. Cleaning Data: Imagine you're making a fruit salad, but some of the fruits
have bruises or are overripe. You'd want to clean them up or discard them
before adding them to your salad. Similarly, in data wrangling, you clean
up any messy or irrelevant data, like removing duplicate names or fixing
typos in your guest list.
2. Selecting Variables: For your picnic, you might decide to make
sandwiches, salads, and drinks. You wouldn't need to bring every
ingredient from your kitchen—just the ones you'll use for these specific
dishes. In data wrangling, you select the important pieces of information,
or variables, that you'll need for your analysis, like choosing the names of
the guests and their dietary preferences.
Once you've finished data wrangling, you have a neatly organized set of
ingredients ready to create your picnic dishes—or in this case, to analyze and
draw insights from your data!
4. Data Analysis
Page 7 of 22
lOMoARcPSD|37347388
Now that we've got our data all cleaned up and ready to go, it's time to start
analyzing it. This step involves a few key tasks:
1. Choosing Analysis Techniques: Just like picking the right tools for a job,
we need to select the best techniques for analyzing our data. This might
involve methods like sorting, categorizing, or finding patterns.
3. Reviewing Results: Once our models are built, we take a look at what they
tell us. Are they giving us the insights we expected? Do they accurately
represent our data? This step helps us fine-tune our analysis and make any
necessary adjustments.
So, in simple terms, in this step, we take our cleaned-up data and use special
algorithms to build models that help us understand it better.
5. Train Model
In the "Train Model" step, we teach our model to get better at its job. Here's how
it works:
1. Training the Model: Just like teaching a student, we show our model lots
of examples from our datasets. This helps it learn different patterns, rules,
and features in the data.
2. Using Machine Learning Algorithms: Think of these as different
teaching methods. We use various algorithms to train our model, each one
helping it understand the data in a different way.
The goal here is to improve the model's performance so that it can give us better
results when we apply it to real-world problems.
6. Test Model
In the "Test Model" step, we evaluate how well our trained machine learning
model performs. Here's what happens:
Page 8 of 22
lOMoARcPSD|37347388
1. Testing the Model: After we've trained our model on a specific dataset,
we give it a different dataset to see how well it performs. This dataset is
called a test dataset.
The goal of this step is to ensure that our model is reliable and gives accurate
predictions when applied to new data.
7. Deployment
In the "Deployment" phase, we put our machine learning model to work in the
real world. Here's how it unfolds:
Think of the deployment phase as finalizing and presenting our project's findings.
It's the culmination of all the hard work put into developing and refining the
machine learning model.
Page 9 of 22
lOMoARcPSD|37347388
1. Image Recognition:
Behind the scenes, Facebook uses a project called "Deep Face" for face
recognition and identifying people in photos. It's all about making it easier for us
to tag our friends in pictures without having to manually do it ourselves.
2. Speech Recognition:
Speech recognition, a common use of machine learning, powers features like
"Search by voice" in Google.
It works by converting spoken words into text, also known as "Speech to text."
Many applications, like Google Assistant, Siri, Cortana, and Alexa, use
machine learning algorithms for speech recognition. They understand and
respond to voice commands, making it easier for users to interact with technology
using their voice.
Page 10 of 22
lOMoARcPSD|37347388
3. Traffic prediction:
When we use Google Maps to navigate to a new destination, it not only shows us
the shortest route but also predicts the traffic conditions along the way.
2. Historical data: Google Maps also considers past traffic patterns for the
same route and time of day. By analyzing data from previous days, it
estimates how long it typically takes to travel the route at that particular
time.
By combining real-time and historical data, Google Maps can provide accurate
predictions about traffic conditions, helping users plan their journeys more
efficiently.
4. Product recommendations:
Many e-commerce and entertainment companies like Amazon and Netflix use
machine learning to recommend products to users.
For example, when we search for a product on Amazon, we might start seeing
ads for similar products while browsing the internet. This is because Amazon's
machine learning algorithms understand our interests and suggest relevant
products.
5. Self-driving cars:
Page 11 of 22
lOMoARcPSD|37347388
Virtual personal assistants like Google Assistant, Alexa, Cortana, and Siri are like
our digital helpers. We can ask them questions or give them commands using our
voice.
Page 12 of 22
lOMoARcPSD|37347388
They use machine learning to understand and respond to our voice instructions.
Here's how it works:
Machine learning is helping keep our online transactions safe from fraud. When
we make a transaction online, there are various ways fraudsters can try to steal
money, like using fake accounts or stealing information during the transaction.
To detect fraud, we use a type of machine learning algorithm called Feed Forward
Neural Networks. These algorithms analyze patterns in our transactions to
determine if they're genuine or fraudulent.
1. Each genuine transaction is converted into hash values, which are unique
identifiers.
2. These hash values are used as inputs for the next round of transactions.
3. Machine learning algorithms look for patterns in these transactions.
Genuine transactions have a specific pattern that changes for fraudulent
ones.
4. If the algorithm detects a pattern that suggests fraud, it alerts us, helping
keep our online transactions secure.
So, thanks to machine learning, our online transactions are safer and more secure,
protecting us from potential fraudsters.
Page 13 of 22
lOMoARcPSD|37347388
1. Historical data on stock prices, trading volumes, and other relevant factors
are collected.
2. This data is fed into machine learning algorithms like LSTM neural
networks.
3. The algorithms analyze patterns and trends in the data to predict future
stock market movements.
4. Based on these predictions, traders can make informed decisions about
buying, selling, or holding stocks.
By using machine learning, traders can better understand market dynamics and
make more effective investment choices, mitigating risks and maximizing
returns.
Page 14 of 22
lOMoARcPSD|37347388
Machine learning has made language translation easier than ever before. When
we visit a new place and don't know the language, we can rely on automatic
translation tools to help us communicate.
Google's GNMT (Google Neural Machine Translation) is one such tool. It uses a
type of machine learning called a sequence-to-sequence learning algorithm to
translate text from one language to another.
Concept Learning
✓ Inducing general functions from specific training examples is a main issue
of machine learning.
Imagine you have a big box of LEGO bricks, and your task is to build a specific
type of vehicle, let's say a car. You've never built this exact car before, but you
have a few examples of cars that others have built using LEGO bricks. Now, you
want to figure out how to build your own car using those examples.
1. Training Examples (LEGO Cars): You have some examples of cars that
others have built using LEGO bricks. Each car is built in a specific way,
with certain types of bricks arranged in a particular order.
Page 15 of 22
lOMoARcPSD|37347388
3. Inducing from Examples (Learning): You start by studying the cars you
have as examples. You look at how they're built, what types of bricks are
used, and how they're arranged. By observing these examples, you try to
induce or infer general rules or patterns that explain how cars are built.
So, the main issue in machine learning is figuring out how to generalize from
specific training examples to create models or functions that can accurately make
predictions or classifications on new, unseen data. Just like in the LEGO car
example, it's about learning from what you've seen to make educated guesses
about what you haven't seen yet.
Imagine you have a collection of fruits on a table, and you're trying to learn what
makes a fruit an "apple" based on examples.
2. Negative Examples (Non-Apples): You also have some other fruits like
oranges, bananas, and pears in your collection. These are the fruits that you
know are definitely not apples because they lack some of the characteristics
you associate with apples – they're not red, they're not round, or they don't
have a stem like apples do.
Page 16 of 22
lOMoARcPSD|37347388
5. Testing the Concept: Finally, to validate your concept, you may encounter
new fruits that you haven't seen before. Using the concept you've learned,
you can now make predictions about whether these new fruits are apples
or not based on their characteristics.
Imagine you're trying to find a missing puzzle piece that fits into a puzzle you're
working on.
2. The Missing Piece (Best Fit): You're searching for a specific puzzle piece
that fits perfectly into the gap in your puzzle. This missing piece represents
the hypothesis that best fits the training examples you have.
3. Searching for the Missing Piece (Searching for the Best Hypothesis):
You start by examining each puzzle piece in your collection, trying to find
the one that fits the gap in your puzzle. Similarly, in concept learning, you
search through a predefined space of potential hypotheses (possible
definitions or rules) to find the one that best fits the training examples.
Page 17 of 22
lOMoARcPSD|37347388
5. Selecting the Best Fit (Selecting the Best Hypothesis): After examining
all the puzzle pieces, you select the one that fits the gap in your puzzle the
best. This piece represents the hypothesis that best fits the training
examples. In concept learning, you select the hypothesis that best fits the
examples based on certain criteria, such as accuracy or simplicity.
6. Using the Selected Piece (Using the Best Hypothesis): Once you've
found the missing puzzle piece that fits the gap in your puzzle, you place
it in the puzzle to complete it. Similarly, in concept learning, you use the
selected hypothesis to classify or make predictions about new instances
based on the learned concept.
Imagine you're playing a guessing game where you have to figure out what kind
of thing your friend is thinking of. They'll give you hints, but you have to guess
based on those hints.
1. Hypothesis Space: Think of this as your list of guesses about what your
friend might be thinking of. Each guess is a different idea, like "animal,"
"vehicle," "food," etc.
Page 18 of 22
lOMoARcPSD|37347388
friend gives. Then, based on the feedback, you can make more specific
guesses until you figure out what your friend is thinking of.
So, by organizing your guesses from general to specific and searching efficiently,
you can quickly determine what your friend is thinking of in the game. This is
similar to how machine learning algorithms work when trying to classify or
identify things based on given information.
Imagine you have a magical box that can predict whether it will rain or not based
on certain clues, like the temperature, humidity, and wind speed. Your goal is to
figure out how this magical box makes its prediction by observing examples of
when it has correctly predicted rain or no rain.
2. Training Examples: These are like past instances where you know the
inputs (like temperature, humidity, etc.) and the corresponding outputs
(whether it rained or not). For example, you might have data for days when
it was sunny and the magical box said "no rain," and for days when it was
cloudy and the box said "rain."
3. Inferring: Your task is to figure out the secret rule or formula inside the
magical box based on these examples. You want to learn how it decides
whether it will rain or not based on the inputs it receives.
Page 19 of 22
lOMoARcPSD|37347388
https://medium.com/datadriveninvestor/3-steps-introduction-to-machine-learning-and-design-of-a-
learning-system-bd12b65aa50c
https://chat.openai.com/c/17d7fdc7-58a3-4409-b4c0-a9b77a9ed0e2
Page 20 of 22
lOMoARcPSD|37347388
Page 21 of 22
lOMoARcPSD|37347388
Page 22 of 22