KEMBAR78
Lecture 1 | PDF | Machine Learning | Artificial Intelligence
0% found this document useful (0 votes)
34 views25 pages

Lecture 1

Uploaded by

asedovskaya.ann
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views25 pages

Lecture 1

Uploaded by

asedovskaya.ann
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Artificial Intelligence

Artificial intelligence (AI), the ability of a digital computer or computer-controlled robot


to perform tasks commonly associated with intelligent beings. The term is frequently
applied to the project of developing systems endowed with the intellectual processes
characteristic of humans, such as the ability to reason, discover meaning, generalize, or
learn from past experience.

Topics

Machine learning. Supervised learning. Unsupervised learning.


Classification. Classification metrics.. Decision tree. Random forest. K nearest neighbor
algorithm.
Linear regression.
Clustering. Clustering with K-Means.
Dimensionality reduction. Principal Components Analysis (PCA).
Artificial Neural Networks (ANN).
Feed Forward Neural Networks. Multilayer Perceptron. Logistic regression.
Loss. MSE Loss. Cross-Entropy Loss.
Optimizers.
Regularization techniques.
Computer vision tasks. Convolutional Neural Networks (CNN). Examples of CNN.
Recurrent Neural Networks (RNN). Long Short-Term Memory (LSTM).
Attention in Neural Networks. Transformers.
Natural Language Processing. Word embeddings. Sentiment analysis.
Generative Adversarial Networks (GAN).
Reinforcement Learning.

Assignments (Python, Scikit-Learn, Pytorch) – 50 %


Exam - 50%

Gintautas Daunys
gintautas.daunys@sa.vu.lt

Gavin Hackeling 2017 Mastering Machine Packt Publishing,


Learning with scikit-learn https://www.packtpub.com/product/mastering-
machine-learning-with-scikit-learn-second-
edition/9781788299879
Ian Goodfellow, Yoshua 2016 Deep Learning https://www.deeplearningbook.org/
Bengio, Aaron Courville
Eli Stevens, Luca Antiga, 2020 Deep Learning with Manning Publications,
Thomas Viehmann Pytorch https://www.manning.com/books/deep-
learning-with-pytorch
Turing test

The Turing test is a measure of a machine's ability to exhibit


intelligent behavior that is indistinguishable from a human.
Proposed by mathematician and computer science pioneer
Alan Turing in 1950, the test involves a human evaluator who
engages in natural language conversations with both a human
and a machine, without knowing which is which. If the
evaluator is unable to reliably determine which is the machine,
the machine is said to have passed the Turing test and
demonstrated a sufficient level of human-like intelligence.
While the Turing test has been a subject of much debate and
criticism, it remains a widely recognized benchmark for AI
research and development.
Chinease room
The Chinese Room argument is a philosophical critique of the Turing test, and it
challenges the notion that a machine can truly understand human language and thought.
The argument, first presented by philosopher John Searle in 1980, is based on the idea of a
person who does not understand Chinese but is able to respond to Chinese messages by
looking up the answers in a rulebook. In the same way, Searle argues, a machine could
manipulate symbols and produce appropriate responses without truly understanding the
meaning of the language.
According to the Chinese Room argument, the Turing test only measures a machine's
ability to simulate human-like behavior, but it does not prove that the machine has true
understanding or consciousness. This argument highlights the limitations of using
behavior as a measure of intelligence and raises important questions about the nature of
human thought and the possibility of creating truly intelligent machines.
General AI versus Narrow AI
General or Strong AI: General or strong AI refers to systems that have the capability to
perform a wide range of tasks, including those that typically require human intelligence,
such as decision-making and problem-solving.
Narrow or Weak AI: Narrow or weak AI is focused on performing specific tasks and is
trained to perform a specific task such as image recognition, language translation, or
playing a game.

General example: Minerva, ChatGPT, Bard


https://ai.googleblog.com/2022/06/minerva-solving-quantitative-reasoning.html
https://minerva-demo.github.io/#category=Algebra&index=1
https://openai.com/blog/chatgpt/
https://chat.openai.com/chat
Narrow Tasks
Classification (Recognition)
Clustering
Data modeling
Natural Language Processing
Feature extraction (for classification
Content generation

Narrow AI example: Google Translate


Machine Learning

Machine learning is a subfield of artificial intelligence that focuses on the development of


algorithms and statistical models that enable machines to improve their performance on a
specific task over time, without being explicitly programmed.

In machine learning, a model is trained on a large dataset and uses this data to make
predictions or decisions. The model's accuracy is then evaluated and the parameters of the
model are adjusted accordingly. This process is repeated multiple times until the model is
able to perform the task with a high degree of accuracy.

There are several types of machine learning algorithms, including:

1. Supervised Learning
2. Unsupervised Learning
3. Reinforcement Learning:
1. Supervised Learning: This involves training a model on a labeled dataset, where the
correct output for each input is known. The model is then used to make predictions on
new, unseen data.
2. Unsupervised Learning: This involves training a model on an unlabeled dataset, where
the structure and relationships within the data are learned by the model.
3. Reinforcement Learning: This type of learning involves a model that receives rewards
or penalties for its actions and uses this feedback to improve its performance.

Machine learning is used in a wide range of applications, such as speech recognition,


image classification, recommendation systems, and autonomous vehicles. The technology
is constantly evolving, with new breakthroughs and advancements being made all the
time.

Data split is a process of dividing the data into different subsets for training, testing, and
validation. The main purpose of data split is to avoid overfitting and to evaluate the
performance of the model.

The data split is typically done as follows:


Training set: The training set is the largest subset of the data, and it is used to train the
model. The model uses the training set to learn the relationships between the features and
the target variable.

Testing set: The testing set is a smaller subset of the data, and it is used to evaluate the
performance of the model. The model is tested on the testing set to determine how well it
generalizes to new data. The testing set should be independent of the training set, and it
should not be used during the training process.

Validation set: The validation set is used to fine-tune the model and to select the best
model. The validation set is used to evaluate different models and to choose the model
with the best performance. The validation set should also be independent of the training
set and the testing set.

The size of each subset depends on the size of the data and the type of problem. A
common split is 70% training data, 20% validation data, and 10% testing data. However,
the split ratio can vary depending on the specific requirements and goals of the analysis.
It is important to randomly split the data into subsets to ensure that the data is
representative of the entire dataset and to prevent any biases in the results. This can be
achieved by randomly sampling the data or by using cross-validation techniques.
Classification
Classification is a type of machine learning problem where the goal is to predict the class
or category to which a given input belongs. For example, in image classification, an
algorithm might be trained to recognize the presence of specific objects (e.g. cats, dogs,
buildings) in an image and classify each image based on which object(s) it contains. In
spam email classification, an algorithm might be trained to classify incoming emails as
spam or not spam.

Classification algorithms use various statistical and machine learning techniques to learn
the relationship between the input data and the corresponding class labels. Once the model
is trained, it can then be used to make predictions on new, unseen data.
Binary classification is a type of machine learning problem where an algorithm is trained
to predict one of two possible outcomes, usually represented by 1 and 0 or "yes" and "no".
Examples of binary classification problems include spam filtering, sentiment analysis, and
diagnosing a medical condition as positive or negative.

Multiclass classification, on the other hand, is a classification problem where an algorithm


is trained to predict one of more than two possible outcomes. It is also known as
multinomial classification or multicategory classification. An example of a multiclass
classification problem is image classification where an algorithm is trained to recognize
different objects in an image such as a dog, cat, person, etc. In multiclass classification,
there can be several outcomes, and the algorithm must learn to map inputs to one of those
outcomes.

Binary classification
Confussion matrix

Negative class Positive class


Negative input True Negative (TN) False Positive (FP)
Positive input False Negative (FN) True Positive (TP)

Accuracy:
Precision:

Recall:

F1 score:
No disease Disease was found
10000 health TN = 9990 FP = 10
subjects
100 with disease FN = 50 TP = 50

∗ . ∗ .
. .

Python, Scikit-Learn
Decision Tree
Decision tree classification is a machine learning algorithm. The algorithm starts by
selecting a single variable from the data set to split the data into two groups. This process
is repeated on each subgroup until a stopping criterion is reached. The stopping criterion is
usually based on the number of samples in a subgroup, the quality of the split, or the depth
of the tree.
The result of this process is a tree-like structure where each node represents a variable,
each edge represents a split, and each leaf node represents a class label. The class label is
determined by majority voting among the samples that reach the leaf node.
When making a prediction, a new sample is fed into the decision tree and is compared to
each variable in each node. The sample follows the path in the tree that corresponds to its
values until it reaches a leaf node. The class label of the leaf node is the prediction for that
sample.
The decision tree classification algorithm is widely used because it is simple to
understand, easy to implement, and can handle both continuous and categorical variables.
However, it can also be prone to overfitting, which means that the tree can become too
complex and capture noise in the data. To prevent this, techniques such as pruning,
limiting the tree depth, or using random forests are used.
The selection of the feature for the first split in a decision tree algorithm is determined by
evaluating the quality of the split for each feature. The quality of the split is usually
measured by a metric such as Gini impurity, entropy, or the information gain.

Gini impurity measures the probability of misclassifying a sample if it is randomly chosen


from the current group of samples. The feature that results in the lowest Gini impurity is
selected as the feature for the first split.

Entropy measures the amount of uncertainty in the current group of samples. The feature
that results in the highest information gain (the reduction in entropy) is selected as the
feature for the first split.

Information gain is the reduction in entropy achieved by splitting the samples into two
groups based on a feature. The feature that results in the highest information gain is
selected as the feature for the first split.

Once the feature for the first split is selected, the data is split into two groups based on the
values of the selected feature. This process is repeated for each subgroup until a stopping
criterion is reached. The stopping criterion is usually based on the number of samples in a
subgroup, the quality of the split, or the depth of the tree.
For continuous features, the split value is often selected as the midpoint between the
minimum and maximum value of the feature. Alternatively, the split value can be selected
by evaluating the quality of the split for multiple potential split points and selecting the
split point with the highest reduction in impurity or the highest information gain.

The internal implementation of the Decision Tree Classification algorithm involves the
following steps:

Calculating the information gain or impurity measure for each feature: The algorithm
calculates the information gain or impurity measure (e.g., Gini impurity or entropy) for
each feature in the dataset. The information gain measures the reduction in impurity (i.e.,
uncertainty) that results from splitting the data based on a specific feature.

Selecting the feature with the highest information gain: The algorithm selects the feature
with the highest information gain as the feature for the first split.

Determining the split value: The algorithm determines the split value for the selected
feature by finding the value that maximizes the reduction in impurity.
Splitting the data: The algorithm splits the data into two subsets based on the split value.

Building the tree recursively: The algorithm builds the tree recursively by repeating the
above steps for each subset until a stopping criterion is reached (e.g., maximum depth,
minimum samples per leaf, or minimum information gain).

Creating the decision tree: The algorithm creates a tree representation of the splits, where
each node represents a split and each leaf node represents a prediction.

Making predictions: To make predictions, the algorithm follows the path in the tree that
corresponds to the values of the features for a new observation. The prediction is made at
the leaf node that is reached.

The internal implementation of the Decision Tree Classification algorithm involves


several trade-offs and decisions, such as choosing the criterion for splitting the nodes,
choosing the stopping criterion, and choosing the method for handling missing values or
dealing with noisy data. The scikit-learn library provides several options for customizing
the implementation of the Decision Tree Classification algorithm.
The Decision Tree Classification algorithm can be implemented in the scikit-learn module
by using the DecisionTreeClassifier class. The following is an example of how to
implement Decision Tree Classification in scikit-learn:

Python code using Scikit-Learn module

1. Import the required libraries:

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

2. Load the data into a Pandas DataFrame:

data = pd.read_csv('data.csv')

3. Split the data into features (X) and target variable (y):
X = data.iloc[:, :-1].values
y = data.iloc[:, -1].values
4. Split the data into training and testing subsets:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

5.Create an instance of the DecisionTreeClassifier:

classifier = DecisionTreeClassifier(criterion='entropy', random_state=0)

6. Fit the classifier to the training data:

classifier.fit(X_train, y_train)

7. Predict the target variable for the test data:

y_pred = classifier.predict(X_test)

8. Evaluate the performance of the model using the accuracy_score method:


accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

Note: The above code assumes that the data is stored in a CSV file named "data.csv". The
code uses the entropy criterion to split the nodes in the tree, and it sets the random_state
parameter to 0 to ensure reproducibility. The accuracy of the model can be improved by
tuning the parameters, such as the criterion, max_depth, or min_samples_split, or by using
other preprocessing techniques.

Random Forest
Random Forest is a machine learning algorithm used for both regression and classification
problems. It is an extension of the decision tree algorithm, where multiple decision trees
are created and combined to form a forest of trees.

The algorithm works by randomly selecting a subset of the features and a random sample
of the data to create each decision tree. Each decision tree is grown to the maximum depth,
and the class label of each sample is determined by the majority vote of the trees in the
forest.
One of the key benefits of Random Forest is that it reduces the overfitting problem that
occurs in decision trees. Since each decision tree is created from a different subset of the
data and features, the trees are less likely to capture noise in the data and therefore produce
more accurate predictions.

In addition, Random Forest provides a measure of feature importance, which can be useful
in identifying the most important features in the data. The feature importance is
determined by the average decrease in impurity (e.g. Gini impurity) for each feature across
all decision trees in the forest.

Random Forest is widely used in a variety of applications, including image classification,


sentiment analysis, and fraud detection. However, it can be computationally expensive,
especially when dealing with large datasets and a large number of features.

You might also like