KEMBAR78
Intro To ML | PDF | Machine Learning | Linear Regression
0% found this document useful (0 votes)
8 views20 pages

Intro To ML

The document outlines various methods and concepts in applied management research, focusing on machine learning (ML) and deep learning (DL). It discusses the differences between supervised, unsupervised, and reinforced learning, as well as parametric and non-parametric models. Additionally, it covers specific algorithms such as linear and logistic regression, their applications, and their pros and cons.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views20 pages

Intro To ML

The document outlines various methods and concepts in applied management research, focusing on machine learning (ML) and deep learning (DL). It discusses the differences between supervised, unsupervised, and reinforced learning, as well as parametric and non-parametric models. Additionally, it covers specific algorithms such as linear and logistic regression, their applications, and their pros and cons.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Applied Management Research

Applied Management Research Methods

Methods
Machine Learning:
approaches & algorithms SOEAKER

Dott. Federico Mangiò

ROOM

Room 23

DATE 18 March 2025;


Agenda

• What does it mean that a machine learns?

• Parametric vs Non-parametric models

• Supervised ML: classification and regression (Decision Trees, Linear & Logistic

regression; kNN, Naïve Baies)

• Unsupervised ML: Association Rules, Clustering (hierarchical & k-means),

dimensionality reduction (PCA)


AI
Machine (ML) & Deep (DL) Learning

A. Definition
ML: “Field of study that gives computers  machines “learn” from experience, in that
the ability to learn without being they practice a generalization: performing on
explicitly programmed” (Samuel, 1959) an unseen task to adapt to new
circumstances and to spot and extrapolate
DL: field of ML which tries to mimic human patterns after having experienced a data set,
brand’s neural networks using multiple called “training data” (Bishop, 2006).
layers to progressively extract higher-
level features from the raw input

B. Key functionalities
1. ML: classification and prediction,
clustering and dimensionality reduction
2. DL: enhances the effectiveness of ML
through the use of deeper layers of
Artificial Neuron Networks
Statistical Learning
How do we estimate f?

• Training data: observations used to train, or teach, our


method how to estimate f.

• Testing data: observations used to test the fit of our


model to new unseen data

• Parametric methods: choose not to strictly follow the


true functional form of f. It narrows down its modelling to
the modelling of some parameters.
• The more the parameters, the higher the flexibility
E.g., Linear vs polynomial regression modelling

• Non-parametric methods: do not make explicit


assumptions about the functional form of f. Instead, they
seek an estimate of f that gets as close to the data points
as possible without being too rough or wiggly. E.g., spline
So …
Machine Learning =/= Statistical Modelling

MACHINE LEARNING TRADITIONAL STATISTICS

Approach: Inductive Approach: Deductive

Aim: «learn» from all types of data and Aim: analyse and report data
predict (inference)
Hypotheses: No strict hypotheses Hypotheses: Strict hypotheses about
about the problem or about data the problem, strict assumptions about
distribution the data distribution
Generalization: empirically achieved via Generalization: statistical testing
formation, validation, and setting of
testing data
Heuristic search for a «sound solution» Uses strict initial hypotheses about the
problem and assumptions about the
data and searches for the optimal
solution under such assumptions
Features «redundancy» is often useful; Often requires independent features;
the more data, the better small sample of feature is preferable;
data reduction, sampling…

Algo modelling culture  Data modelling culture


There’s learning …
Supervised Learning
The computer is provided with example data in a For each observation of the predictor
format where each input is associated with the measurement(s) xi, i = 1, . . . , n there is
an associated response measurement
desired output («label»), and the goal is to
yi
extract a rule which pairs inputs to output

Unsupervised Learning
For every observation i = 1, . . . , n, we
The computer is required to find a recurrent observe a vector of measurements xi but
pattern among input data, without being no associated response yi.
provided with any label

Reinforced Learning
The computer interacts with a dynamic
environment in which it tries to achieve a goal,
being able to control only if the goal is
achieved, and use this info to modify the
algorithm
There’s learning …
and learning …

• Deep Learning: Set of techniques


based on different levels of
abstraction, corresponding to
hierarchies of characteristics of
factors, where the high-level ones
are defined on the basis of lower
ones
• uses Artificial Neural Networks
(ANN)
• Many algorithms: LSTM, RNN,
Transformers ...
Let the machine learn… what for?
Let the machine learn… what for?
Machine learning in the social sciences (Grimmer, Roberts, Stewart, 2021)

«we doubt our


models, but trust
our validations»
Caveat

• Different algorithms for different aims


• Different algorithms for the same aim
Supervised Machine Learning
SML: Regression

• The outcome variable is quantitative


• Aim: inference, prediction
• Examples:
A. A social manager has to determine whether, and which type, of advertising strategies and
cues are actually leading potential customers to engage with branded social media
content on social media platforms.
B. At the beginning of Q1, a financial analysts is asked to craft a data-based forecast of its
organization’s future health. S/he leverages upon historical data from previous financial
statements, as well as data from the broader industry to project sales, revenue, and
expenses.
Linear Regression
Simple Linear Regression

• Estimation of a linear significant relationship


between the dependent and one or more
independent variables
Y = β0 + β1 X+ e
Where e: random error term, independent of X and
with mean equal to zero and variance equal to σ^2

Multiple Linear Regression


• The algorithm aims at minimizing e by “least
squares” (OLS)
Pros and Cons of Linear Regression

• Simple to implement and to train • Strict assumptions


• Performs well when the dataset is linearly • Sensitive to outliers
separable
• If it works (assumptions being met)... More
complex algorithms should work too!
Logistic Regression

• Generalized linear model (GLM) where the DV


can have only TRUE or FALSE values (0,1).
• The algorithms models the probability that Y
belongs to a particular category

• Used for both regression and classification


tasks
• Why LRM does not suit to qualitative DVs?
Pros and Cons of Logistic Regression

• Less prone to over-fitting • Strict assumptions


• Easy to implement and train • Should not be used when the number of
• Can be adopted for classification tasks observations is smaller than the number
of variables

You might also like