Dr Athanasios Tsanas (‘Thanasis’)
Associate Prof. in Data Science
Usher Institute, Medical School
University of Edinburgh
Day 1 • Introduction and overview; reminder of basic concepts
Day 2 • Data collection and sampling
Day 3 • Data mining: signal/image processing and information extraction
Day 4 • Data visualization: density estimation, statistical descriptors
Day 5 • Exploratory analysis: hypothesis testing and quantifying relationships
Day 6 • Feature selection and feature transformation
Day 7 • Statistical machine learning and model validation
Day 8 • Statistical machine learning and model validation
Day 9 • Practical examples: bringing things together
Day 10 • Revision and exam preparation
© A. Tsanas, 2020
ECG, EEG Activity Location
Subjects feature1 feature2 ... feature M
P1 3.1 1.3 0.9
P2 3.7 1.0 1.3
X
N P3 2.9 2.6 0.6
…
PN 1.7 2.0 0.7
M (features or characteristics) © A. Tsanas, 2020
Feature generation Feature selection Statistical
from raw data or transformation mapping
X y
Subjects feature1 feature2 ... feature M result
P1 3.1 1.3 0.9 1
P2 3.7 1.0 1.3 2
N P3 2.9 2.6 0.6 1
… …
PN 1.7 2.0 0.7 3
M (features or characteristics) outcome
Depending on the problem, “features” can be demographics, genes, …
y = f (X), f : mechanism X: feature set y: outcome © A. Tsanas, 2020
Exploratory
Data
analysis: Feature Statistical
visualization
hypothesis selection or mapping
(density
testing and transformation (regression/clas
estimation,
statistical (e.g. PCA) sification)
scatter plots)
associations
© A. Tsanas, 2020
Understanding the setting of statistical
mapping
Assessing the accuracy of statistical model
Everything we have done in the course
culminates in today’s two lectures on
statistical mapping
© A. Tsanas, 2020
Information has been collected and presented in
the form of design matrix X
Experts typically provide outcome of interest in the
biomedical domain, y
Having both X & y: determining functional mapping
y = f (X) is known as supervised learning
When the outcome y is not available, we can still
work in unsupervised learning mode. For example
clustering
© A. Tsanas, 2020
Outcome y • Unsupervised learning
• Visualization
is not • Transformation (e.g. PCA)
available • Clustering (not covered here)
• Supervised learning
Outcome y
is available • Determine functional mapping
strategy: y = f (X)
© A. Tsanas, 2020
Classification Discrete outcome (oftentimes binary)
• Learners f (X) = y: classifiers
• Examples: kNN, Logistic Regression (LR), Naïve Bayes, Support Vector
Machines (SVM), Random Forests (RF)…
Regression Continuous outcome (typically real numbers)
• Learners f (X) = y: regressors
• Examples: Ordinary Least Squares (OLS) regression (linear regression),
Support Vector Machines (SVM), Random Forests (RF)…
© A. Tsanas, 2020
Determine the functional relationship in a simple
linear model form: 𝑦 = 𝑎 + 𝑏𝑥
Indicative regression model: Explanatory
variable
𝑈𝑃𝐷𝑅𝑆 = 3 + 8.5 ∙ 𝐽𝑖𝑡𝑡𝑒𝑟
intercept
Coefficient
(or slope
Coefficient = Unit increase in x => increase in y
© A. Tsanas, 2020
ELEC 𝑈𝑃𝐷𝑅𝑆 = 3 + 8.5 ∙ 𝐽𝑖𝑡𝑡𝑒𝑟
2000
𝑁
Myocardial infarction risk
min 𝑒𝑖2
1600 𝑖=1
1200 𝑒ด
𝑁
𝑖𝑛𝑑𝑖𝑐𝑎𝑡𝑖𝑣𝑒 𝑒𝑟𝑟𝑜𝑟
Coefficient
800
(or slope
𝑒ถ
132
𝑖𝑛𝑑𝑖𝑐𝑎𝑡𝑖𝑣𝑒 𝑒𝑟𝑟𝑜𝑟 𝑓𝑜𝑟 𝑠𝑎𝑚𝑝𝑙𝑒 132
intercept 400
0 C76
0 600 1200 1800
X (explanatory variable)
© A. Tsanas, 2020
𝑦 = 𝑎 + 𝑏1 ∙ 𝑥1 + 𝑏2 ∙ 𝑥2 + ⋯ + 𝑏𝑀 ∙ 𝑥𝑀
𝑈𝑃𝐷𝑅𝑆 = 3 + 8.5 ∙ 𝐽𝑖𝑡𝑡𝑒𝑟 − 3.2 ∙ 𝑆ℎ𝑖𝑚𝑚𝑒𝑟 + ⋯
Expresses how much each variable contributes to
the outcome
Signs of coefficients express direction of
contribution
© A. Tsanas, 2020
Many algorithms have been proposed for
regression problems
This is an area beyond the scope of this
course
We will now look into classification
© A. Tsanas, 2020
▪ Find the optimal approach to separate the following two types:
▪ Given 𝐱 𝑖 , 𝑦𝑖 𝑖=1…𝑁 , with data samples 𝐱 𝑖 𝜖 ℝM and corresponding
response 𝑦𝑖 = −1, +1
−1, 𝑓 𝐱 𝑖 < 0
▪ Design a classifier 𝑓 𝐱 𝑖 : 𝑦𝑖 = ቊ
+1, 𝑓 𝐱 𝑖 ≥ 0
© A. Tsanas, 2020
p
1
1
p=
1 + e −( + x ) Linear
Logistics
0 x
Logistic function taking values in the range [0,1].
Logistic function is a misnomer, it is a classification algorithm!
© A. Tsanas, 2020
A model was computed as follows:
1
𝑝(𝑑𝑖𝑠𝑐ℎ𝑎𝑟𝑔𝑒) =
1 + 𝑒 −(5+2∙𝑏𝑙𝑜𝑜𝑑_𝑡𝑒𝑠𝑡)
Find the probability that patient should be
discharged if 𝑏𝑙𝑜𝑜𝑑_𝑡𝑒𝑠𝑡 = 5
2.7182
Substitute values: 𝑝(𝑑𝑖𝑠𝑐ℎ𝑎𝑟𝑔𝑒) = 0.99
© A. Tsanas, 2020
G. James et al. An introduction to statistical learning
(pages: 15-42, 59-83, 127-138)
https://www-
bcf.usc.edu/~gareth/ISL/ISLR%20First%20Printing.pdf
OPTIONAL G. James et al. An introduction to statistical
learning (pages: 83-104)
© A. Tsanas, 2020