Neural Networks
& Fuzzy Logic
Introduction
Neural Networks
0 0 0 1 0 0 0
adjustable
weights
1 20 37 10 1 1
Neural Networks
Definition & Area of Application
Neural Networks (NN) are:
• mathematical models that resemble nonlinear regression models, but
are also useful to model nonlinearly separable spaces
• “knowledge acquisition tools” that learn from examples
Neural Networks are used for:
• pattern recognition (objects in images, voice, medical diagnostics for
diseases, etc.)
• exploratory analysis (data mining)
• predictive models and control
4.07.12
Neural Networks
Biological Analogy
Neural Networks
Perceptrons
Output of unit j:
Output oj = f(aj)
units j
Input to unit j: aj = Σ wijai
Input to unit i: ai
measured value of variable i
i
Input units
Neural Networks
Example: Logical AND function with NN
y θ = 0.5
w1 w2
x1 x2
input output f(x1w1 + x 2w2) = y
00 0 f(0w1 + 0w2) = 0
01 0 f(0w1 + 1w2) = 0 1, for a > θ
y=f(a) =
10 0 f(1w1 + 0w2 ) = 0 0, for a ≤ θ
11 1 f(1w1 + 1w2 ) = 1 θ
some possible values for w1 and w2
w1 w2
0.20 0.35
0.20 0.40
0.25 0.30
0.40 0.20
Neural Networks
Example: Perceptrons (NN) in Medical Diagnostics
Input
units
Cough Headache
Δ rule
change weights to
weights decrease the error
what we got
No disease Pneumonia Flu Meningitis
- what we wanted
error
Output units
Neural Networks
Linear Separation
Neural Networks
Nonlinear Separation
Linear Nonlinear
Activation Activation
Neural Networks
Multilayered Perceptrons
Neural Networks
Example: Multilayer NN for Diagnosis of
Abdominal Pain
Perforated Small Bowel
Duodenal Non-specific
Appendicitis Diverticulitis Ulcer Pain Cholecystitis Obstruction Pancreatitis
0 0 0 1 0 0 0
adjustable
weights
1 20 37 10 1 1
Male Age Temp WBC Pain Pain
Intensity Duration
Neural Networks
Regression vs. Neural Networks
Jargon Pseudo‐Correspondence
• Independent variable = input variable
• Dependent variable = output variable
• Coefficients = “weights”
• Estimates = “targets”
• Cycles = epoch
Neural Networks
Logistic Regression Model
Inputs Output
Age 34 5
0.6
Σ
4
Gender 2 “Probability
of beingAlive”
8 Σ = 34∗.5 + 1∗.4 + 4∗.8 = 20.6
Stage 4
Dependent
Independent Coefficients variable
variables
a, b, c Prediction
x1, x2, x3
Neural Networks
Neural Network Model
Activation functions
•
Inputs Linear
• Threshold or step function
• Logistic, sigmoid, “squash” Output
Age 34 .6 .4 • Hyperbolic tangent
.2 Σ
.1 .5 0.6
Gender 2
.3 .2
.8
Σ
Σ “Probability
.7
of beingAlive”
Stage 4 .2
Output
Input Weights Hidden Weights variable
variables Layer
Prediction
Neural Networks
Learning: Hidden Units and Backpropagation
Neural Networks
Minimizing the Error
Error Functions
initial error • Mean Squared Error
Error surface
(for most problems)
Σ(t ‐ o)2/n
negative derivative
• Cross Entropy Error
(for dichotomous or
binary outcomes)
final error ‐ Σ(t ln o) + (1‐t) ln (1‐o)
local minimum
winitial wtrained Epochs
positive change
Neural Networks
Implementation of Learning:
Gradient descent & Minima
Error
Global minimum
Local minimum
Epochs
Neural Networks
Implementation of Learning: Problem of
Overfitting
Overfitted model “Real” model Overfitted model
CHD
error
holdout
training
0 age epochs
Neural Networks
Implementation of Learning: Problem of
Overfitting
tss
Overfitted model
tss a a = test set
min (Δtss) b = training set
tss b
Epochs
Stopping criterion
Neural Networks
Parameter Estimation
Logistic Regression Neural Network
• It models just one function • It models several functions
– Maximum likelihood – Backpropagation
– Fast – Iterative
– Optimizations – Slow
• Fisher – Optimizations
• Newton‐Raphson • Quickprop
• Scaled conjugate g.d.
• Adaptive learning rate
Neural Networks
What Do You Want?
Insight versus Prediction
Insight into the model Accurate predictions
• Explain importance of each • Make a good estimate of the
variable “real” probability
• Assess model fit to existing data • Assess model prediction in new
data
Neural Networks
Model Selection:
Finding Influential Variables
Logistic Neural Network
• Forward • Weight elimination
• Backward • Automatic Relevance
Determination
• Stepwise
• “Relevance”
• Arbitrary
• All combinations
• Relative risk
Neural Networks
Regression Diagnostics:
Finding Influential Observations
Logistic Neural Network
• Analysis of residuals • Ad‐hoc
• Cook’s distance
• Deviance
• Difference in coefficients when case
is left out
Neural Networks
How Accurate are Predictions?
• Construct training and test sets or bootstrap to assess “unbiased” error
• Assess
– Discrimination
• How model “separates” alive and dead
– Calibration
• How close the estimates are from “real” probability
Neural Networks
“Unbiased” Evaluation:
Training and Tests Sets
• Training set is used to build the model (may include holdout set to control
for overfitting)
• Test set left aside for evaluation purposes
• Ideal: yet another validation data set, from different source to test if
model generalizes to other settings
Neural Networks
Evaluation of NN
Neural Networks
More Examples: ECG Interpretation
Neural Networks
More Examples: Thyroid Diseases
Neural Networks
Expert Systems and Neural Nets
Fuzzy Logic
Fuzzy Logic
Definition
• Experts rely on common sense when they solve problems.
• How can we represent expert knowledge that uses vague and ambiguous
terms in a computer?
• Fuzzy logic is not logic that is fuzzy, but logic that is used to describe
fuzziness. Fuzzy logic is the theory of fuzzy sets, sets that calibrate
vagueness.
• Fuzzy logic is based on the idea that all things admit of degrees.
Temperature, height, speed, distance, beauty – all come on a sliding scale.
– The motor is running really hot.
– Tom is a very tall guy.
Fuzzy Logic
Definition
• Many decision‐making and problem‐solving tasks are too complex to be
understood quantitatively, however, people succeed by using knowledge
that is imprecise rather than precise.
• Fuzzy set theory resembles human reasoning in its use of approximate
information and uncertainty to generate decisions.
• It was specifically designed to mathematically represent uncertainty and
vagueness and provide formalized tools for dealing with the imprecision
intrinsic to many engineering and decision problems in a more natural
way.
• Boolean logic uses sharp distinctions. It forces us to draw lines between
members of a class and non‐members. For instance, we may say, Tom is
tall because his height is 181 cm. If we drew a line at 180 cm, we would
find that David, who is 179 cm, is small.
• Is David really a small man or we have just drawn an arbitrary line in the
sand?
Fuzzy Logic
Bit of History
• Fuzzy, or multi‐valued logic, was introduced in the 1930s by Jan
Lukasiewicz, a Polish philosopher. While classical logic operates with only
two values 1 (true) and 0 (false), Lukasiewicz introduced logic that
extended the range of truth values to all real numbers in the interval
between 0 and 1.
• For example, the possibility that a man 181 cm tall is really tall might be
set to a value of 0.86. It is likely that the man is tall. This work led to an
inexact reasoning technique often called possibility theory.
• In 1965 Lotfi Zadeh, published his famous paper “Fuzzy sets”. Zadeh
extended the work on possibility theory into a formal system of
mathematical logic, and introduced a new concept for applying natural
language terms. This new logic for representing and manipulating fuzzy
terms was called fuzzy logic.
Fuzzy Logic
Why Fuzzy Logic?
• Why fuzzy?
As Zadeh said, the term is concrete, immediate and descriptive; we all know
what it means. However, many people in the West were repelled by the word
fuzzy, because it is usually used in a negative sense.
• Why logic?
Fuzziness rests on fuzzy set theory, and fuzzy logic is just a small part of that
theory.
• The term fuzzy logic is used in two senses:
– Narrow sense: Fuzzy logic is a branch of fuzzy set theory, which deals (as
logical systems do) with the representation and inference from
knowledge. Fuzzy logic, unlike other logical systems, deals with imprecise
or uncertain knowledge. In this narrow, and perhaps correct sense, fuzzy
logic is just one of the branches of fuzzy set theory.
– Broad Sense: fuzzy logic synonymously with fuzzy set theory
Fuzzy Logic
Fuzzy Applications
• Theory of fuzzy sets and fuzzy logic has been applied to problems in a
variety of fields:
– taxonomy; topology; linguistics; logic; automata theory; game theory;
pattern recognition; medicine; law; decision support; Information
retrieval; etc.
• And more recently fuzzy machines have been developed including:
– automatic train control; tunnel digging machinery; washing machines;
rice cookers; vacuum cleaners; air conditioners, etc.
Fuzzy Logic
Fuzzy Applications
Advertisement: …
• Extraklasse Washing Machine ‐ 1200 rpm. The Extraklasse machine has a
number of features which will make life easier for you.
• Fuzzy Logic detects the type and amount of laundry in the drum and allows
only as much water to enter the machine as is really needed for the loaded
amount. And less water will heat up quicker ‐ which means less energy
consumption.
• Foam detection
Too much foam is compensated by an additional rinse cycle: If Fuzzy Logic
detects the formation of too much foam in the rinsing spin cycle, it simply
activates an additional rinse cycle. Fantastic!
• Imbalance compensation
In the event of imbalance, Fuzzy Logic immediately calculates the maximum
possible speed, sets this speed and starts spinning. This provides optimum
utilization of the spinning time at full speed […]
• Washing without wasting ‐ with automatic water level adjustment
Fuzzy automatic water level adjustment adapts water and energy consumption
to the individual requirements of each wash programme, depending on the
amount of laundry and type of fabric […]
Fuzzy Logic
More Definitions
• Fuzzy logic is a set of mathematical principles for knowledge
representation based on degrees of membership.
• Unlike two‐valued Boolean logic, fuzzy logic is multi‐valued. It deals with
degrees of membership and degrees of truth.
• Fuzzy logic uses the continuum of logical values between 0 (completely
false) and 1 (completely true). Instead of just black and white, it employs
the spectrum of colours, accepting that things can be partly true and
partly false at the same time.
0 0 0 1 1 1 0 0 0.2 0.4 0.6 0.8 1 1
(a) Boolean Logic. (b) Multi-valued Logic.
Fuzzy Logic
Fuzzy Sets
• The concept of a set is Degree of Membership
fundamental to mathematics. Name Height, cm Crisp Fuzzy
• However, our own language is Chris 208 1 1.00
Mark 205 1 1.00
also the supreme expression of
John 198 1 0.98
sets. For example, car indicates
Tom 181 1 0.82
the set of cars. When we say a
David 179 0 0.78
car, we mean one out of the set Mike 172 0 0.24
of cars. Bob 167 0 0.15
• The classical example in fuzzy sets Steven 158 0 0.06
is tall men. The elements of the Bill 155 0 0.01
fuzzy set “tall men” are all men, Peter 152 0 0.00
but their degrees of membership
depend on their height.
Fuzzy Logic
Crisp vs. Fuzzy Sets
The x-axis represents the
Degree of
Membership Crisp Sets
universe of discourse
1.0
– the
range of all possible values
0.8
Tall Men
applicable to a chosen
0.6
variable. In our case,
0.4
the
variable is the man height.
0.2
According to this
0.0
150 160 170 180 190 200 210
representation, theDegree
universe
of
of Height, cm
men’s heights consists of all
Fuzzy Sets
Membership
1.0
tall men. 0.8
The y-axis represents
0.6 the
membership value 0.4 of the
fuzzy set. In our 0.2
case, the
fuzzy set of “tall men”
0.0
150
maps
160 170 180 190 200 210
height values into Height, cm
corresponding membership
values.
Fuzzy Logic
A Fuzzy Set has Fuzzy Boundaries
• Let X be the universe of discourse and its elements be denoted as x. In the
classical set theory, crisp set A of X is defined as function fA(x) called the
characteristic function of A:
⎧1, if x ∈ A
fA(x) : X Æ {0, 1}, where f A ( x) = ⎨
⎩0, if x ∉ A
For any element x of universe X, characteristic function fA(x) is equal to 1 if x is an
element of set A, and is equal to 0 if x is not an element of A.
In the fuzzy theory, fuzzy set A of universe X is defined by function µA(x)
called the membership function of set A
µA(x) : X Æ {0, 1}, where µA(x) = 1 if x is totally in A;
µA(x) = 0 if x is not in A;
0 < µA(x) < 1 if x is partly in A.
For any element x of universe X, membership function µA(x) is the degree
of membership to which x is an element of set A.
Fuzzy Logic
Fuzzy Set Representation
• First, we determine the Degree of
membership functions. In our Membership Crisp Sets
1.0
“tall men” example, we can 0.8 Short Average Short
Tall
obtain fuzzy sets of tall, short 0.6
Tall Men
and average men. 0.4
0.2
• The universe of discourse – the 0.0
150 160 170 180 190 200 210
men’s heights – consists of Height, cm
Degree of
three sets: short, average and Membership Fuzzy Sets
1.0
tall men. As you will see, a man
0.8
who is 184 cm tall is a member Short Average Tall
0.6
of the average men set with a 0.4
degree of membership of 0.1, 0.2 Tall
and at the same time, he is also 0.0
a member of the tall men set 150 160 170 180 190 200 210
with a degree of 0.4.
Fuzzy Logic
Linguistic Variables and Inference
• At the root of fuzzy set theory lies the idea of linguistic variables.
• A linguistic variable is a fuzzy variable. For example, the statement “John
is tall” implies that the linguistic variable John takes the linguistic value
tall.
• In fuzzy expert systems, linguistic variables are used in fuzzy rules. For
example:
IF wind is strong
THEN sailing is good
IF project_duration is long
THEN completion_risk is high
IF speed is slow
THEN stopping_distance is short