KEMBAR78
Gaussian Mixture Models in Fault Detection | PDF | Cluster Analysis | Akaike Information Criterion
0% found this document useful (0 votes)
65 views28 pages

Gaussian Mixture Models in Fault Detection

This document discusses Gaussian mixture models (GMM), their construction and applications. GMMs represent a probability distribution as a weighted sum of multiple Gaussian distributions. The document provides a 3-step process for constructing a standard GMM using maximum likelihood and the EM algorithm. It then presents two case studies: [1] using GMMs to model and monitor sludge profiles in a secondary wastewater treatment settler, and [2] developing residual and fault detection criteria based on the GMM. The document concludes that GMMs provide a novel tool for fault detection in wastewater treatment processes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views28 pages

Gaussian Mixture Models in Fault Detection

This document discusses Gaussian mixture models (GMM), their construction and applications. GMMs represent a probability distribution as a weighted sum of multiple Gaussian distributions. The document provides a 3-step process for constructing a standard GMM using maximum likelihood and the EM algorithm. It then presents two case studies: [1] using GMMs to model and monitor sludge profiles in a secondary wastewater treatment settler, and [2] developing residual and fault detection criteria based on the GMM. The document concludes that GMMs provide a novel tool for fault detection in wastewater treatment processes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/321245699

Gaussian Mixture Model - method and application

Presentation · November 2017


DOI: 10.13140/RG.2.2.32667.77602

CITATION READS

1 5,065

1 author:

Jesús Zambrano
The MathWorks
65 PUBLICATIONS 375 CITATIONS

SEE PROFILE

All content following this page was uploaded by Jesús Zambrano on 23 November 2017.

The user has requested enhancement of the downloaded file.


Gaussian Mixture Models
– method and applications
Jesús Zambrano
PostDoctoral Researcher
School of Business, Society and Engineering
www.mdh.se

FUDIPO project. Machine Learning course. Oct.-Dec. 2017


Outline

● Method
● Introduction to Gaussian Mixture Process (GMM)
● Standard construction of GMM
● Clustering (Silhouette and Akaike criterion)

● Case studies
● Monitoring a secondary settler tank
● Residual and fault detection criteria

● Conclusions
Gaussian Mixture Model (GMM)
- standard construction
𝜇𝜇𝑘𝑘 : mean
A linear superposition of K-Gaussians 𝜎𝜎𝑘𝑘 : covariance

is called a Gaussian mixture (GM). The mixture coefficient


satisfies

Interpretation: The density is the probability


of , given that component was chosen. The probability of
choosing component is given by the prior probability .
GMM - standard construction (cont.)

For example, consider the following GMM:


GMM - standard construction (cont.)
The form of the GM distribution is governed by the parameters
𝝅𝝅, 𝝁𝝁 and 𝝈𝝈. One way to get them is by maximum likelihood.

Given 𝑁𝑁 observations , the log-likelihood function is

There is no closed-form solution available (due to the sum


inside the logarithm).

This problem can be separated into two simple problems using


the expectation-maximization (EM) algorithm.
GMM - standard construction (cont.)
Conditions to be satisfied at a maximum of the likelihood function

which gives

Maximize with respect to (using Lagrange multipliers)


gives

For more details of EM and GMM see: C. Bishop, Pattern Recognition and Machine Learning, Springer, 2007.
GMM - standard construction (cont.)
A simple Matlab example
● Matlab functions:
● fitgmdist (Fit a Gaussian mixture distribution to data)
● pdf (Density function of a specific ditribution)

Raw data Data model with 2 Gaussian


(2 clusters of 1000 points each) Mixture distributions

Run: gmm_example.m
A simple Matlab example (cont.)
● Silhouette value (S)
It is a measure of how similar a point is to a point in its own cluster.
Minimum average distance from the Average distance from 𝑖𝑖 𝑡𝑡𝑡 point to
𝑖𝑖 𝑡𝑡𝑡 point to points in a different cluster 𝑏𝑏𝑖𝑖 − 𝑎𝑎𝑖𝑖
other points in the same cluster
𝑆𝑆𝑖𝑖 = For well match of 𝑖𝑖 in its own cluster,
max(𝑎𝑎𝑖𝑖 , 𝑏𝑏𝑖𝑖 ) 𝑏𝑏𝑖𝑖 should be large and 𝑎𝑎𝑖𝑖 small.

𝑆𝑆𝑖𝑖 ranges between -1 to +1. High 𝑆𝑆𝑖𝑖 indicates that 𝑖𝑖 is well-matched to its
own cluster, and poorly-matched to neighboring clusters.
A simple Matlab example (cont.)
● Silhouette value (S)

K=2 GM

K=3 GM
A simple Matlab example (cont.)
● Akaike’s Information Criterion (AIC)

Provides a measure of the relative quality of a model for a given set of


data.
Number of estimated parameters Model parameters

2𝑛𝑛𝑝𝑝
Then, the aim is to get: min 1 + ∑𝑁𝑁
𝑡𝑡=1 𝜀𝜀 2 (𝑡𝑡, 𝜃𝜃)
𝑛𝑛𝑝𝑝 ,𝜃𝜃 𝑁𝑁
Number of values in the estimation data set
Prediction error

The most accurate model has the smallest AIC.

AIC=17584 AIC=14233 AIC=14238


Case study

Jesús Zambrano
jesus.zambrano@mdh.se
A wastewater treatment plant
A wastewater treatment plant (cont.)
The Process

Effluent

Influent

Waste
Q: flowrate
S: conc. soluble substrate
X: conc. biomass
r: recycle ratio
w: wastage ratio
The Process (cont.)
Clarification zone

Thickening
zone

Sludge blanket
Scanning a secondary settler

Sludge profile
The Problem

Scanning Sludge profiles

Level [m] Level [m]


SS sensor
How to detect
settler faulty profiles?
?

SS conc.
[g/L]

Let’s apply
Gaussian Mixture Models!
GMM for the settler
15 sludge profiles in non-faulty conditions
GMM for the settler (cont.)

GMM parameters 𝜋𝜋𝑘𝑘 , 𝜇𝜇𝑘𝑘 , 𝜎𝜎𝑘𝑘 :

We denote
Settler monitoring

• Sludge profiles from day 1 (blue) to day 33 (red).


• New profile every 15 minutes = 3168 profiles.

Day 1 -10 Day 11 - 20 Day 21 -33

(Red does not mean alarm!)


Residual and Fault detection criteria

threshold

normal where Classical binary


hypothesis
faulty! ℎ = max 𝑟𝑟 � testing problem
𝑡𝑡∈𝐻𝐻0
Settler monitoring (cont.)
residual

threshold
Conclusions

● Valuable information can be obtained by monitoring a


Secondary Settler in a wastewater treatment plant.

● Gaussian Mixture Models provide a novel tool for fault


detection in this process.

● The proposed method is general and could be


implemented in settlers with different geometries and
sludge profiles.

● The method is also suitable for monitoring deviations


in a process with repetitive data profiles.
Sources of information
● Books:

● Podcasts:
Thanks for your attention!

Jesús Zambrano
jesus.zambrano@mdh.se
Jesús Zambrano
jesus.zambrano@mdh.se

View publication stats

You might also like