KEMBAR78
Machine Learning: an Introduction and cases | PDF
Disclaimer
Presentations are intended for educational
purposes only and do not replace independent
professional judgment. Statements of fact and
o p i n i o n s e x p r e s s e d a r e t h o s e o f t h e
participants individually and don’t necessarily
reflect those of blibli.com.
Blibli.com does not endorse or approve, and
assumes no responsibility for, the content,
accuracy or completeness of the information
presented.
Machine Learning:
an Introduction and cases
Hendri Karisma
hendri.karisma@gdn-commerce.com
Hendri Karisma
• Sr. Research and Development Engineer
at blibli.com (PT. Global Digital Niaga)
• R&D Team in AI Squad
• Working for Fraud Detection System,
Customer Group for abuser detection,
dynamic recommendation system project
and Customer Segmentation.
• https://about.me/hendriKarisma
Materials
• Definition and background
• Methods
• Problems and Solutions
• Technologies
• Cases
The Definition of Informatics
“Automation of Information” –
Prof. Dr. Ing. Iping Supriana
Artificial Intelligence
 S. Rusel and P. Norvig, Artificial
Intelligence in Modern Approach
Problem Solving agent
 Searching for solution
 Knowledge Base and Planning
 Reasoning
 Learning
Learning
Generalization
Specific Cases
Inductive LearningDeductive Learning
Data ???
Machine Learning Definition
“A computer program is said to learn from
experience E with respect to some class of
tasks T and performance measure P, if its
performance at tasks in T, as measured by P,
improves with experience E.” – Prof. Tom Mitchel
Problem and Solutions
 Analytical (Exact)
Example :
— analytics solution :
— Numerical solution
— Error = | 7.25 – 22/3| = |7.25-7.33|=0.08333
 Numerical (Aprox)
— Is numerical methods just about ML method that we know in the book?
— Newton raphson, Gauss Elimination, Gauss-Jordan, Jacobi method, Gauss-
Seidel, Lagrange, Newton Gregory, Richardson Interpolation, etc.
How it works
Count the error ( y - y')
Then minimize the error
or
Maximize the likelihood
Machine Learning Function
 Information Theory (Decission Tree : ID-Tree,
C4.5, etc)
 Probability (Bayessian : Naive Bayes, Belief
Network, etc)
 Graphical Model (Belief network, HMM, CRF,
Neural Network, etc)
 Numerical Method / Regression (Stochastic
Gradient Descent: Linear Regression, Multiple
Linear Regression, Neural Network,
Stochastic Gradient Ascent : E-M Algorithm)
Machiner Learning Taxonomy
 Supervised
 Unsupervised
 Reinforcement Learning
 Semi-Supervised
 Deep Learning
Machine Learning Taxonomy #2
Methods
Machiner Learning Taxonomy
Regression
Deep Learning
Deep Learning
The four layer of Datamining
Problems
Complexity #1
Complexity #2
Solutions
Solutions #1
Solutions #2
 In-memory data fabric: provides low-latency access and
processing of large quantities of data by distributing data
across the dynamic random access memory (DRAM), Flash, or SSD
of a distributed computer system
Solutions #3
 Cluster machine
 GPU Machine (OpenCL and nVidia CUDA)
Technologies
Tools on python
 Numpy
 Scipy
 Pandas
 Scikit-learn
 Matplotlib
 Seaborn
 Tensorflow
 *pydata.org
 *anaconda
 Other Tech (to support
ML) :
– Apache Kafka
– Apache Spark
– Db: mongo,postgre
– elasticsearch
– CUDA/OpenCL
Tools on java virtual machine (jvm)
 Weka
 Deeplearin4j (working with spark and gpu)
 H2O (working with spark and GPU, support
tensor, mxnet, and cafe)
 JcuDNN (JNI wrapping nvidia cuDNN)
 Mahout (hadoop)
 Mllib Spark
Tools
Stack and Services
Stack and Services
GCP
Compile all components
How we applied Machine Learning
Process mining
 We are using micro services
Another problem
What data that we need???
How to Collect
Event Drive Architecture
Event Drive Architecture
Service ServiceService Service Service
Message queue (kafka)
RecommedationFraud Service
Machine Learning Engine
Cases
Anomaly Detection
Anomaly Detection
 Anomalies are patterns in data that do not conform to a well
defined notion of normal behavior
 These nonconforming patterns : outliers, discordant
observations, exceptions, aberrations, surprises, peculiarities, or
contaminants in different application domains
Challanges
 Defining a normal region
 In many domains normal behavior keeps evolving
 Availability of labeled data for training/validation of models used
by anomaly detection techniques is usually a major issue
 Often the data contains noise that tends to be similar to the
actual anomalies
Anomaly Detection
Solution Method :
 Gaussian MIxture Model
 Fitted by EM - Algorithm
Gaussian Distribution
Before GMM, try to remember the gaussian distribution
Gaussian Distribution Multivariate
Gaussian Mixture
We have 3 gausians
mean 3 clusters on the
right picture.
We have 4 gausians
mean 4 clusters on the
left picture.
Gaussian Mixture (Multivariate)
Gaussian Mixture Multivariate
We have more than one gaussian mixture mean we have more
than one possible position for each data that we want to
distribute to GMM.
for example we have data x then want to trying to distribute x
to GMM, then we need to calculate the probability of x in first
gaussian, then in second gaussian, until our last gaussian. It
mean we have p(x) given each gaussians parameters.
f(x) = argmax{p(x | μ1, Σ1), p(x | μ2, Σ2), p(x | μ3, Σ3),... p(x | μn+1, Σn+1) }
where p is :
p(x | μ, Σ)
EM (Expectation Maximization) Algorithm
• GMM Ref : https://brilliant.org/wiki/gaussian-mixture-model/
EM (Expectation Maximization) Algorithm
, Dimana
Fraud Detection
Fraud Detection
Payment Fraud (phishing,
account take-over, carding)
System abuse (promo, content,
account, logistic and payment
methods especially COD)
Fraud not only result in financial
losses but also produce some
reputational risk.
Some security measures has
been taken by bank or another
multinational finance service.
[E. Duman et al, 2013]
Annual Reports Cybersource
Annual Reports Cybersource
THANK YOU
Any question?

Machine Learning: an Introduction and cases