KEMBAR78
RuleML 2015: When Processes Rule Events | PDF
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
When Processes Rule Events
Avigdor Gal
Technion – Israel Institute of Technology
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Presentation Outline
Big data: the New Playground
Events, Processes, and Anything in Between
Complex Event Processing Optimizaion
Process Mining with Schedules
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Big Data: is it a Storm in a Teacup?
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Big data is a game changer
From Theory to Systems: empirical evaluation counts
From Systems to Data: large scale empirical evaluation
counts
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Who is a Data Scientist?
The ability to take data – to be able to understand it, to
process it, to extract value from it, to visualize it, to
communicate it – that’s going to be a hugely important skill in
the next decades. (Hal Varian, Google’s Chief Economist)
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Volume: No Longer the Size of a Teacup
Volume
Table: Big Data Cross Table
Big data may be a single dataset with a lot of data
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Volume: No Longer the Size of a Teacup
Table: Big Data Cross Table
Big data may be a single dataset with a lot of data
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Velocity: Replacing a Teacup with a Tea Hose
Volume
Velocity
Table: Big Data Cross Table
Big data may be data that rapidly changes
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Velocity: Replacing a Teacup with a Tea Hose
Table: Big Data Cross Table
Big data may be data that rapidly changes
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Velocity: Replacing a Teacup with a Tea Hose
Table: Big Data Cross Table
Big data may be data that rapidly changes
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Velocity: Replacing a Teacup with a Tea Hose
Table: Big Data Cross Table
Big data may be data that rapidly changes
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Variety: When One Tea Type is Just not
Enough
Volume
Velocity
Variety
Table: Big Data Cross Table
Big data may be a small dataset with many different schemata
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Variety: When One Tea Type is Just not
Enough
Table: Big Data Cross Table
Big data may be a small dataset with many different schemata
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Veracity: Is it Coffee or Black Tea with Milk?
Volume
Velocity
Variety
Veracity
Table: Big Data Cross Table
Big data may be data with varying levels of trustworthiness
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Veracity: Is it Coffee or Black Tea with Milk?
Table: Big Data Cross Table
Big data may be data with varying levels of trustworthiness
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Gathering: where and when to expect the
fountain to burst
Gathering
Volume
Velocity
Variety
Veracity
Signal and Event Processing
Table: Big Data Cross Table
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Gathering: where and when to expect the
fountain to burst
Table: Big Data Cross Table
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Management: Not your typical DBA anymore
Gathering Managing
Volume
Velocity
Variety
Veracity
Cloud Computing, NoSQL, NewSQL
Table: Big Data Cross Table
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Analytics: When Data Analysis Explodes
Multi-Dimensionally
Gathering Managing Analyzing
Volume
Velocity
Variety
Veracity
Data & Process Mining
ML, IR, NLP
Table: Big Data Cross Table
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Visualization: The Machine Offering to
Mankind
Gathering Managing Analyzing Visualizing
Volume
Velocity
Variety
Veracity
User Experience
Table: Big Data Cross Table
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Data Visualization: The Machine Offering to
Mankind
Table: Big Data Cross Table
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Events
Processes
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Big Data Cross Table
Gathering Managing Analyzing Visualizing
Volume Ev Pro
Velocity en ce
Variety t ss
Veracity s es
Table: Big Data Cross Table
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Events
Processes
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Event Processing
Events
An event e is an occurrence within a particular system or
domain.
It is something that has happened, or is contemplated as
having happened in that domain.
[Etzion and Niblett, 2010]
Point-based semantics.
An event type E ∈ E is a specification for a set of events
that share the same semantic intent and structure.
Complex Event Processing
Systems: Amit [Adi and Etzion, 2004],
SASE [Wu et al., 2006], Cayuga [Demers et al., 2007],
CEDR [Barga et al., 2007], ESPER [].
DEBS 2016: Oragne County, California
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Events
Processes
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Event Processing
Urban Traffic Management
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Events
Processes
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Traffic Flow
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Events
Processes
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Bus Log
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Events
Processes
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Events and Big Data
Volume: 23 Million records per month (∼ 4GB)
Velocity: 770,000 new records per day (an event each 2-6
seconds)
Variety: Homogeneous
Veracity: GPS locations
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Events
Processes
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Processes
Processes
Process models describe time dependencies among
activities:
Business processes
Scheduled activities
Used as a template for execution by a process engine.
A process model can be modeled as a graph containing
activity nodes and control nodes:
Petri nets [Reisig, 1985]
BPMN [bpm, 2011]
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Events
Processes
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Process Models
Bus Log
Bus Model
s d
ω_2 ω_3 ω_i ω_{n-1}
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Events
Processes
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Between Events and Processes
Given processes, detect (complex) events
Given events, discover processes
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
From Processes to CEP
Optimisation of event pattern matching on three levels
Approach based on domain knowledge
Results taken from: M. Weidlich, H. Ziekow, A. Gal, J.
Mendling, M. Weske - Optimising Event Pattern Matching
using Business Process Models. IEEE Transactions on
Knowledge and Data Engineering (TKDE), accepted for
publication, 2015.
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
From Processes to CEP
Thanks Matthias Weidlich for the slides
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Optimization by Transformation
Sequentialization Rule
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Optimization by Plan Selection
Sequentialization Rule
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Optimization by Early Termination
Sequentialization Rule
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Performance Analysis
Datasets
publicly available process log that contains recorded
execution sequences of a paper reviewing process.a
The model denes 20 activities.
The log comprises 3730 events that are related to 100
process instances.
Each event is associated with a timestamp and a reference
to an activity of the process model.
Process models of a German insurance company.
1021 process models, ranging from 4 to 339 nodes.
The average size of the process models is around 23 nodes.
The log was simulated using annotations of the process
models.
a
http://www.processmining.org/logs/start
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Performance Analysis
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Performance Analysis
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Complex Events Processing with Processes
Gathering ...
Volume
Velocity Optimization
Variety Optimisation in event processing networks
Veracity
Table: Big Data Cross Table
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Complex Events Processing with Processes
... Analysis
Volume Mining of constraints
Velocity
Variety
Veracity Probabilistic mining of constraints
Table: Big Data Cross Table
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
From Events to Processes
Online Traveling Time Prediction: when Processes Rule Events
Using information on bus stops, the prediction of the journey
traveling time T( ω1, . . . , ωn , tω1 ) is traced back to the sum of
traveling times per segment:
T( ω1, . . . , ωn , tω1 ) = T( ω1, ω2 , tω1 ) + . . . + T( ωn−1, ωn , tωn−1 )
where
tωn−1 = tω1 + T( ω1, ωn−1 , tω1 ).
s d
Traveling Time = Drive Time + Delay Time + Stop Time
ω_2 ω_3 ω_i ω_{n-1}
(Thanks to Arik Senderovich for the slides)
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
From Events to Processes
Online Traveling Time Prediction: when Processes Rule Events
Using information on bus stops, the prediction of the journey
traveling time T( ω1, . . . , ωn , tω1 ) is traced back to the sum of
traveling times per segment:
T( ω1, . . . , ωn , tω1 ) = T( ω1, ω2 , tω1 ) + . . . + T( ωn−1, ωn , tωn−1 )
where
tωn−1 = tω1 + T( ω1, ωn−1 , tω1 ).
s d
Traveling Time = Drive Time + Delay Time + Stop Time
ω_2 ω_3 ω_i ω_{n-1}
(Thanks to Arik Senderovich for the slides)
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Prediction: The Snapshot Principle in
Single-Station Queues
The snapshot principle stems from a heavy-traffic
approximation of a queueing system under limits of its
parameters, as the workload converges to capacity.
Station1
The principle states that the total time in the station
(waiting+service) remains constant.
In our context, bus that passes through a segment, e.g.,
ωi, ωi+1 ∈ S × S, will have the same traveling time as
another bus that has just passed through that segment (not
necessarily of the same type, line, etc.).
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Prediction: The Snapshot Principle in
Single-Station Queues
The snapshot principle stems from a heavy-traffic
approximation of a queueing system under limits of its
parameters, as the workload converges to capacity.
Station1
The principle states that the total time in the station
(waiting+service) remains constant.
In our context, bus that passes through a segment, e.g.,
ωi, ωi+1 ∈ S × S, will have the same traveling time as
another bus that has just passed through that segment (not
necessarily of the same type, line, etc.).
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Prediction: The Snapshot Principle in
Single-Station Queues
The snapshot principle stems from a heavy-traffic
approximation of a queueing system under limits of its
parameters, as the workload converges to capacity.
Station1
The principle states that the total time in the station
(waiting+service) remains constant.
In our context, bus that passes through a segment, e.g.,
ωi, ωi+1 ∈ S × S, will have the same traveling time as
another bus that has just passed through that segment (not
necessarily of the same type, line, etc.).
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
The Snapshot Principle in Single-Station Queues
Based on the above, we define a single-segment snapshot
predictor, Last-Bus-to-Travel-Segment (LBTS), denoted by
θLBTS( ωi, ωi+1 , tω1 ).
In real-life settings, applicability of the snapshot principle
predictors should be tested ad-hoc.
The snapshot principle was shown to be of an empirical value
in previous research, where queueing techniques were applied to
predict delays.
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
The Snapshot Principle in Single-Station Queues
Based on the above, we define a single-segment snapshot
predictor, Last-Bus-to-Travel-Segment (LBTS), denoted by
θLBTS( ωi, ωi+1 , tω1 ).
In real-life settings, applicability of the snapshot principle
predictors should be tested ad-hoc.
The snapshot principle was shown to be of an empirical value
in previous research, where queueing techniques were applied to
predict delays.
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Snapshot Principle in a Network
In our case, the LBTS predictor needs to be lifted to a network
setting.
The snapshot principle holds for networks of queues, when the
routing through this network is known in advance.
In scheduled transportation such as buses this is the case as the
order of stops (and segments) is predefined:
Station1 Station2 Station3
Station5 Station6
Station4
Station7
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Snapshot Principle in a Network
In our case, the LBTS predictor needs to be lifted to a network
setting.
The snapshot principle holds for networks of queues, when the
routing through this network is known in advance.
In scheduled transportation such as buses this is the case as the
order of stops (and segments) is predefined:
Station1 Station2 Station3
Station5 Station6
Station4
Station7
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Snapshot Principle in a Network
In our case, the LBTS predictor needs to be lifted to a network
setting.
The snapshot principle holds for networks of queues, when the
routing through this network is known in advance.
In scheduled transportation such as buses this is the case as the
order of stops (and segments) is predefined:
Station1 Station2 Station3
Station5 Station6
Station4
Station7
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Snapshot Principle in a Network
We define a multi-segment (network) snapshot predictor that
we refer to as the Last-Bus-to-Travel-Network or
θLBTN ( ω1, ..., ωn , tω1 ), given a sequence of stops (with ω1
being the start stop and ωn being the end stop).
According to the snapshot principle in networks we get that:
θLBTN ( ω1, ..., ωn , tω1 ) =
n
i=1
θLBTS( ωi, ωi+1 , tω1 ).
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Snapshot Principle in a Network
We define a multi-segment (network) snapshot predictor that
we refer to as the Last-Bus-to-Travel-Network or
θLBTN ( ω1, ..., ωn , tω1 ), given a sequence of stops (with ω1
being the start stop and ωn being the end stop).
According to the snapshot principle in networks we get that:
θLBTN ( ω1, ..., ωn , tω1 ) =
n
i=1
θLBTS( ωi, ωi+1 , tω1 ).
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Snapshot Principle in a Network
We define a multi-segment (network) snapshot predictor that
we refer to as the Last-Bus-to-Travel-Network or
θLBTN ( ω1, ..., ωn , tω1 ), given a sequence of stops (with ω1
being the start stop and ωn being the end stop).
According to the snapshot principle in networks we get that:
θLBTN ( ω1, ..., ωn , tω1 ) =
n
i=1
θLBTS( ωi, ωi+1 , tω1 ).
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Performance Analysis
Data
8 days of bus data, between September and October of
2014.
Each day: approximately 11500 traveled segments.
First trip for each day: no associated last travel time.
Prediction for line 046A.
Data comes from all buses that share segments with line
046A.
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Performance Analysis
10 20 30 40 50
Index of the segment in the trip
100
101
102
103
104
105
106
107
Samplesquareestimationerror
40
50
60
70
80
90
100
110
RootMeanSquareError
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Process Mining with Schedules
... Analysis
Volume Better prediction
Velocity Segmentation
Variety
Veracity
Table: Big Data Cross Table
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Process Mining with Schedules
... Management ...
Volume
Velocity
Variety
Veracity Event Cleaning
Table: Big Data Cross Table
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Thank You
Avigdor Gal
Technion – Israel Institute of Technology
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
A. Adi and O. Etzion.
Amit - the situation manager.
The International Journal on Very Large Data Bases, 13(2):177–203, May
2004.
Roger S. Barga, Jonathan Goldstein, Mohamed H. Ali, and Mingsheng
Hong.
Consistent streaming through time: A vision for event stream processing.
In CIDR [DBL, 2007], pages 363–374.
Business Process Model and Notation (BPMN) Version 2.0.
Technical report, Object Management Group (OMG), January 2011.
CIDR 2007, Third Biennial Conference on Innovative Data Systems
Research, Asilomar, CA, USA, January 7-10, 2007, Online Proceedings.
www.cidrdb.org, 2007.
Alan J. Demers, Johannes Gehrke, Biswanath Panda, Mirek Riedewald,
Varun Sharma, and Walker M. White.
Cayuga: A general purpose event monitoring system.
In CIDR [DBL, 2007], pages 412–422.
Opher Etzion and Peter Niblett.
Event Processing in Action.
Manning Publications Company, 2010.
Lecture
Outline
Big Data: the
New
Playground
Events,
Processes, and
Anything in
Between
Complex
Event
Processing
Optimization
Process
Mining with
Schedules
Wolfgang Reisig.
Petri Nets: An Introduction, volume 4 of Monographs in Theoretical
Computer Science. An EATCS Series.
Springer, 1985.
Eugene Wu, Yanlei Diao, and Shariq Rizvi.
High-performance complex event processing over streams.
In SIGMOD ’06: Proceedings of the 2006 ACM SIGMOD international
conference on Management of data, pages 407–418, New York, NY, USA,
2006. ACM.

RuleML 2015: When Processes Rule Events

  • 1.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules When Processes Rule Events Avigdor Gal Technion – Israel Institute of Technology
  • 2.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Presentation Outline Big data: the New Playground Events, Processes, and Anything in Between Complex Event Processing Optimizaion Process Mining with Schedules
  • 3.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Big Data: is it a Storm in a Teacup?
  • 4.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Big data is a game changer From Theory to Systems: empirical evaluation counts From Systems to Data: large scale empirical evaluation counts
  • 5.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Who is a Data Scientist? The ability to take data – to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it – that’s going to be a hugely important skill in the next decades. (Hal Varian, Google’s Chief Economist)
  • 6.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Volume: No Longer the Size of a Teacup Volume Table: Big Data Cross Table Big data may be a single dataset with a lot of data
  • 7.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Volume: No Longer the Size of a Teacup Table: Big Data Cross Table Big data may be a single dataset with a lot of data
  • 8.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Velocity: Replacing a Teacup with a Tea Hose Volume Velocity Table: Big Data Cross Table Big data may be data that rapidly changes
  • 9.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Velocity: Replacing a Teacup with a Tea Hose Table: Big Data Cross Table Big data may be data that rapidly changes
  • 10.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Velocity: Replacing a Teacup with a Tea Hose Table: Big Data Cross Table Big data may be data that rapidly changes
  • 11.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Velocity: Replacing a Teacup with a Tea Hose Table: Big Data Cross Table Big data may be data that rapidly changes
  • 12.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Variety: When One Tea Type is Just not Enough Volume Velocity Variety Table: Big Data Cross Table Big data may be a small dataset with many different schemata
  • 13.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Variety: When One Tea Type is Just not Enough Table: Big Data Cross Table Big data may be a small dataset with many different schemata
  • 14.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Veracity: Is it Coffee or Black Tea with Milk? Volume Velocity Variety Veracity Table: Big Data Cross Table Big data may be data with varying levels of trustworthiness
  • 15.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Veracity: Is it Coffee or Black Tea with Milk? Table: Big Data Cross Table Big data may be data with varying levels of trustworthiness
  • 16.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Gathering: where and when to expect the fountain to burst Gathering Volume Velocity Variety Veracity Signal and Event Processing Table: Big Data Cross Table
  • 17.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Gathering: where and when to expect the fountain to burst Table: Big Data Cross Table
  • 18.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Management: Not your typical DBA anymore Gathering Managing Volume Velocity Variety Veracity Cloud Computing, NoSQL, NewSQL Table: Big Data Cross Table
  • 19.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Analytics: When Data Analysis Explodes Multi-Dimensionally Gathering Managing Analyzing Volume Velocity Variety Veracity Data & Process Mining ML, IR, NLP Table: Big Data Cross Table
  • 20.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Visualization: The Machine Offering to Mankind Gathering Managing Analyzing Visualizing Volume Velocity Variety Veracity User Experience Table: Big Data Cross Table
  • 21.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Data Visualization: The Machine Offering to Mankind Table: Big Data Cross Table
  • 22.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Big Data Cross Table Gathering Managing Analyzing Visualizing Volume Ev Pro Velocity en ce Variety t ss Veracity s es Table: Big Data Cross Table
  • 23.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Event Processing Events An event e is an occurrence within a particular system or domain. It is something that has happened, or is contemplated as having happened in that domain. [Etzion and Niblett, 2010] Point-based semantics. An event type E ∈ E is a specification for a set of events that share the same semantic intent and structure. Complex Event Processing Systems: Amit [Adi and Etzion, 2004], SASE [Wu et al., 2006], Cayuga [Demers et al., 2007], CEDR [Barga et al., 2007], ESPER []. DEBS 2016: Oragne County, California
  • 24.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Event Processing Urban Traffic Management
  • 25.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Traffic Flow
  • 26.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Bus Log
  • 27.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Events and Big Data Volume: 23 Million records per month (∼ 4GB) Velocity: 770,000 new records per day (an event each 2-6 seconds) Variety: Homogeneous Veracity: GPS locations
  • 28.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Processes Processes Process models describe time dependencies among activities: Business processes Scheduled activities Used as a template for execution by a process engine. A process model can be modeled as a graph containing activity nodes and control nodes: Petri nets [Reisig, 1985] BPMN [bpm, 2011]
  • 29.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Process Models Bus Log Bus Model s d ω_2 ω_3 ω_i ω_{n-1}
  • 30.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Events Processes Complex Event Processing Optimization Process Mining with Schedules Between Events and Processes Given processes, detect (complex) events Given events, discover processes
  • 31.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules From Processes to CEP Optimisation of event pattern matching on three levels Approach based on domain knowledge Results taken from: M. Weidlich, H. Ziekow, A. Gal, J. Mendling, M. Weske - Optimising Event Pattern Matching using Business Process Models. IEEE Transactions on Knowledge and Data Engineering (TKDE), accepted for publication, 2015.
  • 32.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules From Processes to CEP Thanks Matthias Weidlich for the slides
  • 33.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Optimization by Transformation Sequentialization Rule
  • 34.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Optimization by Plan Selection Sequentialization Rule
  • 35.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Optimization by Early Termination Sequentialization Rule
  • 36.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Performance Analysis Datasets publicly available process log that contains recorded execution sequences of a paper reviewing process.a The model denes 20 activities. The log comprises 3730 events that are related to 100 process instances. Each event is associated with a timestamp and a reference to an activity of the process model. Process models of a German insurance company. 1021 process models, ranging from 4 to 339 nodes. The average size of the process models is around 23 nodes. The log was simulated using annotations of the process models. a http://www.processmining.org/logs/start
  • 37.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Performance Analysis
  • 38.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Performance Analysis
  • 39.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Complex Events Processing with Processes Gathering ... Volume Velocity Optimization Variety Optimisation in event processing networks Veracity Table: Big Data Cross Table
  • 40.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Complex Events Processing with Processes ... Analysis Volume Mining of constraints Velocity Variety Veracity Probabilistic mining of constraints Table: Big Data Cross Table
  • 41.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules From Events to Processes Online Traveling Time Prediction: when Processes Rule Events Using information on bus stops, the prediction of the journey traveling time T( ω1, . . . , ωn , tω1 ) is traced back to the sum of traveling times per segment: T( ω1, . . . , ωn , tω1 ) = T( ω1, ω2 , tω1 ) + . . . + T( ωn−1, ωn , tωn−1 ) where tωn−1 = tω1 + T( ω1, ωn−1 , tω1 ). s d Traveling Time = Drive Time + Delay Time + Stop Time ω_2 ω_3 ω_i ω_{n-1} (Thanks to Arik Senderovich for the slides)
  • 42.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules From Events to Processes Online Traveling Time Prediction: when Processes Rule Events Using information on bus stops, the prediction of the journey traveling time T( ω1, . . . , ωn , tω1 ) is traced back to the sum of traveling times per segment: T( ω1, . . . , ωn , tω1 ) = T( ω1, ω2 , tω1 ) + . . . + T( ωn−1, ωn , tωn−1 ) where tωn−1 = tω1 + T( ω1, ωn−1 , tω1 ). s d Traveling Time = Drive Time + Delay Time + Stop Time ω_2 ω_3 ω_i ω_{n-1} (Thanks to Arik Senderovich for the slides)
  • 43.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Prediction: The Snapshot Principle in Single-Station Queues The snapshot principle stems from a heavy-traffic approximation of a queueing system under limits of its parameters, as the workload converges to capacity. Station1 The principle states that the total time in the station (waiting+service) remains constant. In our context, bus that passes through a segment, e.g., ωi, ωi+1 ∈ S × S, will have the same traveling time as another bus that has just passed through that segment (not necessarily of the same type, line, etc.).
  • 44.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Prediction: The Snapshot Principle in Single-Station Queues The snapshot principle stems from a heavy-traffic approximation of a queueing system under limits of its parameters, as the workload converges to capacity. Station1 The principle states that the total time in the station (waiting+service) remains constant. In our context, bus that passes through a segment, e.g., ωi, ωi+1 ∈ S × S, will have the same traveling time as another bus that has just passed through that segment (not necessarily of the same type, line, etc.).
  • 45.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Prediction: The Snapshot Principle in Single-Station Queues The snapshot principle stems from a heavy-traffic approximation of a queueing system under limits of its parameters, as the workload converges to capacity. Station1 The principle states that the total time in the station (waiting+service) remains constant. In our context, bus that passes through a segment, e.g., ωi, ωi+1 ∈ S × S, will have the same traveling time as another bus that has just passed through that segment (not necessarily of the same type, line, etc.).
  • 46.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules The Snapshot Principle in Single-Station Queues Based on the above, we define a single-segment snapshot predictor, Last-Bus-to-Travel-Segment (LBTS), denoted by θLBTS( ωi, ωi+1 , tω1 ). In real-life settings, applicability of the snapshot principle predictors should be tested ad-hoc. The snapshot principle was shown to be of an empirical value in previous research, where queueing techniques were applied to predict delays.
  • 47.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules The Snapshot Principle in Single-Station Queues Based on the above, we define a single-segment snapshot predictor, Last-Bus-to-Travel-Segment (LBTS), denoted by θLBTS( ωi, ωi+1 , tω1 ). In real-life settings, applicability of the snapshot principle predictors should be tested ad-hoc. The snapshot principle was shown to be of an empirical value in previous research, where queueing techniques were applied to predict delays.
  • 48.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Snapshot Principle in a Network In our case, the LBTS predictor needs to be lifted to a network setting. The snapshot principle holds for networks of queues, when the routing through this network is known in advance. In scheduled transportation such as buses this is the case as the order of stops (and segments) is predefined: Station1 Station2 Station3 Station5 Station6 Station4 Station7
  • 49.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Snapshot Principle in a Network In our case, the LBTS predictor needs to be lifted to a network setting. The snapshot principle holds for networks of queues, when the routing through this network is known in advance. In scheduled transportation such as buses this is the case as the order of stops (and segments) is predefined: Station1 Station2 Station3 Station5 Station6 Station4 Station7
  • 50.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Snapshot Principle in a Network In our case, the LBTS predictor needs to be lifted to a network setting. The snapshot principle holds for networks of queues, when the routing through this network is known in advance. In scheduled transportation such as buses this is the case as the order of stops (and segments) is predefined: Station1 Station2 Station3 Station5 Station6 Station4 Station7
  • 51.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Snapshot Principle in a Network We define a multi-segment (network) snapshot predictor that we refer to as the Last-Bus-to-Travel-Network or θLBTN ( ω1, ..., ωn , tω1 ), given a sequence of stops (with ω1 being the start stop and ωn being the end stop). According to the snapshot principle in networks we get that: θLBTN ( ω1, ..., ωn , tω1 ) = n i=1 θLBTS( ωi, ωi+1 , tω1 ).
  • 52.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Snapshot Principle in a Network We define a multi-segment (network) snapshot predictor that we refer to as the Last-Bus-to-Travel-Network or θLBTN ( ω1, ..., ωn , tω1 ), given a sequence of stops (with ω1 being the start stop and ωn being the end stop). According to the snapshot principle in networks we get that: θLBTN ( ω1, ..., ωn , tω1 ) = n i=1 θLBTS( ωi, ωi+1 , tω1 ).
  • 53.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Snapshot Principle in a Network We define a multi-segment (network) snapshot predictor that we refer to as the Last-Bus-to-Travel-Network or θLBTN ( ω1, ..., ωn , tω1 ), given a sequence of stops (with ω1 being the start stop and ωn being the end stop). According to the snapshot principle in networks we get that: θLBTN ( ω1, ..., ωn , tω1 ) = n i=1 θLBTS( ωi, ωi+1 , tω1 ).
  • 54.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Performance Analysis Data 8 days of bus data, between September and October of 2014. Each day: approximately 11500 traveled segments. First trip for each day: no associated last travel time. Prediction for line 046A. Data comes from all buses that share segments with line 046A.
  • 55.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Performance Analysis 10 20 30 40 50 Index of the segment in the trip 100 101 102 103 104 105 106 107 Samplesquareestimationerror 40 50 60 70 80 90 100 110 RootMeanSquareError
  • 56.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Process Mining with Schedules ... Analysis Volume Better prediction Velocity Segmentation Variety Veracity Table: Big Data Cross Table
  • 57.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Process Mining with Schedules ... Management ... Volume Velocity Variety Veracity Event Cleaning Table: Big Data Cross Table
  • 58.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Thank You Avigdor Gal Technion – Israel Institute of Technology
  • 59.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules A. Adi and O. Etzion. Amit - the situation manager. The International Journal on Very Large Data Bases, 13(2):177–203, May 2004. Roger S. Barga, Jonathan Goldstein, Mohamed H. Ali, and Mingsheng Hong. Consistent streaming through time: A vision for event stream processing. In CIDR [DBL, 2007], pages 363–374. Business Process Model and Notation (BPMN) Version 2.0. Technical report, Object Management Group (OMG), January 2011. CIDR 2007, Third Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 7-10, 2007, Online Proceedings. www.cidrdb.org, 2007. Alan J. Demers, Johannes Gehrke, Biswanath Panda, Mirek Riedewald, Varun Sharma, and Walker M. White. Cayuga: A general purpose event monitoring system. In CIDR [DBL, 2007], pages 412–422. Opher Etzion and Peter Niblett. Event Processing in Action. Manning Publications Company, 2010.
  • 60.
    Lecture Outline Big Data: the New Playground Events, Processes,and Anything in Between Complex Event Processing Optimization Process Mining with Schedules Wolfgang Reisig. Petri Nets: An Introduction, volume 4 of Monographs in Theoretical Computer Science. An EATCS Series. Springer, 1985. Eugene Wu, Yanlei Diao, and Shariq Rizvi. High-performance complex event processing over streams. In SIGMOD ’06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data, pages 407–418, New York, NY, USA, 2006. ACM.