Chapter 1 Introduction to
Simulation
Banks, Carson, Nelson & Nicol
Discrete-Event System Simulation
Outline
When Simulation Is the Appropriate Tool
When Simulation Is Not Appropriate
Advantages and Disadvantages of Simulation
Areas of Application
Systems and System Environment
Components of a System
Discrete and Continuous Systems
Model of a System
Types of Models
Discrete-Event System Simulation
Steps in a Simulation Study
2
Definition
A simulation is the imitation of the operation of real-world
process or system over time.
Generation of artificial history and observation of that
observation history
A model construct a conceptual framework that
describes a system
The behavior of a system that evolves over time is
studied by developing a simulation model.
The model takes a set of expressed assumptions:
Mathematical, logical
Symbolic relationship between the entities
Goal of modeling and simulation
A model can be used to investigate a wide verity of what
if questions about real-world system.
Potential changes to the system can be simulated and predicate
their impact on the system.
Find adequate parameters before implementation
So simulation can be used as
Analysis tool for predicating the effect of changes
Design tool to predicate the performance of new system
It is better to do simulation before Implementation.
How a model can be developed?
Mathematical Methods
Probability
theory, algebraic method ,
Their results are accurate
They have a few Number of parameters
It is impossible for complex systems
Numerical computer-based simulation
It
is simple
It is useful for complex system
When Simulation Is the Appropriate Tool
Simulation enable the study of internal interaction of a subsystem
with complex system
Informational, organizational and environmental changes can be
simulated and find their effects
A simulation model help us to gain knowledge about improvement of
system
Finding important input parameters with changing simulation inputs
Simulation can be used with new design and policies before
implementation
Simulating different capabilities for a machine can help determine
the requirement
Simulation models designed for training make learning possible
without the cost disruption
A plan can be visualized with animated simulation
The modern system (factory, wafer fabrication plant, service
organization) is too complex that its internal interaction can be
treated only by simulation
6
When Simulation Is Not Appropriate
When the problem can be solved by common
sense.
When the problem can be solved analytically.
If it is easier to perform direct experiments.
If cost exceed savings.
If resource or time are not available.
If system behavior is too complex.
Like
human behavior
Advantages and disadvantages of simulation
In contrast to optimization models, simulation
models are run rather than solved.
Given
as a set of inputs and model characteristics the
model is run and the simulated behavior is observed
Advantages of simulation
New policies, operating procedures, information flows and son on
can be explored without disrupting ongoing operation of the real
system.
New hardware designs, physical layouts, transportation systems
and can be tested without committing resources for their
acquisition.
Time can be compressed or expanded to allow for a speed-up or
slow-down of the phenomenon( clock is self-control).
Insight can be obtained about interaction of variables and important
variables to the performance.
Bottleneck analysis can be performed to discover where work in
process, the system is delayed.
A simulation study can help in understanding how the system
operates.
What if questions can be answered.
Disadvantages of simulation
Model building requires special training.
Vendors
of simulation software have been actively
developing packages that contain models that only
need input (templates).
Simulation results can be difficult to interpret.
Simulation modeling and analysis can be time
consuming and expensive.
Many
simulation software have output-analysis.
10
Areas of application
Manufacturing Applications
Semiconductor Manufacturing
Construction Engineering and project management
Military application
Logistics, Supply chain and distribution application
Transportation modes and Traffic
Business Process Simulation
Health Care
Automated Material Handling System (AMHS)
Risk analysis
Insurance, portfolio,...
Computer Simulation
Test beds for functional testing of control-system software
CPU, Memory,
Network simulation
Internet backbone, LAN (Switch/Router), Wireless, PSTN (call center),...
11
Systems and System Environment
A system is defined as a groups of objects that
are joined together in some regular interaction
toward the accomplishment of some purpose.
An
automobile factory: Machines, components parts
and workers operate jointly along assembly line
A system is often affected by changes occurring
outside the system: system environment.
Factory
: Arrival orders
Effect of supply on demand : relationship between factory
output and arrival (activity of system)
Banks
: arrival of customers
12
Components of system
Entity
Attribute
A instantaneous occurrence that might change the state of the system:
breakdown
Endogenous
A collection of variables that describe the system in any time : status of machine
(busy, idle, down,)
Event
A time period of specified length :welding, stamping
State
The property of an entity : speed, capacity
Activity
An object of interest in the system : Machines in factory
Activities and events occurring with the system
Exogenous
Activities and events occurring with the environment
13
Discrete and Continues Systems
A discrete system is one in which the state variables
change only at a discrete set of points in time : Bank
example
14
Discrete and Continues Systems (cont.)
A continues system is one in which the state variables
change continuously over time: Head of water behind the
dam
15
Model of a System
To study the system
it is sometimes possible to experiments with system
This is not always possible (bank, factory,)
A new system may not yet exist
Model: construct a conceptual framework that
describes a system
It
is necessary to consider those accepts of systems
that affect the problem under investigation
(unnecessary details must remove)
16
Types of Models
17
Characterizing a Simulation Model
Deterministic or Stochastic
Does
the model contain stochastic components?
Randomness is easy to add to a DES
Static or Dynamic
Is
time a significant variable?
Continuous or Discrete
Does
the system state evolve continuously or only at
discrete points in time?
Continuous: classical mechanics
Discrete: queuing, inventory, machine shop models
18
Discrete-Event Simulation Model
Stochastic: some state variables are random
Dynamic: time evolution is important
Discrete-Event: significant changes occur at
discrete time instances
19
Model Taxonomy
20
DES Model Development
How to develop a model:
1)
2)
3)
4)
5)
6)
Determine the goals and objectives
Build a conceptual model
Convert into a specification model
Convert into a computational model
Verify
Validate
Typically an iterative process
21
Three Model Levels
Conceptual
Very high level
How comprehensive should the model be?
What are the state variables, which are dynamic, and which are
important?
Specification
On paper
May involve equations, pseudocode, etc.
How will the model receive input?
Computational
A computer program
General-purpose PL or simulation language?
22
Verification vs. Validation
Verification
Computational
model should be consistent with
specification model
Did we build the model right?
Validation
Computational
model should be consistent with the
system being analyzed
Did we build the right model?
Can an expert distinguish simulation output from
system output?
Interactive graphics can prove valuable
23
Steps in Simulation
Study
24
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation
Discrete-Event System Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chapter 9
Verification and Validation of Simulation Models
Computer Science, Informatik 4
Communication and Distributed Systems
Purpose & Overview
The goal of the validation process is:
To produce a model that represents true
behavior closely enough for decision-making
purposes
To increase the models credibility to an
acceptable level
Validation is an integral part of model
development:
Verification: building the model correctly,
correctly implemented with good input and
structure
Validation: building the correct model, an
accurate representation of the real system
Most methods are informal subjective
comparisons while a few are formal
statistical procedures
Chapter 9. Verification and Validation of Simulation Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Modeling-Building, Verification & Validation
Steps in Model-Building
Observing the real system
and the interactions among
their various components
and of collecting data on
their behavior.
Construction of a
conceptual model
Implementation of an
operational model
Chapter 9. Verification and Validation of Simulation Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Verification
Purpose: ensure the conceptual model is reflected accurately in the
computerized representation.
Many common-sense suggestions, for example:
Have someone else check the model.
Make a flow diagram that includes each logically possible action a
system can take when an event occurs.
Closely examine the model output for reasonableness under a variety of
input parameter settings.
Print the input parameters at the end of the simulation, make sure they
have not been changed inadvertently.
Make the operational model as self-documenting as possible.
If the operational model is animated, verify that what is seen in the
animation imitates the actual system.
Use the debugger.
If possible use a graphical representation of the model.
Chapter 9. Verification and Validation of Simulation Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Examination of Model Output for Reasonableness
Two statistics that give a quick indication of model
reasonableness are current contents and total counts
Current content: The number of items in each component of the
system at a given time.
Total counts: Total number of items that have entered each
component of the system by a given time.
Compute certain long-run measures of performance, e.g.
compute the long-run server utilization and compare to
simulation results.
Chapter 9. Verification and Validation of Simulation Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Examination of Model Output for Reasonableness
A model of a complex network of queues consisting of
many service centers.
If the current content grows in a more or less linear fashion as
the simulation run time increases, it is likely that a queue is
unstable
If the total count for some subsystem is zero, indicates no items
entered that subsystem, a highly suspect occurrence
If the total and current count are equal to one, can indicate that
an entity has captured a resource but never freed that resource.
Chapter 9. Verification and Validation of Simulation Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Other Important Tools
Documentation
A means of clarifying the logic of a model and verifying its
completeness.
Comment the operational model, definition of all variables and
parameters.
Use of a trace
A detailed printout of the state of the simulation model over time.
- Can be very labor intensive if the programming language does not
support statistic collection.
- Labor can be reduced by a centralized tracing mechanism
- In object-oriented simulation framework, trace support can be
integrated into class hierarchy. New classes need only to add little for
the trace support.
Chapter 9. Verification and Validation of Simulation Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Trace - Example
Simple queue from Chapter 2
Trace over a time interval [0, 16]
Allows the test of the results by pen-and-paper method
Definition of Variables:
CLOCK = Simulation clock
EVTYP = Event type (Start, Arrival, Departure, Stop)
NCUST = Number of customers in system at time CLOCK
STATUS = Status of server (1=busy, 0=idle)
State
CLOCK
CLOCK
CLOCK
CLOCK
CLOCK
CLOCK
...
of System Just After the Named Event Occurs:
= 0 EVTYP = Start
NCUST=0 STATUS = 0
= 3 EVTYP = Arrival NCUST=1 STATUS = 0
= 5 EVTYP = Depart
NCUST=0 STATUS = 0
= 11 EVTYP = Arrival NCUST=1 STATUS = 0
= 12 EVTYP = Arrival NCUST=2 STATUS = 1
= 16 EVTYP = Depart
NCUST=1 STATUS = 1
Chapter 9. Verification and Validation of Simulation Models
There is a customer,
but the status is 0
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Calibration and Validation
Validation: the overall process of comparing the model and its behavior to the
real system.
Calibration: the iterative process of comparing the model to the real system
and making adjustments.
Comparison of the model to
real system
Subjective tests
- People who are
knowledgeable about
the system
Objective tests
- Requires data on the
real systems behavior
and the output of the
model
Chapter 9. Verification and Validation of Simulation Models
10
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Calibration and Validation
Danger during the calibration phase
No model is ever a perfect representation of the system
Typically few data sets are available, in the worst case only one, and the
model is only validated for these.
Solution: If possible collect new data sets
The modeler must weigh the possible, but not guaranteed, increase in
model accuracy versus the cost of increased validation effort.
Three-step approach for validation:
1. Build a model that has high face validity.
2. Validate model assumptions.
3. Compare the model input-output transformations with the real systems
data.
Chapter 9. Verification and Validation of Simulation Models
11
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
High Face Validity
Ensure a high degree of realism:
Potential users should be involved in model construction from its
conceptualization to its implementation.
Sensitivity analysis can also be used to check a models face
validity.
Example: In most queueing systems, if the arrival rate of
customers were to increase, it would be expected that server
utilization, queue length and delays would tend to increase.
For large-scale simulation models, there are many input
variables and thus possibly many sensitity tests.
- Sometimes not possible to perform all of theses tests, select the
most critical ones.
Chapter 9. Verification and Validation of Simulation Models
12
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Validate Model Assumptions
General classes of model assumptions:
Structural assumptions: how the system operates.
Data assumptions: reliability of data and its statistical analysis.
Bank example: customer queueing and service facility in a bank.
Structural assumptions
- Customer waiting in one line versus many lines
- Customers are served according FCFS versus priority
Data assumptions, e.g., interarrival time of customers, service times for
commercial accounts.
- Verify data reliability with bank managers
- Test correlation and goodness of fit for data
Chapter 9. Verification and Validation of Simulation Models
13
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Validate Input-Output Transformation
Goal: Validate the models ability to predict future behavior
The only objective test of the model.
The structure of the model should be accurate enough to make good
predictions for the range of input data sets of interest.
One possible approach: use historical data that have been reserved
for validation purposes only.
Criteria: use the main responses of interest.
Input
System
Output
Model is viewed as an
input-output
transformation
Input
Chapter 9. Verification and Validation of Simulation Models
Model
14
Output
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Bank Example
Example: One drive-in window serviced by one teller, only one or two
transactions are allowed.
Data collection: 90 customers during 11 am to 1 pm.
- Observed service times {Si, i = 1,2, , 90}.
- Observed interarrival times {Ai, i = 1,2, , 90}.
Data analysis let to the conclusion that:
- Interarrival times: exponentially distributed with rate = 45
- Service times: N(1.1, 0.22)
Chapter 9. Verification and Validation of Simulation Models
15
Input variables
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Bank Example The Black Box
A model was developed in close consultation with bank management and
employees
Model assumptions were validated
Resulting model is now viewed as a black box:
Model Output Variables, Y
Input Variables
Uncontrolled
variables, X
Controlled
Decision
variables, D
Possion arrivals
= 45/hr: X11, X12,
Services times,
N(D2, 0.22): X21, X22,
Model
black box
f(X,D) = Y
D1 = 1 (one teller)
D2 = 1.1 min
(mean service time)
D3 = 1 (one line)
Chapter 9. Verification and Validation of Simulation Models
16
Primary interest:
Y1 = tellers utilization
Y2 = average delay
Y3 = maximum line length
Secondary interest:
Y4 = observed arrival rate
Y5 = average service time
Y6 = sample std. dev. of
service times
Y7 = average length of time
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Comparison with Real System Data
Real system data are necessary for validation.
System responses should have been collected during the same time
period (from 11am to 1pm on the same day.)
Compare the average delay from the model Y2 with the actual delay
Z2:
Average delay observed, Z2 = 4.3 minutes, consider this to be the true
mean value 0 = 4.3.
When the model is run with generated random variates X1n and X2n, Y2
should be close to Z2.
Chapter 9. Verification and Validation of Simulation Models
17
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Comparison with Real System Data
Six statistically independent replications of the model, each of
2-hour duration, are run.
Replication
Y4
Arrivals/Hour
Y5
Service Time [Minutes]
Y2
Average Delay [Minutes]
51
1.07
2.79
40
1.12
1.12
45.5
1.06
2.24
50.5
1.10
3.45
53
1.09
3.13
49
1.07
2.38
Sample mean
2.51
Standard deviation
0.82
Chapter 9. Verification and Validation of Simulation Models
18
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Hypothesis Testing
Compare the average delay from the model Y2 with the actual delay Z2
Null hypothesis testing: evaluate whether the simulation and the real
system are the same (w.r.t. output measures):
H 0: E(Y2 ) = 4.3 minutes
H1: E(Y2 ) 4.3 minutes
- If H0 is not rejected, then, there is no reason to consider the model
invalid
- If H0 is rejected, the current version of the model is rejected, and the
modeler needs to improve the model
Chapter 9. Verification and Validation of Simulation Models
19
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Hypothesis Testing
Conduct the t test:
Chose level of significance ( = 0.5) and sample size (n = 6).
Compute the same mean and sample standard deviation over
the n replications:
n
1 n
Y2 = Y2i = 2.51 minutes
n i =1
S=
(Y
i =1
2i
Y2 ) 2
n 1
= 0.82 minutes
Compute test statistics:
t0 =
Y2 0
2.51 4.3
=
= 5.34 > tcritical = 2.571 (for a 2 - sided test)
S/ n
0.82 / 6
Hence, reject H0.
- Conclude that the model is inadequate.
Check: the assumptions justifying a t test, that the observations
(Y2i) are normally and independently distributed.
Chapter 9. Verification and Validation of Simulation Models
20
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Hypothesis Testing
Similarly, compare the model output with the observed output for
other measures:
Y4 Z4, Y5 Z5, and Y6 Z6
Chapter 9. Verification and Validation of Simulation Models
21
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Type II Error
For validation, the power of the test is:
Probability[ detecting an invalid model ] = 1
= P(Type II error) = P(failing to reject H0 | H1 is true)
Consider failure to reject H0 as a strong conclusion, the modeler would
want to be small.
Value of depends on:
- Sample size, n
- The true difference, , between E(Y) and :
E (Y )
In general, the best approach to control error is:
Specify the critical difference, .
Choose a sample size, n, by making use of the operating characteristics
curve (OC curve).
Chapter 9. Verification and Validation of Simulation Models
22
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Type II Error
Operating characteristics curve
(OC curve).
Graphs of the probability of a
Type II Error () versus for a
given sample size n
For the same error
probability with smaller
difference the required
sample size increses!
Chapter 9. Verification and Validation of Simulation Models
23
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Type I and II Error
Type I error ():
Error of rejecting a valid model.
Controlled by specifying a small level of significance .
Type II error ():
Error of accepting a model as valid when it is invalid.
Controlled by specifying critical difference and find the n.
For a fixed sample size n, increasing will decrease .
Statistical Terminology
Modeling Terminology
Associated Risk
Type I: rejecting H0 when H0 is true
Rejecting a valid model
Type II: failure to reject H0 when H1 is true
Failure to reject an invalid model
Chapter 9. Verification and Validation of Simulation Models
24
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Confidence Interval Testing
Confidence interval testing: evaluate whether the simulation and the
real system performance measures are close enough.
If Y is the simulation output, and = E(Y)
The confidence interval (CI) for is:
Y t ,n1
2
Chapter 9. Verification and Validation of Simulation Models
25
S
n
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Confidence Interval Testing
Validating the model:
is a difference value chosen
Suppose the CI does not contain 0:
- If the best-case error is > , model
needs to be refined.
- If the worst-case error is , accept
the model.
- If best-case error is , additional
replications are necessary.
by the analyst, that is small
enough to allow valid
decisions to be based on
simulations!
Suppose the CI contains 0:
- If either the best-case or worst-case
error is > , additional replications
are necessary.
- If the worst-case error is , accept
the model.
Chapter 9. Verification and Validation of Simulation Models
26
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Confidence Interval Testing
Bank example: 0 = 4.3, and close enough is = 1 minute of
expected customer delay.
A 95% confidence interval, based on the 6 replications is
[1.65, 3.37] because:
S
n
0.82
2.51 2.571
6
Y t0.025,5
0 = 4.3 falls outside the confidence interval,
- the best case |3.37 4.3| = 0.93 < 1, but
- the worst case |1.65 4.3| = 2.65 > 1
Additional replications are needed to reach a decision.
Chapter 9. Verification and Validation of Simulation Models
27
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Using Historical Output Data
An alternative to generating input data:
Use the actual historical record.
Drive the simulation model with the historical record and then compare
model output to system data.
In the bank example, use the recorded interarrival and service times for
the customers {An, Sn, n = 1,2,}.
Procedure and validation process: similar to the approach used for
system generated input data.
Chapter 9. Verification and Validation of Simulation Models
28
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Using a Turing Test
Use in addition to statistical test, or when no statistical test is readily
applicable.
Turing Test
Described by Alan Turing in 1950. A human jugde is involved in a natural language
conversation with a human and a machine. If the judge cannot reliably tell which of the
partners is the machine, then the machine has passed the test.
Utilize persons knowledge about the system.
For example:
Present 10 system performance reports to a manager of the system.
Five of them are from the real system and the rest are fake reports
based on simulation output data.
If the person identifies a substantial number of the fake reports, interview
the person to get information for model improvement.
If the person cannot distinguish between fake and real reports with
consistency, conclude that the test gives no evidence of model
inadequacy.
Chapter 9. Verification and Validation of Simulation Models
29
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Summary
Model validation is essential:
Model verification
Calibration and validation
Conceptual validation
Best to compare system data to model data, and make comparison
using a wide variety of techniques.
Some techniques that we covered:
Insure high face validity by consulting knowledgeable persons.
Conduct simple statistical tests on assumed distributional forms.
Conduct a Turing test.
Compare model output to system output by statistical tests.
Chapter 9. Verification and Validation of Simulation Models
30
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation
Discrete-Event System Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chapter 10
Output Analysis for a Single Model
Computer Science, Informatik 4
Communication and Distributed Systems
Purpose
Objective: Estimate system performance via simulation
If is the system performance, the precision of the estimator can
be measured by:
The standard error of .
The width of a confidence interval (CI) for .
Purpose of statistical analysis:
To estimate the standard error or CI .
To figure out the number of observations required to achieve desired
error or CI.
Potential issues to overcome:
Autocorrelation, e.g. inventory cost for subsequent weeks lack statistical
independence.
Initial conditions, e.g. inventory on hand and number of backorders at
time 0 would most likely influence the performance of week 1.
Chapter 10. Output Analysis for a Single Model
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Outline
Distinguish the two types of simulation:
transient vs.
steady state
Illustrate the inherent variability in a stochastic discrete-event
simulation.
Cover the statistical estimation of performance measures.
Discusses the analysis of transient simulations.
Discusses the analysis of steady-state simulations.
Chapter 10. Output Analysis for a Single Model
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Type of Simulations
Terminating versus non-terminating simulations
Terminating simulation:
Runs for some duration of time TE, where E is a specified event that
stops the simulation.
Starts at time 0 under well-specified initial conditions.
Ends at the stopping time TE.
Bank example: Opens at 8:30 am (time 0) with no customers present
and 8 of the 11 teller working (initial conditions), and closes at 4:30 pm
(Time TE = 480 minutes).
The simulation analyst chooses to consider it a terminating system
because the object of interest is one days operation.
Chapter 10. Output Analysis for a Single Model
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Type of Simulations
Non-terminating simulation:
Runs continuously, or at least over a very long period of time.
Examples: assembly lines that shut down infrequently, hospital
emergency rooms, telephone systems, network of routers, Internet.
Initial conditions defined by the analyst.
Runs for some analyst-specified period of time TE.
Study the steady-state (long-run) properties of the system, properties
that are not influenced by the initial conditions of the model.
Whether a simulation is considered to be terminating or nonterminating depends on both
The objectives of the simulation study and
The nature of the system
Chapter 10. Output Analysis for a Single Model
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Stochastic Nature of Output Data
Model output consist of one or more random variables because the
model is an input-output transformation and the input variables are
random variables.
M/G/1 queueing example:
Poisson arrival rate = 0.1 per minute;
service time ~ N( = 9.5, =1.75).
System performance: long-run mean queue length, LQ(t).
Suppose we run a single simulation for a total of 5000 minutes
- Divide the time interval [0, 5000) into 5 equal subintervals of 1000 minutes.
- Average number of customers in queue from time (j-1)1000 to j(1000) is Yj .
Waiting line
Chapter 10. Output Analysis for a Single Model
2
2
=
LQ =
( ) 1
Server
7
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Stochastic Nature of Output Data
M/G/1 queueing example (cont.):
Batched average queue length for 3 independent replications:
Batching Interval
(minutes)
[0, 1000)
[1000, 2000)
[2000, 3000)
[3000, 4000)
[4000, 5000)
[0, 5000)
1, Y1j
3.61
3.21
2.18
6.92
2.82
3.75
Batch, j
1
2
3
4
5
Replication
2, Y2j
2.91
9.00
16.15
24.53
25.19
15.56
3, Y3j
7.67
19.53
20.36
8.11
12.62
13.66
Inherent variability in stochastic simulation both within a single replication
and across different replications.
The average across 3 replications, Y1. , Y2. , Y3. , can be regarded as
independent observations, but averages within a replication, Y11, , Y15,
are not.
Chapter 10. Output Analysis for a Single Model
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Measures of performance
Consider the estimation of a performance parameter, (or ), of a
simulated system.
Discrete time data: [Y1, Y2, , Yn], with ordinary mean:
Continuous-time data: {Y(t), 0 t TE} with time-weighted mean:
Point estimation for discrete time data.
The point estimator:
n
1
= Yi
n i =1
- Is unbiased if its expected value is , that is if:
E () =
Desired
- Is biased if: E () and E () is called bias of
Chapter 10. Output Analysis for a Single Model
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Point Estimator
Point estimation for continuous-time data.
The point estimator:
=
TE
TE
Y (t )dt
- Is biased in general where: E ()
.
- An unbiased or low-bias estimator is desired.
Usually, system performance measures can be put into the common
framework of or :
The proportion of days on which sales are lost through an out-of-stock
situation, let:
1, if out of stock on day i
Y (i) =
0, otherwise
Chapter 10. Output Analysis for a Single Model
10
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Point Estimator
Performance measure that does not fit:
quantile or percentile: Pr{Y } = p
Estimating quantiles: the inverse of the problem of estimating a proportion
or probability.
Consider a histogram of the observed values Y:
- Find such that 100p% of the histogram is to the left of (smaller than) .
A widely used used performance meausure is the median, which is the
0.5 quantile or 50-th percentile.
Chapter 10. Output Analysis for a Single Model
11
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Confidence-Interval Estimation
To understand confidence intervals fully, it is important to distinguish
between measures of error, and measures of risk, e.g., confidence
interval versus prediction interval.
Suppose the model is the normal distribution with mean , variance 2
(both unknown).
Let Yi. be the average cycle time for parts produced on the i-th replication
of the simulation (its mathematical expectation is ).
Average cycle time will vary from day to day, but over the long-run the
average of the averages will be close to .
Sample variance across R replications:
1 R
S =
(Yi. Y.. ) 2
R 1 i =1
2
Chapter 10. Output Analysis for a Single Model
12
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Confidence-Interval Estimation
Confidence Interval (CI):
A measure of error.
Where Yi are normally distributed.
Y.. t , R 1
2
Quantile of the t
distribution with R-1
degrees of freedom.
S
R
We cannot know for certain how far Y.. is from but CI attempts to bound
that error.
A CI, such as 95%, tells us how much we can trust the interval to actually
bound the error between Y.. and .
The more replications we make, the less error there is in Y.. (converging
to 0 as R goes to infinity).
Chapter 10. Output Analysis for a Single Model
13
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Confidence-Interval Estimation
Prediction Interval (PI):
A measure of risk.
A good guess for the average cycle time on a particular day is our
estimator but it is unlikely to be exactly right.
PI is designed to be wide enough to contain the actual average cycle time
on any particular day with high probability.
Normal-theory prediction interval:
1
Y.. t / 2, R 1S 1 +
R
The length of PI will not go to 0 as R increases because we can never
simulate away risk.
PIs limit is: z / 2
Chapter 10. Output Analysis for a Single Model
14
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Output Analysis for Terminating Simulations
A terminating simulation: runs over a simulated time interval [0, TE].
A common goal is to estimate:
1 n
= E Yi ,
for discrete output
n i =1
1 TE
= E Y (t )dt , for continuous output Y (t ),0 t TE
TE 0
In general, independent replications are used, each run using a
different random number stream and independently chosen initial
conditions.
Chapter 10. Output Analysis for a Single Model
15
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Statistical Background
Important to distinguish within-replication data from across-replication
data.
For example, simulation of a manufacturing system
Two performance measures of that system: cycle time for parts and work
in process (WIP).
Let Yij be the cycle time for the j-th part produced in the i-th replication.
Across-replication data are formed by summarizing within-replication
data Yi. .
Within-Replication Data
Across-Replication
Data
Y11
Y12
Y1n 1
Y1 , S 12 , H 1
Y 21
Y 22
Y 2n 2
Y 2 , S 22 , H 2
Y R1
YR 2
Y Rn R
Y R , S 2R , H R
Chapter 10. Output Analysis for a Single Model
16
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Statistical Background
Across Replication:
For example: the daily cycle time averages (discrete time data)
-
1 R
The average: Y.. = Yi.
R i =1
1 R
2
The sample variance: S =
(Yi. Y.. ) 2
R 1 i =1
- The confidence-interval half-width:
H = t / 2, R 1
S
R
Within replication:
For example: the WIP (a continuous time data)
1 TEi
Yi (t )dt
0
T Ei
1 TEi
2
2
(Yi (t ) Yi. ) dt
The sample variance: S i =
0
T Ei
- The average:
-
Chapter 10. Output Analysis for a Single Model
Yi. =
17
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Statistical Background
Overall sample average, Y.., and the interval replication sample
averages, Yi., are always unbiased estimators of the expected daily
average cycle time or daily average WIP.
Across-replication data are independent (different random numbers)
and identically distributed (same model), but within-replication data
do not have these properties.
Chapter 10. Output Analysis for a Single Model
18
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Confidence Intervals with Specified Precision
The half-length H of a100(1 )% confidence interval for a mean ,
based on the t distribution, is given by:
H = t / 2, R 1
S
R
S2 is the sample
variance
R is the # of
replications
Suppose that an error criterion is specified with probability 1- , a
sufficiently large sample size should satisfy:
P Y.. < 1
Chapter 10. Output Analysis for a Single Model
19
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Confidence Intervals with Specified Precision
Assume that an initial sample of size R0 (independent) replications has
been observed.
Obtain an initial estimate S02 of the population variance 2.
Then, choose sample size R such that R R0:
Since t/2, R-1 z/2, an initial estimate of R:
2
z S
R / 2 0 , z / 2 is the standard normal distribution.
2
t / 2, R 1S 0
R is the smallest integer satisfying R R0 and R
Collect R - R0 additional observations.
The 100(1- )% CI for :
Y.. t / 2, R 1
Chapter 10. Output Analysis for a Single Model
20
S
R
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Confidence Intervals with Specified Precision
Call Center Example: estimate the agents utilization over the first 2 hours of
the workday.
Initial sample of size R0 = 4 is taken and an initial estimate of the population variance
is S02 = (0.072)2 = 0.00518.
The error criterion is = 0.04 and confidence coefficient is 1- = 0.95, hence, the final
sample size must be at least:
2
2
z0.025 S0 1.96 * 0.00518
= 12.14
=
0.042
For the final sample size:
R
t 0.025, R-1
(t / 2, R1S 0 / )
13
14
15
2.18
2.16
2.14
15.39
15.1
14.83
R = 15 is the smallest integer satisfying the error criterion, so R - R0 = 11 additional
replications are needed.
After obtaining additional outputs, half-width should be checked.
Chapter 10. Output Analysis for a Single Model
21
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Quantiles
Here, a proportion or probability is treated as a special case of a
mean.
When the number of independent replications Y1, ,YR is large
enough that t/2,n-1 = z/2, the confidence interval for a probability p is
often written as:
p (1 p )
R 1
p z / 2
The sample proportion
A quantile is the inverse of the probability to the probability
estimation problem:
p is given
Find such that Pr(Y ) = p
Chapter 10. Output Analysis for a Single Model
22
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Quantiles
The best way is to sort the outputs and use the (R*p)-th smallest value,
i.e., find such that100p% of the data in a histogram of Y is to the left of
.
Example: If we have R=10 replications and we want the p = 0.8 quantile,
first sort, then estimate by the (10)(0.8) = 8-th smallest value (round if
necessary).
5.6 sorted data
7.1
8.8
8.9
9.5
9.7
10.1
12.2 this is our point estimate
12.5
12.9
Chapter 10. Output Analysis for a Single Model
23
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Quantiles
Confidence Interval of Quantiles: An approximate (1-)100%
confidence interval for can be obtained by finding two values l
and u.
l cuts off 100pl% of the histogram (the Rpl smallest value of the sorted
data).
u cuts off 100pu% of the histogram (the Rpu smallest value of the sorted
data).
Chapter 10. Output Analysis for a Single Model
where pl = p z / 2
p (1 p )
R 1
pu = p + z / 2
p (1 p )
R 1
24
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Quantiles
Example: Suppose R = 1000 reps, to estimate the p = 0.8 quantile
with a 95% confidence interval.
First, sort the data from smallest to largest.
Then estimate of by the (1000)(0.8) = 800-th smallest value, and the
point estimate is 212.03.
A portion of the 1000
And find the confidence interval:
pl = 0.8 1.96
.8(1 .8)
= 0.78
1000 1
.8(1 .8)
= 0.82
1000 1
The c.i. is the 780 th and 820 th smallest values
pu = 0.8 + 1.96
The point estimate is
The 95% c.i. is [188.96, 256.79]
Chapter 10. Output Analysis for a Single Model
25
sorted values:
Output
Rank
180.92
779
188.96
780
190.55
781
208.58
799
212.03
800
216.99
801
250.32
819
256.79
820
256.99
821
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Output Analysis for Steady-State Simulation
Consider a single run of a simulation model to estimate a steadystate or long-run characteristics of the system.
The single run produces observations Y1, Y2, ... (generally the samples of
an autocorrelated time series).
Performance measure:
1 n
= lim Yi ,
n n i =1
for discrete measure
1
= lim
TE TE
for continuous measure
TE
Y (t )dt ,
(with probability 1)
(with probability 1)
- Independent of the initial conditions.
Chapter 10. Output Analysis for a Single Model
26
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Output Analysis for Steady-State Simulation
The sample size is a design choice, with several considerations in
mind:
Any bias in the point estimator that is due to artificial or arbitrary initial
conditions (bias can be severe if run length is too short).
Desired precision of the point estimator.
Budget constraints on computer resources.
Notation: the estimation of from a discrete-time output process.
One replication (or run), the output data: Y1, Y2, Y3,
With several replications, the output data for replication r: Yr1, Yr2, Yr3,
Chapter 10. Output Analysis for a Single Model
27
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Initialization Bias
Methods to reduce the point-estimator bias caused by using artificial
and unrealistic initial conditions:
Intelligent initialization.
Divide simulation into an initialization phase and data-collection phase.
Intelligent initialization
Initialize the simulation in a state that is more representative of long-run
conditions.
If the system exists, collect data on it and use these data to specify more
nearly typical initial conditions.
If the system can be simplified enough to make it mathematically solvable,
e.g. queueing models, solve the simplified model to find long-run expected
or most likely conditions, use that to initialize the simulation.
Chapter 10. Output Analysis for a Single Model
28
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Initialization Bias
Divide each simulation into two phases:
An initialization phase, from time 0 to time T0.
A data-collection phase, from T0 to the stopping time T0+TE.
The choice of T0 is important:
- After T0 , system should be more nearly representative of steady-state behavior.
System has reached steady state: the probability distribution of the system
state is close to the steady-state probability distribution (bias of response
variable is negligible).
Chapter 10. Output Analysis for a Single Model
29
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Initialization Bias
M/G/1 queueing example: A total of 10 independent replications were
made.
Each replication beginning in the empty and idle state.
Simulation run length on each replication was T0+TE = 15000 minutes.
Response variable: queue length, LQ(t,r) (at time t of the r-th replication).
Batching intervals of 1000 minutes, batch means
Ensemble averages:
To identify trend in the data due to initialization bias
The average corresponding batch means across replications:
1 R
Y. j = Yrj
R r =1
R replications
The preferred method to determine deletion point.
Chapter 10. Output Analysis for a Single Model
30
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Initialization Bias
A plot of the ensemble averages, Y ..(n, d ) , versus 1000j,
for j = 1,2, ,15.
Chapter 10. Output Analysis for a Single Model
31
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Initialization Bias
Cumulative average sample mean (after deleting d observations):
1
Y.. (n, d ) =
nd
j = d +1
.j
Not recommended to determine the initialization phase.
It is apparent that downward bias is present and this bias can be reduced
by deletion of one or more observations.
Chapter 10. Output Analysis for a Single Model
32
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Initialization Bias
No widely accepted, objective and proven technique to guide how
much data to delete to reduce initialization bias to a negligible level.
Plots can, at times, be misleading but they are still recommended.
Ensemble averages reveal a smoother and more precise trend as the # of
replications, R, increases.
Ensemble averages can be smoothed further by plotting a moving
average.
Cumulative average becomes less variable as more data are averaged.
The more correlation present, the longer it takes for Y. j to approach
steady state.
Different performance measures could approach steady state at different
rates.
Chapter 10. Output Analysis for a Single Model
33
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Error Estimation
If {Y1, , Yn} are not statistically independent, then S2/n is a biased
estimator of the true variance.
Almost always the case when {Y1, , Yn} is a sequence of output
observations from within a single replication (autocorrelated sequence,
time-series).
Suppose the point estimator is the sample mean
1 n
Y = i =1 Yi
n
Variance of Y is very hard to estimate.
For systems with steady state, produce an output process that is
approximately covariance stationary (after passing the transient phase).
- The covariance between two random variables in the time series depends
only on the lag, i.e. the number of observations between them.
Chapter 10. Output Analysis for a Single Model
34
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Error Estimation
For a covariance stationary time series, {Y1, , Yn}:
Lag-k autocovariance is:
k = cov(Y1 , Y1+ k ) = cov(Yi , Yi + k )
Lag-k autocorrelation is:
k =
k
2
1 k 1
If a time series is covariance stationary, then the variance of
V (Y ) =
Y is:
k
1
2
+
1 k
n
n
k =1
n 1
The expected value of the variance estimator is:
S2
E = B V (Y ),
n
Chapter 10. Output Analysis for a Single Model
n / c 1
where B =
n 1
35
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Error Estimation
a)
k > 0 for most k
Stationary time series Yi exhibiting
positive autocorrelation.
b)
c)
Serie slowly drifts above and then
below the mean.
k < 0 for most k
Stationary time series Yi exhibiting
negative autocorrelation.
Nonstationary time series with an
upward trend
Chapter 10. Output Analysis for a Single Model
36
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Error Estimation
The expected value of the variance estimator is:
S2
E = B V (Y ),
n
where B =
n / c 1
and V (Y ) is the variance of Y
n 1
If Yi are independent, then S2/n is an unbiased estimator of V (Y )
If the autocorrelation k are primarily positive, then S2/n is biased low as
an estimator of V (Y ) .
If the autocorrelation k are primarily negative, then S2/n is biased high as
an estimator of V (Y ).
Chapter 10. Output Analysis for a Single Model
37
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Replication Method
Use to estimate point-estimator variability and to construct a confidence
interval.
Approach: make R replications, initializing and deleting from each one the
same way.
Important to do a thorough job of investigating the initial-condition bias:
Bias is not affected by the number of replications, instead, it is affected only by
deleting more data (i.e., increasing T0) or extending the length of each run (i.e.
increasing TE).
Basic raw output data {Yrj, r = 1, ..., R; j = 1, , n} is derived by:
Individual observation from within replication r.
Batch mean from within replication r of some number of discrete-time
observations.
Batch mean of a continuous-time process over time interval j.
Chapter 10. Output Analysis for a Single Model
38
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Replication Method
Each replication is regarded as a single sample for estimating .
For replication r:
1
Yr . (n, d ) =
nd
j = d +1
rj
The overall point estimator:
1 R
Y.. (n, d ) = Yr . (n, d )
R r =1
and
E[Y.. (n, d )] = n,d
If d and n are chosen sufficiently large:
n,d ~ .
Y.. (n, d ) is an approximately unbiased estimator of .
Chapter 10. Output Analysis for a Single Model
39
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Replication Method
To estimate the standard error of Y.. , the sample variance and
standard error:
1 R
1 R 2
2
2
(
)
=
S =
Y
Y
Y
R
Y
r. .. R 1
..
r.
R 1 r =1
r =1
Mean of the
undeleted
observations
from the r-th
replication.
Chapter 10. Output Analysis for a Single Model
Mean of
and
s.e.(Y.. ) =
S
R
Standard error
Y1 (n, d ),K, YR (n, d )
40
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Replication Method
Length of each replication (n) beyond deletion point (d):
( n d ) > 10d or
TE > 10T0
Number of replications (R) should be as many as time permits, up to
about 25 replications.
For a fixed total sample size (n), as fewer data are deleted (d):
CI shifts: greater bias.
Standard error of Y.. ( n, d ) decreases: decrease variance.
Reducing
bias
Chapter 10. Output Analysis for a Single Model
Trade off
41
Increasing
variance
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Replication Method
M/G/1 queueing example:
Suppose R = 10, each of length TE = 15000
minutes, starting at time 0 in the empty and
idle state, initialized for T0 = 2000 minutes
before data collection begins.
Each batch means is the average number of
customers in queue for a 1000-minute interval.
The 1-st two batch means are deleted (d = 2).
The point estimator and standard error are:
Y.. (15,2) = 8.43
and
s.e.(Y.. (15,2) ) = 1.59
The 95% CI for long-run mean queue length is:
Y.. t / 2, R 1S / R Y.. + t / 2, R 1S / R
8.43 2.26(1.59) LQ 8.42 + 2.26(1.59)
A high degree of confidence that the long-run mean queue length is between 4.84
and 12.02 (if d and n are large enough).
Chapter 10. Output Analysis for a Single Model
42
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Sample Size
To estimate a long-run performance measure, , within
with confidence 100(1- )%.
M/G/1 queueing example (cont.):
We know: R0 = 10, d = 2 and S02 = 25.30.
To estimate the long-run mean queue length, LQ, within = 2 customers
with 90% confidence ( = 10%).
Initial estimate:
2
z0.05 S0 1.645 (25.30)
R
= 17.1
=
2
2
Hence, at least 18 replications are needed, next try R = 18,19,
using R t0.05, R1S 0 / 2 . We found that:
R = 19 (t0.05,191S 0 / ) = (1.732 * 25.3 / 4) = 18.93
2
Additional replications needed is R R0 = 19-10 = 9.
Chapter 10. Output Analysis for a Single Model
43
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Sample Size
An alternative to increasing R is to increase total run length T0+TE
within each replication.
Approach:
- Increase run length from (T0+TE) to (R/R0)(T0+TE), and
- Delete additional amount of data, from time 0 to time (R/R0)T0.
Advantage: any residual bias in the point estimator should be further
reduced.
However, it is necessary to have saved the state of the model at time
T0+TE and to be able to restart the model.
Chapter 10. Output Analysis for a Single Model
44
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Batch Means for Interval Estimation
Using a single, long replication:
Problem: data are dependent so the usual estimator is biased.
Solution: batch means.
Batch means: divide the output data from 1 replication (after appropriate
deletion) into a few large batches and then treat the means of these batches
as if they were independent.
A continuous-time process, {Y(t), T0 t T0+TE}:
k batches of size m = TE / k, batch means:
1 jm
Yj =
Y (t + T0 )dt
j
m
(
1
)
m
A discrete-time process, {Yi, i = d+1,d+2, , n}:
k batches of size m = (n d)/k, batch means:
jm
1
Yj =
Yi + d
m i =( j 1) m +1
Chapter 10. Output Analysis for a Single Model
45
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Batch Means for Interval Estimation
Y1 , ..., Yd , Yd +1 , ..., Yd + m , Yd + m+1 , ..., Yd + 2 m , ... , Yd +( k 1) m+1 , ..., Yd + km
deleted
Y1
Yk
Y2
Starting either with continuous-time or discrete-time data, the
variance of the sample mean is estimated by:
2
S
1
=
k
k
j =1
(Y j Y )2 =
k 1
j =1
Y j2 kY 2
k (k 1)
If the batch size is sufficiently large, successive batch means will be
approximately independent, and the variance estimator will be
approximately unbiased.
No widely accepted and relatively simple method for choosing an
acceptable batch size m. Some simulation software does it
automatically.
Chapter 10. Output Analysis for a Single Model
46
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Summary
Stochastic discrete-event simulation is a statistical experiment.
Distinguish: terminating simulations and steady-state simulations.
Steady-state output data are more difficult to analyze
Purpose of statistical experiment: obtain estimates of the performance measures
of the system.
Purpose of statistical analysis: acquire some assurance that these estimates are
sufficiently precise.
Decisions: initial conditions and run length
Possible solutions to bias: deletion of data and increasing run length
Statistical precision of point estimators are estimated by standard-error or
confidence interval
Method of independent replications was emphasized.
Chapter 10. Output Analysis for a Single Model
47
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation
Discrete-Event System Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chapter 11
Comparison and Evaluation of Alternative System Designs
Computer Science, Informatik 4
Communication and Distributed Systems
Purpose
Purpose: comparison of alternative system designs.
Approach: discuss a few of many statistical methods that can be
used to compare two or more system designs.
Statistical analysis is needed to discover whether observed
differences are due to:
Differences in design or
The random fluctuation inherent in the models
Chapter 11. Comparison and Evaluation of Alternative Designs
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Outline
For two-system comparisons:
Independent sampling.
Correlated sampling (common random numbers).
For multiple system comparisons:
Bonferroni approach: confidence-interval estimation, screening, and
selecting the best.
Metamodels
Chapter 11. Comparison and Evaluation of Alternative Designs
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Comparison of Two System Designs
Goal: compare two possible configurations of a system
Two possible ordering policies in a supply-chain system, two possible
scheduling rules in a job shop.
Approach: the method of replications is used to analyze the output
data.
The mean performance measure for system i
Denoted by i , i = 1,2.
To obtain point and interval estimates for the difference in mean
performance, namely 1 2.
Chapter 11. Comparison and Evaluation of Alternative Designs
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Comparison of Two System Designs
Vehicle-safety inspection example:
The station performs 3 jobs: (1) brake check, (2) headlight check, and (3) steering
check.
Vehicles arrival: Possion with rate = 9.5/hour.
Present system:
Alternative system:
Performance measure: mean response time per vehicle (total time from vehicle
arrival to its departure).
- Three stalls in parallel (one attendant makes all 3 inspections at each stall).
- Service times for the 3 jobs: normally distributed with means 6.5, 6.0 and 5.5 minutes,
respectively.
- Each attendant specializes in a single task, each vehicle will pass through three work
stations in series
- Mean service times for each job decreases by 10% (5.85, 5.4, and 4.95 minutes).
Chapter 11. Comparison and Evaluation of Alternative Designs
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Comparison of Two System Designs
From replication r of system i, the simulation analyst obtains an estimate
Yir of the mean performance measure i .
Assuming that the estimators Yir are (at least approximately) unbiased:
1 = E(Y1r ), r = 1, , R1;
2 = E(Y2r ), r = 1, , R2
Goal: compute a confidence interval for 1 2 to compare the two
system designs
Confidence interval for 1 2:
- If CI is totally to the left of 0, strong evidence for the hypothesis that
1 2 < 0 (1 < 2 ).
- If CI is totally to the right of 0, strong evidence for the hypothesis that
1 2 > 0 (1 > 2 ).
- If CI contains 0, no strong statistical evidence that one system is better than
the other
If enough additional data were collected (i.e., Ri increased), the CI would
most likely shift, and definitely shrink in length, until conclusion of
1 < 2 or 1 > 2 would be drawn.
Chapter 11. Comparison and Evaluation of Alternative Designs
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Comparison of Two System Designs
In this chapter:
A two-sided 100(1-)% CI for 1 2 always takes the form of:
Y.1 Y.2 t / 2, s.e.(Y.1 Y.2 )
Sample
mean for
system i
Degree
of
freedom
Standard error
of the estimator
All three techniques assume that the basic data Yir are approximately
normally distributed.
Chapter 11. Comparison and Evaluation of Alternative Designs
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Comparison of Two System Designs
Statistically significant versus practically significant
Statistical significance: is the observed difference Y.1 Y.2 larger than
the variability in Y.1 Y.2 ?
Practical significance: is the true difference 1 2 large enough to
matter for the decision we need to make?
Confidence intervals do not answer the question of practical significance
directly, instead, they bound the true difference within the range:
Y1 Y2 t , s.e.(Y1 Y2 ) 1 2 Y1 Y2 + t , s.e.(Y1 Y2 )
2
Whether a difference within these bounds is practically significant
depends on the particular problem.
Chapter 11. Comparison and Evaluation of Alternative Designs
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Independent Sampling with Equal Variances
Different and independent random number streams are used
to simulate the two systems
All observations of simulated system 1 are statistically
independent of all the observations of simulated system 2.
The variance of the sample mean, Y.i , is:
V (Y.i ) i2
V Y.i =
=
,
Ri
Ri
( )
i = 1,2
For independent samples:
) ( ) ( )
V Y.1 Y.2 = V Y.1 + V Y.2 =
Chapter 11. Comparison and Evaluation of Alternative Designs
10
12
R1
22
R2
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Independent Sampling with Equal Variances
If it is reasonable to assume that 21 = 22 (approximately) or if R1 =
R2, a two-sample-t confidence-interval approach can be used:
The point estimate of the mean performance difference is:
The sample variance for system i is:
Y.1 Y.2
1 Ri
1 Ri 2
2
2
(Yri Y.i ) =
S =
Yri RiY.i
Ri 1 r =1
Ri 1 r =1
2
i
The pooled estimate of 2 is:
S p2
( R1 1) S12 + ( R2 1) S 22
=
,
R1 + R2 2
where = R1 + R2 -2 degrees of freedom
CI is given by: Y.1 Y.2 t / 2, s.e.(Y.1 Y.2 )
Standard error: s.e.(Y.1 Y.2 ) = S p
Chapter 11. Comparison and Evaluation of Alternative Designs
1
1
+
R1 R2
11
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Independent Sampling with Unequal Variances
If the assumption of equal variances cannot safely be made,
an approximate 100(1-)% CI can be computed as:
s.e. Y.1 Y.2 =
With degrees of freedom:
(S
/ R1 + S 22 / R2
=
S 2 / R 2 / (R 1) + S 2 / R
1
2 2
1 1
2
1
S12 S 22
+
R1 R2
) / (R
2
1)
, round to an interger
In this case, the minimum number of replications
R1 > 7 and R2 > 7 is recommended.
Chapter 11. Comparison and Evaluation of Alternative Designs
12
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Common Random Numbers (CRN)
For each replication, the same random numbers are used to simulate
both systems R1=R2=R.
For each replication r, the two estimates, Yr1 and Yr2, are correlated.
However, independent streams of random numbers are used on different
replications, so the pairs (Yr1 ,Ys2 ) are mutually independent for r s.
Purpose: induce positive correlation between Y.1 ,Y.2 (for each r) to
reduce variance in the point estimator of Y.1 Y.2 .
) ( ) ( )
V Y.1 Y.2 = V Y.1 + V Y.2 2 cov Y.1 , Y.2
12
Correlation:
12 is positive
2 12 1 2
=
+
R
R
R
Chapter 11. Comparison and Evaluation of Alternative Designs
22
13
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Common Random Numbers (CRN)
Compare variance from independent sampling with variance
from CRN:
VCRN
2 12 1 2
= VIND
R
Variance of Y.1 Y.2 arising from CRN is less than that of
independent sampling (with R1=R2).
Chapter 11. Comparison and Evaluation of Alternative Designs
14
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Common Random Numbers (CRN)
The estimator based on CRN is more precise, leading to a shorter
confidence interval for the difference.
Sample variance of the differences D = Y.1 Y.2 :
1 R
1 R 2
2
2
(
)
S =
D
D
D
R
D
r
r
R 1 r =1
R 1 r =1
2
D
where Dr = Yr1-Yr 2
Standard error:
1 R
and D = Dr , with degress of freedom = R-1
R r =1
s.e.( D ) = s.e. Y.1 Y.2 =
Chapter 11. Comparison and Evaluation of Alternative Designs
15
SD
R
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Common Random Numbers (CRN)
It is never enough to simply use the same seed for the randomnumber generator(s):
The random numbers must be synchronized: each random number used
in one model for some purpose should be used for the same purpose in
the other model.
Example: if the i-th random number is used to generate a service time at
work station 2 for the 5-th arrival in model 1, the i-th random number
should be used for the very same purpose in model 2.
Chapter 11. Comparison and Evaluation of Alternative Designs
16
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Common Random Numbers (CRN)
Vehicle inspection example:
4 input random variables:
- An interarrival time between vehicle n and vehicle n+1,
- Sn(i) inspection time for task i for vehicle n in model 1 (i=1,2,3; refers to brake,
headlight and steering task, respectively).
When using CRN:
- Same values should be generated for A1, A2, A3, in both models.
- However, mean service time for model 2 is 10% less.
- Two possible approaches to obtain correlated service times:
Let Sn(i), be the service times generated for model 1, use:
Sn(i) - 0.1E[Sn(i)]
Let Zn(i), as the standard normal variate, = 0.5 minutes, use:
E[Sn(i)] + Zn(i)
- For synchronized runs: the service times for a vehicle were generated at the
instant of arrival and stored as its attribute and used as needed.
Chapter 11. Comparison and Evaluation of Alternative Designs
17
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Common Random Numbers (CRN)
Vehicle inspection example (cont.)
Each replication run of 16 hours
Model 2 with independen random numbers
Model 2 with common random numbers
with synchronisation
Model 1
Chapter 11. Comparison and Evaluation of Alternative Designs
Model 2 with common random numbers
without synchronisation
18
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Common Random Numbers (CRN)
Vehicle inspection example (cont.): compare the two systems using
independent sampling and CRN where R = R1 = R2 =10:
Independent sampling: Y.1 Y.2 = 5.4 minutes
with = 17, t 0.05,17 = 2.11, S12 = 118.9 and S 22 = 244.3, CI : -18.1 1-2 7.3
CRN without synchronization: Y.1 Y.2 = 1.9 minutes
with = 9, t 0.05,9 = 2.26, S D2 = 208.9, CI : - 12.3 1 - 2 8.5
CRN with synchronization: Y.1 Y.2 = 0.4 minutes
with = 9, t 0.05,9 = 2.26, S D2 = 1.7, CI : - 0.50 1 - 2 1.30
Chapter 11. Comparison and Evaluation of Alternative Designs
19
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
CRN with Specified Precision
Goal: The error in our estimate of 1 2 to be less than .
Approach: determine the # of replications R such that the half-width of
CI:
H = t / 2, s.e.(Y.1 Y.2 )
Vehicle inspection example (cont.):
R0 = 10, complete synchronization of random numbers
yield 95% CI: 0.4 0.9 minutes
Suppose = 0.5 minutes for practical significance, we know R is the
smallest integer satisfying R R0 and:
t / 2, R 1S D
R
t / 2, R0 1S D
t
t
Since / 2, R 1 / 2, R0 1 , a conservation estimate of R is: R
Hence, 35 replications are needed (25 additional).
Chapter 11. Comparison and Evaluation of Alternative Designs
20
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation
Discrete-Event System Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chapter 2
Simulation Examples
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation using a Table
Repetition
Introducing simulation by
manually simulating on a table
Inputs
xi1
Response
yi
xi2 xij xip
1
2
:
Can be done via pen-and-paper
or by using a spreadsheet
n
Three steps
1. Determine the characteristics of each inputs to the simulation.
2. Construct a simulation table consisting of
p inputs xij, j=1,2,,p
one response y , i=1,2,,n
i
3. For each repetition i, generate a value for each of the p inputs xij and
calculate the response yi.
Next some simulation examples using tables.
Chapter 2. Simulation Examples
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation of Queueing Systems
A queueing system is described by
Calling population
Arrival rate
Service mechanism
System capacity
Queueing discipline
Calling population
Waiting line
Chapter 2. Simulation Examples
Server
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation of Queueing Systems
Single server queue
Queueing system state
Calling population is infinite
System
Arrival rate does not change
Units are served according FIFO
Arrivals are defined by the
Units (in queue or being served)
- Clock
-
distribution of the time between
arrivals
State of the system
Events
- Number of units in the system
- Status of server (idle, busy)
Service times are according to
Arrival rate must be less than
service rate
inter-arrival time
distribution
Server
stable system
- Arrival of a unit
- Departure of a unit
Otherwise waiting line will grow
unbounded
unstable system
Calling population
ti+1
ti
Arrivals
Waiting line
Chapter 2. Simulation Examples
Server
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation of Queueing Systems
Arrival Event
If server idle unit gets service,
otherwise unit enters queue.
Departure Event
If queue is not empty begin
servicing next unit, otherwise
server will be idle.
How do events occur?
Events occur randomly
Interarrival times
Service times
Chapter 2. Simulation Examples
{1,...,6}
{1,...,4}
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation of Queueing Systems
Customer
Interarrival
Time
Arrival Time
on Clock
The interarrival and service
times are taken from
distributions!
Service
Time
15
Customer
Number
The simulation run is
build by meshing
clock, arrival and
service times!
Chapter 2. Simulation Examples
Arrival Time
[Clock]
Time Service
Begins [Clock]
Service Time
[Duration]
Time Service
Ends [Clock]
11
11
12
15
15
19
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation of Queueing Systems
Chronological ordering of events
Clock
Time
Customer
Number
Event
Type
Number of
customers
Arrival
Departure
Arrival
Departure
Arrival
Arrival
Departure
Arrival
11
Departure
12
Departure
15
Arrival
19
Departure
Chapter 2. Simulation Examples
Number of customers in the system
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation of Queueing Systems A Grocery
Analysis of a small grocery store
One checkout counter
Customers arrive at random times
Interarrival
Time
Probability
0.125
0.125
0.125
0.250
0.125
0.375
0.125
0.500
0.125
0.625
0.125
0.750
0.125
0.875
0.125
1.000
from {1,2,,8} minutes
Service times vary from {1,2,,6}
minutes
Consider the system for 100
customers
Problems/Simplifications
Sample size is too small to be able
to draw reliable conclusions
Initial condition is not considered
Chapter 2. Simulation Examples
Service
Time
Probability
Cumulative
Probability
Cumulative
Probability
0.10
0.10
0.20
0.30
0.30
0.60
0.25
0.85
0.10
0.95
0.05
1.00
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation of Queueing Systems A Grocery
Customer
Interarrival
Time
[Minutes]
Arrival Time
[Clock]
Service
Time
[Minutes]
Time
Service
Begins
[Clock]
Time
Service
Ends
[Clock]
Waiting
Time in
Queue
[Minutes]
Time
Customer
in System
[Minutes]
Idle Time
of Server
[Minutes]
11
11
15
11
15
16
18
18
23
416
418
174
491
101
...
100
Total
415
Chapter 2. Simulation Examples
415
2
317
10
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation of Queueing Systems A Grocery
Average waiting time
w=
Probability that a customer has to wait
Porpotion of server idle time
p( wait ) =
p(idle server) =
Average service time
s=
Number of customers
100
Number of customer who wait 46
=
= 0.46
Nunber of customers
100
Idle time of server = 101 = 0.24
Simulation run time
Service time
Number of customers
418
317
= 3.17 min
100
s p(s) = 0.110 + 0.2 20 + L + 0.05 6 = 3.2 min
Times between arrivals 415
=
= 4.19 min
=
E (s ) =
Average time between arrivals
Waiting time in queue = 174 = 1.74 min
s=0
Number of arrivals - 1
E ( ) =
Average waiting time of those who wait
Average time a customer spends in system
wwaited =
t=
99
a + b 1+ 8
=
= 4.5 min
2
2
Waiting time in queue
Number of customers that wait
174
= 3.22 min
54
Time customers spend in system = 491 = 4.91min
Number of customers
100
t = w + s = 1.74 + 3.17 = 4.91 min
Chapter 2. Simulation Examples
11
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation of Queueing Systems A Grocery
Interesting results for a
Average of 50 Trials
manager, but
Histogram for the Average Customer Waiting Time
longer simulation run would
increase the accuracy
18
17
Some interpretations
Average waiting time is not
high
Server hast not undue amount
Occurrences (No. of Trials)
16
14
12
10
8
4
2
1
0
of idle time, it is well loaded ;-)
15
1
0
4,5
0
0
0,5
1,5
2,5
3,5
>4.5
Average customer waiting time [min]
Nearly half of the customers
have to wait (46%)
Chapter 2. Simulation Examples
12
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Summary
This chapter introduced simulation concepts my means of
examples
Example simulation were performed on a table manually
Use a spreadsheet for large experiments (Excel, OpenOffice)
Input data is important
Random variables can be used
Output analysis important and difficult
The used tables were of ad hoc, a more methodic approach is
needed
Chapter 2. Simulation Examples
13
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation
Discrete-Event System Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chapter 3
General Principles
Computer Science, Informatik 4
Communication and Distributed Systems
General Principles Introduction
Framework for modeling systems by discrete-event
simulation
A system is modeled in terms of its state at each point in time
This is appropriate for systems where changes occur only at
discrete points in time
Chapter 3. General Principles
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Concepts in Discrete-Event Simulation
Concepts of dynamic, stochastic systems that change in a discrete
manner
System
A collection of entities that interact together over time to accomplish one or more
goals.
Model
An abstract representation of a system, usually containing structural, logical, or
mathematical relationships that describe the system.
System state
A collection of variables that contain all the information necessary to describe the
system at any time.
Entity
An object in the system that requires explicit representation in the model, e.g., people,
machines, nodes, packets, server, customer.
Attributes
The properties of a given entity.
List, Set
A collection of associated entities ordered in some logical fashion in a waiting line.
Holds entities and event notices
Entities on a list are always ordered by some rule, e.g. FIFO, LIFO, or ranked by
some attribute, e.g. priority, due date
Event
An instantaneous occurrence that changes the state of a system.
Event notice
A record of an event to occur at the current or some future time, along with any
associated data necessary to execute the event.
Chapter 3. General Principles
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Concepts in Discrete-Event Simulation
Event list
A list of event notices for future events, ordered by time of occurrence; known as the
future event list (FEL).
Always ranked by the event time
Activity
A duration of time of specified length, which is known when it begins.
Represents a service time, interarrival time, or any other processing time whose
duration has been characterized by the modeler. The duration of an activity can be
specified as:
Deterministic Always 5 time units
Statistical Random draw from {2, 5, 7}
A function depending on system variables and entities
The duration of an activity is computable when it begins
The duration is not affected by other events
To track activities, an event notice is created for the completion time, e.g., let
clock=100 and service with duration 5 time units is starting
Schedule an end of service-event for clock + 5 = 105
Delay
A duration of time of unspecified indefinite length, which is not known until it ends.
Customers delay in waiting line depends on the number and service times of other
customers.
Typically a desired output of the simulation run.
Clock
A variable representing the simulated time.
Chapter 3. General Principles
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Concepts in Discrete-Event Simulation
Activity vs. Delay
Activity
Activity is known as unconditional wait
End of an activity is an event, for this an event notice is placed in
the future event list
This event is a primary event
Delay
Delay is known as conditional wait
Delays are managed by placing the entity on another list, e.g.,
representing a waiting line
Completion of delay is a secondary event, but they are not
placed in the future event list
Chapter 3. General Principles
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Concepts in Discrete-Event Simulation
Activity vs. Delay
Delay
Delay
Activity2
Activity1
t
A1
Chapter 3. General Principles
A2
D1
A3
D2
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Concepts in Discrete-Event Simulation
A model consists of
static description of the model and
the dynamic relationships and interactions between the components
Some questions that need to be answered for the dynamic behavior
Events
- How does each event affect system state, entity attributes, and set contents?
Activities
- How are activities defined?
- What event marks the beginning or end of each activity?
- Can the activity begin regardless of system state, or is its beginning conditioned on the
system being in a certain state?
Delays
- Which events trigger the beginning (and end) of each delay?
- Under what condition does a delay begin or end?
System state initialization
- What is the system state at time 0?
- What events should be generated at time 0 to prime the model that is, to get the
simulation started?
Chapter 3. General Principles
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Concepts in Discrete-Event Simulation
A discrete-event simulation proceeds by producing a
sequence of system snapshots over time
A snapshot of the system at a given time includes
System state
Status of all entities
Status of all sets
- Sets are used to collect required information for calculating
performance metrics
List of activities (FEL)
Statistics
Clock
System state
(x, y, z, ...)
...
...
Chapter 3. General Principles
Entities and
attributes
Set 1
Set 2
...
Future event list (FEL)
Statistics
(3,t1) Type 3 event to occur at t1
...
...
...
...
...
...
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Event-scheduling/Time-advance algorithm
Future event list (FEL)
All event notices are chronologically ordered in the FEL
At current time t, the FEL contains all scheduled events
The event times satisfy: t < t1 t2 t3 ... tn
The event associated with t1 is the imminent event, i.e., the next
event to occur
Scheduling of an event
- At the beginning of an activity the duration is computed and an endof-activity event is placed on the future event list
The content of the FEL is changing during simulation run
- Efficient management of the FEL has a major impact on the
performance of a simulation run
- Class: Data structures and algorithms
Chapter 3. General Principles
10
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Event-scheduling/Time-advance algorithm
Old system snapshot at time t
Event-scheduling/Time-advance algorithm
Clock
State
(5,1,6)
(3,t1) Type 3 event to occur at t1
(1,t2) Type 1 event to occur at t2
Step 1: Remove the event notice for the
imminent event from FEL
event (3, t1) in the example
(1,t3) Type 1 event to occur at t3
...
(2,tn) Type 2 event to occur at tn
Step 2: Advance Clock to imminent event time
Set clock = t1
Step 3: Execute imminent event
update system state
change entity attributes
set membership as needed
Step 4: Generate future events and place their
event notices on FEL
Event (4, t*)
Future event list
New system snapshot at time t1
Clock
State
t1
(5,1,5)
Future event list
(1,t2) Type 1 event to occur at t2
(4,t*) Type 4 event to occur at t*
(1,t3) Type 1 event to occur at t3
Step 5: Update statistics and counters
...
(2,tn) Type 2 event to occur at tn
Chapter 3. General Principles
11
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Event-scheduling/Time-advance algorithm
System snapshot at time 0
Initial conditions
Generation of exogenous events
- Exogenous event, is an event which happens outside the system,
but impinges on the system, e.g., arrival of a customer
Chapter 3. General Principles
12
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Event-scheduling/Time-advance algorithm
Generation of events
Arrival of a customer
- At t=0 first arrival is generated and scheduled
- When the clock is advanced to the time of the
first arrival, a second arrival is generated
- Generate an interarrival time a*
- Calculate t* = clock + a*
- Place event notice at t* on the FEL
Chapter 3. General Principles
13
Bootstrapping
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Event-scheduling/Time-advance algorithm
Generation of events
Service completion of a customer
-
A customer completes service at t
If the next customer is present a new service time s* is generated
Calculate t* = clock + s*
Schedule next service completion at t*
Additionally: Service completion event will scheduled at the arrival
time, when there is an idle server
- Service time is an activity
- Beginning service is a conditional event
Conditions: Customer is present and server is idle
- Service completion is a primary event
Chapter 3. General Principles
14
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Event-scheduling/Time-advance algorithm
Generation of events
Alternate generation of runtimes and downtimes
- At time 0, the first runtime will be generated and an end-of-runtime
event will be scheduled
- Whenever an end-of-runtime event occurs, a downtime will be
generated, and a end-of-downtime event will be scheduled
- At the end-of-downtime event, a runtime is generated and an endof-runtime event is scheduled
- Runtimes and downtimes are activities
- end-of-runtime and end-of-downtime are primary events
runtime
downtime
runtime
Time
Time 0
Chapter 3. General Principles
15
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Event-scheduling/Time-advance algorithm
Stopping a simulation
1. At time 0, schedule a stop simulation event at a specified future
time TE Simulation will run over [0, TE]
2. Run length TE is determined by the simulation itself.
TE is not known ahead.
Example: TE = When FEL is empty
Chapter 3. General Principles
16
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
World Views
World view
A world view is an
orientation for the model
developer
Simulation packages
typically support some
world views
Here, only world views for
discrete simulations
Chapter 3. General Principles
Discrete Simulation
Event-scheduling
17
Process-interaction
Activity-scanning
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
World Views
Start
Event-scheduling
Focus on events
Identify the entities and their
attributes
Identify the attributes of the
system
Define what causes a change
in system state
Write a routine to execute for
each event
Variable time advance
Initialization
Select next event
Event
routine 1
Event
routine 2
No
Event
routine n
Terminate?
Yes
Output
End
Chapter 3. General Principles
18
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
World Views
Process-interaction
Modeler thinks in terms of processes
A process is the lifecycle of one entity, which consists of various events and activities
Simulation model is defined in terms of entities or objects and their life cycle as they flow
through the system, demanding resources and queueing to wait for resources
Some activities might require the use of one or more resources whose capacities are limited
Processes interact, e.g., one process has to wait in a queue because the resource it needs is
busy with another process
A process is a time-sequenced list of events, activities and delays, including demands for
resource, that define the life cycle of one entity as it moves through a system
Variable time advance
Chapter 3. General Principles
19
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
World Views
Start
Activity-scanning
Initialization
Modeler concentrates on activities
of a model and those conditions
that allow an activity to begin
At each clock advance, the
conditions for each activity are
checked, and, if the conditions are
true, then the corresponding
activity begins
Fix time advance
Disadvantage: The repeated
scanning to discover whether an
activity can begin results in slow
runtime
Improvement: Three-phase
approach
Phase 1: Time Scan
Phase 2: Activity Scan
Activity 1
Condition
Actions
Activity 2
Condition
Actions
Other condition
satisfied?
Activity n
Condition
Actions
Yes
No
No
Terminate?
Yes
- Combination of event scheduling
with activity scanning
Output
End
Chapter 3. General Principles
20
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
World Views
Start
Three-phase approach
Initialization
Events are activities of duration
zero time units
Two types of activities
Phase A: Time Scan
- B activities: activities bound to
occur; all primary events and
unconditional activities
- C activities: activities or events
that are conditional upon certain
conditions being true
Phase B: Execute B activities due now
Phase C: Scan all C activities
Activity 1
Condition
Actions
The B-type activites can be
scheduled ahead of time, just as
in the event-scheduling approach
Activity 2
Condition
Actions
- Variable time advance
- FEL contains only B-type events
Other condition
satisfied?
Scanning to learn whether any Ctype activities can begin or C-type
events occur happen only at the
end of each time advance, after
all B-type events have completed
Activity n
Condition
Actions
Yes
No
No
Terminate?
Yes
Output
End
Chapter 3. General Principles
21
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
World Views
P2
E4
E3
A3
A4
P4
E8
E7
A7
P3
P1
E2
E1
A1
E6
E5
A5
A8
A6
A2
Time
Chapter 3. General Principles
22
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Manual Simulation Using Event Scheduling Grocery
Reconsider grocery example from Chapter 2
System state = ( LQ(t), LS(t) )
(A, t) arrival event at future time t
(D, t) departure event at future time t
(E, t) simulation stop at future time t
Activities
Arrival (A)
Departure (D)
Stopping event (E)
Event notices
Server and customers are not explicitly modeled
Events
LQ(t) = Number of customers in the waiting line at t
LS(t) = Number of customers being served at t (0 or 1)
Entities
In chapter 2: We used an ad hoc method to simulate the grocery
Interarrival time
Service time
Delay
Customer time spent in waiting line
Calling population
Waiting line
Chapter 3. General Principles
23
Server
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Manual Simulation Using Event Scheduling Grocery
System state = ( LQ(t), LS(t) ) is affected by the events
Arrival
Departure
Chapter 3. General Principles
24
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Manual Simulation Using Event Scheduling Grocery
Server Busy time
Maximum Queue Length
Initial conditions
First customer arrives at t=0
and gets service
An arrival and a departure
event is on FEL
Server was busy for 21 of
23 time units
Maximum queue length
was 2
Chapter 3. General Principles
25
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Manual Simulation Using Event Scheduling Grocery
When event scheduling is implemented, consider
Only one snapshot is kept in the memory
A new snapshot can be derived only from the previous snapshot
Past snapshot are ignored for advancing the clock
The current snapshot must contain all information necessary to
continue the simulation!
In the example
No information about particular customer
If needed, the model has to be extended
Chapter 3. General Principles
26
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Manual Simulation Using Event Scheduling Grocery
Analyst wants estimates per customer basis
Mean response time (system time)
Mean proportion of customers who spend more than 5 time units
Extend the model to represent customers explicitly
Entities: Customer entities denoted as C1, C2, C3,
- (Ci, t) customer Ci arrived at t
Event notices
- (A, t, Ci) arrival of customer Ci at t
- (D, t, Cj) departure of customer Cj at t
Set
- Checkout Line set of customers currently at the checkout counter ordered
by time of arrival
Statistics
- S: sum of customer response times for all customers who have departed by
the current time
- F: total number of customers who spend 5 time units
- ND: number of departures up to the current simulation time
Chapter 3. General Principles
27
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Manual Simulation Using Event Scheduling Grocery
Extended version of the simulation table from Slide 25
System State
Clock
LQ(t)
(C1,0)
(A,1,C2) (D,4,C1)(E,60)
(C1,0)(C2,1)
(A,2,C3)(D,4,C1)(E,60)
(C1,0)(C2,1)(C3,2)
(D,4,C1)(A,8,C4)(E,60)
(C2,1)(C3,2)
(D,6,C2)(A,8,C4)(E,60)
(C3,2)
(A,8,C4)(D,11,C3)(E,60)
(C3,2)(C4,8)
(D,11,C3)(A,11,C5)(E,60)
11
(C4,8)(C5,11)
(D,15,C4)(A,18,C6)(E,60)
18
15
(C5,11)
(D,16,C4)(A,18,C6)(E,60)
25
16
(A,18,C6)(E,60)
30
18
(C6,18)
(D,23,C6)(A,23,C7)(E,60)
30
23
(C7,23)
(A,25,C8)(D,27,C7)(E,60)
35
response time =
Chapter 3. General Principles
LS(t)
Statistics
Checkout Line
35
S
=
= 5.83
6
ND
Future Event List
N 5 =
28
ND
5
F
= = 0.83
ND 6
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Manual Simulation Using Event Scheduling Dump Truck
The DumpTruck Problem
Six dump trucks are used to haul coal from the entrace of a small mine
to the railroad
Each truck is loaded by one of two loaders
After loading, the truck immediately moves to the scale, to be weighed
Loader and Scale have a first-come-first-serve (FCFS) queue
The travel time from loader to scale is negligible
After being weighed, a truck begins a travel time, afterwards unloads
the coal and returns to the loader queue
Purpose of the study: Estimation of the loader and scale utilizations.
Chapter 3. General Principles
29
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Manual Simulation Using Event Scheduling Dump Truck
Loader queue Trucks waiting to begin loading, FCFS
Weigh queue Truck waiting to bei weighed, FCFS
Delay at loader queue
Delay at scale
Chapter 3. General Principles
30
CDF
0.30
0.30
10
0.50
0.80
15
0.20
1.00
Weighing Time
PDF
CDF
12
0.70
0.70
16
0.30
1.00
Travel Time Distribution
Travel Time
Loading Loading time
Weighing Weighing time
Travel Travel time
Delays
PDF
Weighing Time Distribution
The six dump trucks DT1, DT2, ..., DT6
Activities
Loading Time
Lists
(ALQ, t, DTi) dump truck i arrives at loader queue (ALQ) at time t
(EL, t, DTi) dump truck i ends loading (EL) at time t
(EW, t, DTi) dump truck i ends weighing (EW) at time t
Entities
LQ(t) = number of trucks in the loader queue {0,1,2,...}
L(t) = number of trucks being loaded {0,1,2}
WQ(t) = number of trucks in weigh queue {0,1,2,...}
W(t) = number of trucks being weighed {0,1}
Event notices
Loading Time Distribution
System state [ LQ(t), L(t), WQ(t), W(t) ]
PDF
CDF
40
0.40
0.40
60
0.30
0.70
80
0.20
0.90
100
0.10
1.00
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Manual Simulation Using Event Scheduling Dump Truck
Initialization
It is assumed that five trucks are at the loader and one is at the scale at
time 0
Activity times
Both loaders
are busy!
Loading time: 10, 5, 5, 10, 15, 10, 10
Weighing time: 12, 12, 12, 16, 12, 16
Travel time: 60, 100, 40, 40 80
System State
Lists
Loader Queue
Statistics
Clock
LQ(t)
L(t)
WQ(t)
W(t)
Weigh Queue
DT4, DT5, DT6
DT5, DT6
DT3
10
DT6
10
12
20
24
Future Event List
BS
(EL,10,DT2) (EL,5+5,DT4) (EW,12,DT1)
10
DT3, DT2
(EL,10,DT4) (EW,12,DT1)
(EL,10+10,DT5)
20
10
DT3, DT2, DT4
(EW,12,DT1) (EL,20,DT5)
(EL,10+15,DT6)
20
10
DT2, DT4
(EL,20,DT5) (EW,12+12,DT3)
(EL,25,DT6) (ALQ,12+60,DT1)
24
12
DT2, DT4, DT5
(EW,24,DT3) (EL,25,DT6) (ALQ,72,DT1)
40
20
DT4, DT5
(EL,25,DT6) (EW,24+12,DT2)
(ALQ,72,DT1) (ALQ,24+100,DT3)
44
24
Chapter 3. General Principles
(EL,5,DT3) (EL,10,DT2) (EW,12,DT1)
BL
31
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation in Java
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation in Java
Again the grocery example
Java is a general purpose
programming language
Single server queue
Run for 1000 customers
Interarrival times are
exponentially distributed with
mean 4.5
Service times are also
exponentially distributed with
mean 3.2
Known as: M/M/1 queueing
system
Object-oriented
First simple specific
simulation implementation
Later, object-oriented
framework for discrete event
simulation
Calling population
ti+1
ti
Arrivals
Waiting line
Chapter 3. General Principles
33
Server
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation in Java
System state
queueLength
numberInService
Entity attributes
customers
futureEventList
Activity durations
meanInterArrivalTime
meanServiceTime
meanInterarrivalTime
meanServiceTime
totalCustomers
34
exponential(mu)
Methods
Input parameters
rho = BusyTime/Clock
avgr = Average response time
pc4 = Number of customers who spent
more than 4 minutes
Help functions
clock
lastEventTime
totalBusy
maxQueueLength
sumResponseTime
Statistics
Future event list
Chapter 3. General Principles
Simulation variables
initialization
processArrival
processDeparture
reportGeneration
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation in Java
Overall structure of the Java program
Overall structure of an event-scheduling
simulation program
Chapter 3. General Principles
35
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation in Java Class Event
class Event {
public double time;
private int type;
public Event(int _type, double _time) {
type = _type;
time = _time;
}
public int getType() {
return type;
}
public double getTime() {
return time;
}
}
Chapter 3. General Principles
36
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation in Java Sim Class
class Sim {
// Class Sim variables
public static double clock,
meanInterArrivalTime,
meanServiceTime,
lastEventTime,
totalBusy,
maxQueueLength,
sumResponseTime;
public static long
numberOfCustomers,
queueLength,
numberInService,
totalCustomers,
numberOfDepartures,
longService;
public final static int arrival = 1;
public final static int departure = 2;
// Event type for an arrival
// Event type for a departure
public static EventList futureEventList;
public static Queue customers;
public static Random stream;
Chapter 3. General Principles
37
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation in Java Main program
public static void main(String
meanInterArrivalTime
meanServiceTime
totalCustomers
long seed
argv[]) {
= 4.5;
= 3.2;
= 1000;
= Long.parseLong(argv[0]);
stream = new Random(seed);
futureEventList = new EventList();
customers = new Queue();
// Initialize rng stream
initialization();
// Loop until first totalCustomers" have departed
while( numberOfDepartures < totalCustomers ) {
Event event = (Event)futureEventList.getMin(); // Get imminent event
futureEventList.dequeue();
// Be rid of it
clock = event.getTime();
// Advance simulation time
if( event.getType() == arrival ) {
processArrival(event);
}
else {
processDeparture(event);
}
}
reportGeneration();
}
Chapter 3. General Principles
38
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation in Java Initialization
// Seed the event list with TotalCustomers arrivals
public static void initialization()
{
clock = 0.0;
queueLength = 0;
numberInService = 0;
lastEventTime = 0.0;
totalBusy = 0 ;
maxQueueLength = 0;
sumResponseTime = 0;
numberOfDepartures = 0;
longService = 0;
// Create first arrival event
double eventTime = exponential(stream, MeanInterArrivalTime);
Event event = new Event(arrival, eventTime);
futureEventList.enqueue(event);
}
Chapter 3. General Principles
39
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation in Java Event Arrival
public static void processArrival(Event event) {
customers.enqueue(event);
queueLength++;
// If the server is idle, fetch the event, do statistics and put into service
if( numberInService == 0 ) {
scheduleDeparture();
}
else {
totalBusy += (clock - lastEventTime); // server is busy
}
// Adjust max queue length statistics
if(maxQueueLength < queueLength) {
maxQueueLength = queueLength;
}
// Schedule the next arrival
Double eventTime = clock + exponential(stream, meanInterArrivalTime);
Event nextArrival = new Event(arrival, eventTime);
futureEventList.enqueue( nextArrival );
lastEventTime = clock;
}
Chapter 3. General Principles
40
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation in Java Event Departure
public static void scheduleDeparture() {
double serviceTime = exponential(stream, meanServiceTime);
Event depart = new Event(departure, clock + serviceTime);
futureEventList.enqueue(depart);
numberInService = 1;
queueLength--;
}
public static void processDeparture(Event e) {
// Get the customer description
Event finished = (Event) customers.dequeue();
// If there are customers in the queue then schedule the departure of the next one
if( queueLength > 0 ) {
scheduleDeparture();
}
else {
numberInService = 0;
}
// Measure the response time and add to the sum
double response = clock - finished.getTime();
sumResponseTime += response;
if( response > 4.0 )
longService++; // record long service
totalBusy += (clock - lastEventTime);
numberOfDepartures++;
lastEventTime = clock;
}
Chapter 3. General Principles
41
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation in Java Report Generator
public static void
double rho
=
double avgr =
double pc4
=
reportGeneration() {
totalBusy/clock;
sumResponseTime/totalCustomers;
((double)longService)/totalCustomers;
System.out.println( "SINGLE SERVER QUEUE SIMULATION - GROCERY STORE CHECKOUT COUNTER ");
System.out.println( "\tMEAN INTERARRIVAL TIME
" + meanInterArrivalTime );
System.out.println( "\tMEAN SERVICE TIME
" + meanServiceTime );
System.out.println( "\tNUMBER OF CUSTOMERS SERVED
" + totalCustomers );
System.out.println();
System.out.println( "\tSERVER UTILIZATION
" + rho );
System.out.println( "\tMAXIMUM LINE LENGTH
" + maxQueueLength );
System.out.println( "\tAVERAGE RESPONSE TIME
" + avgr + " Time Units");
System.out.println( "\tPROPORTION WHO SPEND FOUR ");
System.out.println( "\t MINUTES OR MORE IN SYSTEM
" + pc4 );
System.out.println( "\tSIMULATION RUNLENGTH
" + clock + " Time Units");
System.out.println( "\tNUMBER OF DEPARTURES
" + totalCustomers );
}
Chapter 3. General Principles
42
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation in Java - Output
SINGLE SERVER QUEUE SIMULATION - GROCERY STORE CHECKOUT COUNTER
MEAN INTERARRIVAL TIME
4.5
MEAN SERVICE TIME
3.2
NUMBER OF CUSTOMERS SERVED
1000
SERVER UTILIZATION
MAXIMUM LINE LENGTH
AVERAGE RESPONSE TIME
PROPORTION WHO SPEND FOUR
MINUTES OR MORE IN SYSTEM
SIMULATION RUNLENGTH
NUMBER OF DEPARTURES
Chapter 3. General Principles
0.718
13.0
9.563
0.713
4485.635
1000
43
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Object-oriented Discrete-Event Simulation
Framework
Computer Science, Informatik 4
Communication and Distributed Systems
Object-Oriented Simulation Framework
Package rng
Package core
Chapter 3. General Principles
45
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Object-Oriented Simulation Framework
OO Discrete-Event Simulation Framework consists of
Two packages
Package core
SimEvent
SimEntity
SimQueue
SimControl
Package rng
RNG
Chapter 3. General Principles
46
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Object-Oriented Simulation Framework SimEvent
public class SimEvent {
double time;
int type;
SimEntity src;
SimEntity dst;
public long id;
public SimEvent(SimEntity _dst) {
type = 0;
time = 0;
src = null;
dst = _dst;
}
public SimEvent(double _time, SimEntity _dst) {
type = 0;
time = _time;
src = null;
dst = _dst;
}
public SimEvent(double _time, SimEntity _src, SimEntity _dst) {
type = 0;
time = _time;
src = _src;
dst = _dst;
}
Chapter 3. General Principles
47
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Object-Oriented Simulation Framework SimEntity
public abstract class SimEntity {
protected SimControl simControl;
/**
* An entity has to know the current instance of the simulator.
* @param _simControl
* @see SimControl
*/
public SimEntity(SimControl _simControl) {
simControl = _simControl;
}
/**
* This method handles the events destined to this entity.
* @param event
* @see SimEvent
*/
abstract public void handleEvent(SimEvent event);
}
Chapter 3. General Principles
48
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Object-Oriented Simulation Framework SimQueue
public abstract class SimQueue {
/**
* Schedule the given event according to the event time.
* @param event
* @see SimEvent
*/
abstract public void schedule(SimEvent event);
/**
* Return the next event in the queue.
* @return imminent event in queue.
* @see SimEvent
*/
abstract public SimEvent getNextEvent();
/**
* This method dumps the content of the queue.
* It is for debugging purposes.
*/
abstract public void dump();
abstract public boolean isEmpty();
}
Chapter 3. General Principles
49
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Object-Oriented Simulation Framework SimControl
public class SimControl {
private SimQueue queue;
private double time;
private double endTime;
public SimControl(SimQueue _queue) {
queue = _queue;
}
public void run() {
SimEvent event;
while( queue.isEmpty() == false ) {
// If there is an event in FEL and the sim-end is not reached ...
event = queue.getNextEvent();
time = event.getTime();
if( event.getTime() <= endTime )
dispatch(event); // ... call the destination object of this event
else
break;
}
}
private void dispatch(SimEvent event) {
event.getDestination().handleEvent(event);
}
Chapter 3. General Principles
50
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Object-Oriented Simulation Framework SimControl
... public class SimControl ...
public void setRunTime(double _runTime) {
endTime = _runTime;
}
public void schedule(SimEvent event) {
queue.schedule(event);
}
public void schedule(SimEvent event, double _delta) {
event.setTime(time +_delta);
schedule(event);
}
Chapter 3. General Principles
51
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Object-Oriented Simulation Framework RNG
public abstract class RNG {
abstract public double getNext();
}
public class Exponential extends RNG {
double lambda;
Random uniform;
public Exponential(double _lambda) {
lambda = _lambda;
uniform = new Random(System.currentTimeMillis());
}
/*
* @see rng.RNG#getNext()
*/
public double getNext() {
return -Math.log(uniform.nextDouble())/lambda;
}
}
Chapter 3. General Principles
52
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Object-Oriented Simulation Framework
Again our Grocery
example
Use of the objectoriented simulation
framework
MM1Generator
Generates new
customer
MM1Server
Serves customer
Chapter 3. General Principles
53
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Object-Oriented Simulation Framework
Chapter 3. General Principles
54
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Object-Oriented Simulation Framework
9
Simulation
Simulation
Theory
6
Queueing Time
System Time
Queueing Time
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.1
0.9
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
rho
rho
0.9
Simulation
p0 Probability that
a customer finds the
system idle
Theory
p0
0.8
0.7
Probability of Empty System
System Time
Theory
0.6
0.5
0.4
0.3
0.2
0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
rho
Chapter 3. General Principles
55
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Summary
Introduced a general framework for discrete event
simulations
Event-scheduling and time-advance algorithm
Generation of events
World views for discrete simulations
Introduced manual discrete event simulation
Introduced simulation in Java
Object-oriented simulation framework in Java
Chapter 3. General Principles
56
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation
Discrete-Event System Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chapter 4
Statistical Models in Simulation
Computer Science, Informatik 4
Communication and Distributed Systems
Purpose & Overview
The world the model-builder sees is probabilistic rather than
deterministic.
Some statistical model might well describe the variations.
An appropriate model can be developed by sampling the
phenomenon of interest:
Select a known distribution through educated guesses
Make estimate of the parameters
Test for goodness of fit
In this chapter:
Review several important probability distributions
Present some typical application of these models
Chapter 4. Statistical Models in Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Review of Terminology and Concepts
In this section, we will review the following concepts:
Discrete random variables
Continuous random variables
Cumulative distribution function
Expectation
Chapter 4. Statistical Models in Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Discrete Random Variables
X is a discrete random variable if the number of possible values
of X is finite, or countable infinite.
Example: Consider jobs arriving at a job shop.
- Let X be the number of jobs arriving each week at a job shop.
Rx = possible values of X (range space of X) = {0,1,2,}
p(xi) = probability the random variable X is xi , p(xi) = P(X = xi)
p(xi), i = 1,2, must satisfy:
1. p( xi ) 0, for all i
2.
i =1
p( xi ) = 1
The collection of pairs [xi, p(xi)], i = 1,2,, is called the probability
distribution of X, and p(xi) is called the probability mass function (pmf) of
X.
Chapter 4. Statistical Models in Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Continuous Random Variables
X is a continuous random variable if its range space Rx is an interval or
a collection of intervals.
The probability that X lies in the interval [a, b] is given by:
b
P(a X b) = f ( x)dx
a
f(x) is called the probability density function (pdf) of X, satisfies:
1. f ( x) 0 , for all x in R X
2.
f ( x)dx = 1
RX
3. f ( x) = 0, if x is not in RX
Properties
x0
1. P ( X = x0 ) = 0, because f ( x)dx = 0
x0
2. P ( a X b ) = P ( a < X b ) = P ( a X < b ) = P ( a < X < b )
Chapter 4. Statistical Models in Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Continuous Random Variables
Example: Life of an inspection device is given by X, a
continuous random variable with pdf:
1 x / 2
e , x0
f ( x) = 2
0,
otherwise
Lifetime in Year
X has an exponential distribution with mean 2 years
Probability that the devices life is between 2 and 3 years is:
1 3 x / 2
P(2 x 3) = e dx = 0.14
2 2
Chapter 4. Statistical Models in Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Cumulative Distribution Function
Cumulative Distribution Function (cdf) is denoted by F(x), where
F(x) = P(X x)
If X is discrete, then
F ( x) = p ( xi )
xi x
If X is continuous, then
F ( x) = f (t )dt
Properties
1. F is nondecreasing function. If a b, then F (a ) F (b)
2. lim F ( x) = 1
x
3. lim F ( x) = 0
x
All probability question about X can be answered in terms of the cdf:
P (a X b) = F (b) F (a ), for all a b
Chapter 4. Statistical Models in Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Cumulative Distribution Function
Example: An inspection device has cdf:
1 x t / 2
F ( x) = e dt = 1 e x / 2
2 0
The probability that the device lasts for less than 2 years:
P(0 X 2) = F (2) F (0) = F (2) = 1 e 1 = 0.632
The probability that it lasts between 2 and 3 years:
P (2 X 3) = F (3) F (2) = (1 e ( 3 / 2 ) ) (1 e 1 ) = 0.145
Chapter 4. Statistical Models in Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Expectation
The expected value of X is denoted by E(X)
E ( x) = xi p ( xi )
all i
E ( x) = x f ( x)dx
If X is continuous
a.k.a the mean, m, , or the 1st moment of X
A measure of the central tendency
The variance of X is denoted by V(X) or var(X) or 2
If X is discrete
Definition:
V(X) = E( (X E[X])2 )
Also,
V(X) = E(X2) ( E(x) )2
A measure of the spread or variation of the possible values of X around the
mean
The standard deviation of X is denoted by
Definition: = V (x)
Expressed in the same units as the mean
Chapter 4. Statistical Models in Simulation
10
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Expectations
Example: The mean of life of the previous inspection device is:
1
x / 2
x / 2
E ( X ) = xe dx = xe
+ e x / 2 dx = 2
0
2 0
0
To compute variance of X, we first compute E(X2):
1 2 x / 2
x / 2
2
E ( X ) = x e dx = x e
+ e x / 2 dx = 8
0
2 0
0
2
Hence, the variance and standard deviation of the devices life
are:
2
V (X ) = 8 2 = 4
= V (X ) = 2
Chapter 4. Statistical Models in Simulation
11
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Expectations
1 x / 2
x / 2
E ( X ) = xe dx = xe
+ e x / 2 dx = 2
0
2 0
0
1
x / 2
+ e x / 2 dx = 2
E ( X ) = xe x / 2 dx = xe
0
2 0
0
Partial Integration
u ( x)v' ( x)dx = u ( x)v( x) u ' ( x)v( x)dx
Set
u ( x) = x
v' ( x) = e x / 2
u ' ( x) = 1
v( x) = 2e x / 2
1 x/ 2
1
x/ 2
E ( X ) = xe dx = ( x (2e ) 1 (2e x / 2 )dx)
0
2 0
2
0
Chapter 4. Statistical Models in Simulation
12
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Useful Statistical Models
In this section, statistical models appropriate to some
application areas are presented.
The areas include:
Queueing systems
Inventory and supply-chain systems
Reliability and maintainability
Limited data
Chapter 4. Statistical Models in Simulation
13
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Useful models Queueing Systems
In a queueing system, interarrival and service-time patterns
can be probabilistic.
Sample statistical models for interarrival or service time
distribution:
Exponential distribution: if service times are completely random
Normal distribution: fairly constant but with some random variability
(either positive or negative)
Truncated normal distribution: similar to normal distribution but with
restricted value.
Gamma and Weibull distribution: more general than exponential
(involving location of the modes of pdfs and the shapes of tails.)
Chapter 4. Statistical Models in Simulation
14
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Useful models Inventory and supply chain
In realistic inventory and supply-chain systems, there are at
least three random variables:
The number of units demanded per order or per time period
The time between demands
The lead time = Time between placing an order and the receipt of that
order
Sample statistical models for lead time distribution:
Gamma
Sample statistical models for demand distribution:
Poisson: simple and extensively tabulated.
Negative binomial distribution: longer tail than Poisson (more large
demands).
Geometric: special case of negative binomial given at least one demand
has occurred.
Chapter 4. Statistical Models in Simulation
15
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Useful models Reliability and maintainability
Time to failure (TTF)
Exponential: failures are random
Gamma: for standby redundancy where each component has an
exponential TTF
Weibull: failure is due to the most serious of a large number of
defects in a system of components
Normal: failures are due to wear
Chapter 4. Statistical Models in Simulation
16
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Useful models Other areas
For cases with limited data, some useful distributions
are:
Uniform
Triangular
Beta
Other distribution:
Bernoulli
Binomial
Hyperexponential
Chapter 4. Statistical Models in Simulation
17
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Discrete Distributions
Discrete random variables are used to describe random
phenomena in which only integer values can occur.
In this section, we will learn about:
Bernoulli trials and Bernoulli distribution
Binomial distribution
Geometric and negative binomial distribution
Poisson distribution
Chapter 4. Statistical Models in Simulation
18
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Bernoulli Trials and Bernoulli Distribution
Bernoulli Trials:
Consider an experiment consisting of n trials, each can be a success or
a failure.
- Xj = 1 if the j-th experiment is a success
- Xj = 0 if the j-th experiment is a failure
The Bernoulli distribution (one trial):
xj =1
p,
p j ( x j ) = p( x j ) =
,
=
=
q
:
1
p
,
x
0
j
j = 1,2,..., n
where E(Xj) = p and V(Xj) = p(1-p) = pq
Bernoulli process:
The n Bernoulli trials where trails are independent:
p(x1,x2,, xn) = p1(x1)p2(x2) pn(xn)
Chapter 4. Statistical Models in Simulation
19
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Binomial Distribution
The number of successes in n Bernoulli trials, X, has a binomial
distribution.
n x n x
p q , x = 0,1,2,..., n
p( x) = x
0,
otherwise
The number of
outcomes having the
required number of
successes and
failures
Probability that
there are
x successes and
(n-x) failures
The mean, E(x) = p + p + + p = n*p
The variance, V(X) = pq + pq + + pq = n*pq
Chapter 4. Statistical Models in Simulation
20
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Geometric Distribution
Geometric distribution
The number of Bernoulli trials, X, to achieve the 1st success:
q x 1 p, x = 0,1,2,..., n
p ( x) =
otherwise
0,
E(x) = 1/p, and V(X) = q/p2
Chapter 4. Statistical Models in Simulation
21
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Negative Binomial Distribution
Negative binomial distribution
The number of Bernoulli trials, X, until the kth success
If Y is a negative binomial distribution with parameters p and k,
then:
y 1 y k k
q p , y = k , k + 1, k + 2,...
p ( x) = k 1
0,
otherwise
y 1 y k k 1
q p {p
p ( x) =
k 1
1
442443 k th success
(k-1 ) successes
E(Y) = k/p, and V(X) = kq/p2
Chapter 4. Statistical Models in Simulation
22
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Poisson Distribution
Poisson distribution describes many random processes quite
well and is mathematically quite simple.
where > 0, pdf and cdf are:
p( x) = x! e , x = 0,1,...
0,
otherwise
i e
i =0
i!
F ( x) =
E(X) = = V(X)
Chapter 4. Statistical Models in Simulation
23
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Poisson Distribution
Example: A computer repair person is beeped each
time there is a call for service. The number of beeps per
hour ~ Poisson( = 2 per hour).
The probability of three beeps in the next hour:
p(3)
= 23/3! e-2 = 0.18
also,
p(3)
= F(3) F(2) = 0.857-0.677=0.18
The probability of two or more beeps in a 1-hour period:
p(2 or more)
= 1 ( p(0) + p(1) )
= 1 F(1)
= 0.594
Chapter 4. Statistical Models in Simulation
24
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Continuous Distributions
Continuous random variables can be used to describe
random phenomena in which the variable can take on any
value in some interval.
In this section, the distributions studied are:
Uniform
Exponential
Weibull
Normal
Lognormal
Chapter 4. Statistical Models in Simulation
25
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Uniform Distribution
A random variable X is uniformly distributed on the interval
(a, b), U(a, b), if its pdf and cdf are:
x<a
0,
x a
F ( x) =
, a x<b
b a
xb
1,
, a xb
f ( x) = b a
0,
otherwise
Properties
P(x1 < X < x2) is proportional to the length of the interval
[F(x2) F(x1) = (x2-x1)/(b-a)]
E(X) = (a+b)/2
V(X) = (b-a)2/12
U(0,1) provides the means to
generate random numbers, from
which random variates can be
generated.
Chapter 4. Statistical Models in Simulation
26
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Exponential Distribution
A random variable X is exponentially distributed with parameter
> 0 if its pdf and cdf are:
x<0
0,
F ( x ) = x t
x
0 e dt = 1 e , x 0
e x , x 0
f ( x) =
elsewhere
0,
E(X) = 1/ V(X) = 1/2
Chapter 4. Statistical Models in Simulation
27
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Exponential Distribution
Used to model interarrival
times when arrivals are
completely random, and to
model service times that are
highly variable
For several different
exponential pdfs (see figure),
the value of intercept on the
vertical axis is , and all pdfs
eventually intersect.
Chapter 4. Statistical Models in Simulation
28
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Exponential Distribution
Memoryless property
For all s and t greater or equal to 0:
P(X > s+t | X > s) = P(X > t)
Example: A lamp ~ exp( = 1/3 per hour), hence, on average, 1
failure per 3 hours.
- The probability that the lamp lasts longer than its mean life is:
P(X > 3) = 1-(1-e-3/3) = e-1 = 0.368
- The probability that the lamp lasts between 2 to 3 hours is:
P(2 <= X <= 3) = F(3) F(2) = 0.145
- The probability that it lasts for another hour given it is operating for
2.5 hours:
P(X > 3.5 | X > 2.5) = P(X > 1) = e-1/3 = 0.717
Chapter 4. Statistical Models in Simulation
29
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Exponential Distribution
Memoryless property
P( X > s + t )
P( X > s + t | X > s) =
P( X > s)
e ( s +t )
= s
e
= e t
= P( X > t )
Chapter 4. Statistical Models in Simulation
30
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Weibull Distribution
A random variable X has a Weibull distribution if its pdf has the form:
x 1
x
exp
, x
f ( x) =
0,
otherwise
3 parameters:
( < < )
Location parameter: ,
Scale parameter: , ( > 0)
Shape parameter. , (> 0)
Example: = 0 and = 1:
Chapter 4. Statistical Models in Simulation
31
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Weibull Distribution
Weibull Distribution
x 1
x
exp
, x
f ( x) =
0,
otherwise
For = 1, =0
1
x
1
f ( x) = exp , x
0,
otherwise
When = 1,
X ~ exp( = 1/)
Chapter 4. Statistical Models in Simulation
32
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Normal Distribution
A random variable X is normally distributed if it has the pdf:
1 x 2
1
f ( x) =
exp
, < x <
2
2
Mean: < <
2
Variance: > 0
Denoted as X ~ N(,2)
Special properties:
f ( x) = 0, and lim f ( x) = 0
xlim
x
f(-x)=f(+x); the pdf is symmetric about .
The maximum value of the pdf occurs at x = ; the mean and mode are
equal.
Chapter 4. Statistical Models in Simulation
33
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Normal Distribution
Evaluating the distribution:
Use numerical methods (no closed form)
Independent of and , using the standard normal distribution:
Z ~ N(0,1)
Transformation of variables: let Z = (X - ) / ,
F ( x ) = P ( X x ) = P Z
( x ) /
1 z2 / 2
=
e
dz
2
=
( x ) /
Chapter 4. Statistical Models in Simulation
( z )dz = ( x )
34
, where ( z ) =
1 t 2 / 2
e
dt
2
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Normal Distribution
Example: The time required to load an oceangoing vessel, X, is
distributed as N(12,4), =12, =2
The probability that the vessel is loaded in less than 10 hours:
10 12
F (10) =
= (1) = 0.1587
2
- Using the symmetry property, (1) is the complement of (-1)
Chapter 4. Statistical Models in Simulation
35
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Lognormal Distribution
A random variable X has a lognormal distribution if its pdf has
the form:
1
(ln x ) 2
exp
, x > 0
2
f ( x) = 2 x
2
0,
otherwise
=1,
2=0.5,1,2.
Mean E(X) = e+ /2
2
2
Variance V(X) = e2+ /2 (e - 1)
2
Relationship with normal distribution
When Y ~ N(, 2), then X = eY ~ lognormal(, 2)
Parameters and 2 are not the mean and variance of the lognormal
random variable X
Chapter 4. Statistical Models in Simulation
36
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Poisson Distribution
Definition: N(t) is a counting function that represents the
number of events occurred in [0,t].
A counting process {N(t), t>=0} is a Poisson process with
mean rate if:
Arrivals occur one at a time
{N(t), t>=0} has stationary increments
{N(t), t>=0} has independent increments
Properties
( t ) n t
P[ N (t ) = n] =
e ,
n!
for t 0 and n = 0,1,2,...
Equal mean and variance: E[N(t)] = V[N(t)] = t
Stationary increment: The number of arrivals in time s to t is also
Poisson-distributed with mean (t-s)
Chapter 4. Statistical Models in Simulation
37
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Poisson Distribution Interarrival Times
Consider the interarrival times of a Possion process (A1, A2, ), where
Ai is the elapsed time between arrival i and arrival i+1
The 1st arrival occurs after time t iff there are no arrivals in the interval [0,t],
hence:
P(A1 > t) = P(N(t) = 0) = e-t
P(A1 <= t) = 1 e-t
[cdf of exp()]
Interarrival times, A1, A2, , are exponentially distributed and independent with
mean 1/
Arrival counts
~ Poisson()
Interarrival time
~ Exp(1/)
Stationary & Independent
Chapter 4. Statistical Models in Simulation
Memoryless
38
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Poisson Distribution Splitting and Pooling
Splitting:
Suppose each event of a Poisson process can be classified as Type I,
with probability p and Type II, with probability 1-p.
N(t) = N1(t) + N2(t), where N1(t) and N2(t) are both Poisson processes
with rates p and (1-p)
N(t) ~ Poisson()
p
(1-p)
N1(t) ~ Poisson[p]
N2(t) ~ Poisson[(1-p)]
Pooling:
Suppose two Poisson processes are pooled together
N1(t) + N2(t) = N(t), where N(t) is a Poisson processes with rates 1 + 2
N1(t) ~ Poisson[1]
N2(t) ~ Poisson[2]
Chapter 4. Statistical Models in Simulation
1 + 2
N(t) ~ Poisson(1 + 2)
2
39
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Poisson Distribution Empirical Distributions
A distribution whose parameters are the observed values in a
sample of data.
May be used when it is impossible or unnecessary to establish that a
random variable has any particular parametric distribution.
Advantage: no assumption beyond the observed values in the sample.
Disadvantage: sample might not cover the entire range of possible values.
Chapter 4. Statistical Models in Simulation
40
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Summary
The world that the simulation analyst sees is probabilistic, not
deterministic.
In this chapter:
Reviewed several important probability distributions.
Showed applications of the probability distributions in a simulation context.
Important task in simulation modeling is the collection and
analysis of input data, e.g., hypothesize a distributional form for
the input data.
Student should know:
Difference between discrete, continuous, and empirical distributions.
Poisson process and its properties.
Chapter 4. Statistical Models in Simulation
41
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation Techniques
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chapter 7
Queueing Models
Computer Science, Informatik 4
Communication and Distributed Systems
Purpose
Simulation is often used in the analysis of queueing models.
A simple but typical queueing model
Calling population
Waiting line
Server
Queueing models provide the analyst with a powerful tool for
designing and evaluating the performance of queueing systems.
Typical measures of system performance
Server utilization, length of waiting lines, and delays of customers
For relatively simple systems, compute mathematically
For realistic models of complex systems, simulation is usually required
Chapter 7. Queueing Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Outline
Discuss some well-known models
Not development of queueing theory, for this see other class!
We will deal with
General characteristics of queues
Meanings and relationships of important performance measures
Estimation of mean measures of performance
Effect of varying input parameters
Mathematical solutions of some basic queueing models
Chapter 7. Queueing Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Characteristics of Queueing Systems
Key elements of queueing systems
Customer: refers to anything that arrives at a facility and requires
service, e.g., people, machines, trucks, emails.
Server: refers to any resource that provides the requested service, e.g.,
repairpersons, retrieval machines, runways at airport.
System
Customers
Server
Reception desk
People
Receptionist
Hospital
Patients
Nurses
Airport
Airplanes
Runway
Production line
Cases
Case-packer
Road network
Cars
Traffic light
Grocery
Shoppers
Checkout station
Computer
Jobs
CPU, disk, CD
Network
Packets
Router
Chapter 7. Queueing Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Calling Population
Calling population: the population of potential customers, may be
assumed to be finite or infinite.
Finite population model: if arrival rate depends on the number of
customers being served and waiting, e.g., model of one corporate jet, if it
is being repaired, the repair arrival rate becomes zero.
n-1
Infinite population model: if arrival rate is not affected by the number of
customers being served and waiting, e.g., systems with large population
of potential customers.
Chapter 7. Queueing Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
System Capacity
System Capacity: a limit on the number of customers that may
be in the waiting line or system.
Limited capacity, e.g., an automatic car wash only has room for 10 cars
to wait in line to enter the mechanism.
Waiting line
Server
Unlimited capacity, e.g., concert ticket sales with no limit on the number
of people allowed to wait to purchase tickets.
Waiting line
Chapter 7. Queueing Models
Server
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Arrival Process
For infinite-population models:
In terms of interarrival times of successive customers.
Random arrivals: interarrival times usually characterized by a probability
distribution.
- Most important model: Poisson arrival process (with rate ), where An
represents the interarrival time between customer n-1 and customer n, and is
exponentially distributed (with mean 1/).
Scheduled arrivals: interarrival times can be constant or constant plus or
minus a small random amount to represent early or late arrivals.
- Example: patients to a physician or scheduled airline flight arrivals to an
airport
At least one customer is assumed to always be present, so the server is
never idle, e.g., sufficient raw material for a machine.
Chapter 7. Queueing Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Arrival Process
For finite-population models:
Customer is pending when the customer is outside the queueing system,
e.g., machine-repair problem: a machine is pending when it is
operating, it becomes not pending the instant it demands service from
the repairman.
Runtime of a customer is the length of time from departure from the
queueing system until that customers next arrival to the queue, e.g.,
machine-repair problem, machines are customers and a runtime is time
to failure (TTF).
Let A1(i), A2(i), be the successive runtimes of customer i, and S1(i), S2(i)
be the corresponding successive system times:
Chapter 7. Queueing Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Queue Behavior and Queue Discipline
Queue behavior: the actions of customers while in a queue
waiting for service to begin, for example:
Balk: leave when they see that the line is too long
Renege: leave after being in the line when its moving too slowly
Jockey: move from one line to a shorter line
Queue discipline: the logical ordering of customers in a queue
that determines which customer is chosen for service when a
server becomes free, for example:
First-in-first-out (FIFO)
Last-in-first-out (LIFO)
Service in random order (SIRO)
Shortest processing time first (SPT)
Service according to priority (PR)
Chapter 7. Queueing Models
10
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Service Times and Service Mechanism
Service times of successive arrivals are denoted by S1, S2, S3.
May be constant or random.
{S1, S2, S3, } is usually characterized as a sequence of independent and
identically distributed random variables, e.g., exponential, Weibull,
gamma, lognormal, and truncated normal distribution.
A queueing system consists of a number of service centers and
interconnected queues.
Each service center consists of some number of servers, c, working in
parallel, upon getting to the head of the line, a customer takes the 1st
available server.
Chapter 7. Queueing Models
11
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Service Times and Service Mechanism
Example: consider a discount warehouse where customers may:
Serve themselves before paying at the cashier
Chapter 7. Queueing Models
12
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Service Times and Service Mechanism
Wait for one of the three clerks:
Batch service (a server serving several customers simultaneously), or
customer requires several servers simultaneously.
Chapter 7. Queueing Models
13
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Service Times and Service Mechanism
Chapter 7. Queueing Models
14
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Example
Candy production line
Three machines separated by buffers
Buffers have capacity of 1000 candies
Assumption:Allways
sufficient supply of
raw material.
Chapter 7. Queueing Models
15
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Queueing Notation Kendall Notation
A notation system for parallel server queues: A/B/c/N/K
represents the interarrival-time distribution
represents the service-time distribution
represents the number of parallel servers
represents the system capacity
represents the size of the calling population
are usually dropped, if they are infinity
Common symbols for A and B
A
B
c
N
K
N, K
M
D
Ek
H
G
Markov, exponential distribution
Constant, deterministic
Erlang distribution of order k
Hyperexponential distribution
General, arbitrary
Examples
M/M/1// same as M/M/1: Single-server with unlimited capacity and callpopulation. Interarrival and service times are exponentially distributed
G/G/1/5/5: Single-server with capacity 5 and call-population 5.
Chapter 7. Queueing Models
16
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Queueing Notation
Primary performance measures of queueing systems:
Pn
Pn(t)
An
Sn
Wn
WnQ
L(t)
LQ(t)
L
LQ
w
wQ
Chapter 7. Queueing Models
steady-state probability of having n customers in system
probability of n customers in system at time t
arrival rate
effective arrival rate
service rate of one server
server utilization
interarrival time between customers n-1 and n
service time of the n-th arriving customer
total time spent in system by the n-th arriving customer
total time spent in the waiting line by customer n
the number of customers in system at time t
the number of customers in queue at time t
long-run time-average number of customers in system
long-run time-average number of customers in queue
long-run average time spent in system per customer
long-run average time spent in queue per customer
17
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Evolving of a Queueing System
Number of
customers in the
system
Time
Chapter 7. Queueing Models
18
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Time-Average Number in System L
Consider a queueing system over a period of time T
Let Ti denote the total time during [0,T] in which the system contained
exactly i customers, the time-weighted-average number in a system is
defined by:
1
T
L =
iTi =
i i
T i =0
T
i =0
Consider the total area under the function is L(t), then,
1
L =
T
1
iTi =
T
i =0
L(t )dt
0
The long-run time-average number of customers in system, with
probability 1:
L=
T
Chapter 7. Queueing Models
L(t )dt T
L
19
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Time-Average Number in System L
The time-weighted-average number in queue is:
1
1
Q
LQ = iTi =
T i =0
T
LQ (t )dt T
LQ
G/G/1/N/K example: consider the results from the queueing system (N> 4,
K > 3).
L = [0(3) + 1(12) + 2(4) + 3(1)] / 20
= 23 / 20 = 1.15 cusomters
if L(t) = 0
0,
LQ (t ) =
L(t ) 1, if L(t) 1
0(15) + 1(4) + 2(1)
LQ =
= 0.3 customers
20
Chapter 7. Queueing Models
20
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Average Time Spent in System Per Customer w
The average time spent in system per customer, called the average
system time, is:
w=
N
W
i =1
where W1, W2, , WN are the individual times that each of the N
customers spend in the system during [0,T].
w as N
For stable systems: w
If the system under consideration is the queue alone:
1 N Q
w Q = Wi wQ
N i =1
as
G/G/1/N/K example (cont.): the average system time is
w =
W1 + W2 + ... + W5 2 + (8 3) + ... + (20 16)
=
= 4.6 time units
5
5
Chapter 7. Queueing Models
21
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
The Conservation Equation Littles Law
Conservation equation (a.k.a. Littles law)
Average # in
system
L = w
Average
System time
Arrival rate
L = w as T and N
Holds for almost all queueing systems or subsystems (regardless of the
number of servers, the queue discipline, or other special circumstances).
G/G/1/N/K example (cont.): On average, one arrival every 4 time units
and each arrival spends 4.6 time units in the system. Hence, at an
arbitrary point in time, there is (1/4)(4.6) = 1.15 customers present on
average.
Chapter 7. Queueing Models
22
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Server Utilization
Definition: the proportion of time that a server is busy.
Observed server utilization, , is defined over a specified time interval
[0,T].
Long-run server utilization is .
For systems with long-run stability: as T
Chapter 7. Queueing Models
23
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Server Utilization
For G/G/1// queues:
Any single-server queueing system with average arrival rate
customers per time unit, where average service time E(S) = 1/
time units, infinite queue capacity and calling population.
Conservation equation, L = w, can be applied.
For a stable system, the average arrival rate to the server, s,
must be identical to .
The average number of customers in the server is:
Ls =
T
Chapter 7. Queueing Models
(
T
T T0
L(t ) LQ (t ) dt =
T
24
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Server Utilization
In general, for a single-server queue:
Ls = T
Ls =
and
= E (s) =
- For a single-server stable queue:
<1
- For an unstable queue ( > ), long-run server utilization is 1.
Chapter 7. Queueing Models
25
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Server Utilization
For G/G/c// queues:
A system with c identical servers in parallel.
If an arriving customer finds more than one server idle, the
customer chooses a server without favoring any particular server.
For systems in statistical equilibrium, the average number of busy
servers, Ls, is: Ls, = E(s) = /.
The long-run average server utilization is:
Chapter 7. Queueing Models
Ls
=
, where < c for stable systems
c c
26
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Server Utilization and System Performance
System performance varies widely for a given utilization .
For example, a D/D/1 queue where E(A) = 1/ and E(S) = 1/,
where:
L = = /, w = E(S) = 1/, LQ = WQ = 0.
- By varying and , server utilization can assume any value between 0
and 1.
- Yet there is never any line.
In general, variability of interarrival and service times causes lines
to fluctuate in length.
Chapter 7. Queueing Models
27
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Server Utilization and System Performance
Consider the system is simulated
with service times: S1= 9, S2=12,
Example: A physician who
schedules patients every 10
minutes and spends Si minutes
with the i-th patient:
S3 = 9, S4 = 9, S5 = 9, .
The system becomes:
9 minutes with probability 0.9
Si =
12 minutes with probability 0.1
Arrivals are deterministic,
A1 = A2 = = -1 = 10.
Services are stochastic
- E(Si) = 9.3 min
- V(S0) = 0.81 min2
- = 0.9 min
The occurrence of a relatively
long service time (S2 = 12)
causes a waiting line to form
temporarily.
On average, the physician's
utilization = = / = 0.93 < 1.
Chapter 7. Queueing Models
28
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Costs in Queueing Problems
Costs can be associated with various aspects of the waiting line or
servers:
System incurs a cost for each customer in the queue, say at a rate of $10
per hour per customer.
WjQ is the time
- The average cost per customer is:
customer j spends
N
$10 W jQ
j =1
in queue
= $10 w Q
- If customers per hour arrive (on average), the average cost per
hour is:
L
$
10
customer $10 w Q
Q
= $10 w Q =
hour customer
hour
Server may also impose costs on the system, if a group of c parallel
servers (1 c ) have utilization r, each server imposes a cost of $5 per
hour while busy.
- The total server cost is: $5 c
Chapter 7. Queueing Models
29
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Steady-State Behavior of Markovian Models
Markovian models:
Exponential-distributed arrival process (mean arrival rate = 1/).
Service times may be exponentially (M) or arbitrary (G) distributed.
Queue discipline is FIFO.
A queueing system is in statistical equilibrium if the probability that the
system is in a given state is not time dependent:
P ( L(t ) = n) = Pn (t ) = Pn
Mathematical models in this chapter can be used to obtain approximate
results even when the model assumptions do not strictly hold, as a
rough guide.
Simulation can be used for more refined analysis, more faithful
representation for complex systems.
Chapter 7. Queueing Models
30
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Steady-State Behavior of Markovian Models
Properties of processes with statistical equilibrium
The state of statistical equilibrium is reached from any starting
state.
The process remain in statistical equilibrium once it has reached
it.
Chapter 7. Queueing Models
31
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Steady-State Behavior of Markovian Models
For the simple model studied in this chapter, the steady-state
parameter, L, the time-average number of customers in the
system is:
L=
nP
n =0
Apply Littles equation, L= w, to the whole system and to the
queue alone:
w=
wQ = w
LQ = wQ
G/G/c// example: to have a statistical equilibrium, a
necessary and sufficient condition is:
=
<1
c
Chapter 7. Queueing Models
32
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
M/G/1 Queues
Single-server queues with Poisson arrivals and unlimited capacity.
Suppose service times have mean 1/ and variance 2 and = / < 1, the
steady-state parameters of M/G/1 queue:
P0 = 1
2 (1 + 2 2 )
L=+
2(1 )
The particular
distribution is not
known!
2 (1 + 2 2 )
LQ =
2(1 )
(1 / 2 + 2 )
w= +
2(1 )
1
(1 / 2 + 2 )
wQ =
2(1 )
Chapter 7. Queueing Models
33
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
M/G/1 Queues
There are no simple expression for the steady-state probabilities P0,
P1,
L LQ = is the time-average number of customers being served.
Average length of queue, LQ, can be rewritten as:
2 2
+
LQ =
2(1 ) 2(1 )
If and are held constant, LQ depends on the variability, 2, of the
service times.
Chapter 7. Queueing Models
34
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
M/G/1 Queues
Example: Two workers competing for a job, Able claims to be faster than
Baker on average, but Baker claims to be more consistent,
Poisson arrivals at rate = 2 per hour (1/30 per minute).
Able: 1/ = 24 minutes and 2 = 202 = 400 minutes2:
(1 / 30) 2 [24 2 + 400]
LQ =
= 2.711 customers
2(1 4 / 5)
- The proportion of arrivals who find Able idle and thus experience no delay is P0 = 1- = 1/5
= 20%.
Baker: 1/ = 25 minutes and 2 = 22 = 4 minutes2:
(1 / 30) 2 [252 + 4]
LQ =
= 2.097 customers
2(1 5 / 6)
- The proportion of arrivals who find Baker idle and thus experience no delay is P0 = 1- =
1/6 = 16.7%.
Although working faster on average, Ables greater service variability results in an
average queue length about 30% greater than Bakers.
Chapter 7. Queueing Models
35
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
M/M/1 Queues
Suppose the service times in an M/G/1 queue are exponentially
distributed with mean 1/, then the variance is 2 = 1/2.
M/M/1 queue is a useful approximate model when service times
have standard deviation approximately equal to their means.
The steady-state parameters
=
Pn = (1 ) n
L=
P0 = 1
2
2
=
LQ =
( ) 1
w=
wQ =
Chapter 7. Queueing Models
1
1
=
(1 )
=
( ) (1 )
36
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
M/M/1 Queues
Single-chair unisex hair-styling shop
Interarrival and service times are exponentially distributed
=2 customers/hour and =3 customers/hour
=
2
=
3
L=
1
3
P0 = 1 =
1
n =0
Chapter 7. Queueing Models
2
= 2 Customers
3 2
4
4
2
LQ =
=
= Customers
( ) 3(3 2) 3
4 2
L = LQ + = + = 2 Customers
3 3
1 2
4
P2 = =
3 3
27
P 4 = 1 Pn =
2
= 1 hour
2
1
1 2
wQ = w = 1 = hour
3 3
w=
1 2 2
P1 = =
3 3 9
16
81
37
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
M/M/1 Queues
Example: M/M/1 queue with
service rate =10 customers
per hour.
Consider how L and w
increase as arrival rate, ,
increases from 5 to 8.64 by
increments of 20%
5,0
6,0
7,2
8,6
10,0
0,5
1,0
0,6
1,5
0,7
2,6
0,9
6,4
1,0
0,2
0,3
0,4
0,7
20
L
18 w
If / 1, waiting lines tend to
continually grow in length
Number of Customers
16
Increase in average system
time (w) and average number
in system (L) is highly
nonlinear as a function of .
14
12
10
8
6
4
2
0
0.5
0.6
0.7
0.8
0.9
rho
Chapter 7. Queueing Models
38
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Effect of Utilization and Service Variability
For almost all queues, if lines are too long, they can be reduced by
decreasing server utilization () or by decreasing the service time
variability (2).
A measure of the variability of a distribution,
coefficient of variation (cv):
(cv) 2 =
V (X )
[E ( X )]2
The larger cv is, the more variable is the distribution relative to its
expected value
For exponential service times with rate
- E(X)=1/
- V(X)=1/2
cv=1
Chapter 7. Queueing Models
39
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Effect of Utilization and Service Variability
Consider LQ for any M/G/1 queue:
2 (1 + 2 2 )
LQ =
2(1 )
2 1 + (cv) 2
1
2
LQ for M/M/1
queue
Corrects the M/M/1
formula to account
for a non-exponential
service time distn
Chapter 7. Queueing Models
40
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
M/M/c// queue: c servers operating in parallel
Arrival process is poisson with rate
Each server has an independent and identical exponential service-time
distribution, with mean 1/.
To achieve statistical equilibrium, the offered load (/) must satisfy /<c,
where /(c) = is the server utilization.
Calling population
1
Waiting line
c
Chapter 7. Queueing Models
41
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
The steady-state parameters
=
c 1 ( / ) n c 1 c
P0 =
+
n =0 n! c! c
(c ) c P0
P (L ( ) c ) =
c!(1 )
Probability
that all servers
are busy
(c ) c +1 P0
P (L ( ) c )
c
L = c +
=
+
1
c(c!)(1 ) 2
L
w=
P (L ( ) c )
LQ =
1
L LQ = c
Chapter 7. Queueing Models
42
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
Probability of empty system
Chapter 7. Queueing Models
Number of customers in system
43
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
Other common multiserver queueing models
2 1 + (cv) 2
LQ =
2
1
Corrects the M/M/1
formula
LQ for M/M/1
queue
M/G/c/: general service times and c parallel server. The parameters can
be approximated from those of the M/M/c// model.
M/G/: general service times and infinite number of servers.
M/M/c/N/: service times are exponentially distributed at rate and c
servers where the total system capacity is N c customer. When an arrival
occurs and the system is full, that arrival is turned away.
Chapter 7. Queueing Models
44
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
M/G/: general service times and infinite number of servers
- customer is its own server
- service capacity far exceeds service demand
- when we want to know how many servers are required so that
customers are rarely delayed
Pn = e
P0 = e
w=
()
n!
, n = 0,1, K
wQ = 0
L=
LQ = 0
Chapter 7. Queueing Models
45
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
How many users can be logged in simultaneously in a computer
system
Customers log on with rate =500 per hour
Stay connected in average for 1/=180 minutes = 3 hours
For planning purposes it is pretended that the simultaneous logged in
users is infinite
Expected number of simultaneous users L
L = = 500 3 = 1500
To ensure providing adequate capacity 95% of the time, the number of
parallel users c has to be restricted
e 1500 (1500) n
0.95
P ( L() c) = Pn =
n!
n =0
n =0
c
The capacity c=1564 simultaneous users satisfies this requirement
Chapter 7. Queueing Models
46
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
M/M/c/N/: service times are exponentially distributed at rate and
c servers where the total system capacity is N c customer
when an arrival occurs and the system is full, that arrival is turned away
Effective arrival rate e is defined as the mean number of arrivals per
time unit who enter and remain in the system
1
P0
PN
LQ
a n a c N n c
= 1 + +
!
!
n
c
1
1
=
=
+
n
n
c
aN
=
P0
c!c N c
P0 a c
1 N c ( N c) N c (1 )
=
c!(1 )
= (1 PN )
wQ =
LQ
w = wQ +
(1-PN) probability that a
customer will find a space
and be able to enter the
system
L = e w
Chapter 7. Queueing Models
47
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
Single-chair unisex hair-styling shop (again!)
Space only for 3 customers:
one in service and two waiting
First computer P0
P0 =
1
2 2 3 2 n 1
1 + +
3 3 n = 2 3
Queue time
wQ =
= 0.415
PN = P3 =
()
1!1
P0 =
w = wQ +
8
= 0.123
65
28
= 0.246
114
66
= 0.579
114
Expected number of
customers in shop
Average of the queue
LQ = 0.431
L = e w =
Effective arrival rate
66
= 1.015
65
Probability of busy shop
1 P0 = e = 0.585
8 114
e = 21 =
= 1.754
65 65
Chapter 7. Queueing Models
System time, time in shop
P(system is full)
2 3
3
2
LQ
48
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Steady-State Behavior of Finite-Population Models
In practical problems calling population is finite
When the calling population is small, the presence of one or more customers in
the system has a strong effect on the distribution of future arrivals.
Consider a finite-calling population model with K customers (M/M/c/K/K)
The time between the end of one service visit and the next call for service is
exponentially distributed with mean = 1/.
Service times are also exponentially distributed with mean 1/.
c parallel servers and system capacity is K.
K Customers
1
Waiting line
c
Chapter 7. Queueing Models
49
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Steady-State Behavior of Finite-Population Models
Some of the steady-state probabilities:
n
c 1 K n K
K!
P0 = +
n c
n =0 n n =c ( K n)!c!c
K n
n = 0,1,..., c 1
P0 ,
n
Pn =
n
!
K
( K n)!c!c n c , n = c, c + 1,...K
L = nPn ,
n =0
w = L / e ,
e
c
where e is the long run effective arrival rate of customers to queue (or entering/exiting service)
K
e = ( K n)Pn
n =0
Chapter 7. Queueing Models
50
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Steady-State Behavior of Finite-Population Models
Example: two workers who are responsible for 10 milling
machines.
Machines run on the average for 20 minutes, then require an
average 5-minute service period, both times exponentially
distributed: = 1/20 and = 1/5.
All of the performance measures depend on P0:
n
21 10 5 n 10
10!
5
P0 = +
n2
n
20
n
(
10
)!
2
!
2
20
n =0
n=2
= 0.065
Then, we can obtain the other Pn, and can compute the
expected number of machines in system:
10
L=
nP
= 3.17 machines
n =0
The average number of running machines:
K L = 10 3.17 = 6.83 machines
Chapter 7. Queueing Models
51
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Networks of Queues
Many systems are naturally modeled as networks of single queues
customers departing from one queue may be routed to another
The following results assume a stable system with infinite calling
population and no limit on system capacity:
Provided that no customers are created or destroyed in the queue, then
the departure rate out of a queue is the same as the arrival rate into the
queue, over the long run.
If customers arrive to queue i at rate i, and a fraction 0 pij 1 of them
are routed to queue j upon departure, then the arrival rate from queue i
to queue j is i pij over the long run.
Chapter 7. Queueing Models
52
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Networks of Queues
The overall arrival rate into queue j:
j = aj +
p
i
ij
all i
Arrival rate
from outside
the network
Sum of arrival rates
from other queues
in network
If queue j has cj < parallel servers, each working at rate j, then the
long-run utilization of each server is j=j /(cj) (where j < 1 for stable
queue).
If arrivals from outside the network form a Poisson process with rate aj
for each queue j, and if there are cj identical servers delivering
exponentially distributed service times with mean 1/j, then, in steady
state, queue j behaves likes an M/M/cj queue with arrival rate
j = aj +
p
i
ij
all i
Chapter 7. Queueing Models
53
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Network of Queues
Customer
Population
80 cust
hour
0.4
c=
c=1
0.6
Discount store example:
Suppose customers arrive at the rate 80 per hour and
40% choose self-service.
Hence:
- Arrival rate to service center 1 is 1 = 80(0.4) = 32 per hour
- Arrival rate to service center 2 is 2 = 80(0.6) = 48 per hour.
c2 = 3 clerks and 2 = 20 customers per hour.
The long-run utilization of the clerks is:
2 = 48/(3*20) = 0.8
All customers must see the cashier at service center 3,
the overall rate to service center 3 is 3 = 1 + 2 = 80
per hour.
- If 3 = 90 per hour, then the utilization of the cashier is:
3 = 80/90 = 0.89
Chapter 7. Queueing Models
54
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Summary
Introduced basic concepts of queueing models.
Show how simulation, and some times mathematical analysis, can be used
to estimate the performance measures of a system.
Commonly used performance measures: L, LQ, w, wQ, , and e.
When simulating any system that evolves over time, analyst must decide
whether to study transient behavior or steady-state behavior.
Simple formulas exist for the steady-state behavior of some queues.
Simple models can be solved mathematically, and can be useful in providing
a rough estimate of a performance measure.
Chapter 7. Queueing Models
55
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation Techniques
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chapter 7
Queueing Models
Computer Science, Informatik 4
Communication and Distributed Systems
Purpose
Simulation is often used in the analysis of queueing models.
A simple but typical queueing model
Calling population
Waiting line
Server
Queueing models provide the analyst with a powerful tool for
designing and evaluating the performance of queueing systems.
Typical measures of system performance
Server utilization, length of waiting lines, and delays of customers
For relatively simple systems, compute mathematically
For realistic models of complex systems, simulation is usually required
Chapter 7. Queueing Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Outline
Discuss some well-known models
Not development of queueing theory, for this see other class!
We will deal with
General characteristics of queues
Meanings and relationships of important performance measures
Estimation of mean measures of performance
Effect of varying input parameters
Mathematical solutions of some basic queueing models
Chapter 7. Queueing Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Characteristics of Queueing Systems
Key elements of queueing systems
Customer: refers to anything that arrives at a facility and requires
service, e.g., people, machines, trucks, emails.
Server: refers to any resource that provides the requested service, e.g.,
repairpersons, retrieval machines, runways at airport.
System
Customers
Server
Reception desk
People
Receptionist
Hospital
Patients
Nurses
Airport
Airplanes
Runway
Production line
Cases
Case-packer
Road network
Cars
Traffic light
Grocery
Shoppers
Checkout station
Computer
Jobs
CPU, disk, CD
Network
Packets
Router
Chapter 7. Queueing Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Calling Population
Calling population: the population of potential customers, may be
assumed to be finite or infinite.
Finite population model: if arrival rate depends on the number of
customers being served and waiting, e.g., model of one corporate jet, if it
is being repaired, the repair arrival rate becomes zero.
n-1
Infinite population model: if arrival rate is not affected by the number of
customers being served and waiting, e.g., systems with large population
of potential customers.
Chapter 7. Queueing Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
System Capacity
System Capacity: a limit on the number of customers that may
be in the waiting line or system.
Limited capacity, e.g., an automatic car wash only has room for 10 cars
to wait in line to enter the mechanism.
Waiting line
Server
Unlimited capacity, e.g., concert ticket sales with no limit on the number
of people allowed to wait to purchase tickets.
Waiting line
Chapter 7. Queueing Models
Server
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Arrival Process
For infinite-population models:
In terms of interarrival times of successive customers.
Random arrivals: interarrival times usually characterized by a probability
distribution.
- Most important model: Poisson arrival process (with rate ), where An
represents the interarrival time between customer n-1 and customer n, and is
exponentially distributed (with mean 1/).
Scheduled arrivals: interarrival times can be constant or constant plus or
minus a small random amount to represent early or late arrivals.
- Example: patients to a physician or scheduled airline flight arrivals to an
airport
At least one customer is assumed to always be present, so the server is
never idle, e.g., sufficient raw material for a machine.
Chapter 7. Queueing Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Arrival Process
For finite-population models:
Customer is pending when the customer is outside the queueing system,
e.g., machine-repair problem: a machine is pending when it is
operating, it becomes not pending the instant it demands service from
the repairman.
Runtime of a customer is the length of time from departure from the
queueing system until that customers next arrival to the queue, e.g.,
machine-repair problem, machines are customers and a runtime is time
to failure (TTF).
Let A1(i), A2(i), be the successive runtimes of customer i, and S1(i), S2(i)
be the corresponding successive system times:
Chapter 7. Queueing Models
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Queue Behavior and Queue Discipline
Queue behavior: the actions of customers while in a queue
waiting for service to begin, for example:
Balk: leave when they see that the line is too long
Renege: leave after being in the line when its moving too slowly
Jockey: move from one line to a shorter line
Queue discipline: the logical ordering of customers in a queue
that determines which customer is chosen for service when a
server becomes free, for example:
First-in-first-out (FIFO)
Last-in-first-out (LIFO)
Service in random order (SIRO)
Shortest processing time first (SPT)
Service according to priority (PR)
Chapter 7. Queueing Models
10
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Service Times and Service Mechanism
Service times of successive arrivals are denoted by S1, S2, S3.
May be constant or random.
{S1, S2, S3, } is usually characterized as a sequence of independent and
identically distributed random variables, e.g., exponential, Weibull,
gamma, lognormal, and truncated normal distribution.
A queueing system consists of a number of service centers and
interconnected queues.
Each service center consists of some number of servers, c, working in
parallel, upon getting to the head of the line, a customer takes the 1st
available server.
Chapter 7. Queueing Models
11
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Service Times and Service Mechanism
Example: consider a discount warehouse where customers may:
Serve themselves before paying at the cashier
Chapter 7. Queueing Models
12
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Service Times and Service Mechanism
Wait for one of the three clerks:
Batch service (a server serving several customers simultaneously), or
customer requires several servers simultaneously.
Chapter 7. Queueing Models
13
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Service Times and Service Mechanism
Chapter 7. Queueing Models
14
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Example
Candy production line
Three machines separated by buffers
Buffers have capacity of 1000 candies
Assumption:Allways
sufficient supply of
raw material.
Chapter 7. Queueing Models
15
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Queueing Notation Kendall Notation
A notation system for parallel server queues: A/B/c/N/K
represents the interarrival-time distribution
represents the service-time distribution
represents the number of parallel servers
represents the system capacity
represents the size of the calling population
are usually dropped, if they are infinity
Common symbols for A and B
A
B
c
N
K
N, K
M
D
Ek
H
G
Markov, exponential distribution
Constant, deterministic
Erlang distribution of order k
Hyperexponential distribution
General, arbitrary
Examples
M/M/1// same as M/M/1: Single-server with unlimited capacity and callpopulation. Interarrival and service times are exponentially distributed
G/G/1/5/5: Single-server with capacity 5 and call-population 5.
Chapter 7. Queueing Models
16
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Queueing Notation
Primary performance measures of queueing systems:
Pn
Pn(t)
An
Sn
Wn
WnQ
L(t)
LQ(t)
L
LQ
w
wQ
Chapter 7. Queueing Models
steady-state probability of having n customers in system
probability of n customers in system at time t
arrival rate
effective arrival rate
service rate of one server
server utilization
interarrival time between customers n-1 and n
service time of the n-th arriving customer
total time spent in system by the n-th arriving customer
total time spent in the waiting line by customer n
the number of customers in system at time t
the number of customers in queue at time t
long-run time-average number of customers in system
long-run time-average number of customers in queue
long-run average time spent in system per customer
long-run average time spent in queue per customer
17
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Evolving of a Queueing System
Number of
customers in the
system
Time
Chapter 7. Queueing Models
18
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Time-Average Number in System L
Consider a queueing system over a period of time T
Let Ti denote the total time during [0,T] in which the system contained
exactly i customers, the time-weighted-average number in a system is
defined by:
1
T
L =
iTi =
i i
T i =0
T
i =0
Consider the total area under the function is L(t), then,
1
L =
T
1
iTi =
T
i =0
L(t )dt
0
The long-run time-average number of customers in system, with
probability 1:
L=
T
Chapter 7. Queueing Models
L(t )dt T
L
19
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Time-Average Number in System L
The time-weighted-average number in queue is:
1
1
Q
LQ = iTi =
T i =0
T
LQ (t )dt T
LQ
G/G/1/N/K example: consider the results from the queueing system (N> 4,
K > 3).
L = [0(3) + 1(12) + 2(4) + 3(1)] / 20
= 23 / 20 = 1.15 cusomters
if L(t) = 0
0,
LQ (t ) =
L(t ) 1, if L(t) 1
0(15) + 1(4) + 2(1)
LQ =
= 0.3 customers
20
Chapter 7. Queueing Models
20
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Average Time Spent in System Per Customer w
The average time spent in system per customer, called the average
system time, is:
w=
N
W
i =1
where W1, W2, , WN are the individual times that each of the N
customers spend in the system during [0,T].
w as N
For stable systems: w
If the system under consideration is the queue alone:
1 N Q
w Q = Wi wQ
N i =1
as
G/G/1/N/K example (cont.): the average system time is
w =
W1 + W2 + ... + W5 2 + (8 3) + ... + (20 16)
=
= 4.6 time units
5
5
Chapter 7. Queueing Models
21
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
The Conservation Equation Littles Law
Conservation equation (a.k.a. Littles law)
Average # in
system
L = w
Average
System time
Arrival rate
L = w as T and N
Holds for almost all queueing systems or subsystems (regardless of the
number of servers, the queue discipline, or other special circumstances).
G/G/1/N/K example (cont.): On average, one arrival every 4 time units
and each arrival spends 4.6 time units in the system. Hence, at an
arbitrary point in time, there is (1/4)(4.6) = 1.15 customers present on
average.
Chapter 7. Queueing Models
22
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Server Utilization
Definition: the proportion of time that a server is busy.
Observed server utilization, , is defined over a specified time interval
[0,T].
Long-run server utilization is .
For systems with long-run stability: as T
Chapter 7. Queueing Models
23
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Server Utilization
For G/G/1// queues:
Any single-server queueing system with average arrival rate
customers per time unit, where average service time E(S) = 1/
time units, infinite queue capacity and calling population.
Conservation equation, L = w, can be applied.
For a stable system, the average arrival rate to the server, s,
must be identical to .
The average number of customers in the server is:
Ls =
T
Chapter 7. Queueing Models
(
T
T T0
L(t ) LQ (t ) dt =
T
24
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Server Utilization
In general, for a single-server queue:
Ls = T
Ls =
and
= E (s) =
- For a single-server stable queue:
<1
- For an unstable queue ( > ), long-run server utilization is 1.
Chapter 7. Queueing Models
25
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Server Utilization
For G/G/c// queues:
A system with c identical servers in parallel.
If an arriving customer finds more than one server idle, the
customer chooses a server without favoring any particular server.
For systems in statistical equilibrium, the average number of busy
servers, Ls, is: Ls, = E(s) = /.
The long-run average server utilization is:
Chapter 7. Queueing Models
Ls
=
, where < c for stable systems
c c
26
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Server Utilization and System Performance
System performance varies widely for a given utilization .
For example, a D/D/1 queue where E(A) = 1/ and E(S) = 1/,
where:
L = = /, w = E(S) = 1/, LQ = WQ = 0.
- By varying and , server utilization can assume any value between 0
and 1.
- Yet there is never any line.
In general, variability of interarrival and service times causes lines
to fluctuate in length.
Chapter 7. Queueing Models
27
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Server Utilization and System Performance
Consider the system is simulated
with service times: S1= 9, S2=12,
Example: A physician who
schedules patients every 10
minutes and spends Si minutes
with the i-th patient:
S3 = 9, S4 = 9, S5 = 9, .
The system becomes:
9 minutes with probability 0.9
Si =
12 minutes with probability 0.1
Arrivals are deterministic,
A1 = A2 = = -1 = 10.
Services are stochastic
- E(Si) = 9.3 min
- V(S0) = 0.81 min2
- = 0.9 min
The occurrence of a relatively
long service time (S2 = 12)
causes a waiting line to form
temporarily.
On average, the physician's
utilization = = / = 0.93 < 1.
Chapter 7. Queueing Models
28
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Costs in Queueing Problems
Costs can be associated with various aspects of the waiting line or
servers:
System incurs a cost for each customer in the queue, say at a rate of $10
per hour per customer.
WjQ is the time
- The average cost per customer is:
customer j spends
N
$10 W jQ
j =1
in queue
= $10 w Q
- If customers per hour arrive (on average), the average cost per
hour is:
L
$
10
customer $10 w Q
Q
= $10 w Q =
hour customer
hour
Server may also impose costs on the system, if a group of c parallel
servers (1 c ) have utilization r, each server imposes a cost of $5 per
hour while busy.
- The total server cost is: $5 c
Chapter 7. Queueing Models
29
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Steady-State Behavior of Markovian Models
Markovian models:
Exponential-distributed arrival process (mean arrival rate = 1/).
Service times may be exponentially (M) or arbitrary (G) distributed.
Queue discipline is FIFO.
A queueing system is in statistical equilibrium if the probability that the
system is in a given state is not time dependent:
P ( L(t ) = n) = Pn (t ) = Pn
Mathematical models in this chapter can be used to obtain approximate
results even when the model assumptions do not strictly hold, as a
rough guide.
Simulation can be used for more refined analysis, more faithful
representation for complex systems.
Chapter 7. Queueing Models
30
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Steady-State Behavior of Markovian Models
Properties of processes with statistical equilibrium
The state of statistical equilibrium is reached from any starting
state.
The process remain in statistical equilibrium once it has reached
it.
Chapter 7. Queueing Models
31
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Steady-State Behavior of Markovian Models
For the simple model studied in this chapter, the steady-state
parameter, L, the time-average number of customers in the
system is:
L=
nP
n =0
Apply Littles equation, L= w, to the whole system and to the
queue alone:
w=
wQ = w
LQ = wQ
G/G/c// example: to have a statistical equilibrium, a
necessary and sufficient condition is:
=
<1
c
Chapter 7. Queueing Models
32
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
M/G/1 Queues
Single-server queues with Poisson arrivals and unlimited capacity.
Suppose service times have mean 1/ and variance 2 and = / < 1, the
steady-state parameters of M/G/1 queue:
P0 = 1
2 (1 + 2 2 )
L=+
2(1 )
The particular
distribution is not
known!
2 (1 + 2 2 )
LQ =
2(1 )
(1 / 2 + 2 )
w= +
2(1 )
1
(1 / 2 + 2 )
wQ =
2(1 )
Chapter 7. Queueing Models
33
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
M/G/1 Queues
There are no simple expression for the steady-state probabilities P0,
P1,
L LQ = is the time-average number of customers being served.
Average length of queue, LQ, can be rewritten as:
2 2
+
LQ =
2(1 ) 2(1 )
If and are held constant, LQ depends on the variability, 2, of the
service times.
Chapter 7. Queueing Models
34
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
M/G/1 Queues
Example: Two workers competing for a job, Able claims to be faster than
Baker on average, but Baker claims to be more consistent,
Poisson arrivals at rate = 2 per hour (1/30 per minute).
Able: 1/ = 24 minutes and 2 = 202 = 400 minutes2:
(1 / 30) 2 [24 2 + 400]
LQ =
= 2.711 customers
2(1 4 / 5)
- The proportion of arrivals who find Able idle and thus experience no delay is P0 = 1- = 1/5
= 20%.
Baker: 1/ = 25 minutes and 2 = 22 = 4 minutes2:
(1 / 30) 2 [252 + 4]
LQ =
= 2.097 customers
2(1 5 / 6)
- The proportion of arrivals who find Baker idle and thus experience no delay is P0 = 1- =
1/6 = 16.7%.
Although working faster on average, Ables greater service variability results in an
average queue length about 30% greater than Bakers.
Chapter 7. Queueing Models
35
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
M/M/1 Queues
Suppose the service times in an M/G/1 queue are exponentially
distributed with mean 1/, then the variance is 2 = 1/2.
M/M/1 queue is a useful approximate model when service times
have standard deviation approximately equal to their means.
The steady-state parameters
=
Pn = (1 ) n
L=
P0 = 1
2
2
=
LQ =
( ) 1
w=
wQ =
Chapter 7. Queueing Models
1
1
=
(1 )
=
( ) (1 )
36
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
M/M/1 Queues
Single-chair unisex hair-styling shop
Interarrival and service times are exponentially distributed
=2 customers/hour and =3 customers/hour
=
2
=
3
L=
1
3
P0 = 1 =
1
n =0
Chapter 7. Queueing Models
2
= 2 Customers
3 2
4
4
2
LQ =
=
= Customers
( ) 3(3 2) 3
4 2
L = LQ + = + = 2 Customers
3 3
1 2
4
P2 = =
3 3
27
P 4 = 1 Pn =
2
= 1 hour
2
1
1 2
wQ = w = 1 = hour
3 3
w=
1 2 2
P1 = =
3 3 9
16
81
37
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
M/M/1 Queues
Example: M/M/1 queue with
service rate =10 customers
per hour.
Consider how L and w
increase as arrival rate, ,
increases from 5 to 8.64 by
increments of 20%
5,0
6,0
7,2
8,6
10,0
0,5
1,0
0,6
1,5
0,7
2,6
0,9
6,4
1,0
0,2
0,3
0,4
0,7
20
L
18 w
If / 1, waiting lines tend to
continually grow in length
Number of Customers
16
Increase in average system
time (w) and average number
in system (L) is highly
nonlinear as a function of .
14
12
10
8
6
4
2
0
0.5
0.6
0.7
0.8
0.9
rho
Chapter 7. Queueing Models
38
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Effect of Utilization and Service Variability
For almost all queues, if lines are too long, they can be reduced by
decreasing server utilization () or by decreasing the service time
variability (2).
A measure of the variability of a distribution,
coefficient of variation (cv):
(cv) 2 =
V (X )
[E ( X )]2
The larger cv is, the more variable is the distribution relative to its
expected value
For exponential service times with rate
- E(X)=1/
- V(X)=1/2
cv=1
Chapter 7. Queueing Models
39
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Effect of Utilization and Service Variability
Consider LQ for any M/G/1 queue:
2 (1 + 2 2 )
LQ =
2(1 )
2 1 + (cv) 2
1
2
LQ for M/M/1
queue
Corrects the M/M/1
formula to account
for a non-exponential
service time distn
Chapter 7. Queueing Models
40
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
M/M/c// queue: c servers operating in parallel
Arrival process is poisson with rate
Each server has an independent and identical exponential service-time
distribution, with mean 1/.
To achieve statistical equilibrium, the offered load (/) must satisfy /<c,
where /(c) = is the server utilization.
Calling population
1
Waiting line
c
Chapter 7. Queueing Models
41
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
The steady-state parameters
=
c 1 ( / ) n c 1 c
P0 =
+
n =0 n! c! c
(c ) c P0
P (L ( ) c ) =
c!(1 )
Probability
that all servers
are busy
(c ) c +1 P0
P (L ( ) c )
c
L = c +
=
+
1
c(c!)(1 ) 2
L
w=
P (L ( ) c )
LQ =
1
L LQ = c
Chapter 7. Queueing Models
42
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
Probability of empty system
Chapter 7. Queueing Models
Number of customers in system
43
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
Other common multiserver queueing models
2 1 + (cv) 2
LQ =
2
1
Corrects the M/M/1
formula
LQ for M/M/1
queue
M/G/c/: general service times and c parallel server. The parameters can
be approximated from those of the M/M/c// model.
M/G/: general service times and infinite number of servers.
M/M/c/N/: service times are exponentially distributed at rate and c
servers where the total system capacity is N c customer. When an arrival
occurs and the system is full, that arrival is turned away.
Chapter 7. Queueing Models
44
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
M/G/: general service times and infinite number of servers
- customer is its own server
- service capacity far exceeds service demand
- when we want to know how many servers are required so that
customers are rarely delayed
Pn = e
P0 = e
w=
()
n!
, n = 0,1, K
wQ = 0
L=
LQ = 0
Chapter 7. Queueing Models
45
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
How many users can be logged in simultaneously in a computer
system
Customers log on with rate =500 per hour
Stay connected in average for 1/=180 minutes = 3 hours
For planning purposes it is pretended that the simultaneous logged in
users is infinite
Expected number of simultaneous users L
L = = 500 3 = 1500
To ensure providing adequate capacity 95% of the time, the number of
parallel users c has to be restricted
e 1500 (1500) n
0.95
P ( L() c) = Pn =
n!
n =0
n =0
c
The capacity c=1564 simultaneous users satisfies this requirement
Chapter 7. Queueing Models
46
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
M/M/c/N/: service times are exponentially distributed at rate and
c servers where the total system capacity is N c customer
when an arrival occurs and the system is full, that arrival is turned away
Effective arrival rate e is defined as the mean number of arrivals per
time unit who enter and remain in the system
1
P0
PN
LQ
a n a c N n c
= 1 + +
!
!
n
c
1
1
=
=
+
n
n
c
aN
=
P0
c!c N c
P0 a c
1 N c ( N c) N c (1 )
=
c!(1 )
= (1 PN )
wQ =
LQ
w = wQ +
(1-PN) probability that a
customer will find a space
and be able to enter the
system
L = e w
Chapter 7. Queueing Models
47
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multiserver Queue
Single-chair unisex hair-styling shop (again!)
Space only for 3 customers:
one in service and two waiting
First computer P0
P0 =
1
2 2 3 2 n 1
1 + +
3 3 n = 2 3
Queue time
wQ =
= 0.415
PN = P3 =
()
1!1
P0 =
w = wQ +
8
= 0.123
65
28
= 0.246
114
66
= 0.579
114
Expected number of
customers in shop
Average of the queue
LQ = 0.431
L = e w =
Effective arrival rate
66
= 1.015
65
Probability of busy shop
1 P0 = e = 0.585
8 114
e = 21 =
= 1.754
65 65
Chapter 7. Queueing Models
System time, time in shop
P(system is full)
2 3
3
2
LQ
48
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Steady-State Behavior of Finite-Population Models
In practical problems calling population is finite
When the calling population is small, the presence of one or more customers in
the system has a strong effect on the distribution of future arrivals.
Consider a finite-calling population model with K customers (M/M/c/K/K)
The time between the end of one service visit and the next call for service is
exponentially distributed with mean = 1/.
Service times are also exponentially distributed with mean 1/.
c parallel servers and system capacity is K.
K Customers
1
Waiting line
c
Chapter 7. Queueing Models
49
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Steady-State Behavior of Finite-Population Models
Some of the steady-state probabilities:
n
c 1 K n K
K!
P0 = +
n c
n =0 n n =c ( K n)!c!c
K n
n = 0,1,..., c 1
P0 ,
n
Pn =
n
!
K
( K n)!c!c n c , n = c, c + 1,...K
L = nPn ,
n =0
w = L / e ,
e
c
where e is the long run effective arrival rate of customers to queue (or entering/exiting service)
K
e = ( K n)Pn
n =0
Chapter 7. Queueing Models
50
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Steady-State Behavior of Finite-Population Models
Example: two workers who are responsible for 10 milling
machines.
Machines run on the average for 20 minutes, then require an
average 5-minute service period, both times exponentially
distributed: = 1/20 and = 1/5.
All of the performance measures depend on P0:
n
21 10 5 n 10
10!
5
P0 = +
n2
n
20
n
(
10
)!
2
!
2
20
n =0
n=2
= 0.065
Then, we can obtain the other Pn, and can compute the
expected number of machines in system:
10
L=
nP
= 3.17 machines
n =0
The average number of running machines:
K L = 10 3.17 = 6.83 machines
Chapter 7. Queueing Models
51
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Networks of Queues
Many systems are naturally modeled as networks of single queues
customers departing from one queue may be routed to another
The following results assume a stable system with infinite calling
population and no limit on system capacity:
Provided that no customers are created or destroyed in the queue, then
the departure rate out of a queue is the same as the arrival rate into the
queue, over the long run.
If customers arrive to queue i at rate i, and a fraction 0 pij 1 of them
are routed to queue j upon departure, then the arrival rate from queue i
to queue j is i pij over the long run.
Chapter 7. Queueing Models
52
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Networks of Queues
The overall arrival rate into queue j:
j = aj +
p
i
ij
all i
Arrival rate
from outside
the network
Sum of arrival rates
from other queues
in network
If queue j has cj < parallel servers, each working at rate j, then the
long-run utilization of each server is j=j /(cj) (where j < 1 for stable
queue).
If arrivals from outside the network form a Poisson process with rate aj
for each queue j, and if there are cj identical servers delivering
exponentially distributed service times with mean 1/j, then, in steady
state, queue j behaves likes an M/M/cj queue with arrival rate
j = aj +
p
i
ij
all i
Chapter 7. Queueing Models
53
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Network of Queues
Customer
Population
80 cust
hour
0.4
c=
c=1
0.6
Discount store example:
Suppose customers arrive at the rate 80 per hour and
40% choose self-service.
Hence:
- Arrival rate to service center 1 is 1 = 80(0.4) = 32 per hour
- Arrival rate to service center 2 is 2 = 80(0.6) = 48 per hour.
c2 = 3 clerks and 2 = 20 customers per hour.
The long-run utilization of the clerks is:
2 = 48/(3*20) = 0.8
All customers must see the cashier at service center 3,
the overall rate to service center 3 is 3 = 1 + 2 = 80
per hour.
- If 3 = 90 per hour, then the utilization of the cashier is:
3 = 80/90 = 0.89
Chapter 7. Queueing Models
54
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Summary
Introduced basic concepts of queueing models.
Show how simulation, and some times mathematical analysis, can be used
to estimate the performance measures of a system.
Commonly used performance measures: L, LQ, w, wQ, , and e.
When simulating any system that evolves over time, analyst must decide
whether to study transient behavior or steady-state behavior.
Simple formulas exist for the steady-state behavior of some queues.
Simple models can be solved mathematically, and can be useful in providing
a rough estimate of a performance measure.
Chapter 7. Queueing Models
55
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation
Discrete-Event System Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chapter 5
Random-Number Generation
Computer Science, Informatik 4
Communication and Distributed Systems
Purpose & Overview
Discuss the generation of random numbers.
Introduce the subsequent testing for randomness:
Frequency test
Autocorrelation test.
Chapter 5. Random-Number Generation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Properties of Random Numbers
Two important statistical properties:
Uniformity
Independence.
Random Number, Ri, must be independently drawn from a uniform
distribution with pdf:
1, 0 x 1
f ( x) =
0, otherwise
2
x
E ( R) = xdx =
0
2
1
Chapter 5. Random-Number Generation
1
=
2
Figure: pdf for
random numbers
4
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Generation of Pseudo-Random Numbers
Pseudo, because generating numbers using a known method
removes the potential for true randomness.
Goal: To produce a sequence of numbers in [0,1] that simulates, or
imitates, the ideal properties of random numbers (RN).
Important considerations in RN routines:
Fast
Portable to different computers
Have sufficiently long cycle
Replicable
Closely approximate the ideal statistical properties of uniformity and
independence.
Chapter 5. Random-Number Generation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Techniques for Generating Random Numbers
Linear Congruential Method (LCM).
Combined Linear Congruential Generators (CLCG).
Random-Number Streams.
Chapter 5. Random-Number Generation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Linear Congruential Method
To produce a sequence of integers, X1, X2, between 0 and m-1
by following a recursive relationship:
X i +1 = (aX i + c) mod m, i = 0,1,2,...
The
multiplier
The
modulus
The
increment
The selection of the values for a, c, m, and X0 drastically affects
the statistical properties and the cycle length.
The random integers are being generated [0,m-1], and to convert
the integers to random numbers:
Ri =
Chapter 5. Random-Number Generation
Xi
, i = 1,2,...
m
7
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Linear Congruential Method Example
Use X0 = 27, a = 17, c = 43, and m = 100.
The Xi and Ri values are:
X1 = (17*27+43) mod 100 = 502 mod 100 = 2,
X2 = (17*2+43) mod 100 = 77,
X3 = (17*77+43) mod 100 = 52,
X4=(17*52+43) mod 100 = 27,
Chapter 5. Random-Number Generation
R1 = 0.02;
R2 = 0.77;
R3 = 0.52;
R3 = 0.27;
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Characteristics of a Good Generator
Maximum Density
Such that he values assumed by Ri, i = 1,2,, leave no large gaps on
[0,1]
Problem: Instead of continuous, each Ri is discrete
Solution: a very large integer for modulus m
- Approximation appears to be of little consequence
Maximum Period
To achieve maximum density and avoid cycling.
Achieve by: proper choice of a, c, m, and X0.
Most digital computers use a binary representation of numbers
Speed and efficiency are aided by a modulus, m, to be (or close to) a
power of 2.
Chapter 5. Random-Number Generation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Random-Numbers in Java
Defined in java.util.Random
private final static long multiplier = 0x5DEECE66DL;
private final static long addend = 0xBL;
private final static long mask = (1L << 48) - 1;
protected int next(int bits) {
long oldseed, nextseed;
...
oldseed = seed.get();
nextseed = (oldseed * multiplier + addend) & mask;
...
return (int)(nextseed >>> (48 - bits));
}
Chapter 5. Random-Number Generation
10
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Combined Linear Congruential Generators
Reason: Longer period generator is needed because of the
increasing complexity of simulated systems.
Approach: Combine two or more multiplicative congruential
generators.
Let Xi,1, Xi,2, , Xi,k, be the i-th output from k different multiplicative
congruential generators.
The j-th generator:
- Has prime modulus mj and multiplier aj and period is mj -1
- Produces integers Xi,j is approx ~ Uniform on integers in [1, mj 1]
- Wi,j = Xi,j -1 is approx ~ Uniform on integers in [0, mj -2]
Chapter 5. Random-Number Generation
11
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Combined Linear Congruential Generators
Suggested form:
k
j 1
X i = (1) X i , j mod m1 1
j =1
Xi
m ,
Hence, Ri = 1
m 1
1 ,
m1
Xi > 0
Xi = 0
The coefficient:
Performs the
subtraction Xi,1-1
The maximum possible period is: P =
Chapter 5. Random-Number Generation
12
(m1 1)(m2 1)...(mk 1)
2 k 1
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Combined Linear Congruential Generators
Example: For 32-bit computers, combining k = 2 generators with
m1 = 2,147,483,563, a1 = 40,014, m2 = 2,147,483,399 and a2 = 20,692. The
algorithm becomes:
Step 1: Select seeds
- X1,0 in the range [1, 2,147,483,562] for the 1st generator
- X2,0 in the range [1, 2,147,483,398] for the 2nd generator.
Step 2:
For each individual generator,
X1,j+1 = 40,014 X1,j mod 2,147,483,563
X2,j+1 = 40,692 X1,j mod 2,147,483,399.
Step 3:
Step 4:
Xj+1 = (X1,j+1 - X2,j+1 ) mod 2,147,483,562.
Return
X j +1
2,147,483,
563
R j +1 =
2,147,483,562
2,147,483,563 ,
X j +1 > 0
X j +1 = 0
Step 5: Set j = j+1, go back to step 2.
Combined generator has period: (m1 1)(m2 1)/2 ~ 2 x 1018
Chapter 5. Random-Number Generation
13
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Random-Numbers in Excel 2003
In Excel 2003 new Random Number Generator
X, Y, Z {1,...,30000}
X = X 171 mod 30269
Y = Y 172 mod 30307
Z = Z 170 mod 30323
Y
Z
X
R =
+
+
mod 1.0
30269 30307 30323
It is stated that this method produces more than 10^13
numbers
Chapter 5. Random-Number Generation
14
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Random-Numbers Streams
The seed for a linear congruential random-number generator:
Is the integer value X0 that initializes the random-number sequence.
Any value in the sequence can be used to seed the generator.
Refers to a starting seed taken from the sequence X0, X1, , XP.
If the streams are b values apart, then stream i could defined by starting seed:
A random-number stream:
i = 1,2, K , Pb
S i = X b ( i 1)
Older generators: b = 105; Newer generators: b = 1037.
A single random-number generator with k streams can act like k distinct
virtual random-number generators
To compare two or more alternative systems.
Advantageous to dedicate portions of the pseudo-random number sequence to
the same purpose in each of the simulated systems.
Chapter 5. Random-Number Generation
15
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Tests for Random Numbers
Two categories:
Testing for uniformity:
H0: Ri ~ U[0,1]
H1: Ri ~/ U[0,1]
- Failure to reject the null hypothesis, H0, means that evidence of nonuniformity has not been detected.
Testing for independence:
H0: Ri ~ independently
H1: Ri ~/ independently
- Failure to reject the null hypothesis, H0, means that evidence of
dependence has not been detected.
Level of significance , the probability of rejecting H0 when it is true:
= P( reject H0 | H0 is true)
Chapter 5. Random-Number Generation
16
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Tests for Random Numbers
When to use these tests:
If a well-known simulation languages or random-number generators is
used, it is probably unnecessary to test
If the generator is not explicitly known or documented, e.g., spreadsheet
programs, symbolic/numerical calculators, tests should be applied to
many sample numbers.
Types of tests:
Theoretical tests: evaluate the choices of m, a, and c without actually
generating any numbers
Empirical tests: applied to actual sequences of numbers produced.
- Our emphasis.
Chapter 5. Random-Number Generation
17
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Frequency Tests
Test of uniformity
Two different methods:
Kolmogorov-Smirnov test
Chi-square test
Chapter 5. Random-Number Generation
18
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Kolmogorov-Smirnov Test
Compares the continuous cdf, F(x), of the uniform distribution with
the empirical cdf, SN(x), of the N sample observations.
We know: F ( x) = x, 0 x 1
If the sample from the RN generator is R1, R2, , RN, then the empirical
cdf, SN(x) is:
S N ( x) =
Number of Ri where Ri x
N
Based on the statistic: D = max | F(x) - SN(x)|
Sampling distribution of D is known
A more powerful test, recommended.
Chapter 5. Random-Number Generation
19
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Kolmogorov-Smirnov Test
The test consists of the following steps
Step 1: Rank the data from smallest to largest
R(1)R(2)... R(N)
Step 2: Compute
D + = max R( i )
1i N N
i 1
D = max R( i )
1i N
N
Step 3: Compute D = max(D+, D-)
Step 4: Get D for the significance level
Step 5: If D D accept, otherwise reject H0
Chapter 5. Random-Number Generation
20
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Kolmogorov-Smirnov Test
Example: Suppose N=5 numbers: 0.44, 0.81, 0.14, 0.05, 0.93.
i
Step 1:
Step 2:
R(i)
0.05
0.14
0.44
0.81
0.93
i/N
0.20
0.40
0.60
0.80
1.00
i/N R(i)
0.15
0.26
0.16
0.07
R(i) (i-1)/N
0.05
0.04
0.21
0.13
Arrange R(i) from
smallest to largest
D+ = max {i/N R(i)}
D- = max {R(i) - (i-1)/N}
Step 3: D = max(D+, D-) = 0.26
Step 4: For = 0.05,
D = 0.565 > D
Hence, H0 is not rejected.
Chapter 5. Random-Number Generation
21
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chi-square test
Chi-square test uses the sample statistic:
n is the # of classes
n
02 =
i =1
(Oi Ei )
Ei
Ei is the expected
# in the i-th class
Oi is the observed
# in the i-th class
Approximately the chi-square distribution with n-1 degrees of freedom
For the uniform distribution, Ei, the expected number in each class is:
N
Ei = , where N is the total # of observation
n
Valid only for large samples, e.g. N >= 50
Chapter 5. Random-Number Generation
22
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chi-square test
Example
100 numbers from [0,1]
=0.05
10 intervals
X20.05,9=16.9
Accept, since
Interval
Upper Limit
0.1
10
0.2
X20=11.2 < X20.05,9
Oi-Ei
(Oi-Ei)^2
(Oi-Ei)^2/Ei
10
10
-1
0.1
0.3
10
-5
25
2.5
0.4
10
-4
16
1.6
0.5
16
10
36
3.6
0.6
13
10
0.9
0.7
10
10
0.8
10
-3
0.9
0.9
10
10
10
1.0
14
10
16
1.6
100
100
11.2
Oi
Ei
X20=11.2
Chapter 5. Random-Number Generation
23
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Tests for Autocorrelation
Autocorrelation is concerned with dependence between
numbers in a sequence
Example:
0.12 0.01 0.23 0.28 0.89 0.31 0.64 0.28 0.83 0.93
0.99 0.15 0.33 0.35 0.91 0.41 0.60 0.27 0.75 0.88
0.68 0.49 0.05 0.43 0.95 0.58 0.19 0.36 0.69 0.87
Numbers at 5-th, 10-th, 15-th, ... are very similar
Numbers can be
Low
High
Alternating
Chapter 5. Random-Number Generation
24
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Tests for Autocorrelation
Testing the autocorrelation between every m numbers (m is a.k.a.
the lag), starting with the i-th number
The autocorrelation im between numbers: Ri, Ri+m, Ri+2m, Ri+(M+1)m
M is the largest integer such that i + (M + 1 )m N
Hypothesis:
H 0 : im = 0,
if numbers are independen t
H 1 : im 0,
if numbers are dependent
If the values are uncorrelated:
For large values of M, the distribution of the estimator of im, denoted im
is approximately normal.
Chapter 5. Random-Number Generation
25
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Tests for Autocorrelation
Test statistics is:
im
Z0 =
im
Z0 is distributed normally with mean = 0 and variance = 1, and:
1 M
im =
R
i + km
i +(k +1 )m 0.25
M + 1 k =0
im =
If im > 0, the subsequence has positive autocorrelation
13M + 7
12(M + 1 )
High random numbers tend to be followed by high ones, and vice versa.
If im < 0, the subsequence has negative autocorrelation
Low random numbers tend to be followed by high ones, and vice versa.
Chapter 5. Random-Number Generation
26
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Example
Test whether the 3rd, 8th, 13th, and so on, for the numbers on Slide
24.
Hence, = 0.05, i = 3, m = 5, N = 30, and M = 4
35 =
1 (0.23)(0.28) + (0.28)(0.33) + (0.33)(0.27 )
0.25
4 + 1 + (0.27 )(0.05) + (0.05)(0.36)
= 0.1945
13( 4) + 7
= 0.128
35 =
12( 4 + 1 )
0.1945
= 1.516
Z0 =
0.1280
z0.025 = 1.96 hence, the hypothesis is not rejected.
Chapter 5. Random-Number Generation
27
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Shortcomings
The test is not very sensitive for small values of M, particularly when
the numbers being tested are on the low side.
Problem when fishing for autocorrelation by performing numerous
tests:
If = 0.05, there is a probability of 0.05 of rejecting a true hypothesis.
If 10 independence sequences are examined,
- The probability of finding no significant autocorrelation, by chance
alone, is 0.9510 = 0.60.
- Hence, the probability of detecting significant autocorrelation when it
does not exist = 40%
Chapter 5. Random-Number Generation
28
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Summary
In this chapter, we described:
Generation of random numbers
Testing for uniformity and independence
Caution:
Even with generators that have been used for years, some of which still
in used, are found to be inadequate.
This chapter provides only the basic
Also, even if generated numbers pass all the tests, some underlying
pattern might have gone undetected.
Chapter 5. Random-Number Generation
29
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation
Discrete-Event System Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chapter 6
Random-Variate Generation
Computer Science, Informatik 4
Communication and Distributed Systems
Purpose & Overview
Develop understanding of generating samples from a
specified distribution as input to a simulation model.
Illustrate some widely-used techniques for generating random
variates.
Inverse-transform technique
Acceptance-rejection technique
Special properties
Chapter 6. Random-Variate Generation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Preparation
It is assumed that a source of uniform [0,1] random numbers
exists.
Linear Congruential Method
Random numbers R, R1, R2, with
PDF
1 0 x 1
f R ( x) =
0 otherwise
CDF
0 x < 0
FR ( x) = x 0 x 1
1 x > 1
Chapter 6. Random-Variate Generation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Inverse-transform Technique
The concept:
For CDF function: r = F(x)
Generate r from uniform (0,1), a.k.a U(0,1)
Find x,
x = F-1(r)
F(x)
F(x)
1
r = F(x)
r = F(x)
r1
r1
x
x1
x1
Chapter 6. Random-Variate Generation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Inverse-transform Technique
The inverse-transform technique can be used in principle for
any distribution.
Most useful when the CDF F(x) has an inverse F -1(x) which
is easy to compute.
Required steps
1.
2.
3.
4.
Compute the CDF of the desired random variable X.
Set F(X)=R on the range of X.
Solve the equation F(X)=R for X in terms of R.
Generate uniform random numbers R1, R2, R3, and compute
the desired random variate by Xi = F-1(Ri)
Chapter 6. Random-Variate Generation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Exponential Distribution
Exponential Distribution
To generate X1, X2, X3
PDF
f ( x ) = e
1 e X = R
e X = 1 R
X = ln(1 R)
ln(1 R)
X=
ln(1 R)
X =
CDF
F ( x ) = 1 e x
Simplification
X =
ln(R)
X = F 1 ( R)
Since R and (1-R) are
uniformly distributed on [0,1]
Chapter 6. Random-Variate Generation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Exponential Distribution
Figure: Inverse-transform technique for
exp( = 1)
Chapter 6. Random-Variate Generation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Exponential Distribution
Example: Generate 200 variates Xi with distribution exp(= 1)
Generate 200 Rs with U(0,1), the histogram of Xs become:
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0
0,5
1,5
2,5
3,5
Empirical Histogram
4,5
5,5
6,5
Theor. PDF
Check: Does the random variable X1 have the desired distribution?
P ( X 1 x0 ) = P ( R1 F ( x0 )) = F ( x 0 )
Chapter 6. Random-Variate Generation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Other Distributions
Examples of other distributions for which inverse CDF works
are:
Uniform distribution
Weibull distribution
Triangular distribution
Chapter 6. Random-Variate Generation
10
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Uniform Distribution
Random variable X uniformly distributed over [a, b]
F(X ) = R
X a
=R
ba
X a = R(b a )
X = a + R (b a )
Chapter 6. Random-Variate Generation
11
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Weibull Distribution
The Weibull Distribution is
described by
The variate is
F(X ) = R
PDF
1 e
1 ( )
f ( x) = x e
( X
( X
=R
= 1 R
( X ) = ln(1 R)
CDF
F ( X ) = 1 e
(x )
= ln(1 R )
X = ln(1 R )
x = ln(1 R )
X = ln(1 R )
Chapter 6. Random-Variate Generation
12
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Triangular Distribution
The CDF of a Triangular
Distribution with endpoints (0, 2)
is given by
0
x2
2
F ( x) =
(2 x) 2
1
2
For 0 X 1
x0
X2
R=
2
and for 1 X 2
0 < x 1
1< x 2
(2 X ) 2
R = 1
2
x>2
X is generated by
2R
X =
2 2(1 R)
Chapter 6. Random-Variate Generation
0 R
1
2
1
2
< R 1
13
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Empirical Continuous Distributions
When theoretical distribution is not applicable
To collect empirical data:
Resample the observed data
Interpolate between observed data points to fill in the gaps
For a small sample set (size n):
Arrange the data from smallest to largest
x (1) x (2) x (n)
Set x(0)=0
Assign the probability 1/n to each interval x (i-1)
The slope of each line segment is defined as
x(i ) x(i 1)
x(i ) x(i 1)
=
ai =
1 / n (i 1) / n
1/ n
The inverse CDf is given by
x x (i)
(i 1)
X = F 1 ( R) = x(i 1) + ai R
Chapter 6. Random-Variate Generation
14
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Empirical Continuous Distributions
i
Interval
Probability
Cumulative Probability
Slope ai
0.0 < x 0.8
0.2
0.2
4.00
0.8 < x 1.24
0.2
0.4
2.20
1.24 < x 1.45
0.2
0.6
1.05
1.45 < x 1.83
0.2
0.8
1.90
1.83 < x 2.76
0.2
1.0
4.65
R1 = 0.71
X 1 = x( 41) + a4 ( R1 (4 1) / n)
= 1.45 + 1.90(0.71 0.6)
= 1.66
Chapter 6. Random-Variate Generation
15
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Empirical Continuous Distributions
What happens for large samples of data
Several hundreds or tens of thousand
First summarize the data into a frequency distribution with
smaller number of intervals
Afterwards, fit continuous empirical CDF to the frequency
distribution
Slight modifications
Slope
ai =
ci cumulative probability of
the first i intervals
x(i ) x(i 1)
ci ci 1
The inverse CDf is given by
X = F 1 ( R) = x + a (R c
( i 1)
Chapter 6. Random-Variate Generation
i 1
16
)
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Empirical Continuous Distributions
Example: Suppose the data collected for 100 broken-widget repair
times are:
Cumulative Slope,
ai
Frequency, c i
Interval
(Hours)
Frequency
Relative
Frequency
0.25 x 0.5
31
0.31
0.31
0.81
0.5 x 1.0
10
0.10
0.41
5.0
1.0 x 1.5
25
0.25
0.66
2.0
1.5 x 2.0
34
0.34
1.00
1.47
Consider R1 = 0.83:
c3 = 0.66 < R1 < c4 = 1.00
X1 = x(4-1) + a4(R1 c(4-1))
= 1.5 + 1.47(0.83-0.66)
= 1.75
Chapter 6. Random-Variate Generation
17
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Empirical Continuous Distributions
Problems with empirical distributions
The data in the previous example is restricted in the range
0.25 X 2.0
The underlying distribution might have a wider range
Thus, try to find a theoretical distribution
Hints for building empirical distribution based on frequency
tables
It is recommended to use relatively short intervals
- Number of bin increase
This will result in a more accurate estimate
Chapter 6. Random-Variate Generation
18
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Discrete Distribution
All discrete distributions can be generated via inverse-transform
technique
Method: numerically, table-lookup procedure, algebraically, or a
formula
Examples of application:
Empirical
Discrete uniform
Gamma
Chapter 6. Random-Variate Generation
19
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Discrete Distribution
Example: Suppose the number of shipments, x, on the
loading dock of a company is either 0, 1, or 2
Data - Probability distribution:
x
p(x)
F(x)
0
1
2
0.50
0.30
0.20
0.50
0.80
1.00
The inverse-transform technique as table-lookup procedure
F ( xi 1 ) = ri 1 < R ri = F ( xi )
Set X1 = xi
Chapter 6. Random-Variate Generation
20
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Discrete Distribution
Method - Given R, the
generation scheme
becomes:
R 0.5
0,
x = 1, 0.5 < R 0.8
2, 0.8 < R 1.0
Table for generating
the discrete variate
X
Consider R1 = 0.73:
F(xi-1) < R <= F(xi)
F(x0) < 0.73 <= F(x1)
Hence, x1 = 1
i Input ri Output xi
1
0.5
0.8
1.0
Chapter 6. Random-Variate Generation
21
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Acceptance-Rejection technique
Useful particularly when inverse cdf does not exist in closed form, a.k.a.
thinning
Illustration: To generate random variates, X ~ U(1/4, 1)
Generate R
Procedures:
no
Step 1. Generate R ~ U[0,1]
Step 2a. If R >= , accept X=R.
Step 2b. If R < , reject R, return
to Step 1
Condition
yes
Output R
R does not have the desired distribution, but R conditioned (R) on the
event {R } does.
Efficiency: Depends heavily on the ability to minimize the number of
rejections.
Chapter 6. Random-Variate Generation
22
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Poisson Distribution
PMF of a Poisson Distribution
P ( N = n) =
n!
Exactly n arrivals during one time unit
A1 + A2 + L + An 1 < A1 + A2 + L + An + An +1
Since interarrival times are exponentially distributed we can set
Ai =
ln( Ri )
Well known, we derived this generator in the beginning of the class
Chapter 6. Random-Variate Generation
23
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Poisson Distribution
Substitute the sum by
n
Simplify by
ln( Ri )
i =1
n +1
1<
ln( Ri )
i =1
multiply by -, which reverses the inequality sign
sum of logs is the log of a product
n
n +1
i =1
i =1
i =1
i =1
ln Ri = ln( Ri ) > ln( Ri ) = ln Ri
Simplify by eln(x) = x
n
i =1
Chapter 6. Random-Variate Generation
n +1
> Ri
i =1
24
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Poisson Distribution
Procedure of generating a Poisson random variate N is as
follows
1. Set n=0, P=1
2. Generate a random number Rn+1, and replace P by P x Rn+1
3. If P < exp(-), then accept N=n
-
Otherwise, reject the current n, increase n by one, and return to
step 2.
Chapter 6. Random-Variate Generation
25
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Poisson Distribution
Example: Generate three Poisson variates with mean =0.2
exp(-0.2) = 0.8187
Step 1: Set n=0, P=1
Step 2: R1 = 0.4357, P = 1 x 0.4357
Step 3: Since P = 0.4357 < exp(- 0.2), accept N = 0
Step 1: Set n=0, P=1
Step 2: R1 = 0.4146, P = 1 x 0.4146
Step 3: Since P = 0.4146 < exp(-0.2), accept N = 0
Step 1: Set n=0, P=1
Step 2: R1 = 0.8353, P = 1 x 0.8353
Step 3: Since P= 0.8353 > exp(-0.2), reject n=0 and return to Step 2 with n=1
Step 2: R2 = 0.9952, P = 0.8353 x 0.9952 = 0.8313
Step 3: Since P= 0.8313 > exp(-0.2), reject n=1 and return to Step 2 with n=2
Step 2: R3 = 0.8004, P = 0.8313 x 0.8004 = 0.6654
Step 3: Since P = 0.6654 < exp(-0.2), accept N = 2
Variate 1
Variate 2
Variate 3
Chapter 6. Random-Variate Generation
26
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Poisson Distribution
It took five random numbers to generate three Poisson
variates
In long run, the generation of Poisson variates requires some
overhead!
N Rn+1
Accept/Reject
Result
0 0.4357 0.4357 P < exp(- )
Accept
N=0
0 0.4146 0.4146 P < exp(- )
Accept
N=0
0 0.8353 0.8353 P exp(- )
Reject
1 0.9952 0.8313 P exp(- )
Reject
2 0.8004 0.6654 P < exp(- )
Accept
Chapter 6. Random-Variate Generation
27
N=2
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Special Properties
Based on features of particular family of probability
distributions
For example:
Direct Transformation for normal and lognormal distributions
Convolution
Beta distribution (from gamma distribution)
Chapter 6. Random-Variate Generation
28
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Direct Transformation
Approach for N(0,1):
Consider two standard normal random variables, Z1 and Z2, plotted as a
point in the plane:
In polar coordinates:
Z1 = B cos()
Z2 = B sin()
B2 = Z21 + Z22 ~ 2 distribution with 2 degrees of freedom = Exp( = 2).
Hence,
B = (2 ln R)1/ 2
The radius B and angle are mutually independent.
Z1 = (2 ln R )1/ 2 cos(2R2 )
Z 2 = (2 ln R)1/ 2 sin(2R2 )
Chapter 6. Random-Variate Generation
29
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Direct Transformation
Approach for N(,2):
Generate Zi ~ N(0,1)
Xi = + Zi
Approach for Lognormal(,2):
Generate X ~ N((,2)
Yi = eXi
Chapter 6. Random-Variate Generation
30
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Summary
Principles of random-variate generation via
Inverse-transform technique
Acceptance-rejection technique
Special properties
Important for generating continuous and discrete distributions
Chapter 6. Random-Variate Generation
31
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Simulation
Discrete-Event System Simulation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chapter 8
Input Modeling
Computer Science, Informatik 4
Communication and Distributed Systems
Purpose & Overview
Input models provide the driving force for a simulation model.
The quality of the output is no better than the quality of inputs.
In this chapter, we will discuss the 4 steps of input model
development:
1) Collect data from the real system
2) Identify a probability distribution to represent the input process
3) Choose parameters for the distribution
4) Evaluate the chosen distribution and parameters for goodness
of fit.
Chapter 8. Input Modeling
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Data Collection
One of the biggest tasks in solving a real problem
GIGO Garbage-In-Garbage-Out
Raw Data
System
Performance
simulation
Input
Data
Output
Even when model structure is valid simulation results can be
misleading, if the input data are
inaccurately collected
inappropriately analyzed
not representative of the environment
Chapter 8. Input Modeling
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Data Collection
Suggestions that may enhance and facilitate data
collection:
Plan ahead: begin by a practice or pre-observing session, watch
for unusual circumstances
Analyze the data as it is being collected: check adequacy
Combine homogeneous data sets: successive time periods,
during the same time period on successive days
Be aware of data censoring: the quantity is not observed in its
entirety, danger of leaving out long process times
Check for relationship between variables: build scatter diagram
Check for autocorrelation:
Collect input data, not performance data
Chapter 8. Input Modeling
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Identifying the Distribution
Histograms
Scatter Diagrams
Selecting families of distribution
Parameter estimation
Goodness-of-fit tests
Fitting a non-stationary process
Chapter 8. Input Modeling
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Histograms
A frequency distribution or histogram is useful in determining
the shape of a distribution
The number of class intervals depends on:
The number of observations
The dispersion of the data
Suggested number of intervals: the square root of the sample size
For continuous data:
Corresponds to the probability density function of a theoretical
distribution
For discrete data:
Corresponds to the probability mass function
If few data points are available
combine adjacent cells to eliminate the ragged appearance of the
histogram
Chapter 8. Input Modeling
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Histograms
Vehicle Arrival Example: Number of vehicles arriving at an intersection
between 7 am and 7:05 am was monitored for 100 random workdays.
Arrivals per
Period
Frequency
0
12
1
10
2
19
3
17
4
10
5
8
6
7
7
5
8
5
9
3
10
3
11
1
Same data
with different
interval sizes
There are ample data, so the histogram may have a cell for each possible
value in the data range
Chapter 8. Input Modeling
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Histograms Example
Life tests were performed on electronic components at 1.5
times the nominal voltage, and their lifetime was recorded
Component Life Frequency
0x<3
23
3x<6
10
6x<9
9 x < 12
12 x < 15
42 x < 45
144 x < 147
Chapter 8. Input Modeling
1
9
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Histograms Example
Stanford University Mobile Activity Traces (SUMATRA)
Available traces
Target community: cellular
network research community
Traces contain mobility as well as
connection information
SULAWESI (S.U. Local Area Wireless
Environment Signaling Information)
BALI (Bay Area Location Information)
BALI Characteristics
San Francisco Bay Area
Trace length: 24 hour
Number of cells: 90
Persons per cell: 1100
Persons at all: 99.000
Active persons: 66.550
Move events: 243.951
Call events: 1.570.807
Chapter 8. Input Modeling
Question: How to transform the BALI
information so that it is usable with a
network simulator, e.g., ns-2?
10
Node number as well as connection
number is too high for ns-2
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Histograms Example
1800
Analysis of the BALI Trace
1600
Goal: Reduce the amount of
data by identifying user groups
1400
People
1200
User group
1000
800
600
Between 2 local minima
Communication characteristic
is kept in the group
A user represents a group
400
200
50
0 40
30
lls
Ca
Groups with different mobility
characteristics
0
5
20
10
nts
eme
Mov
15
10
20
0
25000
Intra- and inter group
communication
Number of People
20000
Interesting characteristic
Number of people with odd
number movements is
negligible!
15000
10000
5000
-1 0
9 10 11 12 13 14 15 16 17 18 19
Number of Movements
Chapter 8. Input Modeling
11
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Scatter Diagrams
A scatter diagram is a quality tool that can show the
relationship between paired data
Random Variable X = Data 1
Random Variable Y = Data 2
Draw random variable X on the x-axis and Y on the y-axis
Strong Correlation
Chapter 8. Input Modeling
Moderate Correlation
12
No Correlation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Scatter Diagrams
Linear relationship
Correlation: Measures how well data line up
Slope: Measures the steepness of the data
Direction
Y Intercept
Chapter 8. Input Modeling
13
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Selecting the Family of Distributions
A family of distributions is selected based on:
The context of the input variable
Shape of the histogram
Frequently encountered distributions:
Easier to analyze: Exponential, Normal and Poisson
Harder to analyze: Beta, Gamma and Weibull
Chapter 8. Input Modeling
14
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Selecting the Family of Distributions
Use the physical basis of the distribution as a guide, for example:
Binomial: Number of successes in n trials
Poisson: Number of independent events that occur in a fixed amount of
time or space
Normal: Distribution of a process that is the sum of a number of
component processes
Exponential: time between independent events, or a process time that is
memoryless
Weibull: time to failure for components
Discrete or continuous uniform: models complete uncertainty
Triangular: a process for which only the minimum, most likely, and
maximum values are known
Empirical: resamples from the actual data collected
Chapter 8. Input Modeling
15
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Selecting the Family of Distributions
Remember the physical characteristics of the process
Is the process naturally discrete or continuous valued?
Is it bounded?
No true distribution for any stochastic input process
Goal: obtain a good approximation
Chapter 8. Input Modeling
16
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Quantile-Quantile Plots
Q-Q plot is a useful tool for evaluating distribution fit
If X is a random variable with CDF F, then the q-quantile of X is the
such that
F( ) = P(X ) = q,
for 0 < q < 1
When F has an inverse, = F-1(q)
Let {xi, i = 1,2, ., n} be a sample of data from X and {yj, j = 1,2, , n}
be the observations in ascending order:
j - 0.5
y j is approximately F -1
n
where j is the ranking or order number
Chapter 8. Input Modeling
17
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Quantile-Quantile Plots
The plot of yj versus F-1( ( j - 0.5 ) / n) is
Approximately a straight line if F is a member of an appropriate family of
distributions
The line has slope 1 if F is a member of an appropriate family of
distributions with appropriate parameter values
Chapter 8. Input Modeling
18
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Quantile-Quantile Plots
Example: Door installation times of a robot follows a normal
distribution.
The observations are ordered from the smallest to the largest:
j
1
2
3
4
5
Value
99.55
99.56
99.62
99.65
99.79
j
6
7
8
9
10
Value
99.98
100.02
100.06
100.17
100.23
j
11
12
13
14
15
Value
100.26
100.27
100.33
100.41
100.47
yj are plotted versus F-1( (j-0.5)/n) where F has a normal distribution with
the sample mean (99.99 sec) and sample variance (0.28322 sec2)
Chapter 8. Input Modeling
19
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Quantile-Quantile Plots
Example (continued): Check whether the door
installation times follow a normal distribution.
100,8
100,6
100,4
100,2
Straight line,
supporting the
hypothesis of a
normal distribution
100
99,8
99,6
99,4
99,2
99,2
99,4
99,6
99,8
100
100,2
100,4
100,6
100,8
0,35
0,3
0,25
0,2
Superimposed
density function of
the normal
distribution
0,15
0,1
0,05
0
99,4
Chapter 8. Input Modeling
20
99,6
99,8
100
100,2
100,4
100,6
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Quantile-Quantile Plots
Consider the following while evaluating the linearity of a Q-Q plot:
The observed values never fall exactly on a straight line
The ordered values are ranked and hence not independent, unlikely for
the points to be scattered about the line
Variance of the extremes is higher than the middle. Linearity of the
points in the middle of the plot is more important.
Q-Q plot can also be used to check homogeneity
It can be used to check whether a single distribution can represent two
sample sets
Given two random variables
- X and x1, x2, , xn
- Z and z1, z2, , zn
Plotting the ordered values of X and Z against each other reveals
approximately a straight line if X and Z are well represented by the same
distribution
Chapter 8. Input Modeling
21
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Parameter Estimation
Parameter Estimation: Next step after selecting a family of
distributions
If observations in a sample of size n are X1, X2, , Xn (discrete or
continuous), the sample mean and variance are:
i=1 X i
n
X=
2
2
X
n
X
i=1 i
n
S2 =
n 1
If the data are discrete and have been grouped in a frequency
distribution:
j =1 f j X j
X=
S2
2
2
f
X
n
X
j
j
j =1
n 1
where fj is the observed frequency of value Xj
Chapter 8. Input Modeling
22
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Parameter Estimation
When raw data are unavailable (data are grouped into class
intervals), the approximate sample mean and variance are:
j =1 f j m j
c
X=
S2
2
2
f
m
n
X
j
j
j =1
n 1
fj is the observed frequency in the j-th class interval
mj is the midpoint of the j-th interval
c is the number of class intervals
A parameter is an unknown constant, but an estimator is a
statistic.
Chapter 8. Input Modeling
23
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Parameter Estimation
Vehicle Arrival Example (continued): Table in the histogram of the example
on Slide 8 can be analyzed to obtain:
n = 100, f1 = 12, X 1 = 0, f 2 = 10, X 2 = 1,...
and
j =1 f j X j = 364, and j =1 f j X 2j = 2080
k
The sample mean and variance are
25
20
Frequency
364
X=
= 3.64
100
2080 100 (3.64) 2
2
S =
99
= 7.63
15
10
0
0
10
11
Number of Arrivals per Period
The histogram suggests X to have a Possion distribution
- However, note that sample mean is not equal to sample variance.
Theoretically: Poisson with parameter = 2 =
- Reason: each estimator is a random variable, it is not perfect.
Chapter 8. Input Modeling
24
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Parameter Estimation
Suggested Estimators for Distributions often used in Simulation
Maximum-Likelihood Esitmators
Distribution
Parameter Estimator
Poisson
Exponential
= X
=
1
X
Gamma
Normal
, 2
= X , 2 = S 2
Lognormal
, 2
= X , 2 = S 2
, =
1
X
After taking ln
of data.
Chapter 8. Input Modeling
25
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Goodness-of-Fit Tests
Conduct hypothesis testing on input data distribution using
Kolmogorov-Smirnov test
Chi-square test
No single correct distribution in a real application exists
If very little data are available, it is unlikely to reject any candidate
distributions
If a lot of data are available, it is likely to reject all candidate distributions
Be aware of mistakes in decision finding
Type I Error:
Type II Error:
Chapter 8. Input Modeling
Statistical
Decision
State of the null hypothesis
H0 True
H0 False
Reject H0
Type I Error
Correct
Accept H0
Correct
Type II Error
26
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chi-Square Test
Intuition: comparing the histogram of the data to the shape of
the candidate density or mass function
Valid for large sample sizes when parameters are estimated
by maximum-likelihood
Arrange the n observations into a set of k class intervals
The test statistic is:
02
i =1
(Oi Ei ) 2
Ei
Observed Frequency
in the i-th class
Expected Frequency
Ei = n*pi
where pi is the theoretical
prob. of the i-th interval.
Suggested Minimum = 5
02 approximately follows the chi-square distribution with k-s-1
degrees of freedom
s = number of parameters of the hypothesized distribution
estimated by the sample statistics.
Chapter 8. Input Modeling
27
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chi-Square Test
The hypothesis of a chi-square test is
H0: The random variable, X, conforms to the distributional
assumption with the parameter(s) given by the estimate(s).
H1: The random variable X does not conform.
2
2
H0 is rejected if 0 > ,k s 1
If the distribution tested is discrete and combining adjacent cell is not
required (so that Ei > minimum requirement):
Each value of the random variable should be a class interval, unless
combining is necessary, and
pi = p(xi ) = P(X = xi )
Chapter 8. Input Modeling
28
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chi-Square Test
If the distribution tested is continuous:
pi =
ai
ai1
f ( x) dx = F (ai ) F (ai 1 )
where ai-1 and ai are the endpoints of the i-th class interval
f(x) is the assumed pdf, F(x) is the assumed cdf
Recommended number of class intervals (k):
Sample Size, n
Number of Class Intervals, k
20
Do not use the chi-square test
50
5 to 10
100
10 to 20
n1/2 to n/5
> 100
Caution: Different grouping of data (i.e., k) can affect the hypothesis
testing result.
Chapter 8. Input Modeling
29
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Chi-Square Test
Vehicle Arrival Example (continued):
H0: the random variable is Poisson distributed.
H1: the random variable is not Poisson distributed.
xi
Observed Frequency, Oi
Expected Frequency, Ei
0
1
2
3
4
5
6
7
8
9
10
> 11
12
10
19
17
19
6
7
5
5
3
3
1
100
2.6
9.6
17.4
21.1
19.2
14.0
8.5
4.4
2.0
0.8
0.3
0.1
100.0
22
17
12.2
(Oi - Ei)2/Ei
7.87
0.15
0.8
4.41
2.57
0.26
7.6
11.62
27.68
Ei = np ( x)
e x
=n
x!
Combined because
of the assumption of
min Ei = 5, e.g.,
E1 = 2.6 < 5, hence
combine with E2
Degree of freedom is k-s-1 = 7-1-1 = 5, hence, the hypothesis is rejected
at the 0.05 level of significance.
02 = 27.68 > 02.05,5 = 11.1
Chapter 8. Input Modeling
30
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Kolmogorov-Smirnov Test
Intuition: formalize the idea behind examining a Q-Q plot
Recall
The test compares the continuous cdf, F(x), of the hypothesized
distribution with the empirical cdf, SN(x), of the N sample observations.
Based on the maximum difference statistics
D = max| F(x) - SN(x) |
A more powerful test, particularly useful when:
Sample sizes are small
No parameters have been estimated from the data
When parameter estimates have been made:
Critical values are biased, too large.
More conservative, i.e., smaller Type I error than specified.
Chapter 8. Input Modeling
31
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
p-Values and Best Fits
p-value for the test statistics
The significance level at which one would just reject H0 for the given test
statistic value.
A measure of fit, the larger the better
Large p-value: good fit
Small p-value: poor fit
Vehicle Arrival Example (cont.):
H0: data is Poisson
Test statistics: 02 = 27.68 , with 5 degrees of freedom
p-value = 0.00004, meaning we would reject H0 with 0.00004 significance
level, hence Poisson is a poor fit.
Chapter 8. Input Modeling
32
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
p-Values and Best Fits
Many software use p-value as the ranking measure to automatically
determine the best fit.
Things to be cautious about:
Software may not know about the physical basis of the data, distribution
families it suggests may be inappropriate.
Close conformance to the data does not always lead to the most
appropriate input model.
p-value does not say much about where the lack of fit occurs
Recommended: always inspect the automatic selection using
graphical methods.
Chapter 8. Input Modeling
33
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Fitting a Non-stationary Poisson Process
Fitting a NSPP to arrival data is difficult, possible approaches:
Fit a very flexible model with lots of parameters
Approximate constant arrival rate over some basic interval of
time, but vary it from time interval to time interval.
Suppose we need to model arrivals over time [0,T], our
approach is the most appropriate when we can:
Observe the time period repeatedly
Count arrivals / record arrival times
Divide the time period into k equal intervals of length t =T/k
Over n periods of observation let Cij be the number of arrivals
during the i-th interval on the j-th period
Chapter 8. Input Modeling
34
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Fitting a Non-stationary Poisson Process
The estimated arrival rate during the i-th time period
(i-1) t < t i t is:
n
(t ) =
Cij
nt j =1
n = Number of observation periods,
t = time interval length
Cij = # of arrivals during the i-th time interval on the j-th observation
period
Example: Divide a 10-hour business day [8am,6pm] into equal
intervals k = 20 whose length t = , and observe over n=3 days
Time Period
Chapter 8. Input Modeling
Number of Arrivals
Day 1
Day 2
Day 3
Estimated Arrival
Rate (arrivals/hr)
8:00 - 8:30
12
14
10
24
8:30 - 9:00
23
26
32
54
9:00 - 9:30
27
18
32
52
9:30 - 10:00
20
13
12
30
35
For instance,
1/3(0.5)*(23+26+32)
= 54 arrivals/hour
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Selecting Model without Data
If data is not available, some possible sources to obtain
information about the process are:
Engineering data: often product or process has performance
ratings provided by the manufacturer or company rules specify
time or production standards.
Expert option: people who are experienced with the process or
similar processes, often, they can provide optimistic, pessimistic
and most-likely times, and they may know the variability as well.
Physical or conventional limitations: physical limits on
performance, limits or bounds that narrow the range of the input
process.
The nature of the process.
The uniform, triangular, and beta distributions are often used
as input models.
Speed of a vehicle?
Chapter 8. Input Modeling
36
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Selecting Model without Data
Example: Production planning
simulation.
Input of sales volume of various
products is required, salesperson
of product XYZ says that:
- No fewer than 1000 units and no
more than 5000 units will be sold.
- Given her experience, she believes
there is a 90% chance of selling
more than 2000 units, a 25%
chance of selling more than 2500
units, and only a 1% chance of
selling more than 4500 units.
Interval (Sales)
1000 <= X <= 2000
0,1
0,10
2000 < X <=2500
0,65
0,75
2500 < X <= 4500
0,24
0,99
4500 < X <= 5000
0,01
1,00
1,20
1,00
0,80
0,60
Translating these information into
a cumulative probability of being
less than or equal to those goals
for simulation input:
Chapter 8. Input Modeling
Cumulative
Frequency, ci
0,40
0,20
0,00
1000 <= X <= 2000
37
2000 < X <=2500
2500 < X <= 4500
4500 < X <= 5000
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multivariate and Time-Series Input Models
The random variable discussed until now were considered to be
independent of any other variables within the context of the problem
However, variables may be related
If they appear as input, the relationship should be investigated and taken
into consideration
Multivariate input models
Fixed, finite number of random variables
For example, lead time and annual demand for an inventory model
An increase in demand results in lead time increase, hence variables are
dependent.
Time-series input models
Infinite sequence of random variables
For example, time between arrivals of orders to buy and sell stocks
Buy and sell orders tend to arrive in bursts, hence, times between arrivals
are dependent.
Chapter 8. Input Modeling
38
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Covariance and Correlation
Consider the model that describes relationship between X1 and X2:
( X 1 1 ) = ( X 2 2 ) +
is a random
variable with mean 0
and is independent
of X2
= 0, X1 and X2 are statistically independent
> 0, X1 and X2 tend to be above or below their means together
< 0, X1 and X2 tend to be on opposite sides of their means
Covariance between X1 and X2:
cov( X 1 , X 2 ) = E[( X 1 1 )( X 2 2 )] = E ( X 1 X 2 ) 1 2
Covariance between X1 and X2:
where
Chapter 8. Input Modeling
= 0
cov( X 1 , X 2 )< 0
> 0
= 0
< 0
> 0
39
< cov( X 1 , X 2 ) <
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Covariance and Correlation
Correlation between X1 and X2 (values between -1 and 1):
= corr( X 1 , X 2 ) =
where
= 0
corr ( X 1 , X 2 )< 0
> 0
cov( X 1 , X 2 )
1 2
= 0
< 0
> 0
The closer is to -1 or 1, the stronger the linear relationship is between
X1 and X2.
Chapter 8. Input Modeling
40
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Covariance and Correlation
A time series is a sequence of random variables X1, X2, X3,
which are identically distributed (same mean and variance)
but dependent.
cov(Xt, Xt+h) is the lag-h autocovariance
corr(Xt, Xt+h) is the lag-h autocorrelation
If the autocovariance value depends only on h and not on t, the
time series is covariance stationary
For covariance stationary time series, the shorthand for lag-h is
used
h = corr ( X t , X t + h )
Notice
autocorrelation measures the dependence between random
variables that are separated by h-1
Chapter 8. Input Modeling
41
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multivariate Input Models
If X1 and X2 are normally distributed, dependence between them can
be modeled by the bivariate normal distribution with 1, 2, 12, 22
and correlation
To estimate 1, 2, 12, 22, see Parameter Estimation
To estimate , suppose we have n independent and identically distributed
pairs (X11, X21), (X12, X22), (X1n, X2n),
Then the sample covariance is
1 n
cov( X 1 , X 2 ) =
( X 1 j X 1 )( X 2 j X 2 )
n 1 j =1
The sample correlation is
=
Chapter 8. Input Modeling
cov( X 1 , X 2 )
1 2
42
Sample deviation
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Multivariate Input Models - Example
Let X1 the average lead time to deliver and X2 the annual demand
for a product.
Data for 10 years is available.
Demand
Lead Time
(X2)
(X1)
6,5
103
4,3
83
X 2 = 101.8, 2 = 9.93
6,9
116
6,0
97
csample = 8.66
6,9
112
6,9
104
5,8
106
7,3
109
4,5
92
6,3
96
X 1 = 6.14, 1 = 1.02
Covariance
8.66
= 0.86
1.02 9.93
Lead time and demand are strongly dependent.
Before accepting this model, lead time and demand should be checked
individually to see whether they are represented well by normal
distribution.
Chapter 8. Input Modeling
43
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Time-Series Input Models
If X1, X2, X3, is a sequence of identically distributed, but dependent
and covariance-stationary random variables, then we can represent
the process as follows:
Autoregressive order-1 model, AR(1)
Exponential autoregressive order-1 model, EAR(1)
- Both have the characteristics that:
h = corr( X t , X t + h ) = h ,
for h = 1,2,...
- Lag-h autocorrelation decreases geometrically as the lag increases, hence,
observations far apart in time are nearly independent
Chapter 8. Input Modeling
44
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
AR(1) Time-Series Input Models
Consider the time-series model:
X t = + ( X t 1 ) + t ,
for t = 2,3,...
where 2 , 3 , are i.i.d. normally distributed with = 0 and variance 2
If initial value X1 is chosen appropriately, then
X1, X2, are normally distributed with mean = , and variance = 2/(1-2)
Autocorrelation h = h
To estimate , , 2 :
= X ,
= (1 2 ) ,
2
cov( X t , X t +1 )
=
2
where cov( X t , X t +1 ) is the lag-1 autocovariance
Chapter 8. Input Modeling
45
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
EAR(1) Time-Series Input Models
Consider the time-series model:
with probability
X t 1 ,
Xt =
X t 1 + t , with probability 1-
for t = 2,3,...
where 2 , 3 , are i.i.d. exponentially distributed with = 1/, and 0 < 1
If X1 is chosen appropriately, then
X1, X2, are exponentially distributed with mean = 1/
Autocorrelation h = h , and only positive correlation is allowed.
To estimate , :
= 1 / X ,
cov( X t , X t +1 )
= =
2
where cov( X t , X t +1 ) is the lag-1 autocovariance
Chapter 8. Input Modeling
46
Dr. Mesut Gne
Computer Science, Informatik 4
Communication and Distributed Systems
Summary
In this chapter, we described the 4 steps in developing input data
models:
1)
2)
3)
4)
Collecting the raw data
Identifying the underlying statistical distribution
Estimating the parameters
Testing for goodness of fit
Chapter 8. Input Modeling
47
Dr. Mesut Gne