KEMBAR78
Simulation Model Verification and Validation | PDF
0% found this document useful (0 votes)
63 views11 pages

Simulation Model Verification and Validation

This paper discusses the verification and validation of simulation models. The different approaches to deciding model validity are described; how model verification and validation relate to the model development process is specified; various validation techniques are defined; conceptual model validity, model verification, operational validity, and data validity are discussed; ways to document results are given; and a recommended validation procedure is presented.

Uploaded by

nsbrnctfhr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
63 views11 pages

Simulation Model Verification and Validation

This paper discusses the verification and validation of simulation models. The different approaches to deciding model validity are described; how model verification and validation relate to the model development process is specified; various validation techniques are defined; conceptual model validity, model verification, operational validity, and data validity are discussed; ways to document results are given; and a recommended validation procedure is presented.

Uploaded by

nsbrnctfhr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 11
Proceedings of the 1991 Winter Simulation Conference Barty L. Nelson, W. David Kelton, Gordon M. Clark (eds.) SIMULATION MODEL VERIFICATION AND VALIDATION* Robert G. Sargent Simulation Research Group 449 Link Hall Syracuse University Syracuse, New York 13244 ABSTRACT ‘This paper discusses verification and validation of simulation models. The different approaches to deciding ‘model validity are described; how model verification and validation relate (0 the model development process is specified; various validation techniques are defined; conceptual model validity, model verification, ‘operational validity, and data validity are discussed; ways to document results are given; and a recommended validation procedure is presented. 1 INTRODUCTION Simulation models are increasingly being used in problem-solving and to aid in decision-making. The developers and users of these models, the decision-makers using information derived from the results of the models, and people effected by decisions based on such models are all rightly concerned with whether a model and its results are "correct". This concem is addressed through model verification and validation. Model validation is usually defined to mean "substantiation that a computerized model within its domain of applicability possesses a satisfactory range of accuracy consistent with the intended application of the model” (Schlesinger, et al 1979) and is the definition used here. Model verification is often defined as “ensuring that the computer program of the computerized model and its implementation is correct”, and is the definition adopted here. A related topic is model credibility (or acceptability) which is developing in the (potential) users of information from the models (¢.g., decision-makers) sufficient confidence in the information that they are willing to use it. ‘A model should be developed for a specific purpose or + This paper is an updated version of "A Tutorial on Validation and Verification of Simulation Models", Proceedings of the 1988 Winter Simulation Conference Conference, pp 33-39. 37 application and its validity determined with respect to that purpose. If the purpose of a model is to answer a variety of questions, the validity of the model needs to be determined with respect to each question. (Different models of the same system are sometimes developed for different purposes.) Several sets of experimental conditions are usually required to define the domain of a model's intended applicability. A model may be valid for ‘one set of experimental conditions and be invalid in another. A model is considered valid for a set of experimental conditions if its accuracy is within its acceptable range of accuracy which is the amount of accuracy required for the model's intended purpose. This, generally requires that the variables of interest, i. the variables used in answering the questions in the purpose of the model, be identified and their required accuracy determined. If the variables of interest are random variables, then properties and functions of the random variables such as their means and variances are frequently what is of primary interest and are what are used in determining model validity. Several versions of a model are usually developed prior to obtaining a satisfactory valid model, The substantiation that a model is valid, ie, model and verification validation, is generally considered to be a process and is usually part of the model development process. It is often too costly and time consuming to determine that a model is absolutely valid over the complete domain of its intended applicability. Instead, tests and evaluations are conducted until sufficient confidence is obtained that a model can be considered valid for its intended application (Sargent 1982, 1984 and Shannon 1975, 1981). ‘The relationships of cost (and a similar relationship holds for the amount of time) of performing model validation and the value of the model to the user as a function of model confidence are itlustrated in Figure 1. ‘The cost of model validation is, usually quite significant; in particular where extremely high confidence is required because of the consequence of an invalid model. 38 VALUE VALUE GOEL cost oa Cost USER 0% MODEL CONFIDENCE 100% Figure 1: Model Confidence “The remainder of this paper is organized as follows: Section 2 discusses the three basic approaches used in deciding model validity; Section 3 defines the validation techniques used; Sections 4, 5, 6, and 7 contain descriptions of data validity, conceptual model validity, computerized model verification, and operational validity, respectively; Section 8 describes ways of presenting results; Section 9 contains a recommended valida procedure; and Section 10 gives the conclusions. 2 VALIDATION PROCESS ‘There are three basic decision-making approaches used in determining that a simulation model is valid. Each of these approaches require the model development team to conduct verification and validation as part of the ‘model development process and this is discussed below in some detail. The most common decision-making approach is for the model development team to make the decision that the model is valid. This decision is a subjective decision based on the results of the various tests and evaluations conducted as part of the model development process. Another approach, often called independent verification and validation (IV&V), uses a third (independent) party to decide whether the model is valid. ‘The third party is independent of both the model development team and the model sponsor/user(s). After the model has been developed, the third party conducts an evaluation to determine whether the model is valid. Based upon this validation, the third party makes a subjective decision on the validity of the model. This approach is usually used when there is a large cost associated with the problem the simulation model is being used for and/or to help in model credibility. ‘The evaluation used in the IV&V approach can be as simple as reviewing the verification and validation performed by the model development team or it may involve a complete verification and validation effort. Wood (1986) describes experiences over this range of evaluation by a third party on energy models. One Sargent conclusion that Wood (1986) makes is that a complete IV&V evaluation is extremely costly and time consuming for what is gained. This author's view is that if a third party is to be used, they should be used and involved during the model development process. If the model has already been developed, this author believes that a third party should usually only evaluate what verification and validation has already been performed and not repeat earlier work. (Also see Davis (1986) for an approach that simultaneously specifies and validates a model) The last decision-making approach is to use a scoring model (see, e.g. Balci (1989) and Gass (1979)) to determine whether a model is valid. Scores (or weights) are determined subjectively when conducting various aspects of the validation process. Then these scores are combined to determine category scores and an overall score for the simulation model. A simulation model is considered valid if its overall and category scores are greater than some passing score(s). This approach is infrequently used in practice. ‘This author does not believe in the use of a scoring model for determine validity. One reason is that the subjectiveness of this approach tends to be hidden and thus it appears to be objective. A second reason is "how are passing scores” to be determined. A third reason is that a model may receive a passing score and yet have a defect that needs correction. A fourth reason is that the score(s) may cause over confidence in a model or be used 10 argue one model is better than another. We will now discuss how model verification and validation relate to the model development process There are two common ways to view this relationship. ‘One way uses a detail model development process and the other uses a simple model development process. Banks, Gerstein, and Searles (1988) reviewed work in both of these ways and concluded that the simple way more clearly illuminates model verification and validation. This author recommends the use of the simple way (see e.g., Sargent 1982) and is the way presented here. Consider the simplified version of the modelling process in Figure 2. The problem entity is the system (real or proposed), idea, situation, policy, or phenomena to be modelled; the conceptual model is the ‘mathematicalfogical/verbal representation (mimic) of the problem entity developed for a particular study; and the computerized model is the conceptual model implemented on a computer. The conceptual model is developed through an analysis and modelling phase, the computerized mode! is developed through a computer programming and implementation phase, and inferences about the problem entity are obtained by conducting ‘computer experiments on the computerized model in the experimentation phase. Verification and Validation \ / crerarional anaysis ‘auiorty MODELLING = EXPERIMENTATION arg \ Ne yf VALIOTY \ yauorry / \ é ! ‘CONCEPTUAL MPUTERIZED 1 ‘con MODEL [COMPUTER PROGRAMMING _| ‘MODEL, COMPUTERIZED MODEL ‘VERIFICATION Figure 2: Simplified Version of the Modelling Process We will now relate model validation and verification to this simplified version of the modelling process (See Figure 2). Conceptual model validity is defined as determining that the theories and assumptions underlying the conceptual model are correct and that the model representation of the problem entity is "reasonable" for the intended purpose of the model. Computerized model verification is defined as ensuring that the computer programming and implementation of the conceptual model is correct. Operational validity is defined as determining that the model's output behavior has sufficient accuracy for its intended purpose over the domain of the model's intended applicability. Daca validity is defined as ensuring that the data necessary for model building, model evaluation and testing, and ‘conducting the model experiments to solve the problem are adequate and correct. Several versions of a model are usually developed in the modelling process prior to obtaining a satisfactory valid model. During each model iteration, model verification and validation are performed (Sargent 1984), A variety of (validation) techniques are used, which are described below. Unfortunately, no algorithm or procedure exists to select which techniques to use. Some of their attributes are discussed in Sargent (1984). 3. VALIDATION TECHNIQUES This section describes various validation techniques (and tests) used in model verification and validation. Most of the techniques described here are found in the literature (see Balci and Sargent (1984a) for a detailed bibliography), although some may be described slightly 39 different. They can be used either subjectively or objectively. By objectively, we mean using some type of statistical test or procedure, e.g., hypothesis tests or confidence intervals. A combination of techniques is usually used. ‘These techniques are used for validating and verifying both the overall model and submodels. Animation (Operational Graphics): ‘The model's operational behavior is displayed graphically as the model moves through time. Examples are (i) the ‘graphical plot of the status of a server over time, e... is it busy, idle, or blocked, and (i) the graphical display of parts moving through a factory. Comparison to Other Models: Various results (e.g outputs) of the simulation model being validated are compared to results of other (valid) models. Examples are (i) simple cases of a simulation model may be compared to known results of analytic models, and (i the model may be compared to other (simpler) models, that have been validated. (Sometimes the simulation model being validated requires modification to allow comparisons to be made.) Degenerate Tests: The degeneracy of the mode's behavior is tested by removing portions of the model or by appropriate selection of values of the input and intemal parameters. For example, does the average number in the queue of a single server continue to increase with respect to time when the arrival rate is larger than the service rate. Event Validity: The “events” of occurrences of the simulation model are compared to those of the real system to determine if they are the same. An example of events are deaths in a given fire department simulation, Extreme-Condition Tests: ‘The model structure and output should be plausible for any extreme and unlikely combination of levels of factors in the system, e.g., if in-process inventories are zero, production output should be zero. Also, the model should be bound to restrict the behavior outside of normal operating ranges. Face Validity: Face validity is asking people knowledgeable about the system whether the model and/or its behavior is reasonable. This technique can be used in determining if the logic in the conceptual model is correct and if a model's input-output relationships are reasonable. Fixed Values: Fixed values are used for all model ‘input and internal variables. This should allow checking the model results against hand calculated values, Historical Data Validation: If historical data exist (or if data is collected on a system for building or testing the model), part of the data is used to build the model and the remaining data is used to determine (test) ifthe model bbehaves as the system does. (This testing is conducted by driving the simulation model with either Distributions or Traces (Balci and Sargent 1982a, 1982b, 40 1984b).) Historical Methods: The three historical methods of validation are Rationalism, Empiricism, and Positive Economics, Rationalism assumes that everyone knows whether the underlying assumptions of a model are true. ‘Then logic deductions are used from these assumptions to develop the correct (valid) model. Empiricism requires every assumption and outcome to be empirically validated. Positive Economics requires only that the ‘model be able to predict the future and is not concerned with a model's assumptions or structure (causal relationships or mechanisms). Internal Validity: Several replications (runs) of a stochastic model are made to determine the amount of internal stochastic variability in the model. A high amount of variability (lack of consistency) may cause the ‘model's results to be questionable, and, if typical of the problem entity, may question the appropriateness of the policy or system being investigated. ‘Multistage Validation: Naylor and Finger (1967) proposed combining the three historical methods of Rationalism, Empiricism, and Positive Economics into multistage process of validation. This validation method consists of (1) developing the model's assumptions on theory, observations, general knowledge, ‘and function, (2) validating the model's assumptions where possible by empirically testing them, and (3) ‘comparing (testing) the input-output relationships of the model to the real system. Parameter Variability - Sensitivity Analysis: This validation technique consists of changing the values of the input and internal parameters of a model to determine the effect upon the model behavior and its output. The same relationships should occur in the model as in the real system. Those parameters that are sensitive, i.e. ccause significant changes in the model's behavior, should be made sufficiently accurate prior to using the model. (This may require iterations in model development.) Predictive Validation: The model is used to predict (forecast) the system behavior and comparisons are made to determine if the system behavior and the model's, forecast are the same. The system data may come from ‘an operational system or from experiments performed on the system, e.g, field tests Traces: The behavior of different types of specific entities in the model are traced (followed) through the model to determine if the model's logic is correct and if the necessary accuracy is obtained, Turing Tests: People who are knowledgeable about the operations of a system are asked if they can discriminate between system and model outputs. (Schruben (1980) contains statistical procedures for Turing tests.) Sargent 4 DATA VALIDITY Even though data validity is usually not considered part of model validation, we discuss it because it is usually difficult, time consuming, and costly to obtain sufficient, accurate and appropriate data, and is frequently the reason that early attempts to validate a model fail. Basically, data is needed for three purposes: for building the conceptual model, for validating the model, and for performing experiments with the validated model. In ‘model validation, we are concerned only with the first two types of data. To build a conceptual model, we must have sufficient data on the problem entity to develop theories that can be used in building the model, to develop the mathematical and logical relationships in the model for it to adequately represent the problem entity for its intended ‘purpose, and to test the model's underlying assumptions. Also needed is behavior data on the problem entity to be used in the operational validity step of comparing the problem entity's behavior with the model's behavior. (Usually, these data are system input/output data.) If these data are not available, high model confidence usually cannot be obtained because sufficient operational validity cannot be achieved. ‘The concern with data is that appropriate, accurate, and sufficient data are available, and if any data transformations are made, such as disaggregation, they are correctly performed. Unfortunately, there is not ‘much that can be done to ensure that the data are correct. ‘The best that one can do is to develop good procedures for collecting and maintaining data, and test the collected data using such techniques as internal consistency checks, and screening for outliers and determine if they are correct, If the amount of data is large, a data base should be developed and maintained. 5 CONCEPTUAL MODEL VALIDATION Conceptual model validity is determining that the theories and assumptions underlying the conceptual model are correct, and that the model representation of the problem entity and the model's structure, logic, and mathematical and causal relationships are “reasonable” for the intended purpose of the model. ‘The theories and assumptions underlying the model should be tested using mathematical analysis and statistical methods on Problem entity data. Examples of theories and assumptions are linearity, independence, stationary, and Poisson arrivals. Examples of applicable statistical methods are fitting distributions to data; estimating parameter values, mean, variance, and correlations among data observations; and plotting data to see if it is Stationary. In addition, all theories used should be Verification and Validation reviewed to ensure they were applied correctly; for example, if a Markov chain is used, does the system have the Markov property and are the states and transition probabilities correct? Next, each submodel and the overall model must be evaluated to determine if they are reasonable and correct for the intended purpose of the model. This should include determining if the appropriate detail and aggregate relationships have been used for the model's intended purpose, and if the appropriate structure, logic, and mathematical and causal relationships have been used. The primary validation techniques used for these evaluations are face validation and traces. Face validation is having experts on the problem entity evaluate the conceptual model to determine if they believe it is correct and reasonable for its purpose. This usually means examining the flowchart or graphical model, or the set of model equations. ‘The use of traces is the tracking of entities through each submodel and the overall model to determine if the logic is correct and the necessary accuracy is maintained. If any errors are found in the conceptual model, it must be revised and conceptual ‘model validation performed again. 6 COMPUTERIZED MODEL VERIFICATION Computerized model verification is ensuring that the computer programming and implementation of the conceptual model is correct. To help ensure that a correct computer program is obtained, program design and development procedures found in the field of Software Engineering should be used in developing and implementing the computer program. These include such techniques as top-down design, structured programming, and program modularity. A separate rogram module should be used for each submodel, the ‘overall model, and for each simulation function (e.g., time-flow mechanism, random number and random variate generators, and integration routines) when using ‘general purpose higher order languages, ¢., FORTRAN or PASCAL, and where possible when using simulation languages (Chattergy and Pooch 1977). (See Whitner and Balci (1989) for a more detailed discussion on model verification. ) ‘One should be aware that the use of different types of computer languages effects the probability of having a correct program. The use of a special purpose simulation language, if appropriate, generally will result in having less errors than if a general purpose simulation language is used, and using a general purpose simulation language will generally result in having less errors than if a general purpose higher order language is used. Not only does the use of simulation languages increase the probability of having a correct program, they usually 4 reduce programming time. ‘After the computer program has been developed, implemented, and hopefully most of the programming "pugs" removed, the program must be tested for correctness and accuracy. First, the simulation functions should be tested to see if they are correct, Usually straightforward tests can be used here to determine if they are working properly. Next, each submodel and the overall model should be tested to see if they are correct. Here the testing is much more difficult. There are two basic approaches to testing: static and dynamic testing (analysis) (Fairley 1976). In static testing the computer program of the computerized model is analyzed to determine if it is correct by using such techniques as correctness proofs, structured walk-through, and examining the structure properties of the program. The commonly used structured walk-through technique consists of each program developer explaining their ‘computer program code statement by statement to other members of the modelling team until all are convinced it is correct (or incorrect). In dynamic testing, the computerized model is executed under different conditions, and the values obtained are used to determine if the computer program and its implementations are correct. This includes both the values obtained during the program execution and the final values obtained, There are three different strategies to use in dynamic testing: bottom-up testing which means, e.g., testing the submodels first and then the overall model; top-down testing which means, e.g., testing the overall model first using programming stubs (ets of data) for each of the submodels and then testing the submodels; and mixed testing, which is using a combination of bottom-up and top-down testing (Fairley 1976). The techniques commonly used in dynamic testing are traces, investigations of input-output relations, using the validation techniques, internal consistency checks, and reprogramming critical components to determine if the same results are obtained. If there are a large number of variables, one might aggregate to reduce the number of tests needed or use certain types of design of experiments (Kleijnen 1987), e.g., factor screening experiments (Smith and Mauro 1982) to identify the key variables, in order to reduce the number of experimental conditions that need to be tested. One must continuously be aware while checking the correctness of the computer program and its implementation, that errors may be caused by the data, the conceptual model, the computer program, or the computer implementation. 7 OPERATIONAL VALIDITY Operational validity is primarily concerned with 42 determining that the model’s output behavior has the accuracy required for the model's intended purpose over the domain of its intended applicability. This is where ‘most of the validation testing and evaluation takes place. ‘The computerized model is used in operational validity and thus any deficiencies found may be due to an inadequate conceptual model, an improperly programmed or implemented conceptual model (e.g., due to programming errors or insufficient numerical accuracy), ‘or due to invalid data, All of the validation techniques discussed in Section 3 are applicable to operational validity. Which techniques and whether to use them objectively or subjectively must be decided by the model development team and other interested parties. ‘The major attribute effecting operational validity is whether the problem entity (or system) is observable or not, where observable means it is possible to collect data on the operational behavior of the program entity. Figure 3 gives one classification of the validation approaches for operational validity. ‘The “explore model behavior" means to examine the behavior of the model using appropriate validation techniques for various sets of experimental conditions from the domain of the model's intended applicability and usually includes parameter variability- sensitivity analysis, To obtain a high degree of confidence in a model and its results, comparison of the model's and system's, input-output behavior for at least two different sets of experimental conditions is usually required. There are three basic comparison approaches used: (i) graphs of the model and system behavior data, (ji) confidence intervals, and (ii) hypothesis tests. Graphs are the most ‘commonly used approach and confidence intervals are next. ‘OBSERVABLE [NON-OBSERVABLE SYSTEM SYSTEM SUBIECTIVE *COMPARISONOF = * EXPLORE APPROACH DATAUSING MODEL BEHAVIOR (GRAPHICAL DISPLAYS EXPLORE MODEL + COMPARISONTO BEHAVIOR OTHER MoDaLs ‘ODIECTIVE —*COMPARISONOF——=—* COMPARISON APPROACH DATA USING TOOTHER STATISTICAL ‘MODELS USING. TESTS AND STATISTICAL PROCEDURES TESTS AND. PROCEDURES Figure 3: Operational Validity Classification Sargent 7.1 Graphical Comparison of Data ‘The model's and system's behavior data are plotted on graphs for various sets of experimental conditions to determine if the model's output behavior has sufficient accuracy for its intended purpose. (See Figures 4 and 5 for examples of such graphs.) A variety of graphs using different types of measures and relationships are required. Examples of measures and relationships are (i) time series, means, variances, and maximums of each output le, (ji) relationships between parameters of each output variable, e.g., means and standard deviations, and (iii) relationships between different output variables. It is important that appropriate measures and relationships be used in validating a model and that they be determined with respect to the model's intended purpose. As an ‘example of a set of graphs used in the validation of a model, see Anderson and Sargent (1974). “These graphs can be used in model validation in three ways. First, the model development team can use the graphs in the model development process to make a subjective judgement on whether the model does or does ‘not possess sufficient accuracy for its intended purpose, 20) sx cosemarn nea system x be Sea x face es trem 0X o| Aewive mignon ‘ose’ Oo 19] x, * ° 2 50 «0 ° xx ° = sol x 3 ° g x8 : x x0 q ° 5a oxox * é ox 30 1203 « Rox 20] x . ‘ol oo Ob 8. fire RENO oe & re 0 —1e00 000 3600 4000 S00 NUMBER OF DISK ACCESSES Figure 4: Disk Access Verification and Validation Real system -2 SIMULATION MODEL “A. 2 zis 3 4 Z 5 10 s 2 fe? Pos fr ae s;——s —— oe os is 15 Zo AVERAGE VALUE OF REACTION TIME (SECONDS) Figure 5: Reaction Time Secondly, they can be used in the face validity technique ‘where experts are asked to make subjective judgements ‘on whether a model does or does not possess sufficient accuracy for its intended purpose. ‘The third way the graphs can be used is in Turing Tests. Sets of data from the model and from the system are plotted on separate graphs. The graphs are shuffled and then experts are asked to determine which graphs are from the system and which are from the model. The results for each measure and relationship can be evaluated either subjectively or statistically. The subjective method requires that a subjective decision be made whether the results are satisfactory or not. ‘The statistical method requires that the results be analyzed statistically. See Schruben (1980) for a variety of statistical methods for analyzing the results of Turing Tests and examples of their use. 7.2 Confidence Intervals Confidence intervals (c.i.), simultaneous confidence intervals (S.c.i.), and joint confidence regions (jc.x.) can be obtained for the differences between the population parameters, e.g., means and variances, and distributions of the model and system output variables for each set of experimental conditions. These c.i.,s.ci., and j.cx. can be used as the model range of accuracy for model validation, To construct the model range of accuracy, a statistical procedure containing a statistical technique and a method of data collection must be developed for each set of experimental conditions and for each parameter of interest. ‘The statistical techniques used can be divided {nto two groups: (A) univariate statistical techniques and (B) multivariate statistical techniques. The univariate 43 techniques can be used to develop c.i. and with the use of the Bonferroni inequality (Law and Kelton 1991) s.c.i. ‘The multivariate techniques can be used to develop s.c.i. and jc.r. Both parametric and nonparametric techniques can be used, The method of data collection must satisfy the underlying assumptions of the statistical technique being used. The standard statistical techniques and data collection methods used in simulation output analysis can be used for developing the model range of accuracy: namely (1) replication, (2) batch means, (3) regenerative, (4) spectral, (5) time series, and (6) standardized time series (Banks and Carson 1984, Law and Kelton 1991). It is usually desirable to construct the model range of accuracy with the lengths of the c.i, and s.c.i. and the sizes of the jc.t. as small as possible. The shorter the lengths or the smaller the sizes, the more useful and meaningful the specification of the model range of accuracy will usually be. The lengths and the sizes of the joint confidence regions are affected by the values of confidence levels, variances of the model and system response variables, and sample sizes. The lengths can be shortened or sizes made smaller by decreasing the confidence levels. Variance reduction techniques (Law and Kelton 1991) can be used when collecting observations from a simulation model to decrease the variability and thus obtain a smaller range of accuracy. ‘The lengths can also be shortened or the size decreased by increasing the sample sizes. A tradeoff needs to be made ‘among the sample sizes, confidence levels, and estimates of the length or sizes of the model range of accuracy. In those cases where the cost of data collection is significant for either the model or system, the data collection cost should also be considered in the tradeoff analysis. Tradeoff curves can be constructed to aid in the tradeoff analysis. Figure 6 is an example of a set of tradeoff curves which contain the relationship between the significance level, x, estimated half lengths of the confidence interval, and cost of data collection, snumun Hace Lenora estamare (ut DATA COLLECTION COST IN GOLLARS (coe) Figure 6: Tradeoff Curves 4 Details on the use of ci., s.c.i, and j.c.r. for operational validity, including a general methodology, are contained in Balci and Sargent(1984b). A brief discussion on the use of c.i. for model validation is also contained in Law and Kelton (1991). 7.3. Hypothesis Tests Hypothesis tests can be used in the comparison of parameters, distributions, and time series of the output data of a model and a system for each set of experimental conditions to determine if the mode!’s output behavior has an acceptable range of accuracy. An acceptable range of accuracy is the amount ef accuracy that is required of a model to be valid for its intended purpose. ‘The first step in hypothesis testing is to state the hypotheses to be tested: Hg: Model is valid for the acceptable range of accuracy under the set of experimental conditions. (I) Hj: Model is invalid for the acceptable range of accuracy under the set of experimental conditions. ‘Two types of errors are possible in testing the hypotheses in (1). The first or type I error is rejecting the validity of a valid model; the second or type I error accepting the validity of an invalid model. The probability of a type error I is called model builder's risk (@) and the probability of type Il error is called model user's risk (B). In model validation, model user's risk is, extremely important and must be kept small. Thus both type I and type II errors must be considered in using hypothesis testing for model validation. ‘The amount of agreement between a model and a system can be measured by a validity measure. ‘The validity measure is chosen such that the model accuracy ‘or the amount of agreement between the model and the system decreases as the value of the validity measure increases. The acceptable range of accuracy can be used to determine an acceptable validity range, 0SA

You might also like