In contrast to validation, which is concerned with the appropriateness of the conceptual model for questions of the intended study, output analysis of the results of computer experiments is concerned with the outputs of this model and its performance. It therefore is mainly a statistical task involving problems like the determination of the run length of a simulation or the necessary number of replications (Law and Kelton, 1991). Which simulation outputs and performance measures are relevant in the analysis of a simulation and for drawing appropriate conclusions about the real-world system depends on the nature of the system output. Concerning their output and behavior over time systems and their conceptual models for simulation can be divided into (i) steady-state or transient systems (concerning the behavior of output over time) and (ii) terminating or non-terminating systems (concerning the time horizon of the system) (Liebl, 1992; Pidd, 1992). These two factors with each two levels combine to a total of only three reasonable types of systems (Liebl, 1992):15
• non-terminating steady-state systems: These systems reach a long-term equilibrium state with no trend components so that the system outputs are time-invariate. A system is considered to be in a steady-state if its current behavior is independent of the starting con- ditions and if the probability of being in one of its states is governed by a fixed probability function, which means that the system may change its state but the probability of doing so can be determined. Therefore a steady-state system embodies a steady stochastic process. Systems that can reach a steady-state clearly should be evaluated when in this state – i.e. when the effects of the initial conditions are no longer noticeable (Liebl, 1992; Pidd, 1992). • non-terminating transient systems: Most systems however do not reach a steady-state but are non-terminating as well, for example due to varying inputs over time (’rush hour’ or seasonal trends) as for example an airport which is not closed at night but has lower airplane arrival rates during the night and higher ones during some rush hour (Liebl, 1992; Pidd, 1992).
• terminating transient systems: Whereas non-terminating systems are considered as a con- tinuum (like airports without night closure or phone networks that are started only once and then are considered to run infinitely) terminating transient systems have some natural start and end, so that their time horizon is a finite one as they are self terminating by some particular events (Pidd, 1992). Here an example would be a post office opening at 9:00 am and closing at 5:00 pm.
The output or response of a simulation run depends on the input – even in steady-state systems the starting conditions as input to the system will influence if and when the system reaches a steady-state. In simulations where these inputs are random variables a single run of the simulation only yields information about the simulation output for this specific values of the input variable (or combination of values in case of more than one input variable). Therefore we are usually not interested in one input-specific response but rather on the distribution of the
15The fourth combination would be a system that reaches a steady-state but is a terminating system, which is
a contradiction, as systems with logical start and end do not satisfy the conditions of time invariance of output measures necessary for steady state (Liebl, 1992).
response or summary measures of this output distribution for different input variables like mean, standard deviation, variance, quantiles, and minimum or maximum (Kelton, 1999).
The output measures of interest for non-terminating systems are usually some average measures as it makes no sense to consider aggregate measures due to the infinite time horizon of the system. For non-terminating steady-state systems independence of average measures from initial input variables can be achieved by deleting observations from the so-called run-in phase, where the output measures of the system are still influenced by the initial starting conditions of the simulation and are not in a steady-state yet.16 However, still observations will not be independent
– a major requirement for the applicability of many statistical techniques – but are likely to be autocorrelated (Liebl, 1992; Pidd, 1992). In a simple queuing system for example the waiting time of a customer will depend on the number of people that are already waiting when the customer arrives and the waiting time of the customer will therefore depend on the waiting time of preceding customers (Pidd, 1992). To achieve independence of observations in non-terminating steady-state systems (after removing observations of the run-in phase) the whole output time series can be divided into batches for which average measures are calculated, which are likely to be independent of each other when the batch size is long enough. Batches can also be used for non- terminating transient systems if their output shows some cyclical or periodical behavior where the cycles or periods form the batches for which average measures are calculated. If such cyclical behavior cannot be observed then the only way to account for input dependence of the outcomes is – same as for terminating systems – to do several replications of the simulation run and calculate average or aggregated measures over these replications. In terminating systems one run from the start condition of the simulation until the critical event that terminates the simulation yields a single observation of the response of interest. This observation clearly incorporates start-up and end effects due to the specific input variables. To yield several observations in terminating systems, for which summary measures of the response variables can be calculated – which then are independent of the simulation input –, several replications of the simulation over its time horizon are necessary (Kleijnen, 1987; Pidd, 1992).
An additional possibility to increase the accuracy of outcome measurement of a system is the artificial reduction of the output variance by means of so called variance reduction techniques (Law and Kelton, 1991; Liebl, 1992). Variance reduction is a procedure to increase the precision of the estimates that can be obtained from a number of replications of a simulation. Every output variable of a simulation is in case of the usage of random input variables itself also a random variable with a particular variance that limits the precision of the simulation results. In order to render a simulation statistically more efficient, i.e., to obtain a greater precision for the output variables of interest variance reduction techniques like common random numbers, antithetic variates, control variates, indirect estimation, conditioning, importance sampling, and stratified sampling can be used (Law and Kelton, 1991; Liebl, 1992). We will focus on common random numbers for variance reduction which is – maybe due to its simplicity – the most popular and powerful variance reduction technique. Furthermore the common random numbers technique is the only one of the above mentioned directly applicable for comparisons of two or more alternative
16No matter if the simulation is started from an ’empty-and-idle’ state or with ’typical’ starting conditions –
which might be difficult to know exactly and must be equal for all runs or system versions – there will be such a run-in phase, which has to be deleted from the observations, however, it might be shorter in the later approach of starting the simulation (Pidd, 1992).
2.7. Experimentation 37
system configurations, which is the objective of this dissertation, while other variance reduction techniques are applicable only for the investigation of one single configuration (Law and Kelton, 1991).17
The basic idea of the use of common random numbers is that alternative configurations should be compared under similar experimental conditions so that observed differences are due to dif- ferences in the system configuration rather than the result of fluctuations of the experimental conditions i.e. the different realizations of the random number generator used in the simulation program for interarrival times, demand sizes, etc. The common random number technique for variance reduction requires the synchronization of the random numbers streams i.e. the use of the same realizations of random input variables for the same purposes in system configurations to be compared. Thereby the simulation program experiences the same environmental input for different system configurations and differences can only be due to these differences in system configuration (Law and Kelton, 1991).18