In this section, we investigate whether adding the empirical decision rules obtained from the scenario-based questionnaire data improves base model validity. Two techniques are used for the validation process (i.e., mean error estimation and regression analysis). The mean error estimation aims to measure the magnitude of model output
deviations from the real data and the regression analysis results show how well the model outputs represent the trends in the real data. We used the real cattle population, cow population, milk production and the number of farmer households data obtained from the farmer cooperative (KPBS, 2016) to validate the ABS models. These variables are considered to be important by both the government and cooperative when recording their statistics.
Table 3. 3 Cattle population, cow population, average daily milk production and the number of farmer households in Pangalengan West Java 2010-2016 (KPBS, 2016) Year Cattle population (head) Cow population (head) Average daily Production (litre) Farmer Household January – 2010 21,322 21,083 159,333 5072 January – 2011 21,438 20,960 136,694 4204 January – 2012 22,366 22,073 138,904 3439 January – 2013 16,173 16,080 97,476 3053 January – 2014 13,415 13,399 84,207 2888 January – 2015 12,563 12,555 76,372 2852
To estimate the mean error, first, the difference between model outputs at the end of each simulation year and the real data, from January 2010 until December 2011 (i.e.,
𝐸𝑟𝑟𝑜𝑟 = 𝐷𝑎𝑡𝑎 − 𝑆𝑖𝑚𝑢𝑙𝑎𝑡𝑖𝑜𝑛 where i = 2011 … 2012) was measured. This time interval is chosen because a drastic decline occurred in cow population, cattle population and average daily milk production in 2012 (which appears in the data for January 2013). This decline occurred owing to an external factor that was not considered in the model (i.e., the policy to stop beef imports). This policy created an incentive for the farmers to sell their productive cows as meat.
We then computed the mean error (ME) from 2011 to 2012 (i.e. 𝑀𝐸 = ∑ 𝐸𝑟𝑟𝑜𝑟 2⁄ ). Table 3.4 shows the average (𝑀𝐸) and standard deviation (𝑆 ) of outputs from 25 replications. A t-test was then carried out to infer whether, in the long
run, the model’s average ME is zero. The two-tailed significance (sig. column) of the t- test at 95% confidence level is also presented in Table 3.4.
Table 3. 4 The descriptive statistics and t-result of ABS models’ error from the real data
Model Name
Cattle Population Cow Population
Daily Milk Production
Farmer Households (𝑴𝑬, 𝑺𝑴𝑬) Sig. (𝑴𝑬, 𝑺𝑴𝑬) Sig. (𝑴𝑬, 𝑺𝑴𝑬) Sig. (𝑴𝑬, 𝑺𝑴𝑬) Sig.
M0 (-2272.3, 4395.4) 0.02 (-1443.1, 4025.3) 0.09 (20600.2, 13421.1) 0.00 (62.6, 130.6) 0.02
Empirical models with one empirical decision rule
MSBQ1 (-1494.8, 5075.0) 0.15 (-876.3, 4703.7) 0.36 (16811.3, 15151.1) 0.00 (48.3, 125.1) 0.07
MSBQ2 (-1458.1, 5095.7) 0.17 (-867.4, 4573.9) 0.35 (6267.3, 18333.9) 0.10 (422.4, 313.5) 0.00
MSBQ3 (-1755.2, 4643.9) 0.07 (-975.3, 4244.8) 0.26 (20693.8, 13108.0) 0.00 (61.0, 122.3) 0.02
Empirical models with two empirical decision rules
MSBQ4 (-1506.9, 5088.8) 0.15 (-876.0, 4718.4) 0.36 (16943.8, 15206.1) 0.00 (45.1, 124.1) 0.08
MSBQ5 (-1523.6, 5144.1) 0.15 (-965.3, 4593.2) 0.30 (5912.9, 18434.2) 0.12 (408.8, 315.7) 0.00
MSBQ6 (-1472.9, 5104.2) 0.16 (-874.6, 4586.1) 0.35 (6359.1, 18349.6) 0.10 (422.3, 311.9) 0.00
Empirical model with three empirical decision rules
MSBQ7 (-1504.0, 5116.5) 0.15 (-904.4, 4588.1) 0.33 (6118.6, 18383.3) 0.11 (411.1, 310.0) 0.00 In Table 3.4, a lower |𝑀𝐸| value indicates that on average the model output is closer to the real data. While, a significance value higher than 5% indicates that we fail to reject the null hypothesis that the simulation output reflects the real world data (i.e., a valid model). Table 3.4 shows that the base model is only valid for prediction of the cow population. However, Table 3.4 also shows that the model’s operational validity can be improved by using the empirical decision rules. Buying, selling and their combinations are the decision rules that can improve the model’s operational validity on most output variables while the empirical sorting decision rule can only increase the model’s validity in predicting cattle and cow population. Table 3.4 also shows that the significance of the base model and model that use empirical sorting decision rules are not very different.
Table 3.5 summarizes the regression analysis results between simulation outputs and real data. In this regression analysis, the mean of simulation outputs from 25 replications (for example the mean of simulated cow population in 2012, 𝐶𝑜𝑤 =
∑ 𝐶𝑜𝑤
25, with i represents the replication) was used as the independent
variable and real data was used as the dependent variable. This regression analysis focused more on the match between the trends produced by the simulation and the trend in real data rather than the accuracy of the predicted value. Consequently, the external factor mentioned earlier is not very influential and all data from 2011-2015 can be incorporated. The significance column (Sig) in Table 3.5 shows the significance of the ANOVA test and confirms the validity of the regression analysis. A lower significance value indicates a smaller probability that the relationship between the average simulation outputs and the real data occurs by chance. The positive regression coefficient value, presented in column B, indicates that the simulation outputs and the real data have a similar trend (i.e., they move in the same direction). The R2 values
show the proportion of variation in the real data that can be explained by the simulation outputs variation. A higher R2 value indicates that a particular model has a better fit to
Table 3. 5 Summary of regression analysis between the simulation outputs and the real data
Model Name
Cattle Population Cow Population
Daily Milk
Production Farmer Households
Sig. B R2 Sig. B R2 Sig. B R2 Sig. B R2
M0 0.00 2.77 0.98 0.00 3.27 0.96 0.03 1.44 0.83 0.00 0.60 0.97
Empirical models with one empirical decision rule
MSBQ1 0.00 3.00 0.99 0.00 3.75 0.98 0.02 1.66 0.86 0.00 0.62 0.97
MSBQ2 0.00 4.11 0.95 0.00 3.89 0.96 0.00 5.31 0.97 0.00 0.67 0.99
MSBQ3 0.00 2.70 0.99 0.00 3.21 0.97 0.03 1.43 0.83 0.00 0.60 0.97
Empirical models with two empirical decision rules
MSBQ4 0.00 3.04 0.99 0.00 3.82 0.98 0.02 1.66 0.86 0.00 0.62 0.97
MSBQ5 0.00 4.16 0.95 0.00 3.98 0.96 0.00 5.41 0.97 0.00 0.67 0.99
MSBQ6 0.00 4.09 0.98 0.00 3.93 0.97 0.00 5.29 0.98 0.00 0.67 0.99
Empirical model with three empirical decision rules
MSBQ7 0.00 4.10 0.95 0.00 3.95 0.97 0.00 5.40 0.98 0.00 0.66 0.99 Table 3.5 shows that for all output variables all models have significantly linear relationships with the real data. All models are also able to imitate the trends in the real data. However, the models that use the empirical decision rules often have a better fit to the real data. Specifically, the empirical buying decision can increase the R2 value for
most output variables.