As mentioned in Section 6.3.1, the literature recommends approximately 3 unfolding iterations should be run in the case of infinite statistics. 1 iteration was found to be
νe cross-section measurement 151 Fake data: Crazy OOFV
>
>
νe In FV bkg OOFV bkg Fake data: Crazy OOFV
e) Fake data: Crazy OOFV νe
In FV bkg OOFV bkg Fake data: Crazy OOFV
4) Fake data: Crazy OOFV
>
>
νe In FV bkg OOFV bkg Fake data: Crazy OOFV
(GeV/c) Fake data: Crazy OOFV
>
>
νe In FV bkg OOFV bkg (reweighted) Fake data: Crazy OOFV
e) Fake data: Crazy OOFV νe
In FV bkg OOFV bkg (reweighted) Fake data: Crazy OOFV
4) Fake data: Crazy OOFV
>
>
νe In FV bkg OOFV bkg (reweighted) Fake data: Crazy OOFV
Figure 6.18: γ sample before and after OOFV re-weighting when the fake data is the “crazy OOFV” model.
optimum for the T2K νµ CC inclusive measurement due to the low statistics in that dataset. For this analysis, which has even fewer events than the νµ analysis, 1 iteration would again be expected to be optimal.
The choice of how many iterations to perform is based on studying the bias of the unfolding method and the fractional statistical error as the number of iterations is increased. The bias is the fractional deviation of the unfolded differential cross-section from the true cross-section of the fake dataset,
Bias = Ntmk − Nttruek
Nttruek , (6.21)
where Nttruek is the true number of events in bin tk for the fake dataset being tested. The statistical error is the quadratic sum of the data statistical and MC statistical errors, which are described in Section6.4.4. In Figure 6.19, the BANFF pre-fit is used for both generating the unfolding and as the fake dataset. The negligible bias (10−10%) shows that there is no pathological bug in the unfolding code, and the correct cross-section is extracted when using the same data for generation and unfolding. As expected, the statistical error increases with the number of iterations.
A more thorough test of the unfolding routine is shown in Figure 6.20, where the BANFF post-fit is used as the fake dataset. The statistical error increases with the number of iterations, as expected, but there is also a slight bias, which again increases
Iteration
1 2 3 4 5
Bias from truth
-1.5 -1 -0.5 0 0.5 1 1.5
10-12
×
Bin 1 Bin 2 Bin 3 Bin 4 Bin 5 Bin 6 Bin 7 Bin 8 Avg.
Iteration
1 2 3 4 5
Fractional statistical error
0 0.1 0.2 0.3 0.4 0.5
Bin 1 Bin 2 Bin 3 Bin 4 Bin 5 Bin 6 Bin 7 Bin 8
Figure 6.19: Bias from the true BANFF pre-fit (left) and fractional statistical error (right) in each ptruee bin when generating with the BANFF pre-fit, and using BANFF pre-fit as the fake dataset. Note the y-axis scale on the bias plot.
Iteration
1 2 3 4 5
Bias from truth
-0.15 -0.1 -0.05 0 0.05 0.1 0.15
Bin 1 Bin 2 Bin 3 Bin 4 Bin 5 Bin 6 Bin 7 Bin 8 Avg.
Iteration
1 2 3 4 5
Fractional statistical error
0 0.1 0.2 0.3 0.4 0.5 0.6
Bin 1 Bin 2 Bin 3 Bin 4 Bin 5 Bin 6 Bin 7 Bin 8
Figure 6.20: Bias from the true BANFF post-fit (left) and fractional statistical error (right) in each ptruee bin when generating with the BANFF pre-fit, and using BANFF post-fit as the fake dataset.
with more iterations. Here, the bias is the fractional difference from the BANFF post-fit cross-section prediction, and is approximately 3% for 1 iteration. 1% of this bias is expected as the BANFF post-fit changes the νe flux by 1%, whilst the unfolding assumes the BANFF pre-fit flux is correct. A further bias is also expected as the BANFF post-fit re-weights the background contribution in the fake data. As there is less background in the fake data than in the MC, the unfolded νe cross-section is expected to be biased slightly high. Finally, as the 3% bias is small compared to any systematic uncertainty, it is not a concern.
It is also interesting to examine how necessary the unfolding procedure is. Figure6.21 shows the same information as Figure 6.20, but also includes a “0 iteration” result.
νe cross-section measurement 153
Iteration
0 1 2 3 4 5
Bias from truth
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
Bin 1 Bin 2 Bin 3 Bin 4 Bin 5 Bin 6 Bin 7 Bin 8 Avg.
Iteration
0 1 2 3 4 5
Fractional statistical error
0 0.1 0.2 0.3 0.4 0.5 0.6
Bin 1 Bin 2 Bin 3 Bin 4 Bin 5 Bin 6 Bin 7 Bin 8
Figure 6.21: Bias from the true BANFF post-fit (left) and fractional statistical error (right) in each ptruee bin when generating with the BANFF pre-fit, and using BANFF post-fit as the fake dataset. These plots include a “zero-iteration” result, where no unfolding is done and the background-subtracted data is simply efficiency-corrected.
This result is found by simply taking the background-subtracted data and correcting for the efficiency in each bin. For the pe case shown in the figure, this means that the cross-section in the 0–200 MeV/c bin is zero, as there is no data in that bin. Significant biases are present in the “0 iteration” result, indicating that the unfolding procedure is absolutely necessary.
A further test of the method is performed by using the “crazy signal” fake dataset, in which the νe signal shape and normalisation is significantly modified. Figure6.22 shows the bias and statistical error as a function of the number of iterations, and again the bias is not significantly reduced by applying more iterations, whilst the statistical error still increases. This further justifies the choice of using a single iteration. The size of the bias will be discussed in more detail in Section6.4.5, where it is explained that the bias is small compared to the difference between the model predictions, and is also small compared to the uncertainties.