• No se han encontrado resultados

The focus of investigation for a potential improvement of the previous analysis will be placed on the combinatorial cut search, the range search method and neural networks. While the first two methods have been chosen in the previous analysis, neural networks have been regarded as too time consuming and not worth considering regarding possible improvements in performance. This assessment will be challenged here. In addition the previous analysis fixed the background set to the MEPS simulation while here all the different training strategies which have been discussed in section 7.3.3 will be tried out.

We first address the question why the search for instanton-induced events was restricted to three inputs in the previous analysis. One part of the answer is that the differences between simulation and data in the two additional quantities ET,J et and ET,B are larger than the differences in the other three quantitiessphB, nB and Q0rec

2. Closely related but

more important, however, is the observation that the simple cut-based approach in five dimensions leads to too many selected events in the CDM simulation compared to the data. Whereas a lower number of perturbative events would be expected in a scenario where instanton-induced events are produced, more perturbative events in the prediction than in the data clearly points to a problem in the simulation.

Analysis I has not been able to realise that this problem is not specific to the five input quantities. The effect can also be observed with the three first-choice inputs, however, only when larger efficiencies are required. Since the study in analysis I was restricted to the 10% efficiency the effect plotted in figure 7.24 could not be observed.

0 10 20 30 40 50 60 70 80 90 100 -4 -3 -2 -1 0 1 2 3 4 instanton efficiency [%] significance

Range Search, 3 inputs

MEPS

CDM

Figure 7.24: Significances in depen- dence of the instanton efficiency: This curve has been measured with the range search method with three in- puts, shown in table 7.18 (trained with MEPS as background). The dotted line marks the special case of 10% efficiency for instanton-induced events.

When the required instanton efficiency is increased by loosening the cut in the output, the separation power decreases and also the differences between simulations and data are expected to disappear: They are explicitly normalised to return the same number of selected events if no cut is done7. Since statistical and systematic uncertainties become

large compared to the number of selected instanton-induced events, the significance for high efficiencies can be directly taken as a measure for the disagreement between data and simulation. In other words, for high efficiencies it is not important whether the few hundred instanton-induced events are taken into account, the only significant differences there are due to disagreements between experimental data and simulation.

7

7.3 Applications for H1: Instanton Purification 159 Figure 7.24 demonstrates that the evaluation of the significances at the efficiency of 10% is a very special case. In contrast to intuition, the significances do not decrease but increase with higher efficiencies. This means that much stronger results regarding the disagreement of the two simulations could have been obtained already in analysis I by evaluating higher efficiencies. In particular, it can be seen for high efficiencies that the CDM simulation predicts much more events than seen in the data. This effect will play an important role in the following.

Table 7.19 summarises all the different results from different training strategies. The first column describes which method was used: neural network (NN) or combinatorial cut search (CUTS). It also identifies the training set used: either the three standard inputs (sphB, nB and Q0rec

2

) were used (“3”) or all five inputs (ET,J et and ET,B in addition) were used (“5”). The background was formed either by the MEPS simulation (“M”), by the CDM simulation (“C”), or by the experimental dataset (“D”). The second column (“DATA Sel.”) shows the number of selected events in the experimental dataset. The following two blocks show the same information for the two simulations CDM and MEPS: The separation power (“SP”), the number of selected events (“Sel.”) and its total (statistical plus system- atic) uncertainty and the significance (equation 7.3). If the instanton hypothesis holds then the number of selected events in the experimental dataset (“DATA Sel.”) should be the sum of the selected instanton-induced events (“INS Sel.”) plus the standard perturbative events (“CDM Sel.” or “MEPS Sel.”).

The following cuts have been selected by the combinatorial cut search with five inputs since they gave the best separation power:

• For training with background from MEPS: 80GeV2 < Q0

rec

2

< 200GeV2, n

B > 10,

sphB >0.38, ET,J et >1.5GeV and ET,B >12GeV.

• For training with background from CDM: 80GeV2 < Q0

rec

2

< 180GeV2, n

B > 9,

sphB >0.40, ET,J et >1.0GeV and ET,B >11.4GeV.

Training DATA INS CDM MEPS

Sel. Sel. SP Sel. Uncert. Sign. SP Sel. Uncert. Sign.

1 NN-3-D 388 81 109 345 +2534 1.1 141 267 +1723 +2.3 2 NN-3-C 424 82 99 380 +3234 1.1 135 280 +1725 +3.6 3 NN-3-M 387 81 104 363 +2634 1.7 142 265 +2228 +1.9 4 NN-5-D 261 83 101 377 +6949 4.1 160 237 +3741 1.4 5 NN-5-C 347 83 112 343 +4627 2.9 127 300 +1838 1.0 6 NN-5-M 260 82 110 346 +4343 3.9 215 177 +2620 +0.1 7 CUTS-5-M 288 82 97 387 +3636 5.0 193 196 +2320 +0.5 8 CUTS-5-C 286 82 103 372 +4835 4.8 190 199 +1525 +0.3

Table 7.19: Summary of new results for the instanton search – the columns are described in the text.

160 7. Analysis and Results Two technical observations have been made during the training and evaluation of the different classifiers:

• The extensive studies summarised in table 7.19 and in figure 7.25 allow rigorous consistency checks. They have only been possible by the consequent automation of the training and evaluation procedures (compare section 6.3). The calculation of systematic uncertainties plays a very important role in the evaluation process. For each significance 17 test sets have to be evaluated (see table 7.18). Whereas the training times for neural networks and the range search method are similar, the evaluation times are about two orders of magnitude smaller for neural networks. The fast evaluation of the modified test sets with neural networks allows thus to perform the extensive studies shown in table 7.19 and in figure 7.25.

• The comparison of the separation powers from table 7.19 with the range search result in table 7.18 shows that the highest separation powers are achieved with neural networks (142 in line 3 of table 7.19 vs. 126 in table 7.18). The highest separation power is equivalent to the best possible enrichment of instanton-induced events in the data. Whatever the conclusion about the obtained numbers will be, higher separation powers will inevitably lead to more meaningful results.

What is immediately evident from a scan through table 7.19 is that no clear conclusion about the instanton hypothesis can be drawn. Like in the previous analysis [8] the numbers of selected events in simulations and data are sometimes consistent with the instanton hypothesis, sometimes they show deviations. These deviations have unfortunately two different directions for the two simulations CDM and MEPS: The CDM simulation shows almost as many selected events or even more than found in the data leaving little or no room for the predicted number of instanton-induced events. In contrast, the MEPS simulation shows mostly much less selected events than found in the data which leaves room for instanton-induced events. But these would have to be much more than predicted by the simulation, leading to interesting physical implications [7].

A closer look into table 7.19, however, reveals some important details about the mis- match of the two simulations. It was not possible to obtain these details in analysis I because only one specific training was done there (the range search training with three inputs and MEPS as background shown in table 7.18).

The first observation is that the separation powers (“SP”) for the CDM simulation are generally much lower than those for the MEPS simulation. This means that it is more difficult to distinguish between CDM events and instanton-induced events (QCDINS) than to distinguish between MEPS events and instanton-induced events.

A second important observation is that the numbers of selected events in the data agree well for the trainings performed with data and MEPS as backgrounds (“DATA Sel.” in lines 1 and 3 and in lines 4 and 6). In contrast, the number of selected events in the data is always much higher if the CDM simulation was used as background in the training (“DATA Sel.” in lines 2 and 5). This suggests that the phase space regions selected by the training with data and MEPS are similar while the training with the CDM simulation selects a different phase space region. This difference is confirmed, in particular, by the training of the neural network with five inputs: Trainings with data or MEPS as background result in much too many events predicted by CDM (“CDM Sel.” compared to “DATA Sel.” in lines 4 and 6 vs. line 5). In contrast, training with CDM as background results in a prediction of CDM which matches the seen data.

7.3 Applications for H1: Instanton Purification 161 Figure 7.25 finally shows again that the possible validation of the instanton hypothesis with the MEPS simulation depends strongly on the background which is used in the training as well as on the efficiency for instanton-induced events which is chosen. The regions in phase space selected in the training with the CDM simulation are different from those selected in the training with the MEPS simulation. Above, this was observed with the number of selected events in the CDM simulation. Here, this can be observed by the behaviour of the significance for the MEPS simulation. Whereas the significance decreases towards lower efficiencies for the training with MEPS it stays constantly very high for the training with CDM. 0 10 20 30 40 50 60 70 80 90 100 -4 -3 -2 -1 0 1 2 3 4 instanton efficiency [%] significance

Trained with DATA

MEPS CDM 0 10 20 30 40 50 60 70 80 90 100 -4 -3 -2 -1 0 1 2 3 4 instanton efficiency [%] significance Trained with CDM MEPS CDM 0 10 20 30 40 50 60 70 80 90 100 -4 -3 -2 -1 0 1 2 3 4 instanton efficiency [%] significance

Trained with MEPS

MEPS

CDM

Figure 7.25: Significances in dependence of the instanton efficiency: All curves have been measured with the neural network trainings with three inputs shown in table 7.19.

In summary, the contradictory results of the simulations make a conclusion about the instanton hypothesis difficult. If one believes that the MEPS simulation describes the perturbative QCD event classes correctly, then the instanton hypothesis is confirmed. In particular, the difference between simulated and observed events is significantly larger than the number of predicted instanton-induced events in the three-dimensional study, while the predicted number of instanton-induced events matches well in the five-dimensional case. If one believes that the CDM simulation describes the perturbative QCD event classes correctly, then the instanton hypothesis cannot be confirmed as the results in five dimensions leave no room. Even more important, the CDM simulation predicts more events than found in the data which points to a problem in the simulation. Finally, we have two simulations claiming to describe the perturbative QCD event classes whose predictions differ significantly in the selected region of phase space. Improvements in the understanding of these differences are clearly needed, some aspects which could help have been discussed here.

162 7. Analysis and Results

7.4

Higgs Boson Parity Measurement at a Future Lin-

ear Collider

While the next generation of hadron colliders aims, among other “new physics” topics like super-symmetry and large extra dimensions, at the detection of the Higgs Boson, the determination of its properties, in particular of its parity, is a task for a future linear col- lider. Whereas the standard model Higgs boson must have positive parity (scalar particle), super-symmetric models predict in addition a Higgs boson with negative parity (pseudo- scalar particle). It will be very important to be able to distinguish between these two cases. Section 2.2 already introduced the basic ideas how such a measurement could be done.

Documento similar