3.2. Compuestos de Selenio
3.2.1. Comportamiento del ´atomo de selenio en distintos compuestos
3.2.1.2. Procesos de tautomerizaci´on de los selenouracilos y efectos
As previously mentioned, the most immediate extension of these results is an
optimization of network architecture. In particular, we can explore the relative performance of the models mentioned in the Appendix, as well as the use of techniques such as Dropout and alteration of the learning rate. We can also modify the depth of each network, as network depth has in many cases been shown to dramatically increase performance.
Because the data images are derived from previously-generated files, it is relatively easy to change the resolution of the images. When the input images contain more pixels, classification accuracy may increase, as the amount of visible structure in learned features will likewise increase. We propose to test the effects of image resolution on network performance by
representing the dataset as images of other sizes, such as 90x90 and 10x10, and feeding this data into an adapted version of the current CNN model.
Scene Labeling
Another future application, called scene labeling, may allow for improvement upon analyses of jet topologies in calorimeter images. In a scene labeling procedure, each individual pixel in an image is tagged as belonging to one of several predetermined categories. To produce these tags, a CNN is modified to learn specific features that are correlated with labeled regions in an image [26]. A sample CNN scene-labeling algorithm, trained to identify various regions of an landscape scene, is depicted in Figure 12.
Figure 12: A scene-labelled image. Each pixel is marked to correspond with the region or object it represents [26].
By using scene labeling to mark the regions of high-energy physics images that contain certain types of jet, such as heavy quark jets and gluon jets, it may be possible to utilize CNNs to recover valuable information about the topology of rare events. This may have potential
applications in the development of new b-tagging algorithms, as well as the development of tagging algorithms for other, less distinctive types of jet.
Trigger Applications
During its current run, it is estimated that the CMS detector will gather roughly 40 terabytes of data every second, roughly corresponding to 40 million events per second [19]. Most of this information comes from relatively well-understood events, such as the production of low-energy jets. Due to the inability to store all of this information, a series of high-speed triggers are designed to prune data before storage, discarding events that are deemed uninteresting for physics purposes. Example event selection criteria include the presence of an electron or muon, a large missing transverse energy, or the presence of a high-energy jet. It is entirely possible that a
CNN could learn to recognize the geometries of events of interest and function as an independent triggering algorithm, or, if scene labeling is implemented, could aid an existing trigger algorithm in picking out triggering characteristics.
Current trigger systems rely largely on field-programmable gate arrays, or FPGAs, complex circuits whose structures can be altered dynamically. Studies involving the implementation of Deep Convolutional Neural Networks on FPGA systems have proved promising [27], so it may be interesting to investigate the applicability of these results to the high-energy LHC environment.
Conclusion
We have shown that a Deep Convolutional Neural Network trained on raw calorimeter data is able to discriminate between simulated 13 TeV ttH and ttbb+jets events as well as or better than a traditional ANN. This result suggests that, with further optimization, CNNs may be the key to obtaining a measurement of the top-Higgs coupling.
Appendix
We present a useful formula for determining allowed sizes of convolutional kernels and the output at each stage.
Assume the input of a convolutional layer is a set of n by n images with m channels. Let the first layer be q neurons deep, with kernel size k by k. We see that, as the convolutional layer takes steps over the image, the layer will produce q feature maps with dimension k+1) by (n-k+1) . If this layer is immediately followed by a pooling layer with kernel size p by p, we see that
the output is q feature maps of size (n− k +1)
p by (n− k +1)
p . These feature maps will then become the inputs of the next layer.
We see that, at each stage, the feature map sizes must be integers. We are thus constrained in our choices of kernel size by the criterion p|(n-k+1) for each later. As we add convolutional layers, we are forced to consider this constraint for several layers of neurons.
Our input images are 30 by 30. Given the constraint (and the fact that we disregard kernels of size 1 and 30), the allowed first-layer convolutional kernels are of sizes
3,4,5,6,7,9,10,11,13,15,16,19,21,22,23,25, and 27. We note that the average jet has a radius of roughly 1.91 pixels in our 30 by 30 image [28] while the average jet separation varies with the number of b-tags [5], so we propose some alternate architectures to study network performance at this scale. Many other architectures are possible, but we design models to maximize the image size after the first convolutional-pooling layer in order to take advantage of later layers.
We list these models in Table 5.
Name K1 P1 Layer 2 Input
K2 P2 Layer 2
Output
Model 1 3 2 14 5 2 5
Model 2 3 2 14 3 2 6
Model 3 7 2 12 5 2 4
Model 4 7 2 12 3 2 5
Model 5 11 2 10 5 2 3
Model 6 11 2 10 3 2 4
Table 5: Some proposed network architectures. These satisfy the kernel size constraint and allow us to better study features such as jet size and separation.
In future network optimization studies, we hope to explore the performance of these architectures relative to the current model.
Works Cited
[1] Chatrchyan, Serguei, et al. . "Projected performance of an upgraded CMS detector at the LHC and HL-LHC: contribution to the Snowmass process." arXiv preprint
arXiv:1307.7135 (2013).
[2] Aad, Georges, et al. "First combination of Tevatron and LHC measurements of the top-quark mass." arXiv preprint arXiv:1403.4427 (2014).
[3] K.A. Olive et al. (Particle Data Group), Chin. Phys. C, 38, 090001 (2014) and 2015 update.
[4] Rosenblatt, F. Psychological Review, Vol 65(6), Nov 1958, 386-408.
http://dx.doi.org/10.1037/h0042519. The Perceptron: a probabilistic model for information storage and organization in the Brain.
[5] Chatrchyan, Serguei, et al. "Search for the standard model Higgs boson produced in association with a top-quark pair in pp collisions at the LHC." Journal of High Energy Physics 2013.5 (2013): 1-47.
[6] Hinton, Geoffrey E. "Learning multiple layers of representation," Trends in Cognitive Sciences, 11, pp. 428–434, 2007.
[7] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems.
2012.
[8] Dieleman, Sander, Kyle W. Willett, and Joni Dambre. "Rotation-invariant convolutional neural networks for galaxy morphology prediction." Monthly Notices of the Royal Astronomical Society 450.2 (2015): 1441-1459.
[9] Timcheck, Jonathan P. "Image Classification Applied to High Energy Physics Events."
(2015).
[10] Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. "Deep sparse rectifier neural networks." International Conference on Artificial Intelligence and Statistics. 2011.
[11] Bridle, J. S. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. Neurocomputing (pp. 227-236). Springer Berlin Heidelberg
[12] Moody, J., et al. "A simple weight decay can improve generalization." Advances in neural information processing systems 4 (1995): 950-957.
[13] LeCun, Yann A., et al. "Efficient backprop." Neural networks: Tricks of the trade. Springer Berlin Heidelberg, 2012. 9-48.
[14] Collobert, Ronan, Koray Kavukcuoglu, and Clément Farabet. "Torch7: A matlab-like environment for machine learning." BigLearn, NIPS Workshop. No. EPFL-CONF-192376. 2011.
[15] Lee, Honglak, et al. "Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations." Proceedings of the 26th Annual International
Conference on Machine Learning. ACM, 2009.
[16] Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18.7 (2006): 1527-1554.
[17] Ali, Ahmed, and Gustav Kramer. "Jets and QCD: A historical review of the discovery of the quark and gluon jets and its impact on QCD." The European Physical Journal H 36.2 (2011): 245-326.
[18] Chatrchyan, Serguei, et al. "Identification of b-quark jets with the CMS experiment."
Journal of Instrumentation 8.04 (2013): P04013.
[19] Chatrchyan, Serguei, et al. "The CMS experiment at the CERN LHC." Jinst 3.08 (2008):
S08004.
[20] J.D. Bjorken and S.J. Brodsky, Statistical model for electron-positron annihilation into hadrons, Phys. Rev. D 1 (1970) 1416 [ IN SPIRE ].
[21] Hama, Yogiro. "A note on Lorentz transformation and pseudo-rapidity distributions."
Journal of the Physical Society of Japan 50.1 (1981): 21-23.
[22] Alioli, Simone, et al. "A general framework for implementing NLO calculations in shower Monte Carlo programs: the POWHEG BOX." Journal of High Energy Physics 2010.6 (2010): 1-58.
[23] CMS collaboration. "Commissioning of the particle-flow event reconstruction with the first LHC collisions recorded in the CMS detector." CMS Physics Analysis Summary CMS- PAS-PFT-10-001 30 (2010).
[24] Srivastava, Nitish, et al. "Dropout: A simple way to prevent neural networks from overfitting." The Journal of Machine Learning Research 15.1 (2014): 1929-1958.
[25] Farabet, Clement, et al., Demos & Turorials for Torch7, (2014).
https://github.com/torch/demos/
[26] Farabet, Clement, et al. "Learning hierarchical features for scene labeling." Pattern Analysis and Machine Intelligence, IEEE Transactions on 35.8 (2013): 1915-1929.
[27] Farabet, Clément, et al. "Cnp: An fpga-based processor for convolutional networks." Field Programmable Logic and Applications, 2009. FPL 2009. International Conference on.
IEEE, 2009.
[28] Huth, John E., et al. "Toward a standardization of jet definitions." Presented at. No.
FERMILAB-CONF-90-249-E. 1990.
[29] Karpathy, Andrej. “CS231n: Convolutional Neural Networks for Visual Recognition”, (2015). Stanford University. (http://cs231n.github.io/neural-networks-1/).
[30] Apple, Inc. “vImage Programming Guide” (2011). Mac Developer Library.
(https://developer.apple.com/library/mac/documentation/Performance/
Conceptual/vImage/Introduction/Introduction.html)
[31] Hijazi, Samer, Rishi Kumar, and Chris Rowen. "Using Convolutional Neural Networks for Image Recognition", (2015). Cadence Design Systems.
[32] Andreev, Sergey, et. al. “Efficient mapping of the training of Convolutional Neural
Networks to a CUDA-based cluster” (2016). Parallel Architecture Research Eindhoven.
(http://parse.ele.tue.nl/)
[33] Barney, David. "CMS Detector Slice" (Jan 2016). CMS-PHO-GEN-2016-001", (https://cds.cern.ch/record/2120661). CMS Collection.
[34] Pivarski, Jim. “My last LHC status update” (Sept. 2010). The Everything Seminar.
(https://cornellmath.wordpress.com/)