Instituto Tecnologico y de Estudios Superiores de Monterrey
Monterrey Campus
School of Engineering and Sciences
Design and Implementation of a Quantum Multilayer Neural Network Framework
A thesis presented by
Ariel Arturo Goubiah Gamboa V´azquez
Submitted to the
School of Engineering and Sciences
in partial fulfillment of the requirements for the degree of
Master of Science
in
Computer Science
Monterrey, Nuevo Le´on, December, 2020
Dedication
To the memory of Pedro V´azquez Rodr´ıguez.
v
Acknowledgements
I would like to thank my parents, Soledad V´azquez and Ariel Gamboa: for supporting my decision of becoming a scientist. At a younger age you inspired my dreams and now you encourage me to go after them. You thought me that everything can be achieved by putting enough effort to it, and do it with an open heart. Thanks to my brother Luis Gamboa, for being my partner in life, you are the greatest inspiration for me to become someone worth following.
This work would not have been possible without the kindness of Malinalli Vel´azquez, who kept me sane in a period where long working hours late at night were the norm. Thanks to the Vel´azquez-Campos family for welcoming me and making me feel like belonging as one of your own.
There are members of the Scout movement that I would like to acknowledge for provid- ing inspiration, support, and friendship: Ricardo Flores and Laura Jim´enez, who thought me more than what I am supposed to teach them. Najibe Ledesma: you are a great inspiration for me to learn and teach things that are useful to others, your actions and words convinced me that this is the best way to build a better world.
To all the friends I made during my studies, especially: Leopoldo Llano, Nora Hern´andez, Alicia Huidobro, Homar S´anchez, Santiago Renter´ıa, Jacob Rivera and Ernesto Campos.
Thank you for making my stay in the master’s program enjoyable and giving me the op- portunity to learn from you. Also, special thanks to my advisor: Neil Hern´andez, who gave me the opportunity to work in quantum computing, and gave me confidence in the quality of this work; thank you for guiding me in the quest of building quantum algorithms, and pushing me to get new records.
Thanks to Tecnol´ogico de Monterrey for the scholarship it provides to young scientists to develop research in Mexico, investing in knowledge is the wise thing to do even in times of crisis. Thanks to the mexican people, for funding this research through the monetary support provided by CONACyT. I feel in debt and will pay back to society by committing the efforts of my future research towards the creation of a better world.
This work is part of the ITESM-Xofia computing program for quantum computing and its applications. Huge thanks to Xofia computing, through Carlos Sanmiguel, for providing access to the Atos Quatum Learning Machine.
vii
Design and Implementation of a Quantum Multilayer Neural Network Framework
Ariel Arturo Goubiah Gamboa V´azquez by
Abstract
Artificial Neurons are biologically inspired algorithms that form the building blocks for Artificial Neural Networks (ANNs) and Multilayer Neural Networks (MNN), which have been recently studied and implemented to solve important ptoblems. Advances in Learning theory and the availability of powerful computational systems has resulted in the creation of many real-world applications. Practically every industry has already adopted Multilayer Learning powered technologies in some part of their processes, as state of the art MNN- powered algorithms can outperform other algorithms and even human accuracy for a wide number of tasks. However, their performance relies heavily on the budget of data available as well as its format, as the most popular applications require a copious amount of training examples. Another limitation to build large scale MNN applications is the vast computational resources needed to build these systems. MNN based algorithms usage is widespread and also getting more complex, this phenomenon creates an ever-growing demand for computational power, which may no longer be satisfied at some point in the new future, thanks to the decel- eration in state of the art monolithic processors’ performance.
Quantum information theory, is a field that has had success in the last couple of decades, thanks to the creation of algorithms that are in theory able to outperform classical computers.
The ability of quantum computers of working with inherently different physical systems than the ones used by classical computers, opens an exciting opportunity for scientists and com- panies to explore the performance of quantum computers for machine learning tasks, being multilayer learning a focus point, thanks to its importance in classical computing. Although a considerable amount of resources have been allocated to the development of MNN powered algorithms in quantum computers, there are still challenges left to overcome before Quantum Multilayer Neural Networks come to be a technology that can compete with state of the art MNN powered algorithms.
This research explores the properties of multilayer neural network algorithms running on quantum computers. The first contribution of the research work reported in this document is the analysis and implementation of a perceptron algorithm running on a quantum computer.
The second contribution is the proposal, implementation and analysis of two different infor- mation encoding methods for quantum computers.
The final, and most important contribution of this work, is the development of a frame- work that allows training multilayer neural networks for Supervised Learning.
ix
List of Figures
2.1 Electrical model of a neuron’s membrane [23] . . . 8
2.2 Typical waveform generated by a Rulkov map [39]. . . 9
2.3 Visual representation of a three input perceptron. . . 10
2.4 The barycentric correction procedure takes into account five different cases in function of how overlapped are the sets to separate [36]. . . 11
3.1 Quantum Neuron in the Gate Model. . . 20
3.2 Multi layer neural network with a [2, 3, 3, 2] architecture, training over 10 examples. . . 22
3.3 Multi layer neural network, with a [4, 2, 4] architecture, training over three examples. . . 22
3.4 Cross section of the Bloch Sphere. The dashed circular curve represents the region that can be used in the bounded continuous representation. Any quan- tum state that lays in the dashed curve can be mapped. . . 24
3.5 Given the valid states to perform a bounded representation of real numbers, it is possible to map the state |0i to the minimum value of the real numbers to encode, and state |1i to the maximum value. Any intermediate value will be represented in a superposition. . . 26
3.6 Translation scheme from the domain of the real numbers (1), which is then mapped to have a value between zero and ⇡2 (2), from where the sine and cosine functions can be used to create a quantum state that unequivocally represents the x value. . . 27
3.7 The second translation scheme differs from the previous as not the x value is mapped to a value vx between zero and one, which is used to form a vector ~r that can be then projected over states |0i and |1i to get a unique representation of the value x in the real numbers. . . 28
3.8 Graphical Representation of the Barycenter Correction Procedure. . . 29
4.1 and: Fitness of the neural network in function of the epochs. . . 32
4.2 and: Heatmap for the |0i state. . . 33
4.3 or: Fitness of the neural network in function of the epochs. . . 34
4.4 or: Heatmap for the |0i state. . . 34
4.5 xor: Fitness of the neural network in function of the epochs. . . 35
4.6 Heatmap of the network modelling the xor gate, plotted along several steps. . 36
4.7 Pair plot of the predictors of the iris dataset. . . 37
4.8 Discrete representation of the Iris Dataset. . . 40 xi
4.9 Adjusted values for the discrete representation of the Iris Dataset. . . 41
4.10 Iris dataset and its class barycenters. . . 46
4.11 Neural Network for classification: Fitness over epochs. . . 47
4.12 Neural network predictions for the Iris dataset. . . 47
4.13 Prediction accuracy in terms of the number of times the barycenter correction heuristic is applied. . . 48
4.14 Password authentication scheme . . . 50
A.1 Trigonometric mapping from normalised values to valid quantum states for encoding. . . 57
A.2 Vector mapping from normalised values to valid quantum states for encoding . 58 A.3 Contrast between the trigonometric and the . . . 58
B.1 and: Heatmap for the |0i state. . . 59
B.2 and: Heatmap for the |1i state. . . 60
B.3 and: Sum of the probabilities of getting |0i and getting |1i. . . 61
B.4 or: Heatmap for the |0i state. . . 62
B.5 or: Heatmap for the |1i state. . . 63
B.6 or: Sum of the probabilities of getting |0i and getting |1i. . . 64
B.7 xor: Heatmap for the |0i state. . . 65
B.8 xor: Heatmap for the |1i state. . . 66
B.9 xor: Sum of the probabilities of getting |0i and getting |1i. . . 67
C.1 BCP2: Fitness of the neural network. . . 69
C.2 BCP2: Neural network predictions. . . 70
C.3 BCP3: Fitness of the neural network. . . 71
C.4 BCP3: Neural network predictions. . . 72
C.5 BCP4: Fitness of the neural network. . . 72
C.6 BCP4: Neural network predictions. . . 73
C.7 BCP5: Fitness of the neural network. . . 73
C.8 BCP5: Neural network predictions. . . 74
C.9 BCP6: Fitness of the neural network. . . 74
C.10 BCP6: Neural network predictions. . . 75
C.11 BCP7: Fitness of the neural network. . . 75
C.12 BCP7: Neural network predictions. . . 76
C.13 BCP8: Fitness of the neural network. . . 76
C.14 BCP8: Neural network predictions. . . 77
C.15 BCP9: Fitness of the neural network. . . 77
C.16 BCP9: Neural network predictions. . . 78
C.17 BCP10: Fitness of the neural network. . . 78
C.18 BCP10: Neural network predictions. . . 79
xii
List of Tables
4.1 Truth table of the and gate. . . 32
4.2 Truth table of the or gate. . . 33
4.3 Truth table of the xor gate. . . 35
4.4 Comparison between encoding methods. . . 44
4.5 Confusion matrix for the predicions made by the neural network. . . 48
4.6 Confusion matrix for the predictions made by the neural network after imple- menting the barycenter correction heuristic. . . 49
xiii
Contents
Abstract ix
List of Figures xii
List of Tables xiii
1 Introduction 1
1.1 Motivation . . . 1
1.2 The Problem . . . 3
1.3 Hypothesis and research questions . . . 4
1.4 Objectives of the Research . . . 5
2 Preliminary Concepts and Review of the Literature 7 2.1 Mathematical model of a neuron . . . 7
2.1.1 Biological neuron . . . 8
2.1.2 The perceptron algorithm . . . 9
2.2 The Barycentric Correction Procedure . . . 10
2.3 Quantum computing . . . 11
2.3.1 Quantum information theory . . . 12
2.4 Related work . . . 12
2.4.1 The quest for a quantum neural network . . . 12
2.4.2 Simulating a perceptron on a quantum computer . . . 13
2.4.3 Quantum Perceptron Models . . . 13
2.4.4 An Artificial Neuron Implemented on an actual quantum processor . 14 2.4.5 Training Multilayer Quantum Neural Networks . . . 15
2.5 Conclusions . . . 15
3 Methods 17 3.1 Quantum Computing Platforms . . . 18
3.2 Quantum Neuron . . . 19
3.2.1 Generalisation of the Quantum Neuron . . . 19
3.3 Quantum multilayer neural network . . . 21
3.4 Information Encoding in Quantum Neural Networks . . . 23
3.4.1 Binary Encoding of Integer Values . . . 23
3.4.2 Bounded Continuous Representation . . . 23
3.5 The Barycenter Correction Heuristic . . . 28 xv
3.6 Conclusions . . . 29
4 Results 31 4.1 Solving Binary Logic Gates using Quantum Neural Networks . . . 31
4.1.1 Modelling the AND gate . . . 32
4.1.2 Modelling the OR gate . . . 33
4.1.3 Modelling the XOR gate . . . 35
4.2 Multilayer Quantum Neural Networks for Classification . . . 37
4.2.1 Multilayer Neural Network Architecture . . . 38
4.2.2 Quantum Representations of the Iris Dataset . . . 38
4.3 Classification of the Iris Dataset . . . 44
4.3.1 Technological constraints . . . 44
4.3.2 Quantum Multilayer Architecture for the Iris Dataset . . . 45
4.3.3 Accuracy of the Neural Network Using the Barycenter Representation 45 4.3.4 Improving the Accuracy using the Barycenter Correction Heuristic. . 48
4.4 Quantum password authentication algorithm . . . 49
4.4.1 Quantum Password Generation Algorithm . . . 49
4.4.2 Technical Issues for Implementation . . . 50
4.4.3 Proof of work . . . 51
4.5 Conclusions . . . 52
5 Discussion 53 5.1 Final discussion . . . 53
5.2 Research Questions . . . 54
5.3 Conclusions . . . 55
5.4 Future Work . . . 56
A Bounded Representation: Mapping to Quantum States 57 B Modelling Binary Logic Gates 59 B.1 AND gate . . . 59
B.2 OR gate . . . 62
B.3 XOR gate . . . 65
C Network Optimisation 69 C.1 Second iteration . . . 69
C.2 Third iteration . . . 71
C.3 Fourth iteration . . . 72
C.4 Fifth iteration . . . 73
C.5 Sixth iteration . . . 74
C.6 Seventh iteration . . . 75
C.7 Eigth iteration . . . 76
C.8 Ninth iteration . . . 77
C.9 Tenth iteration . . . 78
Bibliography 86
xvi
Chapter 1 Introduction
1.1 Motivation
The amount of digital data around the world has been growing in an accelerating pace, and all of this data needs to be processed as fast and efficiently as possible for a myriad of reasons.
Progress to issue this necessity has been made in the field of the algorithms and microelectron- ics. Miniaturisation of semiconductors in the microelectronics field has helped in the creation of computers with greater processing capacity. A serious problem arises at a miniaturisation size when the quantum tunnelling effect becomes more relevant than the phenomena that gov- erns the transistors used for our everyday computers [49].
A powerful tool that has proved to be good at analysing data are Artificial Neural Net- works (ANNs) [19]. ANNs can be built using several layers of artificial neurons, that can be described mathematically in different ways: The Perceptron model, is a binary classifier which can decide whether or not an input, which is traditionally a vector of real numbers, belongs to a class [4]. The Barycentric Correction Procedure (BCP) is a geometry based heuristic, which has proven to be an efficient substitute for the Perceptron due to its computational advantage, by not having to rely on stochastic phenomena to work. It has two extensions to solve linearly separable problems and non linearly separable problems [35] [36].
Artificial neurons are biologically inspired models that have been built by looking at the characteristics of the biological neurons. A mathematical model of a biological neuron should take into consideration its inputs, like the frequency of pulses received at the synapses, the processing happening in the individual neuron, and also the outputs. Also the model should study how groups of neurons behave and how they transfer and store information. One of the first biological neuron models was build by Alan Lloyd Hodgkin and Andrew Fielding Huxley in 1952 [23] as a set of differential equations. Further models were proposed specially to study artificial neural networks, like the work by Nikolai F. Rulkov in 2001 [39], using a two dimensional iterated map.
Quantum computing is the use of quantum-mechanical phenomena such as superposi- tion and entanglement to perform computation. A quantum computer is used to perform such
1
2 CHAPTER 1. INTRODUCTION
computation, which can be implemented theoretically or physically[34]. The ability of quan- tum computers using quantum states to be in a superposition can lead to a substantial speedup of computation in terms of complexity, since operations can be executed on many states at the same time. The main premise for the study of quantum computing and its implementations, is that, a powerful enough quantum computer will be able to solve problems that classical com- puters are not theoretically able to solve given a very large amount of time [5]. In quantum machine learning, quantum algorithms are developed to solve typical problems of machine learning using the efficiency of quantum computing. This can be done by adapting classical algorithms or their expensive subroutines to run on a potential quantum computer. The expec- tation is that in the near future, such machines will be commonly available for applications and can help to process the growing amounts of global information. There is no comprehen- sive theory of quantum learning yet, but discussions of the elements of such a theory can be found in [3] [24] [41].
There are a number of proposals for quantum versions of artificial neurons and neural networks [10]. However, most of them consider Hopfield networks, which are powerful for the related task of associative memory that is derived from neuroscience rather than machine learning. A large share of the literature on quantum neural networks tries to find specific quantum circuits that integrate the mechanisms of neural networks in some way [21] [12]
[13], trying to use the power of neural computing for quantum computation. A practical im- plementation is given by Elizabeth Behrman [8] [47] who uses interacting quantum dots to simulate neural networks with quantum systems. An interesting approach is also to use fuzzy feed-forward neural networks inspired by quantum mechanics [38] to allow for multi-state neurons. It is an interesting open challenge to translate the nonlinear activation function into a meaningful quantum mechanical framework [42], or to find learning schemes based on quan- tum superposition and parallelism.
Other efforts focus on the building blocks of the neural networks, by exploring with models for individual neurons before scaling up towards a neural network[28] [43]. This aims to completely re imagine and re engineer the way neural networks work. This work will fol- low the latest path, by integrating an extensive research on mathematical models of biological neurons with a new computational paradigm that is quantum computing.
The purpose of this research project is to analyse the mathematical models of a neuron in a quantum computer, we will analyse the behaviour of a single perceptron algorithm in a quantum computer. This research focuses the attention in the fundamental building blocks of the widely used neural networks, aiming to design a neural network that better reassembles of the physical and chemical phenomena involved in the biological brain cells.
The importance of this research is that it evaluates an alternative to solve a problem that will rise in the near future, where the main source of growth of computational power will no longer be available, since the number o transistors per area of the semiconducting monoliths such as microprocessors, will soon be constrained precisely by quantum effects.
This effects suppose an asymptote in the growth of the computing power that is available to solve problems, and that will be needed for critical endeavours such as creating self driving
1.2. THE PROBLEM 3
cars, where peoples’ lives are at stake. This work offers tools to create a better alternatives for widely used algorithms, making them more efficient and with better generalisation properties.
1.2 The Problem
As the interest in data analytics, and decision making grows at an unprecedented pace, the rel- evant data that is generated and available also grows in quantity and also in complexity [31]
[11] [30]. At the same time, computational power provided by traditional computing methods is expected to stagnate [49]. An alternative to traditional computing is quantum computing; a field of study that uses quantum phenomena to perform computation. The results obtained in this field of study have been a major breakthrough in computer science, leading to algorithms that can run in an exponentially faster time than their classical counterparts. The study of quantum computing is a fertile ground that has not yet been explored at depth.
In the last decade, there have been some efforts that tried to combine quantum comput- ing with the properties of biological neurons, but so far this endeavours haven’t been able to exploit both the advantages of both fields. The successful implementation of a system that combines the benefits of quantum computing and biological neurons will potentially result in a framework to create new algorithms that can potentially solve existing problems in an exponentially faster time than the best known algorithms to date, and thus, allow us to analyse more data in less time.
4 CHAPTER 1. INTRODUCTION
1.3 Hypothesis and research questions
The hypothesis of this work is that an artificial neural networks implemented on a quantum computer can emulate more complex behaviours than those available for classical computers, and thus, are able to solve more complex problems.
The research questions to answer by this work are the following:
• What are the most important features of state of the art multilayer neural networks designed for quantum computers?
• Is it possible to use multilayer quantum neural networks as a classification algorithm?
• What is the performance of a BCP learning algorithm implemented in a quantum com- puter against a perceptron implemented in a classical computer for linearly separable problems?
• What is the performance of the BCP learning algorithm implemented in a quantum computer against a perceptron implemented in a classical computer for non linearly separable problems?
• Is it possible to use the geometrical properties of the BCP learning algorithm to improve the accuracy of quantum multilayer neural networks?
1.4. OBJECTIVES OF THE RESEARCH 5
1.4 Objectives of the Research
The general objective of this work is to implement and analyse the behaviour of multilayer neural networks on quantum computers.
The specific goals to achieve as this research work is conducted are the following:
• Analyse the characteristics of a biological neuron.
• Analyse the elements of quantum computing.
• Connect the elements of biological neurons and quanutm computing to explore quantum neuron models.
• Analyse the perceptron as building block for the quantum multilayer neural network.
• Implement a multilayer neural network in a quantum computer.
• Implement the Barycenter Correction Procedure to speed up the training of a model that can perform classification over a tangible dataset.
The following sections of this document are structured as follows:
• Chapter two presents preliminary concepts and review of the literature as the foundation for the following chapters
• Chapter three covers the methodology of the research work reported. This section de- scribes the entire framework to build, train and optimise multilayer neural networks.
Also, some implementations are made to explore the capabilities, limitations and qual- itative properties of neuron algorithms in quantum computers.
• Chapter four reports the results obtained by implementing multilayer neural networks to solve concrete problems using the methodology discussed in chapter three. The first problem is the emulation of the behaviour of logic gates, and the second problem is the classification of the iris dataset [15].
• Chapter five wraps up the document by discussing the results obtained in the research work. The research questions are revisited, and finally, a view on the future work is elaborated.
Chapter 2
Preliminary Concepts and Review of the Literature
A comprehensive review of the literature for the development of this work is provided, be- ginning with the background for neuro-inspired algorithms in section 2.1, following with an introduction to quantum computing and quantum information theory, found in sections 2.3 and 2.3.1 respectively. Finally, a state of the art review on techniques and algorithms shows the most relevant research works where neuro inspired algorithms are designed for quantum computing hardware, or implemented in quantum computers 2.4.
Two classical algorithms are reviewed: The mathematical model of a neuron in section 2.1 is important to understand to have a reference regarding the expected behaviour of a multi layer system, and to showcase opportunity of improvements on the algorithms in case there are any. The Barycenter correction procedure presented in section 2.2 is presented as an al- ternative to the perceptron algorithms, which takes advantage of the geometrical properties of the datasets to make an efficient classification algorithm.
A brief introduction to quantum computing is provided in section 2.3, and finally, state of the art, quantum machine learning algorithms and publications are discussed in section 2.4.
2.1 Mathematical model of a neuron
The neuron, when considered as a signal processing device, has inputs, which are the of pulses received at the synapses, and its output is different potentials formed along its structure. In essence, a neuron is a pulse frequency signal processing device. In comparison, electrical devices use either digital or analog signals for communication or processing, and the mathe- matics behind these subjects is well understood [50].
7
8 CHAPTER 2. PRELIMINARY CONCEPTS AND REVIEW OF THE LITERATURE
2.1.1 Biological neuron
Hodgkin-Huxley model
The Hodgkin–Huxley model, or conductance-based model, is a mathematical model that de- scribes how action potentials in neurons are initiated and propagated. It is a set of nonlinear differential equations that approximates the electrical characteristics of excitable cells such as neurons. Since it is based on differential equations,it is a continuous time model [23].
The Hodgkin-Huxley model is based on the idea that, given a membrane potential E the current in a neuron can be modelled as the sodium and potassium ions and a small leakage current in parallel with the electrical capacity of the membrane CM, as as seen in Figure 2.1.
Figure 2.1: Electrical model of a neuron’s membrane [23]
Rulkov map
The Rulkov map is a two-dimensional iterated map used to model a biological neuron. It was proposed by Nikolai F. Rulkov in 2001. The use of this map to study neural networks has computational advantages because the map is easier to iterate than a continuous dynamical system. This saves memory and simplifies the computation of large neural networks [39]. In Figure 2.2 it is presented a typical waveform generated by a Rulkov map.
FitzHugh-Nagumo
The FitzHugh-Nagumo (FHN) model is a mathematical model of neuronal excitability de- veloped by Richard FitzHugh as a reduction of the Hodgkin and Huxley’s model of action potential generation in the squid giant axon [17]. Nagumo et al. subsequently designed, im- plemented, and analyzed an equivalent electric circuit [32].
In its basic form, the model consists of two coupled, nonlinear ordinary differential equations, one of which describes the fast evolution of the neuronal membrane voltage, the other representing the slower “recovery” action of sodium channel deactivation and potassium
2.1. MATHEMATICAL MODEL OF A NEURON 9
Figure 2.2: Typical waveform generated by a Rulkov map [39].
channel deactivation. Phase plane analysis of the FHN model provides qualitative explana- tions of several aspects of the excitability exhibited by the Hodgkin–Huxley (HH) model, including all-or-none spiking, excitation block, and the apparent absence of a firing threshold.
A version of the FHN equations which adds a spatial diffusion term models the propagation of an action potential along an axon as a travelling wave. Due to their relative simplicity and ease of geometric analysis, the FHN model and its variants are commonly used in neuroscience, chemistry, physics, and other disciplines as simple models of excitable dynamics, relaxation oscillations, and reaction–diffusion wave propagation.
2.1.2 The perceptron algorithm
The perceptron is an algorithm for supervised learning of data patterns, that can be seen as a binary classifier. A binary classifier is a function which can decide whether or not an input, represented by a pattern of numbers, belongs to some specific class [4]. The perceptron can be seen as the simplest feedforward network [18], and it is the building block of multilayer neural networks, that can map sets of input data onto a set of appropriate outputs. Multilayer Perceptron (MLP) networks are general purpose, flexible, nonlinear models consisting of a number of perceptrons in each layer. The complexity of the MLP network can be changed by varying the number of layers and the number of units in each layer. Given enough hidden units and enough data, it has been shown that MLPs can approximate virtually any function to any desired accuracy [27].
In a mathematical sense, the perceptron is an algorithm for learning a binary classifier called a threshold function: a function that maps its input x (a real-valued vector) to an output value f(x) (a single binary value) with the following rule:
f (x) = 1 if w · x + b > 0 and f(x) = 0 otherwise Where w is a vector of real-valued weights, w · x is the dot product
Xn i=1
wixi, where n is
10 CHAPTER 2. PRELIMINARY CONCEPTS AND REVIEW OF THE LITERATURE
the number of inputs to the perceptron, and b is the bias. The bias shifts the decision boundary away from the origin and does not depend on any input value A visual representation of the perceptron is seen in Figure 2.3.
Figure 2.3: Visual representation of a three input perceptron.
2.2 The Barycentric Correction Procedure
The Barycentric Correction Procedure (BCP) is a an algorithm based on geometry rather than in gradient optimisation. This method has proven to be an efficient substitute for the Percep- tron and its enhanced versions such as the Thermal Perceptron or the Pocket algorithm. The BCP is much more efficient than the Perceptron for learning linearly separable mappings. To deal with linearly nonseparable mappings, two extensions of the BCP have been proposed, where the first version is designed to minimise the number of misclassified patterns and the second, to maximise the number of excluded patterns. The BCP is designed to address two issues: improve performance for lineally and non lineally separable roblems. In the case of linearly separable mappings, it must rapidly converge towards a solution, while for linearly nonseparable mappings, it must rapidly yield a satisfactory solution, depending on the kind of constructive strategy used to build up the network [35]. The heuristic to choose a strategy takes into account the sets #1and #0:
#1 = #(pi)
pi 2 C1 and #0 = #(mi)
mi 2 C0
A graphical representation of the possible cases in which can the two sets be overlapped cases is provided in Figure 2.4. The aim of this procedure is to achieve a controlled search in the weight space. To do this, for each modification of the weight vector, all misclassi- fied patterns are taken into account without disregarding the correctly classified patterns : the weight modification is not local. The main idea is to compute the weight vector and the bias separately and to define the weight vector as the vector connecting two points : one within the convex hull of target 1 patterns and the other within the convex hull of target 0 patterns.
2.3. QUANTUM COMPUTING 11
To converge on linearly separable mappings, the principle of the BCP is also to modify the position of these points and to achieve a better direction of the hyperplane at each iteration[36].
Figure 2.4: The barycentric correction procedure takes into account five different cases in function of how overlapped are the sets to separate [36].
2.3 Quantum computing
Quantum computing is the use of quantum-mechanical phenomena such as superposition and entanglement to perform computation. A quantum computer is used to perform such com- putation, which can be implemented theoretically or physically [34]. This field was created in the early 1980s when Richard Feynman expressed the idea that a quantum computer had the potential to simulate things that a classical computer could not [16]. In 1994, Peter Shor shocked the world with an algorithm that had the potential to decrypt all secured communi- cations [44]. Also there exist a number of other quantum algorithms have been designed and tested and provide a meaningful contribution to the quantum computing field [14] [1][20].
Qubits are fundamental to quantum computing and are somewhat analogous to bits in a classical computer. Qubits can be in a 1 or 0 quantum state. But they can also be in a super- position of the 1 and 0 states, which have intrinsically a much more complex behaviour than a classic state, for instance, a given number of states in superposition allow for an exponentially
12 CHAPTER 2. PRELIMINARY CONCEPTS AND REVIEW OF THE LITERATURE
bigger search space than the same number of classic states.
When qubits are measured, the result is always either a 0 or a 1; the probabilities of the two outcomes depends on the quantum state they were in. A quantum computer operates on its qubits using quantum gates and measurement, similar to how classical computers operate using logical gates and a voltage measurement. An algorithm is composed of a fixed sequence of quantum logic gates and a problem is encoded by setting the initial values of the qubits, similar to how a classical computer works. The calculation usually ends with a measurement, collapsing the system of qubits into one of the 2neigenstates, where each qubit is zero or one, decomposing into a classical state [33].
Today’s physical quantum computers are very noisy and quantum error correction is a burgeoning field of research. Unfortunately existing hardware is so noisy that fault-tolerant quantum computing is still a rather distant enterprise [37]. There is an increasing amount of investment in quantum computing by governments, established companies, and start-ups, as the expectation for the rewards of the so called quantum era are great, as seen in the early stages of its development, where exponentially faster algorithms have been designed, and quantum computers have already proven to be superior to solve some problems [10] [5], [6].
2.3.1 Quantum information theory
Quantum information is a rich theory that seeks to describe and make use of the distinctive possibilities for information processing and communication that quantum systems provide.
What a good part of the discipline together, is the recognition that far from quantum behaviour presenting a potential nuisance for computation and information transmission, as seen in light of the trend towards increasing miniaturisation, the fact that the behaviour of quantum systems differ so greatly from that of classical objects actually provides opportunities for interesting new communication protocols and forms of information processing. Entanglement and non- commutativity, two essentially quantum features, can be used [46].
Quantum information theory may be considered as an extension of classical information theory which introduces new communication primitives, e.g., the qubit (two-state quantum system) and shared entanglement, while providing quantum generalisations of the notions of sources, channels and codes [46].
2.4 Related work
2.4.1 The quest for a quantum neural network
Schuld et al. [42] several approaches to Quantum Neural Networks (QNN) development and research. It outlines the challenge of combining the nonlinear, dynamics of neural computing and the linear, dynamics of quantum computing. It establishes requirements for a QNN to
2.4. RELATED WORK 13
be successful and reviews existing literature against the previously defines requirements. It is found that, until then, none of the proposals for a potential QNN model fully exploits both the advantages of quantum physics and computing in neural networks. An outlook on possible ways to move forward on this line of research is given, emphasising the idea of Open Quan- tum Neural Networks based on dissipative quantum computing.
The quantum perceptrons and quantum measurement proposals seem to give a more comprehensive platform to build quanutm machine learning algorithms. It seems to be vital to tackle the quest for a QNN from the stance of a more advanced formulation of quantum the- ory. A candidate for a quantum system simulating a classical neural network’s attractor-like dynamics would need to contain at least two stable states that are obtained through dynamics highly dependend on the initial conditions [42].
2.4.2 Simulating a perceptron on a quantum computer
In the context of the emerging field of quantum machine learning, several attempts have been made to develop a basic computational unit of artificial neural networks using quantum in- formation theory. Based on the quantum phase estimation algorithm, this work introduces a quantum perceptron model imitating the nonlinear step-activation function of a classical per- ceptron. This scheme requires memory resources on a quantum computer of O(n) where n is the size of the input, and promises efficient applications for more complex structures such as trainable quantum neural networks.
The quantum perceptron model presented by Taccino et al. [43] offers a general proce- dure to simulate the step-function characteristic for a perceptron on a quantum computer, with an efficiency equivalent to the classical model. This fills a void in quantum neural network research, especially for quantum learning methods which rely on an equivalent to classical classification using quantum information. This quantum perceptron model could be used to develop superposition-based learning schemes, in which a superposition of training vectors is processed in quantum parallel. This would be a valuable contribution to current explorations of quantum machine learning [43].
2.4.3 Quantum Perceptron Models
The authors demonstrate how quantum computation can provide meaningful improvements in the computational and statistical complexity of the perceptron model. They develop two quantum algorithms for perceptron learning.
The first algorithm [28] exploits quantum information processing to determine a sepa- rating hyperplane using a number of steps sublinear in the number of data points and therefore provides a quadratic speedup with respect to the size of the training data.
The second algorithm [28] illustrates how the classical mistake bound can be further
14 CHAPTER 2. PRELIMINARY CONCEPTS AND REVIEW OF THE LITERATURE
improved from O(12)to O(p1 )through quantum means, where denotes the margin, this al- gorithm provides a quadratic reduction in the scaling of the training time (as measured by the number of interactions with the training data) with the margin between the two classes. This result is especially interesting because it constitutes a quartic speedup relative to the typical perceptron training bounds that are usually seen in the literature.
The most significant feature of this work is that it demonstrates that quantum comput- ing can provide provable speedups for perceptron training, which is a foundational machine learning method. Seeking new models for perceptron learning that deviate from the classical approaches may not only provide a more complete understanding of what form learning takes within quantum systems, but also may lead to richer classes of quantum models that have no classical analogue and are not efficiently simulatable on classical hardware. Such models may not only revolutionise quantum learning but also lead to a deeper understanding of the challenges and opportunities that the laws of physics place on our ability to learn.
2.4.4 An Artificial Neuron Implemented on an actual quantum proces- sor
Although we can see many approaches to the implementation of a quantum perceptron, most of the recent work published can not be yet implemented in the existing quantum computing hardware, or even if they can be implemented, they require hardware that is not easily avail- able for many researchers. The work by Tacchino et al. [45] presents a quantum perceptron algorithm, which does not require a large amount of quantum computational power, and can be easily replicated even using free computational frameworks.
The algorithm from Tacchino’ et al.’s work, uses two different unitary opperations ot build an algorithm that emulates the behaviour of a neuron. The first one encodes the input and the second one performs the action of the weights, so that, when a certain input is pro- cessed by the weights unitary, the output for this specific input must be to set all the qubits to the state one, that are then measured using a Toffoli gate, therefore indicating that a valid input was found. The Toffoli gate works as the nonlinear function that is required by the perceptron before deliver in the final output [45].
The proposed technique does not show a significant speedup over classical algorithms, even with the novel hypergraph-based encoding method proposed. When the algorithm works with states in superposition, the output becomes probabilistic, so the algorithm should be evaluated several times to establish the proper output, having the risk of false positives. Even without showing an important speedup over classical algorithms, many other interesting sys- tems can be built using as a base the framework proposed in this work, as will be evident in the following chapters of this research work.
2.5. CONCLUSIONS 15
2.4.5 Training Multilayer Quantum Neural Networks
Beer at al. [7] propose a multilayer neural network architecture. The architecture forms a fed-forward neural network, and uses the fidelity of the predictions of the neural network for every pair of input output training data, that build up the cost function. The training algorithm is performed by evaluating the result of individual layers and neurons, and applying a gradi- ent descent learning rule. The number of qubits required to implement this architecture grows with the width of the network, and not with the depth, which promises a relevant advantage over current multilayer learning algorithms.
The benchmarking for this algorithm is performed using a random set of input-output pairs, which proves that the multilayer networks generated with this framework are indeed able to perform universal computation. The authors offer the code open to evaluate the frame- work, which is built atop of an open source library used to simulate quantum systems.
2.5 Conclusions
This chapter presents the key concepts and state of the art techniques related to the imple- mentation of neural networks in quantum environments. The algorithms presented in section 2.4.4 and 2.4.5 were implemented in quantum environments, in a scale that allowed them to be simulated in a classical computer. The technical tools available for this task are enumerated and described in chapter 3, section 3.1, and requirements and limitations for useful implemen- tations is discussed in chapter 4.
The barycenter correction procedure uses geometric properties that can be translated to an arbitrary dataset, and may come useful to optimise the neural networks further created.
The concepts discussed in this chapter are revisited and used in following chapters, especially the algorithm for quantum multilayer neural networks, which is a very important component of the complete framework to use quantum multilayer networks with real world data.
Chapter 3 Methods
Reliable and fault free quantum computers are not already available [37] [21]. Scientists are trying several methods to isolate and use quantum phenomena that can be translated to com- putational processes, in a competition where there is still not a clear winner nor predominant technology. State of art quantum devices are commonly referred to as Noisy Intermediate Scale Quantum (NISQ) devices, as these kind of systems are not good at isolating a quantum state, by keeping it from interacting with the environment. The other characteristic of NISQ devices is that they are not scalable, as their classical counterpart, where engineering prob- lems to build exponentially bigger systems are already solved. In the current ecosystem of of quantum computing technologies, NISQ devices allow the exploration of quantum algorithms in an intermediate scale, where it is considered that whenever more powerful quantum com- puters are available, there will already have useful applications. This provides incentives to researcher communities and companies working in quantum computing.
As quantum hardware scientists address the challenge of bringing better quantum hard- ware, quantum computer scientist create theoretical algorithms that encourage the scientific community to keep on working on this field. The basic requirement that quantum algorithms inherit from previous computational techniques is that they should be better than the best clas- sical algorithms to solve a certain problem, otherwise there would be no point on creating a whole new computational framework. A quantum algorithm is considered to be a theoretical breakthrough when it is demonstrated that there is a clear advantage on using that quantum algorithm over its best known classical counterpart that can be translated in time, energy, or memory efficiency.
Between the theoretical development of quantum algorithms and its large scale imple- mentation in a future, an intermediate step is to implement algorithms that can be run in state of the art hardware. Quantum simulation frameworks are currently used as they can simulate quantum phenomena with perfect accuracy, in contrast to NISQ devices that are susceptible to noise, and therefore, not suited to perform computation in a large scale. This is why the implementation of the experiments described in this work was carried out in simulators such as the Atos’ Quantum Learning Machine, using Qiskit and Qutip, which are quantum simula- tion libraries for the Python programming language.
17
18 CHAPTER 3. METHODS
The framework used to build multilayer neural networks [7] is built upon the qutip [25]
[26] package, and uses dependencies of the scipy [48] and numpy [22] libraries. Using these tools, all of this was developed and customized in the Python environment of Jupyter Note- books [29].
The present chapter begins by enumerating and discussing the properties of the quantum computing platforms used in this research. Then, itdescribes the implementation of the algo- rithms from section 2.4.4 and 2.4.5 in a simulated quantum environment, each implementation is described respectively in section 3.2 and 3.3. Section 3.4 covers two specific methods that allows encoding classical information into a multilayer quantum neural network to be trained and fitted to this data. Finally, an efficient method to optimise the data fed to a neural network to improve its efficiency is described in section 3.5
3.1 Quantum Computing Platforms
In this section, the quantum computing platforms used in this research are enumerated, fol- lowing its description and discussing what are they good for.
1. Atos Learning Machine: Basically a supercomputer that has been optimised to perform matrix multiplications for large matrices. In this way, it can efficiently perform the sim- ulation of quantum states. Its upsides are straightforward, by providing a fast interface to simulate quantum systems. On the other hand, running long algorithms in this con- text is not possible due to a timeout that is built in to the system. To connect to this computer it is necesary to establish a secure connection with the server that hosts the quantum learning machine, and then use the device in a, ad hoc jupyter notebooks [29]
interface.
2. IBM Q experience: An open source platform provided by IBM, with a web environment that connects to an actual quantum computer. Its main advantage is that this platform uses actual quantum phenomena to perform computations, but on the other hand, the number of qubits is very reduced. Also, the qubits are not always able to interact with each other due to the connectivity issues that the hardware supposes. This platform is considered to be for learning purposes only.
3. Local quantum simulation: It is possible to simulate quantum states using libraries. In this work, two simulation packages were used, the Qiskit [2] and the qutip [25][26]
libraries. Qiskit is focused on intuitively developing quantum algorithms, with a lot of online resources and training available, while qutip requires more domain of the field on quantum information theory. There are advantages on using a local simulation, as network errors and timeouts stop being an issue, giving more time to perform calcula- tions and simulations, and training bigger systems. The downside, on the other hand, is that the computational resources in a local computer are reduced when compared to
3.2. QUANTUM NEURON 19
a potent ad hoc system to simulate quantum systems. The dedicated learning machine has empirically shown that it can emulate multilayer neural networks in half the time it takes to a local computer, but when simulating large algorithms, the timeout disconnec- tion makes it impossible to run big networks in this platform.
3.2 Quantum Neuron
The algorithm proposed by Tacchino et al. [45] does not require a very specialised hardware, and it was implemented using three different platforms:
• Atos Quantum Learning Machine.
• Qiskit simulation library for Python [2].
• IBM Q experience [40].
It is important to note that the first two platforms are simmulators, and the IBM Q ex- perience uses actual quantum phenomena. All the platforms were able to run up to a 4 Qubit version of the algorithm, which is the analogue to a quantum perceptron with four weights and no bias. It is possible to simulate a 4 qubit version of the Quantum Perceptron algorithm because it is not necesary to perform difficult tasks, as the weights and the inputs are hard- coded, meaning that the inputs and weights are encoded in the program by the programmer itself, and there is no training algorithm that demands computational resources.
A simple example of the quantum perceptron, as described in the work by Taccino [45]
is shown in figure 3.1, which is a visual representation of a quantum algorithm, which is used in the IBM Quantum experience platform and also in the Qiskit environment. Each qi
represents a qubit, and the boxes with the tag X represent NOT gates, and are applied over qubit 0 to 3, thus encoding the state |1111i in the first four qubits, the other qubits are left untouched, and are initialised in the low energy state, thus encoding an overall quantum state of |11110000i. Next, we see a set of Toffoli gates, which performs the AND operation over the two qubits marked with a black dot, and leave the result in the qubit marked using the symbol . Finally, the measurement in qubit 4 is represented by a gauge like icon. As the input state for the neural network is |1111i , there is no need to encode the weights, as the Tofolli gate applied over |1i⌦nwill set to |1i each one of the ancilla qubits, including qubit 4, which will always output a 1 when measured.
3.2.1 Generalisation of the Quantum Neuron
Tacchino’s algorithm works with a set of rotations that act on some qubits, that after being measured via the Toffoli gate, will output weather or not the input corresponds to a desired state. Also, it is discusses earlier that when a certain input in superposition of states is fed to a Tacchino like neural network, the result stops being deterministic, which means that, in
20 CHAPTER 3. METHODS
Figure 3.1: Quantum Neuron in the Gate Model.
order to make sure the input state is actually the state which a particular neuron has been designed for, we would need to make more measurements, which translates to the need of having the capacity of generating the same input state several times and measuring it over and over enough times to characterise a given state. The number of times the neuron should be evaluated to provide a failure proof system, would be proportional to the resolution of our inputs, for instance, if we rotate the qubits in an equal superposition of states |1i and |0i, described by the state | i = p12|0i + p12|1i, we would expect to measure |1i at least once when two measurements are performed.
If we deal with systems that can only perform orthogonal rotations relative to the compu- tational basis, we would have a completely deterministic output, where a single measurement is guaranteed to give us a good result. When we give the system the ability to rotate in arbi- trarily small angles, more evaluations should be performed to ensure the algorithm does not generate a false positive. This way of encoding information, that generalises Tacchino’s al- gorithm, so that it can work with states in an arbitrary superposition, giving more choices to encode information, thus, allowing us to encode more information per every qubit, but also, the output should be measured enough times so that the non deterministic output is properly identified.
There is a way to overcome the drawback of this encoding method, which is simply to measure a known state in a basis such the specific outcome is deterministic. For instance, let
M easurement(|xi) = ˆH|xi (3.1)
Where ˆH is that Hadamard gate. Then, it can be seen that:
3.3. QUANTUM MULTILAYER NEURAL NETWORK 21
M easurement
✓ 1
p2|0i + 1 p2|1i
◆
=|0i (3.2)
That creates a state that is deterministically characterizable from a state that otherwise would need several measurements to be chararterized.
3.3 Quantum multilayer neural network
Beer et al. [7] propose a multilayer network model, which is proved to be able to learn an arbitrary transformation over some set of input states to accurately get the corresponding out- put states, given arbitrary input-output states pairs, in the form {| iin,| iout} with the only limitation that there exists a unitary ˆU that transforms | iin into | ioutas follows:
Uˆ| iin =| iout (3.3)
Following this nomenclature, it is possible to form a set of input-output pairs, and the algorithm finds ˆU so that all the set of input states is converted to its corresponding output state when ˆU is performed on them, or in other words, when they are input to the quantum multilayer neural network. This algorithm is useful when the input-output states corresponds to a dataset. This can be done by encoding the information of a classical dataset in a quantum state, as it will be further discussed.
The performance of network with four layers is shown, where the first layer has two neurons, the second layer has three, the third layer has three, and the fourth layer has two, this is represented as a [2,3,3,4]. The input output set has 10 random elements, has a training pe- riod of 0.91 seconds per epoch, and reaches a normalised fitness function of 0.95 in 28 epochs as shown in figure 3.2. This example shows how the multilayer neural network framework used in this work is able to fit a random dataset.
22 CHAPTER 3. METHODS
Figure 3.2: Multi layer neural network with a [2, 3, 3, 2] architecture, training over 10 exam- ples.
As the complexity of the network is increased, the fitness curve grows differently, due to the stochastic nature of the learning method used in the neural network, which can be seen by comparing the fitness function from figure 3.2 with the one shown in figure 3.3, that represents the fitness of a [4, 2, 4] network, that has encoded three datapoints [15] in the input-output pairs.
Figure 3.3: Multi layer neural network, with a [4, 2, 4] architecture, training over three exam- ples.
3.4. INFORMATION ENCODING IN QUANTUM NEURAL NETWORKS 23
3.4 Information Encoding in Quantum Neural Networks
This section details how to encode relevant information so that this can be fed to the network.
Given the background of digital logic, the most natural way to encode numerical information in a quantum computer is to use the binary representation of an integer value. This is, to concatenate zeros and ones so to form a string S of length k whose corresponding integer value is determined as follows:
Integer = Xk
i=0
si2i (3.4)
Where si corresponds to the numerical value of the ith element in S, which can be ei- ther zero or one. The method of encoding integer numerical values for quantum computers is discussed in subsection 3.4.1 as well as its characteristics and limitations.
Another possibility to encode numerical values as quantum states is to take advantage of the vector representation of the qubits to create a one to one quantum correspondence of a natural number with a quantum state. This approach is discussed and evaluated in subsection 3.4.2.
3.4.1 Binary Encoding of Integer Values
The binary encoding of numerical information was probably the first method to encode infor- mation in a quantum computer. This encoding technique is even used as a convention for the orthogonal basis for the qubits, which are states |0i and |1i that correspond respectively to the lowest and highest energy state of the physical phenomenon modelling the qubit. Further- more, the two most important algorithms in the field, namely Shor’s [44] and Goover’s [20]
algorithms rely on this method.
This encoding method works in a similar way to the binary representation of integer numbers, with the difference that the string that is constructed by concatenating the states |0i and |1i instead of ones and zeroes. The quantum state of length k that corresponds to an arbitrary integer n, can be obtained by a function QBI(n) as follows:
QBI(n) ={|1i , |0i}⇤ : card(QBI(n)) = k, Xk
i=0
statei2i = n (3.5) Where statei is zero if the ith element in QBI(n) is |0i and one if that element corre- sponds to |1i.
3.4.2 Bounded Continuous Representation
An alternative to the binary encoding of integer values can be built defining an injective func- tion that maps real numbers to quantum states. This functions must also satisfy the mono- tonicity that the binary encoding has.
24 CHAPTER 3. METHODS
Figure 3.4: Cross section of the Bloch Sphere. The dashed circular curve represents the region that can be used in the bounded continuous representation. Any quantum state that lays in the dashed curve can be mapped.
The proposed method, consists on bounding the set of real numbers to be mapped to a single quantum state, depending on the context of the information to encode. This method allows to encode a feature of a dataset in a single qubit, assuming that the values this regres- sor can have are bounded and are known. For the bounded Continuous representation to be monotonic, only quantum states | i of the following form are used:
| i = ↵ |0i + |1i : ↵2+ 2 = 1, ↵ 0, 0, {↵, } 2 R (3.6) The constraint {↵, } 2 R can be visualised as a cross section of the Bloch Sphere, as presented in figure 3.4. Requiring ↵ 0 and 0, limits the usage of the whole Hilbert space to a quarter of the cross section. This shrinkage may seem harmful at first, but it is needed for practical reasons, as there are some cases where it is impossible to tell the difference between two states by measuring on a single basis. For instance, let:
| 1i = 1
p2|0i + 1
p2|1i (3.7)
and
| 2i = 1
p2|0i 1
p2|1i (3.8)
Let p1(|1i), be the probability of measuring | 1i in state |1i as:
3.4. INFORMATION ENCODING IN QUANTUM NEURAL NETWORKS 25
p1(|1i) = h 1| (|1i h1|) | 1i (3.9)
= ( 1
p2h0| + 1
p2h1|)(|1i h1|)( 1
p2|0i + 1
p2|1i) (3.10)
= ( 1
p2h0|1i + 1
p2h1|1i)( 1
p2h1|0i + 1
p2h1|1i) (3.11)
= ( 1 p2)( 1
p2) (3.12)
= 1
2 (3.13)
Similarly, let p2(|1i), be the probability of measuring | 2i in the state |1i as:
p2(|1i) = h 2| (|1i h1|) | 2i (3.14)
= ( 1
p2h0| 1
p2h1|)(|1i h1|)( 1
p2|0i 1
p2|1i) (3.15)
= ( 1
p2h0|1i 1
p2h1|1i)( 1
p2h1|0i 1
p2h1|1i) (3.16)
= ( 1
p2)( 1
p2) (3.17)
= 1
2 (3.18)
Which shows that, measuring | 1i and | 2i on the computational basis, even repeatedly, would have the exact same behaviour, which is measuring |1i one every two measurements on average. The constraints considered guarantee that we would have states such that we can not identify one from another from a probability distribution derived from measuring them. A graphical representation of the valid states that are distinguishable is shown in figure 3.5
26 CHAPTER 3. METHODS
Figure 3.5: Given the valid states to perform a bounded representation of real numbers, it is possible to map the state |0i to the minimum value of the real numbers to encode, and state
|1i to the maximum value. Any intermediate value will be represented in a superposition.
Once the valid quantum states on the Hilbert Space have been defined, the whole trans- formation of an arbitrary value x, x 2 R, by assuming it is bounded between two values. Then it is needed to map x to lay between an upper and a lower bound.
It is a necesary condition that for all values of x, the mapping should be different, in order to respect the injectivity of the transformation. Furthermore, the function should also be monotonic, in order to be a useful quantum data representation. It can be seen that:
8x 3 (x 2 R, lb < x < ub), 9! nx = x lb
ub lb 3 (nx 2 R, 0 < nx < 1) (3.19) Where lb 2 R is the lower bound, ub 2 R is the upper bound, and
nx= map(x, lb, ub) = x lb
ub lb (3.20)
Maps x to have a value nxbetween zero and one.
Trigonometric mapping
After the fist mapping has been done, any value that nx can take, can also be mapped to a corresponding quantum state | xi by using the sine and cosine functions as follows:
| xi = cos(nx
⇡
2)|0i + sin(nx
⇡
2)|1i (3.21)
The complete translation scheme of this transformation is shown in figure 3.6. This transformation is referred to as the trigonometric mapping.
3.4. INFORMATION ENCODING IN QUANTUM NEURAL NETWORKS 27
Figure 3.6: Translation scheme from the domain of the real numbers (1), which is then mapped to have a value between zero and ⇡2 (2), from where the sine and cosine functions can be used to create a quantum state that unequivocally represents the x value.
Vector mapping
Another alternative to map nxto a valid quantum state, is to define a vector ~r using nx as:
~r = [vx, 1 vx] (3.22)
Whose unitary representation, given by:
ˆ r = ~r
|~r| (3.23)
Can be taken in the context of a Hilbert space, representing a valid quantum state which is also helpful for the representation discussed. The graphical representation of this alternative can be found in figure 3.7. This mapping is referred to as the vector mapping.
28 CHAPTER 3. METHODS
Figure 3.7: The second translation scheme differs from the previous as not the x value is mapped to a value vx between zero and one, which is used to form a vector ~r that can be then projected over states |0i and |1i to get a unique representation of the value x in the real numbers.
An empirical assertion of the injective and monotonic properties of the trigonometrical and vector mappings are shown in appendix A.
3.5 The Barycenter Correction Heuristic
Section 2.2 in chapter 2 describes the Barycenter Correction Procedure, which is a heuristic that can efficiently classify lineally separable data, and also can find good partition regions for non separable problems. This chapter explains how this heuristic can be applied in the con- text of multilayer quantum neural networks to improve the accuracy of the model we are using.
Consider an aribtrary model, where the output of the classification can be visualised in terms of barycenters, figure 3.8 shows a graphical representation of such output, where the highlighted points in the middle of the shaded regions correspond to the barycenters of those regions.
3.6. CONCLUSIONS 29
Figure 3.8: Graphical Representation of the Barycenter Correction Procedure.
Then, consider that the lighter gray area and the darker grey area correspond to a zone that should be classified with the same label, but only the light label is accurately classi- fied. The barycenter correction heuristic assumes that, instead of the two barycenters, a better barycenter to classify the already well classified data, and the incorrectly classified data, is the middle point between both barycenters, as shown in the figure 3.8 with a red cross.
It is important to note that this heuristic is agnostic of the actual learning method used, and can be applied over several labels. This idea can also be extended to correct barycenters in multi dimensional data, as we can compute the middle point of two points in any number of dimensions.
3.6 Conclusions
This chapter first implements and analyses the behaviour of the perceptron models in quantum environments, using three different quantum environments, as its complexity allows this. Its behaviour is analysed and only the perceptron model with the best qualities is selected, which can also be used in a multi layer context.
A framework to encode quantum information is presented, which is suitable to be fed to a multilayer neural network, and finally, it describes the barycenter correction heuristic, which is an algorithm that can improve the performance of quantum multilayer networks when using a continuous representation base din barycenters. This platform is a generic way to encode information in a classical format into quantum states and train a neural network that accurately fits its input data.
Chapter 4 Results
This chapter shows the results of using the framework described in Chapter 3 to build multi- layer neural networks. First, binary logic gates are emulated using trained multilayer neural networks in section 4.1.
The limitations and challenges on building and training multilayer neural networks are discussed in section 4.2, which also describes the encoding method and setup of various ar- chitectures of multilayer quantum neural networks, as well as their feasibility, this network is then used in section 4.3 to evaluate the performance of the network in the classification task for the Iris dataset.
Section 4.3.3 shows a theoretical algorithm for password verification, which displays a use case for the framework built. This also showcases how quantum technologies are not limited to computing, but also to communications, where a lot of expectations have been generated by the properties of quantum systems.
4.1 Solving Binary Logic Gates using Quantum Neural Net- works
The first approach to showcase the framework created to encode information to a multilayer quantum neural network to it is to analyse its behaviour using basic logic gates. This section shows the result of modelling the or, and, and xor logic gates using multilayer quantum neu- ral networks. The first two gates, the or and and, can be modelled using a lineally separable dataset. The xor gate shows the effect on a non lineally separable dataset.
Subsections 4.1.1, 4.1.2 and 4.1.3 detail the process and results of the results of mod- elling the and, or and xor gate, respectively. Subsection 4.1.3 shows the evolution of the model fort he xor gate, giving a heatmap corresponding to the predictions made in the first training steps, when the network is initialised at random, to the steps where the network is able to accurately classify the dataset.
31
32 CHAPTER 4. RESULTS
4.1.1 Modelling the AND gate
The truth table for of the and gate is shown in table 4.1:
First Input Second Input and output
0 0 0
0 1 0
1 0 0
1 1 1
Table 4.1: Truth table of the and gate.
This truth table can be encoded in a dataset as follows:
{{[0, 0], [0]}, {[0, 1], [0]}, {[0, 1], [0]}, {[1, 1], [1]}} (4.1) Using the binary encoding method, descried in subsection 3.4.1, we can create quantum states corresponding to this dataset as:
{{|00i , |0i}, {|01i , |0i}, {|10i , |0i}, {|11i , |1i}} (4.2) This data is then fed to a [2, 1, 2] network, which is the simplest network that can handle the dataset representing the and gate. The data is trained over 500 epochs, and achieves with a normalised accuracy greater than 0.99. The accuracy as a function of the number of steps is shown in figure 4.1.
Figure 4.1: and: Fitness of the neural network in function of the epochs.
To further evaluate the performance of the neural network, different values between zero and one are mapped to quantum states using the Bounded Continuous Representation,