2. Teor´ ıa de Carteras y Modelos de Valoraci´ on
2.4. Teor´ıa de valoraci´ on por Arbitraje o APT
A major part of the classical artificial neural networks are feed-forward with directed acyclic network graphs [73, 37]. In contrast, recurrent neural networks can contain directed cycles in their architecture leading to a much higher level of complexity and new features. Although many different problems have
0 5 10 15 20 25 30 35 40 45 0 0.05 0.1 0.15 0.2 0.25 Bi o-‐ pl au si bi lit y (# of fe at ur es )
Feasibility (1/# of FLOPS)
Izhikevich
I&F Hindmarsh-‐Rose
Quadra?c I&F I&F with adapt.
Morris-‐Lecar Wilson I&F or-‐burst Fitzhugh-‐Nagumo Resonate-‐and-‐fire Hodgkin-‐Hoxley Mul?-‐compartment Models
Figure 2.6: Bio-plausibility of the spiking neuron models (in number of features) against feasibility (speed)
been tackled successfully using feed-forward neural networks, the biological neural networks are not limited to those architectures.
Recurrent Neural Networks (RNNs) have shown great potential in solving different engineering problems such as classification, regression, prediction, control, and simulation of dynamic systems. They can naturally process temporal inputs while feed-forward networks need delay embedding and more parameters to be able to process temporal data [359]. RNNs are shown to be Turing equivalent [195] and universal approximators [110]. They can also approximate finite state automata [282]. However, the high computation cost and slow convergence and suboptimal solutions of RNN learning algorithms limited their wide real-word application [176].
Reservoir Computing (RC)
Recently, with independent works of Buonomano [53], Dominey (Temporal Recurrent Neural Networks) [84], Maass (Liquid State Machine - LSM [241]), Jaeger (Echo State Network - ESN [176]) and Steil (Back-Propagation Decorrelation - BPDC [353]) a new technique, collectively known as Reservoir Com- puting (RC) [332], emerged, which is claimed to be capable of processing analogue continuous-time inputs and to mitigate the shortcomings of the RNN learning algorithms.
The RC approach is generally based on a recurrent network of (usually) non-linear nodes. This recurrent network, which is called reservoir (also called liquid and dynamic filter) transforms the tempo- ral dynamics of the recent input signals into a high-dimensional representation. This multi-dimensional trajectory can then be used as latent state variables by a simple linear regression/classification or a feed- forward layer (known as readout map or output layer) to extract the salient information from transient states of the dynamic filter and generate stable outputs. This is very similar to Kernel method [336, 70] used in Support Vectors Machines (SVM) [385, 70], which use a mapping function to map data points into a high-dimensional feature space and then linearly separate them into classes [240]. However, unlike
SVMs, RC has an intrinsic temporal nature.
Amazingly diverse choice of dynamic systems can be used as reservoirs: RNN [241], gene regu- latory networks (GRN) [183], an animal brain [276], or even a bucket full of water [97]. The reservoir is traditionally a randomly generated RNN with fixed weights. Only the output layer is trained. Linear nature of the readout map dramatically decreases the computational cost and complexity of the train- ing. Nevertheless, it has been shown that the topology, weights and the other parameters (e.g. bias, gain, threshold) of the reservoir elements can change the dynamics of the reservoir and thus affects the performance of the system [175, 232]. Therefore, a randomly generated reservoir is by definition not optimal.
Researchers tried to propose different measures and methods for generating and/or adapting reser- voirs for a given problem or problem class [332]. However, there is none or very limited theoretical ground for specifying which reservoir is suited for a particular problem due to the non-linearity of the system [175]. Moreover, with only one positive result, in case of intrinsic plasticity [354], the devel- opment of unsupervised techniques for effective reservoir adaptation remains an open research question [332]. Another open question is the effect of the reservoir topology on the performance and dynamics of the system [175]. There is some evidence that hierarchical and structured topologies can significantly increase the performance [332]. Recently some deterministic formulations for generation of competitive reservoirs were suggested [307, 308].
Once the reservoir is optimally designed, tuned, or trained (in an unsupervised fashion), given a problem class, then different readout maps can be simply trained (in a supervised mode) for performing different tasks [241]. Computational power of the reservoir can be increased by just adding more neurons to the existing network [241]. The system is also very robust to noise [241].
RC is a very biologically plausible approach to RNNs [234, 233]. In Liquid State Machine (LSM) technique, generation recipe of the reservoir (liquid) follows biologically motivated topologies and met- rics. The fact that the same reservoir can be used for inference of different outputs or that there are some deterministic architectures or some generic but not well-understood adaptations that can be used for many problems [307, 308, 234] show other promising similarities with the mammalian brain charac- teristics. RC also shows how temporal information can be represented spatially in the brain, providing a temporal context for perception of the current inputs [234]. As a matter of fact, LSMs have also been used as models of cortical microcircuits in cognitive and neuroscience studies [234] to explain processes in biological brains. However, there are still some other biologically not very accurate features in RC for example in terms of the accuracy of the neuron models used in RC (usually LIF neurons).
Hierarchical Temporal Memory (HTM)
Inspired by the structure and characteristics of the mammalian brain, Hawkins introduced a new model of the brain called Hierarchical Temporal Memory [150] in his book “On Intelligence”. Although he misleadingly claimed to solve “the global brain problem” [360] and made other questionable claims about artificial intelligence, the brain, and consciousness [292, 360, 96], he proposed a model that both explains many functions of the brain, and opens the way for creating new useful intelligent systems.
He builds upon the thesis [273] that many different functions of the neocortex such as vision, hear- ing, language, motor planning, etc. are based on a single common algorithm and structure. He proposed a general algorithm that explains how brain regions work together to store previous experiences and use them to predict and model future ones and their causes [150, 151]. Later, Hawkins and George presented a formal model of the theory [119] based on a belief propagation model. They used a hier- archical Bayesian network of sequence recognisers (based on Hidden Markov Model - HMM) in this model. They implemented and successfully tested it on a visual perception problem [119]. They also released a platform and a tool set [118] that allows other researchers and developers build upon their system. This attracted considerable attention and new products were released based on the platform. But the relation of the formal model to neural microcircuits in the brain was not clear and lower-level functioning of the regions (HMM) and the communication means between regions were far from those of the brain. Recently, a new model called HTM Cortical Learning Algorithms (CLA) was published [281] that fills these gaps by explaining the relation of the model to brain microcircuits already known to neuroscientists, and using a more bio-plausible neuron model with sparse coding. Very recently, they also introduced a cloud-based service [280] that allows researchers to evaluate the new model for online prediction and anomaly detection on temporal data streams.
The HTM model is bio-inspired and very biologically plausible compared to many other models with comparable capabilities [281]. Importance of the temporal persistence of the causes (objects) [151, 150], utilising sparse coding and RNNs for spatiotemporal pattern recognition, and other biological assumptions [281] are all examples of bio-plausibility of HTM. However, although a new neuron model is introduced in the CLA that is, in many aspects, much more bio-plausible than aforementioned spiking neuron models, it is in many other aspects heavily abstracted and simplified [281]. Current known CLA implementations are also still limited to a single layer [281]. There are no results from the CLA published yet.