7. ANÁLISIS FINANCIERO 96
7.4. RATIOS ECONÓMICOS 101
pers written comparing different methods make that clear: [Mul75], Geary C [Gea54], Moran I [Mor50], Spatial Autocorrelation [CO70], Wartenberg [War85] and Anselin’s LISA [Ans95]. The second problem is a computational problem involving the optimi- sation of nested O(n2) loops. As stated previously, the spatial autocorrelation function
is O(n2), while the number of permutations of any pair of maps (m) is:
P ermutations(m) = m(m − 1)
2 (Assuming that the correlation function is commutative)
So, the time to run a spatial autocorrelation on all of the maps in a set is O(m2×n2).
Similarly, to add a new map to an already existing set of correlations is O(m × n2). One possible way forward could be to match against a smaller number of templates as a heuristic. The only problem here is that if Map A correlates highly with Map B, then Map C is not guaranteed to correlate with Map A, even if it is highly correlated with Map B. This could be formulated as a spatial feature extraction algorithm. Gangappa et al. provide a review of regression, Bayesian, Artificial Neural Network and other en- semble techniques in “Techniques for Machine Learning based Spatial Data Analysis: Research Directions” [GMS17]. Openshaw’s book, “Artificial Intelligence in Geogra- phy” [Ope97] provides a good introduction as well, but the technology has moved on since it was written in 1997.
Given the nature of the problem defined so far, the use of parallel algorithms to ac- celerate the processing is a possibility. Artificial Neural Networks (ANNs) have gained popularity recently with libraries like TensorFlow [Aba+16], Keras [Cho15], PyTorch [Pas+17] and Caffe [Jia+14] providing interfaces to utilise the multiple parallel cores of graphics processing units (GPUs) for large-scale networks. These ANN simulations use matrix operations (hence “TensorFlow”), running on “general purpose GPU” li- braries. The libraries are not so much specialised to the task of simulating ANNs as to the task of running matrix operations and so are usable in the context of large scale spatial correlations.
2.9
Learning from Real-time Streams
One definition of learning is “Hebb’s postulate of learning”, from The Organization
process of learning as follows:
“When an axon of cell A is near enough to excite a cell B and repeat- edly or persistently takes part in firing it, some growth process or metabolic changes take place in one or both cells such that A’s efficiency as one of the
cells firing B, is increased.” (Hebb, 1949 [Heb49])
This defines the term, “Hebbian Learning”, or learning by reinforcement. When data from real-time streams is considered, this reinforcement comes through the repeating patterns present in the data. Reinforcement learning is the topic of ““General Principles of Learning-Based Multi-Agent Systems”” [WWT99], where Wolpert et al. describe the optimisation of systems in the absence of any centralised control. To quote their justification for using the agent-based approach as opposed to a “hand-tailored design”:
“one does not have to laboriously model the entire system; global per- formance is“robust”; one can scale up to very large systems; and one can maximally exploit the power of machine learning.”
(Wolpert et al. [WWT99] ) The authors state that this approach is also applicable in the fields of: “multi-agent systems, computational economics, reinforcement learning for adaptive control, statis- tical mechanics, computational ecologies and game theory”. This highlights the fact that the techniques of learning by reward, or maximising a utility function, are gener- ally applicable to a wide variety of problems.
From a computational point of view, the following quote by Aleksander and Morton from the book, “An Introduction to Neural Computing” [AM95], reinforces the point that Wolpert et al. make, but frames it in a wider perspective:
“The art of the computer programmer is that of turning algorithms into code that the computer can understand through the use of a suitable computer language, while the art of the AI scientist is to find algorithms for the forms of computer behaviour he or she is trying to achieve.”
(Aleksander and Morton [AM95]) This is put into practice by Rand, in “Machine learning meets agent-based modeling: When not to go to a bar” [Ran06], where he develops the idea of learning models from real-time streams. The example presented draws on the same problem cited by
2.9. Learning from Real-time Streams 59 Wolpert et al. [WWT99], which is Brian Arthur’s bar attendance model [Art94]. Rand, however, includes the concept of an “agent world” and a “real world”, which the model attempts to replicate through modification of its rules. Related to this idea, Kirman and Vriend study the working of the Marseille Fish market in “Evolving market structure: An ACE model of price dispersion and loyalty” [KV01] and “Learning in Agent-based Models” [Kir11], which they describe as an “agent-based computational economics” (ACE) model. The authors stress the use of the word “computational”:
“The important word here is “computational”. The idea is to specify the nature of the agents involved, the rules by which they behave and by which they interact and then to simulate the model in order to observe the
outcomes.” (Kirman [Kir11])
Kirman and Vriend use the data from the market transactions to learn decision prob- abilities for their logit model. Reference is made to game theory and Holland’s com- plex adaptive systems, which most closely follows how their model is developed, being composed of “IF THEN” rules, similar to Holland’s genetic agents [Hol95]. The aim is to find the free parameters of the system to optimise the fit with the real world data.
Relating learning to physical systems, in “Why does deep and cheap learning work so well?” [LT16] Lin and Tegmark make the following claim:
“We have shown that the success of deep and cheap (low-parameter- count) learning depends not only on mathematics but also on physics, which favors certain classes of exceptionally simple probability distributions that
deep learning is uniquely suited to model.” (Lin [LT16])
The paper relates to how network size and structure determines the types of prob- lems that can be solved, concluding with the statement above, that the speech, language translation and object recognition problems that currently work so well all belong to a narrow class of physical problems. A review of the field of deep learning and artifi- cial neural networks in relation to agent-based modelling is given by van der Hoog in “Deep Learning in (and of) Agent-Based Models: A Prospectus” [Hoo17], including a history of the field. The authors’ approach is a form of competitive learning, which they term the “Doppelganger approach”. Agents imitate and try to out-perform their peers, who they then replace in the simulation. However, the “maximisation of utility function” from previous approaches is substituted with an alternative metric based on
performance. The authors’ secondary aim is directly related to learning models from streams of data:
“...we propose to use ANNs as computational emulators of entire ABMs. The ANN functions as a computational approximation of the non-linear, multivariate time series generated by the ABM. It is a meta-modelling ap- proach using statistical machine learning techniques.”
(van der Hoog [Hoo17]) The problem of model validation is mentioned, but van der Hoog also states that, “...until now, no clear consensus has appeared how to resolve the the empirical vali- dation problem.”. The two methods he refers to are bootstrapping [BHM07] [ET94] and “estimation of a master equation derived from Focker-Planck equation” from state transition probabilities [ALW05].