3.1. MATERIALES Y MÉTODOS
3.1.5. Recolección de Datos
In order to determine the Hopf curves (in particular the curves that determine the stability of the steady state) in the q − aee plane, we first recall that we expanded λ as a function of
q:
λ = λ0+ λ1q + λ2q2 for small q.
We know that λ0 is dependent on aee, and that R(λ) = 0 when aee = a∗ee, the critical aee
value when q = 0, so that
R(λ)(aee, q) = α(aee− a∗ee) + β q + δ q 2
, where β = R(λ1) and δ = R(λ2).
Letting R(λ) = 0 provides the critical curve in aee− q space:
(aee− a∗ee) + β q + δ q
2 = 0. (E.25)
We note that we are only concerned with finding δ if β is 0, since otherwise the local behavior will be dominated by the linear term. Letting ρ = aee− a∗ee, we have α = dρdR(λ)|0. Letting
A(ρ) = bL0(ρ; m∗) be the linearization about the critical wavenumber m∗ (when aee = a∗ee),
we have A(ρ) = −1+α1(ρ+a∗ee) bKe(m∗) τe − α1 τeJbie(m ∗) β1 τiJbei(m ∗) −1−β1Jbii(m∗) τi ,
where (ρ + a∗ee) bKe(m∗) = aeeKbe(m∗) = bJee(m∗). Recall from our linearization, Eq (E.11), that we have the following equations
A(0) Φ0 = iζ0Φ0 = λ0Φ0
A∗(0) Ψ = −iζ0Ψ = λ0Ψ.
More generally,
A(ρ) Φ(ρ) = λ(ρ) Φ(ρ).
Differentiating with respect to ρ, evaluating at 0, and keeping the real parts provides
=⇒ (Λ0− A(0)) Φ0(0) = (A0(0) − α Id) Φ0,
where some of the previous variables we have used correspond to quantities here; i.e., λ0 =
λ(0), Φ0 = Φ(0). By the Fredholm alternative this has a solution Φ0(0) iff
hA0(0) − α Id) Φ0, Ψi = 0
=⇒ α hΦ0, Ψi = hA0(o) Φ0, Ψi
=⇒ α = hA 0(0) Φ 0, Ψi hΦ0, Ψi , where A0(0) = α1Jbee(m∗) aee 0 0 0 .
Therefore, we have all of the values necessary to determine the stability curve, Eq (E.25). Plotting these for different µ on the graphs of those determined numerically with AUTO shows very good agreement between numerical and theoretical curves (Fig 24).
APPENDIX F
SPECIFIC GENERAL-PURPOSE GPU PROGRAMMING:
OUTLINE FOR SIMULATING SPATIALLY EXTENDED WILSON-COWAN NETWORKS
F.1 BACKGROUND
Since the cortex is better approximated as a sheet of neurons rather than a line, we wanted to explore how the network behaved when extended in two spatial dimensions. This ne- cessitated using custom code, and we first turned to MATLAB since it allows for rapid prototyping. However, the simulations were excruciatingly slow for networks of around 100×100 populations. Indeed, this led to misleading results: cases in which we thought we saw instances of spatiotemporal pattern formation turned out to be transients that evolved to steady states. Clearly, a framework allowing for more rapid simulations was necessary to implement anything more than a few simple cherry-picked examples.
Throughout the years, high-performance computing has often been implemented in the programming languages C and C++. These are compiled languages with vast libraries for scientific computing, and are known to produce rapidly executed code. For speed in simula- tions, they are hard to beat. Indeed, XPP-AUTO is written in C, as is the operating system Linux, and the high-performance code of the open source finite element solver FEniCS, while being accessible in both C++ and Python implementations, is written in C++ [128].
However, the landscape in high-performance computing has shifted over the past decade, as GPUs (Graphical Processing Units) with many more processors than CPUs (Central
Processing Units) have become available. While these processors are far simpler than those found in modern CPUs, they often more than make up for this by allowing for massively parallel computations. With the development of a hardware architecture — called CUDA architecture — that facilitated access to all of the processing and memory elements of the device, along with a software interface in which to access these resources, came the advent of true GPGPUs: General-purpose GPUs. The language NVIDIA developed to go along with their CUDA architecture is CUDA C; in fact there are now two languages, often referred to as CUDA C/C++. (Note: NVIDIA now simply treats “CUDA” as a term to refer to its GPU architecture and programming languages. Originally they used it as as an acronym for Compute Unified Device Architecture.)
Since CUDA C seemed to require very little knowledge over and above C, we concluded that if we went to the trouble to program our network in C, we might as well try out CUDA C. We found (expectedly) massive gains in performance compared to our MATLAB code. Instead of taking minutes to run 100×100 networks for hundreds of time steps, our CUDA C code takes seconds to run 512 × 512 networks, or indeed even 1024 × 1024 networks. In the end, to run the Wilson-Cowan network for a time period of thousands of milliseconds takes minutes, allowing us to sweep through sets of parameters to characterize the network dynamics. Of course, how well such code performs compared to an implementation done purely in C will depend on the platform, including the particular CPU and GPU used. For our setup, which included an NVIDIA GeForce GTX 970 GPU, we found that the CUDA C implementation was ≈ 3.5 times faster than the C implementation, more than justifying the small amount of additional education required. To that end, we turned to a clear and helpful book, CUDA by Example [129]. This text clearly and methodically progresses from very simple to fairly sophisticated examples, only assuming a knowledge of C. Additionally, the authors provide freely available code on GitHub (https://github.com/CodedK/CUDA-by- Example-source-code-for-the-book-s-examples-). Importantly, this includes graphics helper classes with simple programmer-facing implementations. Thus, with only a few more lines of code we can visualize simulations as they run in real time.
In this appendix, we will outline the procedures we used to produce these simulations. As the code will be made available shortly for anyone to use, we will generally keep the
discussion herein at a higher level, only including a simple example and some code snippets when helpful. Our full code should be included as supplementary files to the forthcoming paper that will be based on Chap3.
As almost our entire CUDA and CUDA C knowledge base stems from the information in CUDA by Example[129], we will generally avoid citing the text, with only some exceptions to point the reader to specific useful infomation. We also note that the reader should refer to [129] and NVIDIA’s documentation for instructions involving the installation of CUDA C, how to compile “*.cu” files, and what is necessary for your system so that your code cooperates well with OpenGL. Finally, I am not a programmer; in fact, I have only taken two formal computer science courses, the last of which was early on in my undergraduate career. The errors that are likely to exist in this brief exposition are my own. Please have some grains of salt and your favorite search engine at the ready.
We first introduce the GPU hardware and programming distinctions that arise as a result of this hardware as compared with traditional serial programming for a CPU. These software distinctions are fundamental to programming for GPUs. To illustrate these differences and the power of GPUs, we then explore the simple canonical example of parallel programming: adding two vectors. In GPUs, no loops are required to do so, and we will see how this is implemented. In doing so, we’ll gain the understanding necessary to implement many basic routines that may be of interest to the scientific reader. We then look more at our specific implementation: simulating neural field equations. An important aspect of speeding up per- formance for such systems is using fast Fourier transforms to speed up the computation of the convolutions that provide nonlocal coupling. We see how to do so, and then outline the remainder of our implementation. We hope this appendix might help others who are inter- ested in simulating large networks do so using the massively parallel capabilities represented within the modern-day GPU.