• No se han encontrado resultados

Biologically-Inspired Massively-Parallel Computation

N/A
N/A
Protected

Academic year: 2023

Share "Biologically-Inspired Massively-Parallel Computation"

Copied!
47
0
0

Texto completo

(1)

Biologically-Inspired Massively-Parallel

Computation

Steve Furber

(2)

Turing Centenary

(3)

Turing in Manchester

(4)

Outline

• 63 years of progress

• Many cores make light work

• Building brains

• The SpiNNaker project

• The networking challenge

• A generic neural modelling platform

• Plans & conclusions

(5)

Manchester Baby (1948)

(6)

SpiNNaker CPU (2011)

(7)

63 years of progress

Baby:

– filled a medium-sized room

– used 3.5 kW of electrical power

– executed 700 instructions per second

SpiNNaker ARM968 CPU node:

– fills ~3.5mm2 of silicon (130nm) – uses 40 mW of electrical power – executes 200,000,000 instructions

(8)

Energy efficiency

• Baby:

– 5 Joules per instruction

• SpiNNaker ARM968:

– 0.000 000 000 2 Joules per instruction

25,000,000,000 times

better than Baby!

(James Prescott Joule born Salford, 1818)

(9)

Moore’s Law

Transistors per Intel chip

0.001 0.01 0.1 1 10 100

Millions of transistors per chip

8008

8080 8086

286 386

486 Pentium

4004

Pentium II Pentium III Pentium 4

(10)

…the Bad News

• atomic scales

• less predictable

(11)

Outline

• 63 years of progress

• Many cores make light work

• Building brains

• The SpiNNaker project

• The networking challenge

• A generic neural modelling platform

• Plans & conclusions

(12)

Multi-core CPUs

• High-end uniprocessors

– diminishing returns from complexity – wire vs transistor delays

• Multi-core processors

– cut-and-paste

simple way to deliver more MIPS

• Moore’s Law

– more transistors – more cores

… but what about the software?

(13)

Back to the future

• Imagine…

– a limitless supply of (free) processors – load-balancing is irrelevant

– all that matters is:

• the energy used to perform a computation

• formulating the problem to avoid synchronisation

• abandoning determinism

• How might such systems work?

(14)

Outline

• 63 years of progress

• Many cores make light work

• Building brains

• The SpiNNaker project

• The networking challenge

• A generic neural modelling platform

• Plans & conclusions

(15)

Bio-inspiration

• How can massively parallel computing resources accelerate our understanding of brain function?

• How can our growing understanding of brain function point the way to more efficient parallel, fault-tolerant

computation?

(16)

Building brains

• Brains demonstrate

– massive parallelism (1011 neurons) – massive connectivity (1015 synapses) – excellent power-efficiency

much better than today’s microchips

– low-performance components (~ 100 Hz) – low-speed communication (~ metres/sec) – adaptivity – tolerant of component failure – autonomous learning

(17)

• Neurons

• multiple inputs, single output (c.f. logic gate)

• useful across multiple scales (10

2

to 10

11

)

• Brain structure

• regularity

• e.g. 6-layer cortical

‘microarchitecture’

Building brains

(18)

Neural Computation

• To compute we need:

Processing

Communication Storage

• Processing:

abstract model

– linear sum of weighted inputs

ignores non-linear processes in dendrites

– non-linear output function

learn by adjusting synaptic weights

× w1

x1

× w2

x2

× w3

x3

× w4

x4

f

y

(19)

/

) (

=

>

+

=

=

=

fire A

if

I A

A

x w I

t t

x

A A i

i i k

ik i

ϑ τ δ

Leaky integrate-and-fire model

inputs are a series of spikes

total input is a weighted sum of the spikes

neuron activation is the input with a “leaky” decay

when activation exceeds threshold, output fires habituation, refractory

Processing

(20)

• Izhikevich model

– two variables, one fast, one slow:

– neuron fires when

v

> 30; then:

) (

140 5

04 .

0 2

u bv

a u

I u v

v v

=

+

− +

+

=

d u u

c v

+

=

=

Processing

0 20 40 60 80 100 120 140 160 180 200

-80 -60 -40 -20 0 20 40

0 20 40 60 80 100 120 140 160 180 200

-14 -12 -10 -8 -6 -4 -2 0 2

v

u

(21)

Communication

• Spikes

– biological neurons communicate principally via ‘spike’ events

– asynchronous

– information is only:

• which neuron fires, and

• when it fires

– ‘Address Event’

Representation (AER)

-80

-60 -40 -20 0 20 40

(22)

Storage

• Synaptic weights

– stable over long periods of time

• with diverse decay properties?

– adaptive, with diverse rules

• Hebbian, anti-Hebbian, LTP, LTD, ...

• Axon ‘delay lines’

• Neuron dynamics

– multiple time constants

• Dynamic network states

(23)

Outline

• 63 years of progress

• Many cores make light work

• Building brains

• The SpiNNaker project

• The networking challenge

• A generic neural modelling platform

• Plans & conclusions

(24)

SpiNNaker project

• A million mobile

phone processors in one computer

• Able to model about 1% of the human brain…

• …or 10 mice!

(25)

Design principles

Virtualised topology

– physical and logical connectivity are decoupled

Bounded asynchrony

– time models itself

Energy frugality

– processors are free

(26)

SpiNNaker system

(27)

SpiNNaker node

(28)

SpiNNaker chip

Mobile DDR SDRAM interface

Multi-chip packaging by UNISEM Europe

(29)

Outline

• 63 years of progress

• Many cores make light work

• Building brains

• The SpiNNaker project

• The networking challenge

• A generic neural modelling platform

• Plans & conclusions

(30)

The networking challenge

• Emulate the very high connectivity of real neurons

• A spike generated by a neuron firing must be conveyed efficiently to >1,000 inputs

• On-chip and inter-chip spike communication should

use the same delivery mechanism

(31)

Network – packets

• Four packet types

MC (multicast): source routed; carry events (spikes)

P2P (point-to-point): used for bootstrap, debug, monitoring, etc NN (nearest neighbour): build address map, flood-fill code

FR (fixed route): carry 64-bit debug data to host

• Timestamp mechanism removes errant packets

which could otherwise circulate forever

Header (8 bits) Event ID (32 bits)

P ER TS

T 0 -

Payload (32 bits) Header (8 bits) Address (16+16 bits)

(32)

Network – MC Router

• All MC spike event packets are sent to a router

• Ternary CAM keeps router size manageable at 1024 entries (but careful network mapping also essential)

• CAM ‘hit’ yields a set of destinations for this spike event

automatic multicasting

• CAM ‘miss’ routes event to a ‘default’ output link

0 0 1 0 X 1 0 1 X 000000010000010000 001001 Event ID

(33)

Outline

• 63 years of progress

• Many cores make light work

• Building brains

• The SpiNNaker project

• The networking challenge

• A generic neural modelling platform

• Plans & conclusions

(34)

Topology mapping

06 07

03

09 01

07 01

94 Problem graph (circuit)

1 02

3 2

23

4 72 72

23

Node 94 14

15

Core 10

2

5 6

9

3 6

Synapse 10 2

7 11 1

8 12

0

1 3

0

1 2

0 1

2

0

2 2

3 3

23 72 94

0 - 0

23 72 94

3 2 3

23 72 94

3 2 2

23 72 94

0 1 0

23 72 94

0 2 -

23 72 94 - 1 2

Topology

Fragment of MC table

(35)

Problem mapping

SpiNNaker:

Problem: represented as a network of nodes with a

certain behaviour...

...behaviour of each node embodied as an interrupt

handler in code...

...compile, link...

...binary files loaded into core instruction memory...

Our job is to make the model behaviour reflect

reality ...problem is

split into two parts...

...problem topology loaded into firmware routing

tables...

...abstract problem topology...

(36)

Bisection performance

• 1,024 links

in each direction

~10 billion packets/s

10Hz mean firing rate

250 Gbps bisection bandwidth

(37)

Event-driven software model

(38)

PyNN design flow

(39)

PyNN integration

• LIF

• Izhikevich

(40)

PyNN integration

• Vogels- Abbott

benchmark

– 500 LIF

neurons

(41)

SpiNNaker vision

(42)

48-node PCB

(43)

SpiNNaker platforms

(44)

Outline

• 63 years of progress

• Many cores make light work

• Building brains

• The SpiNNaker project

• The networking challenge

• A generic neural modelling platform

• Plans & conclusions

(45)

SpiNNaker machines

103 machine: 864 cores, 1 PCB, 75W 104 machine:10,368 cores, 1 rack, 900W

(NB 12 PCBs for operation without aircon)

105 machine: 103,680 cores, 1 cabinet, 9kW

(46)

Conclusions

• Brains represent a significant computational challenge

now coming within range?

SpiNNaker is driven by the brain modelling objective

virtualised topology, bounded asynchrony, energy frugality

• The major architectural innovation is the multicast communications infrastructure

• We have working hardware

48-node 864-ARM PCBs now

(47)

SpiNNaker team

Manchester

Southampton

Referencias

Documento similar

To provide a better understanding of how synergies arising from the different types of activities performed (or components of the trilemma) might be exploited

The results of this study provide information that allows better understanding of how music students conceive the content (what) and the process (how) of

Second, SignS is one of the very few genomic analysis tools to use parallel computing. Parallel computing is cru- cial to allow further improvements in user wall time and to

[9,10] with a large set of data from molecular dynamics simulations and experiments in 2D and 3D liquids and plasmas to determine the regime of applicability of hydrodynamics

In this work we discuss the Fe/Co dimer, extending our previous approach to this case; basically, our interest is addressed to understanding theoretically how the coupling between

Our knowledge of primate chemical communication lies far behind that of other mammals like rodents (e.g. Even if our work has deepened our understanding of

Our results have increased our understanding of the socio-cultural values of ecosystem services by empirically demonstrating the following: (i) different stakeholders hold

Analysis was carried out according to three main areas, namely the technology aspects of digital transformation, sustainable development according to the triple bottom line