• No se han encontrado resultados

To reflect this SoC design with an additional co-processor, the software design is layered as shown in Figure 5-2, where inter process communication (IPC), for exchanging data across multiple processes, can occur [210]. This can be achieved through software applications as messing passing or via hardware schemes such as a synchronised shared memory or modification o f registers within AHB masters.

Software Layers 3. Application ^ 2. Session 1. Network Hardware < - Applications -► Agents JADE-LEAP RTEMS CDC LE0N3 ^ -► JOP FPGA

Figure 5-2. Hardware & Software Layer Design and IPC Methodologies

For this design, IPC occurs in software where a combined registers and memory area approach is used to synchronise the shared memory. These registers and memory are accessible by AHB masters for when the Java co-processor completes routines/operations. Communication standards between the processors are required to retain coherency using a shared memory system and is provided by the robust AMBA bus scheme using register based buffering for read and write operations on the bus. A memory map is also used to separate each core’s designated addressable memory locations. The final software must have a low memory footprint (with operating environment and network stack) and still be real-time. A comparison of the memory footprint and functionality which looks at previous solutions to this problem can be found in Table 4-1, where there are three options considered;

1. A CORBA Middleware based implementation [56].

Chapter 5. Java Co-Processor System-on-a-Chip

3. A new SoC design where the standard Java runtime is replaced in hardware and software is implemented with the CDC stack.

As described in Section 2.3.2.2, CDC and pjava are designed for the devices with intermittent network connections, slow processors and limited memory such as mobile phones, two-way pagers and personal data assistants (PDAs) - making them ideal to run in real-time on the JOP processor. Either 16-or 32-bit MPUs are required and a minimum o f 128 kB to 512 kB o f RAM for the Java platform implementation and associated applications. The full JRE 1.4 requires over 15 MB alone and is a major deterrent for using Java on embedded devices but dynamic class parsers are now available to help minimise the application to a very small size, discussed in Section 2.3.3.1.1. Table 5-2 shows combinations o f operating systems and middleware discussed in Section 2.3 to make up distributed computing platforms for embedded networked systems.

Table 5-2. Memory Footprint Comparison

OSI Software Layer Method Size (MB) Real-tim e

1. Full Software using CORBA

V

(LEON3 + RTEMS, C++, ORB, 802.11 Driver, TCP/IP, Dyn. Lib.) [66]

1.739

2. Full Software using Java

(LE0N3 + RTEMS, JRE 1.4 Std. Lib, CDC 1.0) >16.000 X

3. Combined Hardware & Software Design

(LE0N3 + JOP, CDC 1.0 + JADE-LEAP) 1.106

V

From Table 5-2, it can be seen that the third option offers the smallest memory footprint whilst retaining real-time functionality using the combined hardware and software design which uses CDC 1.0, JADE-LEAP and additional Agents has been reduced to 1.1 MB. Compared to the CORBA based implementation, the proposed system would reduce the footprint by 37 % and compared to the desktop Java solution, the memory footprint becomes too large and too slow for an embedded system.

5.2.1 Shared Memory and Caches

Multi-core system designs use shared memory for a fast form o f IPC between the cores. Once the memory has been mapped, core synchronisation is required between the processes for storing or fetching data to and from shared memory, often called symmetric (shared memory)

multiprocessing (SMP). The synchronisation is implemented using the open-source AMBA AHB

Bus. The AMBA bus acts as the backbone in many SoC designs and is adopted here to provide

Chapter 5. Java Co-Processor System-on-a-Chip

connection between the processor and peripheral cores, on-chip memories and off-chip external memory interfaces, shown in Figure 5-3.

LEO N 3 Core (AH B Master) Register D & I Cacbe

544 B R A M 8 K B

JO P Core (AHB M aster)

Stack Cacbe Method Stack SRAM 1 KB Cacbe 1 KB 256 B

SoC / FPGA

AM BA2 AHB Bus

H

10/100 Ethernet (AHB Master)

H

Debug UART (AHB Master)

I I

JTAG Debug (AHB Master)

H

AHB Arbiter AHB/APB Bridge (AHB Master)

II

AM BA2 APB Bus (Slave) General Purpose

I/O (APB Slave)

X T

Generic UART (APB Slave) Mem. Controller (APB Slave)

u

8 MB 64 MB FLASH SDRAM Off-Cbip Memory

Figure 5-3. System-on-a-Chip Architecture

The LEON3 system implements a standard data and instruction cache but JOP implements a ‘stack’ cache for data and ‘method’ cache for instructions which is designed for real-time worst case execution time (WCET) analysis. JOP’s unique design for a hardware implementation of a

JRE (at V 1.1) implements a simplified garbage collection (GC) model using the RTSJ

specifications introduced in Section 2.3.3.1.2. This method schedules a GC thread for automatic memory management. JOP’s 1 kB cache size was determined by a previous analysis of JRE 1.1 method lengths being 98% under than 1 kB [211] but can be increased if on-chip resources are available. A fault tolerant version of LE0N3 also has a configurable cache and memory system designed to be tolerable to single event upsets (SEUs) or single event latchups (SELs) in the space environment with protected on-chip memories using triple modular redundancy (TMR), parity checking or duplication [212]. TMR is a fault tolerant design process where a design is replicated three times and a voting scheme is utilised to take the sets of data signals and produce 1 correct data signal.

The AMBA bus scheme here implements two types of bus; the advanced high-performance bus (AHB) and the advanced peripheral bus (APB). The AHB bus typically operates at 100 Mbps and is used for on-chip networking only whilst the APB bus operates at a much lower speed for EO, typically 2 or 3 times less than the AHB bus. Bus access for IP core requests and the shared

Chapter 5. Java Co-Processor System-on-a-Chip

memory access is handled using the AHB arbiter where AHB ‘Masters’ can request the bus which is then addressed, granted aeeess, and locked for use before finally being released to the arbiter. Onee the bus is locked, no other core can use the bus unless a split-mode o f operation is implemented where the bandwidth is divided [213].

The LEON3 debug support unit (DSU) is a dedicated AHB slave interface for performing online debugging or instruction tracing of the processor and other masters on the AHB bus using register based buffers. The JTAG UART interface converts JTAG signals to AHB bus transfers and is used to program the FPGA bitstream as well as for debugging purposes. The memory controller on the APB bus is used as a generic slave for the host PROM, mapped EG devices, and RAM devices [202].

Documento similar