The aim of the work is a flexible architecture for different emulation scenarios. The requirements change not only along the emulation models but also concerning the preferred hosting hardware.
A setup was designed and tested for evaluating the emulation engine design and operation on many realistic scenarios. The setup is comprised of:
1. hosting hardware for the computational execution;
2. modified Debian Linux distribution for software and library support;
3. custom software logic that supervises the evaluation process and automates the measurements;
4. custom algorithms for statistical analysis and plotting of the measurements; 5. the enabling BMS for enabling the CPS properties;
6. and finally, the source code of the emulator.
The diversified hardware enables the evaluation of dissimilar use cases such as: as an offline simulation tool, as an online and cloud-hosted emulator or even by mean of building-distributed micro-emulators. The representative hardware fitting these scenarios are the following:
1. powerful modern machine, featuring an Intel®Core i7-6700 CPU @ 4.00Ghz with 32GB
DDR4 memory;
2. mainstream hardware, featuring an Intel®Core i5-4570 CPU @3.60Ghz with 8GB DDR3
memory;
3. last generation hardware, featuring an Intel®Core 2 Quad Q9650 @3.00Ghz with 8GB
DDR2 memory;
4. an inexpensive, cloud-hosted virtual private server (VPS), featuring a shared Intel®
E5-2630L CPU and 512MB RAM;
5. a capable and last generation micro-computer, the Raspberry Pi 3 (Rasp3), featuring a
quad-core ARM®Cortex-A53 MPU @1.2Ghz with 512MB LPDDR2 memory;
6. finally, an industrial, embedded, low-power Linux board, the BeagleBone Black (BBB),
featuring a single-core ARM®Cortex-A8 MPU @1Ghz with 512MB of LPDDR3 memory.
It is worth noting that due to memory size limitations, the machines equipped only with 512MB of RAM cannot support more than 100 vEntities. Hence, the VPS, Raspberry, and BeagleBone related graphs in the following subsections do not include the tests of more than 100 vEntities. To assess and normalize the hardware regarding performance capacity, the open-source
sysbenchutility have been used. While it supports several tests, for this work, only the CPU and RAM had been tested. Those two components, unlike the storage IOPS, significantly influence the performance of the emulation engine. Specifically, a series of three tests, for each hardware, has been performed.
1. Single-threaded CPU benchmark; the benchmark consists in timing the calculation of prime numbers up to 10000.
2. Multi-threaded CPU benchmark; the same configuration with the previous but using 8 threads.
3. RAM speed benchmark; single-threaded, sequential write test for 1 GByte of data with 1 KByte block size.
Listing 9 displays the exactsysbencharguments that had been used. The results of the test are
performance of the engine. Firstly as expected, the multi-core architectures significantly benefit from multi-threaded computations. As the emulation engine is also a multi-threaded architecture, the benefits are assumed to appear also in the emulation related benchmarks. Secondly, while the memory speeds varied significantly, the realistic engine results did not reveal any strong correlation between memory speed and emulation performance. Finally, the computational heavy, multi-threaded benchmark exhibits a significant contrast in performance
(e.g.≈ 252 : 1 for i7-6700 vs BeagleBone) of the hardware; this enables the evaluation of the
engine on the two extreme ends of hardware spectrum.
#!/bin/bash
#Benchmark the CPU performance
sysbench --test=cpu run
#Benchmark the threaded CPU performance
sysbench --test=cpu --num-threads=8 run
#Benchmark the memory performance
sysbench --test=memory --memory-total-size=1G run
Listing 9 – Bash script for hardware benchmark
Table 4.1 – Benchmark metrics using sysbench for the hardware used in the performance evaluation of emulation. For CPU smaller is better, for Memory bigger is better.
Sysbench Results
Hardware Configuration CPU 1-thread CPU 8-threads Memory
Intel®i7-6700, 32GB DDR4 7.35 sec 1.15 sec 3990 MB/sec
Intel®i5-4570, 8GB DDR3 8.6 sec 2.28 sec 3830 MB/sec
Intel®Q9650, 8GB DDR2 8.4 sec 2.12 sec 2250 MB/sec
Intel®E5-2630L (Shared), 512MB DDR3 12.35 sec 12.84 sec 692 MB/sec
Raspberry Pi 3, 512MB LPDDR2 182.6 sec 45.72 sec 318 MB/sec
BeagleBone Black, 512MB LPDDR3 289.4 sec 289.8 sec 155 MB/sec
To normalize the testing hardware for the software point of view, a clean installation of the Debian "jessie" has been used. Additionally, all unnecessary background services had been suspended during the tests. This step is necessary for reducing externally induced variance on the results.
The majority of the results in the following subsection are in the form of CDF diagrams. They have several advantages over histograms. Firstly, all the key values like minimum and maximum, median and percentiles can be directly read from the diagram. Histograms illustrate the minimum and maximum of the samples as values in the first or last bin accordingly. On the contrary, the minimum is the CDF diagram is the point where the curve meets the x-axis, while
the maximum is where it reaches the y= 1. The percentiles can easily be read using the x-axis.
patterns quickly. The outliers for the CDF on the other hand can be seen through the tails of the curves. While harder than with histograms, the clusters of values can be read from the CDF diagrams as well. A decrease of the curve slope followed with an increase again denotes a group of samples with values read on the x-axis. Finally and most importantly for the scope of this section, the CDF diagrams are much more suitable for comparison of several datasets. An arbitrary number of CDF curves can be plotted in the same figure for direct comparison.