2.1 Introducción al sector
2.1.3 Síntesis de los datos recabados en la encuesta
The general methodology of experiments presented in [91] for RndF profiling attacks
on PIC microcontrollers is described in this section.
3.3.1 Attacking PIC Microcontrollers
Unintentional EM emissions from PIC microcontrollers were sampled by the
oscilloscope and transferred to the workstation for post-processing. Each observation
was sampled over time with common time samples across observations representing input
variables. The collected observations are stored in a matrix format with each observation
stored in a row and the columns representing the classification variables. RndF uses these
variables for classification in a profiling attack. Byte-wise profiling attacks are considered,
where one byte of an AES Intermediate Value (IV) is extracted. The IV of interest is the
result of the AES SubBytes function for the first round. A single IV byte yields 28 = 256 possible values, making the byte-wise profiling attack of the first SubBytes output a 256-
class classification problem. With a training set, the EM side channel measurements for
each of the byte values are profiled to create a RndF model for that byte. In the testing
phase, the trained model is used to determine the IV byte value of a device with an unknown
key. It is assumed that the attacker has access to the plaintext. Once the IV is guessed by
the RndF model, the inverse SubBytes function (provided in the AES decryption functions)
is used with the known ciphertext to determine the key.
Prior to applying RndF processing on PIC data, the traces are analyzed to gauge
normality in the variables. In the following section, the variable analysis method using
the Kolmogorov-Smirnoff test is described. 3.3.2 Input Variable Analysis
Classical SCA theory states that side channel leakage for a particular IV is constant
and the measurement noise is multivariate Gaussian spanning multiple time samples [24,
Figure 3.2: 1-D (top) and 2-D (bottom) distributions for an arbitrarily chosen sample set of
two PIC data variables. KS-test for normality revealed 32,102 out of 50,000 total variables
exhibited similar distribution shape.
to contain distinctly non-Gaussian shapes as shown in Figure 3.2. The figure also shows
2-variable distribution plots of p(x) exhibiting distinct multivariate non-Gaussian shapes.
Many more variables were found to show this binomial distribution; only these arbitrarily
chosen variables are shown here for clarity.
A one-sample Kolmogorov-Smirnoff test (KS-test) for normality was used to determine how closely variables resembled a standard normal distribution. This test’s null
hypothesis is that a variable is normally distributed for a given significance level α. The
test is performed by calculating the Empirical Cumulative Distribution Function (ECDF)
SN(X) of N samples by first sorting the sample values from lowest to highest in the order
{X1, X2, . . . , XN} and then using
SN(X)= n(i)/N, (3.11)
where n(i) is the number of points greater than sample value Xi[27]. This is compared to the
the sample mean, and variance σ= s, the sample variance [69]. If D = max(|SN(X)−F(X)|)
is larger than the critical value at a given significance level, then the null hypothesis of
normality is rejected. Figure 3.3a shows a comparison of a sample ECDF from a PIC
microcontroller variable compared to the CDF of a normal distribution. This variable
rejected the null hypothesis at α=1e-20. Figure 3.3b shows the associated histogram of the variable. This type of test is referred to as a one-sample KS-test. In other research, the
two-sample test can be used which compares the ECDF of one variable with the ECDF of
another variable.
(a) (b)
Figure 3.3: (a) Sample ECDF of a variable from a PIC microcontroller data collection
compared to the standard normal CDF. This sample variable rejected the null hypothesis of
normality at α=1e-20, and (b) the associated histogram of the variable.
A very low α=1e-20 was chosen to identify those variables that were clearly not normally distributed. It was found that this low α value when used with the KS-test,
reliably distinguishes variables housing non-Gaussian distribution such as those seen in
Figure 3.2. This method was used in subsequent tests here to distinguish Gaussian and
non-Gaussian variables. Analysis revealed that 32,102 of 50,000 total variables rejected the
Figure 3.4 shows an arbitrarily chosen 2-variable space for key byte values of 1 and
7. Only two variables and two classes are shown for clarity. The figure shows the bimodal
nature of variables in Figure 3.2 is also clearly present for p(x|Byte1) and p(x|Byte7). This
behavior has been verified to exist with each of the 256 classes in the byte-wise attack
considered here, as well as with other variables. Circle and triangle shaped dots in bold
are the locations of the training data used to generate the models. Light and dark shaded
regions in the 2-variable space show the classification regions assigned by each respective
classifier, i.e., they show how the classifier would partition the 2-variable space given the
training set. These non-Gaussian properties were observed for the collections of 40 PIC
microcontrollers, demonstrating that non-Gaussian noise can exist in side channel collected
data.
(a) (b)
Figure 3.4: Arbitrarily chosen two variable decision space for two IV byte values as
determined by (a) Template Attack, and (b) RndF. Light and dark shaded regions represent
the decision regions determined by each classifier, and bold shapes represent the training