• No se han encontrado resultados

Ensayos para determinar las propiedades mecánicas

CAPÍTULO III: METODOLOGÍA DE LA INVESTIGACIÓN

3.2. Técnicas e Instrumentos de recolección de datos

3.2.2. Ensayos para determinar las propiedades mecánicas

The security of AES-CTR and AES-OFB is well-proven [85], except that they pose a strong requirement on the encryption seed which must be unique for each datum under a single encryption key; otherwise, the confidentiality of the data may be compromised due to the “two-time” pad attack caused by re-using the same encryption pad. In our proposed encryption scheme, each attribute seed is spatially and temporally unique across the databases for the same database owner, as described in Section 5.3.2. Various database own-

ers have their own unique database encryption keys Kdb such that the seed

uniqueness concern is confined to a single party. It therefore greatly simplifies the attribute seed management and relies on the DBMS to handle the seed uniqueness. Re-encrypting the database with a new encryption key may be necessary when any of the attribute seeds, either the logical schema ID or tu- ple counters, overflow. These parameters are set to a sufficiently large value to avoid frequent re-encryption. Although these two encryption modes intro- duce additional parameters (attribute seed) that require special management to maintain its uniqueness, they are more secure than the conventional AES encryption because the encrypted data are now non-deterministic due to the unique encryption seed being used. It means that even if two attributes are of the same value, the encrypted data look completely different.

Our design does not rely on encrypting the query statements. One poten- tial drawback is the information leakage from the query statements. We refer to it as indirect information leakage. Take a query statement like “SELECT NAME, AGE FROM TABLE WHERE ID=‘1234’;” as an example. An adver- sary can learn from this unencrypted query statement that the user is searching for NAME and AGE with a certain ID number. However, he is unable to learn the exact information because all the query outputs are encrypted. The cor- responding information leakage is at most the number of entries satisfying the WHERE clause. Such indirect information leakage can be solved by sending

dummy output results but this is not considered in our study.

The use of OPE to encrypt indices can leak the order of the sensitive infor- mation due to the nature of the encryption algorithm. However, we note that

this information leakage is inevitable with the use of B+tree indexing, even if

the indices are encrypted with strong encryption. B+tree uses a binary search

tree which stores and accesses the indices in ascending/descending order. An adversary can passively observe the storage of these indices or the access pat-

tern in order to learn the order of the indices. Since the use of B+tree naturally

discloses the order of the indices, the use of OPE can thus achieve better per- formance (without any decryption) and does not sacrifice any security at the same time.

Chapter 6

Processor Architecture

This chapter describes CypherDB secure processor architecture. The main goal of this processor architecture is to provide architectural support to our proposed look-ahead encryption scheme (see Section 5.2) and protect the pri- vacy of any intermediate data stored in off-chip memory in high performance. This chapter is organized as follows. We first investigate a typical database application to identify the sensitive data that need protection. Based on this investigation, we then present an overview of the architecture, which provides three separate data paths for secure execution. After that, the design of each of these three data paths is discussed. Finally, a query execution example is presented, and the security of this architecture is discussed.

6.1 Database Profiling

Figure 6.1 presents the memory layout of a typical database application which outlines the necessary data in a typical database application. The database records are packed and stored in a format called a payload. This payload contains a record header to describe the features of the record and attribute offset to locate each attribute within a record. Multiple payloads are organized on a database page. Each database page has its own page header and record pointer array. During the execution, the DBMS allocates a segment of heap

HEAP STACK

attribute1

Database Page Buffer

Header Attribute Offset attribute2 ...

Page Header Record Pointer Array

Payload

... ...

DBMS Virtual memory

Data Page (DBPage)

Payload

Figure 6.1: The memory layout of a typical database application process. The database records are formatted in a structure of database pages where the database pages are stored in buffers allocated in heap memory.

memory, forming a Database Page buffer, to accommodate multiple database pages. It can be seen that all of the aforementioned database data can be classified into three types:

• Attribute data: the database record outsourced by the database owner. • Metadata: the non-sensitive information such as page header, record offset, payload header and attribute offset that is useful for the DBMS to manage the storage or access of the database records.

• Execution data: the intermediate value generated on-the-fly during pro- gram execution stored in heap or stack memory.

In order to evaluate the performance impact of these three types of data accesses during a database query operation, we investigated the off-chip mem- ory access profile, which is reported as having the most impact on execution time when performing database queries [86]. Our investigation is based on executing the 22 queries in TPC-H [75] on SQLite using a cycle accurate sim- ulator, SimpleScalar [87]. Figure 6.2 depicts the breakdown of each type of data contributing to the last-level cache miss, which shows three important observations and insights:

• Almost half of last-level data cache misses are caused by loading the metadata. These data contain non-sensitive information, and thus en- cryption is not required.

• Execution data are used frequently which results in a high cache hit rate (95%). However, over 14% to 63% of last-level data cache misses are caused by loading these execution data from off-chip memory. These data need to be encrypted at the processor boundary due to the high data reuse profile in the last-level cache.

Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22 0 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Percentage of total stall time

caused by the last level data cache miss

Execution Data Metadata Attribute Data

Figure 6.2: A quantitative analysis of total stall time caused by the last-level data cache miss by executing the 22 queries in TPC-H using SQLite in SimpleScalar. The stall time contributed by the three different types of data: attribute data, metadata and execution data, are measured.

scheme, where encryption latency can occasionally be hidden from the off-chip memory data access (see Section 5.2.1).

Documento similar