Presented results for partially homomorphic encryption thus far suggest good performance. However, when tested on an average mobile device, the compu- tational intensity of computing large numbers is highlighted. Figure 3.9 shows that generating a proof can nearly 20s for a 2048-bit key. In terms of com- puter science, seconds to minutes for obfuscating a vote on the mobile device is seemly very slow. However, with the use case of voting, the algorithm and system performance are not the only components to consider for practical- ity. The following is an example of the typical procedure a voter would have to undertake: commuting to the voting station, queuing at a voting station, authenticating themselves, and filling in voting papers. Therefore, if the en- cryption and proof generation takes a few minutes on a client’s mobile device, it is still time saved overall. A user could be entertained with a game or could be notified once the computation is complete. The convenience and time saved validate the minutes of computation time as being practical.
In contrast, with a survey, practicality is defined differently because com- pleting a digital survey needs to be as fast as possible, otherwise participants are less likely to finish it. For example, with non-secure survey systems, the
user experiences near-zero latency when completing the survey. As discussed in Section 3.3.2, a lot of the computation can be hidden as the user completes other questions. However, with partially homomorphic encryption, depending on the device and key size, the user could experience a few seconds or minutes before being able to submit the survey. Again, the survey could be submitted automatically once encryption is complete. However, the same technique is not as practical for a survey system, as it is for a simple ballot, where the cloud performance is practical, but the client side is still problematic.
3.4 Summary
This chapter has shown that the most practical privacy-preserving solution is currently partially homomorphic encryption, with multi-party computation practical for some types of applications. The use case of secure voting and the extension into a secure survey system has shown some aspects for what it means to be practical and some limitations of existing solutions. Different schemes will be better suited for some applications than others. Therefore, practicality depends on the application; however, focusing on cloud performance, partially homomorphic encryption takes under one millisecond to compute a homomor- phic operation with a key size of 2048-bits. Combined with the proof, this gives around 35ms for a vote to be added to the tally. This should be the target of any scheme claiming to be practical in the cloud. However, Section 3.3.7 raised the issue of client performance, and fast encryption times are also required, otherwise even with practical cloud performance the application is still not usable.
4
Privacy-Preserving Encoding
In researching Hypotheses 1 and 2, there was no direct outcome to answering the research question for this thesis. However, the simple fully homomorphic encryption algorithm explored in Appendix B did offer search capabilities, spawning the idea of using encoding for privacy-preserving processing. In- stead of encrypting characters into cipher values, encoding them into groups or “bins” could allow search capabilities while protecting privacy. This chapter explores that idea, resulting in a tentative conclusion for Hypothesis 3.
4.1 Bin Encoding
An approximate string searching scheme “Bin Encoding” is presented as an initial comparison between encoding and encryption [22]. This is a lossy en- coding scheme—a simple trapdoor—that maps characters individually to bins (an extension of the simple substitution cipher used by Mary, Queen of Scots). There are several bins, and multiple characters map to the same one. Hence, the original string cannot be easily obtained from its encoding. For example, below is a mapping with three bins A, B and C:
{a, b, c, d, e, f, g, h, i} ⇒A {j, k, l, m, n, o, p, q, r} ⇒B {s, t, u, v, w, x, y, z} ⇒C
Bob This is a file stored on the Cloud. QG?åL!dA asajKNDD VGJsdsK. AABBB ABBAA BAABA QG?åL!dA asajKNDD VGJsdsK. Index Yu2+A Cfgv QG?åL!dA asajKNDD VGJsdsK.
Figure 4.1:Personal user system model for Bin Encoding
to reduce the number of false positives when searching.) Relative to this mapping, the encoded values for hello and world are AABBB and CBBBA respectively, which can be obtained using Algorithm 1. Apart from world, another possibility for CBBBA is snore (amongst others). However, these possibilities can only be generated by someone who knows the bin mapping. Given the encoded value but not the mapping, there are countless possibilities for CBBBA, such as hello (even though the above bins map it to AABBB). The user’s data is protected by hiding it in many possible bin combinations (> 1020).
Algorithm 1 Bin Encoding
1: function binencode(string, binmap)
2: estring←' '
3: for i←0 tolen(string)do
4: c←lowercase(stringi)
5: if c inbinmapthen
6: estring←estring+binmapc
7: return estring