Topic 0 – Cloud Storage Security
(Aalto)
Targeted Scenarios (CloSe project)
Based on the scenarios defined in the CloSe project (the predecessor of the current CloSer project), we report on two important results related to privacy-enhancing technologies that have been achieved as part of the CloSer project.
The first considers the scenario of privacy-preserving cloud storage. Two attractive features of cloud storage services are i) the automatic synchronization of files between multiple client devices and ii) the possibility to share files with other users. However, many users are concerned about the security and privacy of data stored in the cloud. Client-side encryption is an effective safeguard, but requires all client devices to have the decryption key. Current solutions derive these keys from user-chosen passwords, which are easily guessed.
The second considers the deduplication in cloud storage. Cloud providers tend to save storage via cross-user deduplication, while users who care about privacy tend to encrypt their files on client-side. Secure deduplication of encrypted data (SDoE) is an active research topic. In CloSe, we have a paper published in CCS '15 [7] that solves this problem. However, it lacks of formal security model and proof to capture the security of multiple runs.
Achieved Results
OmniShare. We have proposed, developed, and evaluated OmniShare, the first scheme to combine strong client-side encryption with intuitive key distribution mechanisms to enable access from multiple client devices and sharing between users. OmniShare uses a novel combination of out-of-band channels, including QR codes and ultrasonic communication, as well as the cloud storage service itself, to authenticate new devices. We describe the design and implementation of OmniShare, and explain how we evaluated its security (using formal methods), its performance (benchmarks), and its usability (cognitive walkthrough). OmniShare is open source software and currently available for Android and Windows with other platforms in development. This work has been accepted for publication in IEEE Internet Computing (JuFo 3) [9].
For SDoE, we propose a formal security model for the single-server “secure deduplication of encrypted data”. We show that a deduplication scheme proved secure in this model can guarantee that, for a certain file, (1) a compromised client cannot learn whether or not this file has already been uploaded by someone else, and (2) there is no way for a compromised server to break this SDoE faster than doing a online brute-force attack. We proposed two new single-server deduplication schemes and prove their security in our model. We showed that their deduplication effectiveness is reasonable via simulations with realistic datasets. This paper has been accepted by CT-RSA '18 [8].
Topic 1 – Private Membership Test
(F-Secure, Trustonic, Aalto, UH)
Targeted Scenarios
In Android App reputation (WP1-01) and website reputation (WP1-02) services, the service provider holds a database of all reputation information about samples (i.e. apps or websites). A user who wants to check a particular sample will query this database. This query is done by sending a request consisting of an identifier of the sample to the database holder. The database holder’s response is a reputation value (or simply True or False), which provides information about the queried sample (e.g. indicates whether or not the service is trusted).
The nature of a reputation query is to directly hand over a search item (in this case a sample identifier) to service providers. This, however, causes a major privacy concern: service providers learn all queries that have been requested by users, therefore gaining personal information about their users. Studies demonstrate that digital records of human behavior, such as websites and applications they are interested in, can expose important personal traits e.g. religion, political views, relationship status etc. Therefore, this research focuses on protecting users of reputation services, from leakage of their private information.
Achieved Results
Crypto-based solutions. We propose a practical solution by approaching the problem from an offline/online setting. Specifically, we introduce the following three phases: a base phase where server and client run data-independent precomputation, e.g., agree on the public parameters; a setup phase where query-independent pre-computation is performed on server’s input and the results are transferred to clients, who then potentially perform additional computations; and an online phase where the operations dependent on the clients' queries are performed. We aim to shift most of the communication and computation work into the base and setup phases to have an efficient online phase. Take our motivating scenario as an example: after installing the service provided by the malware provider, the client may run the base and setup phases overnight. Then, by running the online phase, the client can check a new app in a short period of time. Thereafter, updates (i.e., insertion or deletion) on the dictionary is supported by only running the efficient online phase. In the setup phase, the server blinds each element in the database, resulting in a blinded database. Then the server inserts it into a compact data structure (e.g., a bloom filter or cuckoo filter) and transfers it to clients. Whenever a client wants to check an element, it runs an oblivious protocol with the server to get its blinded version without revealing any information, after which it can check its existence locally. We investigate four blinding techniques: RSA based, Diffie-Hellman (DH) based, Naor-Reingold (NR) based, and AES-GC based. This work has been published in PETS ’17 [1].
In addition to the previous solution, we suggest another practical crypto-based solution for the problem of private membership test. We propose a protocol with low communication complexity that utilizes additively homomorphic encryption scheme. The proposed protocol enables a client to perform privacy-preserving queries on a set controlled by a server. We aim to lower the communication complexity as far as possible. We approach the problem by compressing the size of the server’s database using a Bloom filter or Cuckoo filter.
In this setting, the server divides the database into several subsets. Each subset contains elements of the database that start with the same prefix of bits. The server stores each subset into a Bloom filter or Cuckoo filter. The server arranges a matrix and inserts the filters into the matrix. In order to achieve privacy properties, the client encrypts the location of the matrix that he/she is interested in, using additively homomorphic encryption. We used Paillier encryption scheme to implement this protocol. The client sends the encrypted values to the server. The server performs a series of computations for all the elements of the matrix, using the client’s encrypted values and sends the results back to the client. The client decrypts the results, retrieves a filter, and performs the membership test on this filter privately. We implement this protocol based on our motivating scenarios.
The results show that our proposed protocol has low communication and computation complexity, such that it is feasible in practice. Moreover, this solution does not need any pre-processing procedure. Therefore, updates can be done instantly. This work has been published in NSS ’17 [2].
TEE-based solution. Based on studying the feasibility of using trusted computing technologies, we propose a new carousel design pattern by having the TEE continuously circle the dictionary, such that a batch of queries can be answered within a single carousel cycle. Similar to the approach in [1], the server encodes the database as a compact data structure. Since the TEE needs to circle the whole data structure, the choice of data structure has a significant impact on the performance of the system. After experimentally evaluating the performance of several well-known data structures, we chose 4-ary cuckoo filter for our protocol. To process queries, the TEE cycles through the data structure and scans its contents in order to answer the received queries. The data structure is divided into several chunks and invoked sequentially with each chunk as input along with waiting queries. Incoming queries are associated with the identifier of the chunk in which they arrived, which is defined as their time of arrival. The TEE compares each entry in the chunk with the queries inside its memory and records the results. This process is repeated for each chunk. To avoid information leakage, the TEE i) performs constant-time processing for every entry, and ii) ensures that every query remains in TEE for exactly one full carousel cycle. This work has been published in AsiaCCS ’17 and received an Honorable Mention [3]. It has been selected as one of the top 10 papers in the Applied Research Contest at CSAW Europe [4].
Topic 2 – Private Graph Search
(SSH, UH)
Targeted Scenarios
This use case is about a cloud-assisted security service for managing trust relationships (WP1-03). In a large information system, such as an internal network of a big corporation, different users have varying degrees of trust in one another. There is no need to define (or re-define) policies for each node separately. Instead, trust relationships between different nodes are defined first and policies may be built on top of the concept of trust relationships.
In our model, a trust relation is a quintuple of the form: (source_user, source_host, fp, target_user, target_host). This expresses that a particular source_user@source_host can access another target_user@target_host by using a unique fingerprint (fp). A fingerprint is obtained from a public key by which the access is authenticated. The data is in the form of a directed graph with labeled edges. The vertices in the graph are different user-host pairs. The edges are fingerprints that connect these pairs. The owner of the graph, or some other user, wants to do queries about the graph using a cloud that stores the data and implements the queries, without revealing these queries to the cloud.
Achieved Results
We have addressed the case where one entity owns the trust relationship database and another makes queries to it. The owner of the database calculates the binary matrix that has a 1 in location (i,j) if there is a path from node denoted by i to the node denoted by j in the directed graph and a zero otherwise. The matrix is encrypted and then sent to the cloud.
When the querier wants to know if there is a path from user-host pair A to user-host pair B, he first finds out which row of the matrix corresponds to A and which column of the matrix corresponds to B. Then, using a private information retrieval protocol, he extracts the corresponding bit from the encrypted matrix stored in the cloud. After that he employs a blind decryption protocol with the owner of the graph to reveal the answer to the query.
In this protocol the owner of the graph does not learn the query nor the answer to it. The cloud learns nothing because the matrix is encrypted and a PIR protocol is used to make queries to it. The protocol works in reasonable time with realistic database sizes. This work has been submitted to NTMS ’18 [6].
We have also considered the problem where the trust relation database is constructed of two parts, both parts have different managers, and they do not want to reveal their secrets to other parties. A paper based on our solution to this problem is currently in preparation.
The third problem that we have been considering is the use case where the trust relation database has only one owner but instead of just asking if there is a path of trust relations from one node to another, the querier wants to know what the path is. A paper based on our solution to this problem is currently in preparation.
Topic 3 – Privacy-preserving Classifiers
(F-Secure, Nokia, Aalto)
Targeted Scenarios
As mentioned in Topic 1, Android App reputation (WP1-01) and website reputation (WP1-02) services require clients to query the cloud-hosted database in a privacy-preserving way. However, an advanced app/website reputation service may allow clients to extract features of an app/website, and send these to a server, which runs a machine learning classifier on the features and returns the classification results. This requires an approach to run the classification in a privacy-preserving way such that the server learns nothing about clients’ features, and clients learn nothing about the classifier except the output results. We refer to this technique as a privacy-preserving classifier.
Having an approach for privacy-preserving classifiers also benefits DoS Detection and Mitigation in Cloud Assisted Services (formely Fake Base Station Detection Using Cloud Services) (WP1-04) and Differential Anomaly Detection (WP1-05).
Achieved Results
We targeted neural network classification and presented the first technique that can transform any common neural network model into a privacy-preserving classifier without any modifications to the training phase. We design oblivious protocols for operations routinely used by neural network designers: linear transformations, popular activation functions, and pooling operations. In particular, we use polynomial splines to approximate nonlinear functions (e.g. sigmoid and tanh) with negligible loss in prediction accuracy. None of our protocols require any changes to the training phase of the model being transformed. We only use lightweight cryptographic primitives such as secret sharing and garbled circuits in the online prediction phase. We also introduce an offline pre-computation phase to perform request-independent operations using additively homomorphic encryption together with the SIMD batch processing technique. We show that our approach outperforms existing work in terms of response latency and message size. We demonstrate its wide applicability by transforming several typical neural network models trained from standard datasets. This work has been published in CCS ’17, one of the top tier conferences in security [5].
[1] Àgnes Kiss, Jian Liu, Thomas Schneider, N. Asokan, Benny Pinkas. Private Set Intersection for Unequal Set Sizes with Mobile Applications. In Proceedings of the 17th Privacy Enhancing Technologies Symposium (PETs), Minneapolis, USA , Pages 97-117, July 2017.
[2] Sara Ramezanian, Tommi Meskanen, Masoud Naderpour, Valtteri Niemi. Private Membership Test Protocol with Low Communication Complexity. In International Conference on Network and System Security, pp. 31-45. Springer, Cham, August 2017.
[3] Sandeep Tamrakar, Jian Liu, Andrew Paverd, Jan-Erik Ekberg, Benny Pinkas, N. Asokan. The Circle Game: Scalable Private Membership Test Using Trusted Hardware. In Proceedings of the 11th ACM on Asia Confer- ence on Computer and Communications Security (ASIA CCS), Abu Dhabi, United Arab Emirates, Pages 31-44, April 2017.
[4] CSAW Europe https://csaw.engineering.nyu.edu/research/finalists
[5] Jian Liu, Mika Juuti, Yao Lu, N Asokan. Oblivious Neural Network Predictions via MiniONN Transformations. In Proceedings of the 24th ACM SIGSAC Conference on Computer and Communications Security (CCS), Dallas, Texas, USA, October 2017.
[6] Sara Ramezanian, Tommi Meskanen, Valtteri Niemi. Privacy Preserving Queries on Directed Graph. Accepted by NTMS'2018 - Security Track
[7] Jian Liu, N. Asokan, and Benny Pinkas. "Secure deduplication of encrypted data without additional independent servers." In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS), 2015.
[8] Jian Liu, Li Duan, Yong Li and N. Asokan. "Secure deduplication of encrypted data: Refined Model and New Constructions." Accepted by CT-RSA 2018.
[9] Andrew Paverd, Sandeep Tamrakar, Hoang Long Nguyen, Praveen Kumar Pendyala, Thien Duc Nguyen, Elizabeth Stobert, Tommi Gröndahl, N. Asokan, Ahmad-Reza Sadeghi, "OmniShare: Securely Accessing Encrypted Cloud Storage from Multiple Authorized Devices", Accepted by IEEE Internet Computing.