CryptoDB
Kevin Yeo
Publications
Year
Venue
Title
2023
EUROCRYPT
Lower Bound Framework for Differentially Private and Oblivious Data Structures
Abstract
In recent years, there has been significant work in studying data structures that provide privacy for the operations that are executed. These primitives aim to guarantee that observable access patterns to physical memory do not reveal substantial information about the queries and updates executed on the data structure. Multiple recent works, including Larsen and Nielsen [Crypto'18], Persiano and Yeo [Eurocrypt'19], Hub{\'{a}}{\v{c}}ek {\em et al.} [TCC'19] and Komargodski and Lin [Crypto'21], have shown that logarithmic overhead is required to support even basic RAM (array) operations for various privacy notions including obliviousness and differential privacy as well as different choices of sizes for RAM blocks $b$ and memory cells $\omega$.
We continue along this work and present the first logarithmic lower bounds for differentially private RAMs (DPRAMs) that apply regardless of the sizes of blocks $b$ and cells $\omega$. This is the first logarithmic lower bounds for DPRAMs when blocks are significantly smaller than cells, that is $b \ll \omega$. Furthermore, we present new logarithmic lower bounds for differentially private variants of classical data structure problems including sets, predecessor (successor) and disjoint sets (union-find) for which sub-logarithmic plaintext constructions are known. All our lower bounds also extend to the multiple non-colluding servers setting.
We also address an unfortunate issue with this rich line of work where the lower bound techniques are difficult to use and required customized for each new result. To make the techniques more accessible, we generalize our proofs into a framework that reduces proving logarithmic lower bounds to showing that a specific problem satisfies two simple, minimal conditions. We show our framework is easy-to-use as all the lower bounds in our paper utilize the framework and hope our framework will spur more usage of these lower bound techniques.
2023
EUROCRYPT
Lower Bounds for (Batch) PIR with Private Preprocessing
Abstract
In this paper, we study (batch) private information retrieval with private preprocessing. Private information retrieval (PIR) is the problem where one or more servers hold a database of $n$ bits and a client wishes to retrieve the $i$-th bit in the database from the server(s). In PIR with private preprocessing (also known as offline-online PIR), the client is able to compute a private $r$-bit hint in an offline stage that may be leveraged to perform retrievals accessing at most $t$ entries. For privacy, the client wishes to hide index $i$ from an adversary that has compromised some of the servers. In the batch PIR setting, the client performs queries to retrieve the contents of multiple entries simultaneously.
We present a tight characterization for the trade-offs between hint size $r$ and number of accessed entries $t$ during queries. For any PIR scheme that enables clients to perform batch retrievals of $k$ entries, we prove a lower bound of $tr = \Omega(nk)$ when $r \ge k$. When $r < k$, we prove that $t = \Omega(n)$. Our lower bounds hold when the scheme errs with probability at most $1/15$ and against PPT adversaries that only compromise one out of $\ell$ servers for any $\ell = O(1)$. Our work also closes the multiplicative logarithmic gap for the single query setting $(k = 1)$ as our lower bound matches known constructions. Our lower bounds hold in the model where each database entry is stored without modification but each entry may be replicated arbitrarily.
Finally, we show connections between PIR and the online matrix-vector (OMV) conjecture from fine-grained complexity. We present barriers for proving lower bounds for two-server PIR schemes in general computational models as they would immediately imply the OMV conjecture.
2023
CRYPTO
Limits of Breach-Resistant and Snapshot-Oblivious RAMs
Abstract
Oblivious RAMs (ORAMs) are an important cryptographic primitive that enable outsourcing data to a potentially untrusted server while hiding patterns of access to the data. ORAMs provide strong guarantees even in the face of a {\em persistent adversary} that views the transcripts of all operations and resulting memory contents. Unfortunately, the strong guarantees against persistent adversaries comes at the cost of efficiency as ORAMs are known to require $\Omega(\log n)$ overhead.
In an attempt to obtain faster constructions, prior works considered security against {\em snapshot adversaries} that only have limited access to operational transcripts and memory. We consider $(s,\ell)$-snapshot adversaries that perform $s$ data breaches and views the transcripts of $\ell$ total queries. Promisingly, Du, Genkin and Grubbs [Crypto'22] presented an ORAM construction with $O(\log \ell)$ overhead protecting against $(1,\ell)$-snapshot adversaries with the transcript of $\ell$ consecutive operations from a single breach. For small values of $\ell$, this outperforms standard ORAMs.
In this work, we tackle whether it is possible to further push this construction beyond a single breach. Unfortunately, we show that protecting against even slightly stronger snapshot adversaries becomes difficult. As our main result, we present a $\Omega(\log n)$ lower bound for any ORAM protecting against a $(3,1)$-snapshot adversary that performs three breaches and sees the transcript of only one query. In other words, our lower bound holds even if an adversary observes only memory contents during two breaches while managing to view the transcript of only one query in the other breach. Therefore, we surprisingly show that protecting against a snapshot adversary with three data breaches is as difficult as protecting against a persistent adversary.
2023
CRYPTO
Cuckoo Hashing in Cryptography: Optimal Parameters, Robustness and Applications
Abstract
Cuckoo hashing is a powerful primitive that enables storing items using small space with efficient querying. At a high level, cuckoo hashing maps $n$ items into $b$ entries storing at most $\ell$ items such that each item is placed into one of $k$ randomly chosen entries. Additionally, there is an overflow stash that can store at most s items. Many cryptographic primitives rely upon cuckoo hashing to privately embed and query data. It is integral to ensure small failure probability for constructing cuckoo hashing tables as it directly relates to the privacy.
As our main result, we present a more query-efficient cuckoo hashing construction using more hash functions. For construction failure probability ε, the query overhead of our scheme is $O(1 + \sqrt{\log(1/\epsilon)/\log n})$. Our scheme has quadratically smaller query overhead than prior works for any target failure probability $\epsilon$. We also prove lower bounds matching our construction. Our improvements come from a new understanding of the locality of cuckoo hashing failures for small sets of items.
We also initiate the study of robust cuckoo hashing where the input set may be chosen with knowledge of the hash functions. We present a cuckoo hashing scheme using more hash functions with query overhead $\tilde{O}(log \lambda)$ that is robust against $\poly(\lambda)$ adversaries. Furthermore, we present lower bounds showing that this construction is tight and that extending previous approaches of large stashes or entries cannot obtain robustness except with $\Omega(n)$ query overhead.
As applications of our results, we obtain improved constructions for batch codes and PIR. In particular, we present the most efficient explicit batch code and blackbox reduction from single-query PIR to batch PIR.
2021
ASIACRYPT
Efficient Boolean Search over Encrypted Data with Reduced Leakage
📺
Abstract
Encrypted multi-maps enable outsourcing the storage of a multi-map to an untrusted server while maintaining the ability to query privately. We focus on encrypted Boolean multi-maps that support arbitrary Boolean queries over the multi-map. Kamara and Moataz [Eurocrypt’17] presented the first encrypted multi-map, BIEX, that supports CNF queries with optimal communication, worst-case sublinear search time and non-trivial leakage.
We improve on previous work by presenting a new construction CNFFilter for CNF queries with significantly less leakage than BIEX, while maintaining both optimal communication and worst-case sublinear search time. As a direct consequence our construction shows additional resistance to leakage-abuse attacks in comparison to prior works. For most CNF queries, CNFFilter avoids leaking the result sets for any singleton queries for labels appearing in the CNF query. As an example, for the CNF query of the form (l1 ∨ l2) ∧ l3, our scheme does not leak the result sizes of queries to l1, l2 or l3 individually. On the other hand, BIEX does leak some of this information. This is just an example of the reduced leakage obtained by CNFFilter. The core of CNFFilter is a new filtering algorithm that performs set intersections with significantly less leakage compared to prior works.
We implement CNFFilter and show that CNFFilter achieves faster search times and similar communication overhead compared to BIEX at the cost of a small increase in server storage.
2021
TCC
Forward Secret Encrypted RAM: Lower Bounds and Applications
📺
Abstract
In this paper, we study forward secret encrypted RAMs (FS eRAMs) which enable clients to outsource the storage of an n-entry array to a server. In the case of a catastrophic attack where both client and server storage are compromised, FS eRAMs guarantee that the adversary may not recover any array entries that were deleted or overwritten prior to the attack. A simple folklore FS eRAM construction with O(logn) overhead has been known for at least two decades. Unfortunately, no progress has been made since then. We show the lack of progress is fundamental by presenting an \Omega(log n) lower bound for FS eRAMs proving that the folklore solution is optimal. To do this, we introduce the symbolic model for proving cryptographic data structures lower bounds that may be of independent interest.
Given this limitation, we investigate applications where forward secrecy may be obtained without the additional O(log n) overhead. We show this is possible for oblivious RAMs, memory checkers, and multicast encryption by incorporating the ideas of the folklore FS eRAM solution into carefully chosen constructions of the corresponding primitives.
2020
CRYPTO
Lower Bounds for Encrypted Multi-Maps and Searchable Encryption in the Leakage Cell Probe Model
📺
Abstract
Encrypted multi-maps (EMMs) enable clients to outsource the storage of a multi-map to a potentially untrusted server while maintaining the ability to perform operations in a privacy-preserving manner. EMMs are an important primitive as they are an integral building block for many practical applications such as searchable encryption and encrypted databases. In this work, we formally examine the tradeoffs between privacy and efficiency for EMMs.
Currently, all known dynamic EMMs with constant overhead reveal if two operations are performed on the same key or not that we denote as the global key-equality pattern. In our main result, we present strong evidence that the leakage of the global key-equality pattern is inherent for any dynamic EMM construction with $O(1)$ efficiency. In particular, we consider the slightly smaller leakage of decoupled key-equality pattern where leakage of key-equality between update and query operations is decoupled and the adversary only learns whether two operations of the same type are performed on the same key or not. We show that any EMM with at most decoupled key-equality pattern leakage incurs $\Omega(\log n)$ overhead in the leakage cell probe model. This is tight as there exist ORAM-based constructions of EMMs with logarithmic slowdown that leak no more than the decoupled key-equality pattern (and actually, much less). Furthermore, we present stronger lower bounds that encrypted multi-maps leaking at most the decoupled key-equality pattern but are able to perform one of either the update or query operations in the plaintext still require $\Omega(\log n)$ overhead. Finally, we extend our lower bounds to show that dynamic, response-hiding searchable encryption schemes must also incur $\Omega(log n)$ overhead even when one of either the document updates or searches may be performed in the plaintext.
2020
TCC
Lower Bounds for Multi-Server Oblivious RAMs
📺
Abstract
In this work, we consider oblivious RAMs (ORAM) in a setting with multiple servers and the adversary may corrupt a subset of the servers. We present an $\Omega(log n)$ overhead lower bound for any k-server ORAM that limits any PPT adversary to distinguishing advantage at most $1/4k$ when only one server is corrupted. In other words, if one insists on negligible distinguishing advantage, then multi-server ORAMs cannot be faster than single-server ORAMs even with polynomially many servers of which only one unknown server is corrupted. Our results apply to ORAMs that may err with probability at most 1/128 as well as scenarios where the adversary corrupts larger subsets of servers. We also extend our lower bounds to other important data structures including oblivious stacks, queues, deques, priority queues and search trees.
2019
EUROCRYPT
Lower Bounds for Differentially Private RAMs
📺
Abstract
In this work, we study privacy-preserving storage primitives that are suitable for use in data analysis on outsourced databases within the differential privacy framework. The goal in differentially private data analysis is to disclose global properties of a group without compromising any individual’s privacy. Typically, differentially private adversaries only ever learn global properties. For the case of outsourced databases, the adversary also views the patterns of access to data. Oblivious RAM (ORAM) can be used to hide access patterns but ORAM might be excessive as in some settings it could be sufficient to be compatible with differential privacy and only protect the privacy of individual accesses.We consider $$(\epsilon ,\delta )$$(ϵ,δ)-Differentially Private RAM, a weakening of ORAM that only protects individual operations and seems better suited for use in data analysis on outsourced databases. As differentially private RAM has weaker security than ORAM, there is hope that we can bypass the $$\varOmega (\log (nb/c))$$Ω(log(nb/c)) bandwidth lower bounds for ORAM by Larsen and Nielsen [CRYPTO ’18] for storing an array of nb-bit entries and a client with c bits of memory. We answer in the negative and present an $$\varOmega (\log (nb/c))$$Ω(log(nb/c)) bandwidth lower bound for privacy budgets of $$\epsilon = O(1)$$ϵ=O(1) and $$\delta \le 1/3$$δ≤1/3.The information transfer technique used for ORAM lower bounds does not seem adaptable for use with the weaker security guarantees of differential privacy. Instead, we prove our lower bounds by adapting the chronogram technique to our setting. To our knowledge, this is the first work that uses the chronogram technique for lower bounds on privacy-preserving storage primitives.
Program Committees
- Crypto 2024
- PKC 2023
- TCC 2023
- Crypto 2022
Coauthors
- Alexander Bienstock (1)
- Yevgeniy Dodis (1)
- Kasper Green Larsen (1)
- Sarvar Patel (2)
- Giuseppe Persiano (5)
- Joon Young Seo (1)
- Mark Simkin (1)
- Kevin Yeo (9)