CryptoDB
Francisco Rodriguez
Publications
Year
Venue
Title
2025
EUROCRYPT
Polynomial Time Cryptanalytic Extraction of Deep Neural Networks in the Hard-Label Setting
Abstract
Deep neural networks (DNNs) are valuable assets, yet their public accessibility raises security concerns about parameter extraction by malicious actors. Recent work by Carlini et al. (Crypto’20) and Canales-Martínez et al. (Eurocrypt’24) has drawn parallels between this issue and block cipher key extraction via chosen plaintext attacks. Leveraging differential cryptanalysis, they demonstrated that all the weights and biases of black-box ReLU-based DNNs could be inferred using a polynomial number of queries and computational time. However, their attacks relied on the availability of the exact numeric value of output logits, which allowed the calculation of their derivatives. To overcome this limitation, Chen et al. (Asiacrypt’24) tackled the more realistic hard-label scenario, where only the final classification label (e.g., "dog" or "car") is accessible to the attacker. They proposed an extraction method requiring a polynomial number of queries but an exponential execution time. In addition, their approach was applicable only to a restricted set of architectures, could deal only with binary classifiers, and was demonstrated only on tiny neural networks with up to four neurons split among up to two hidden layers.
This paper introduces new techniques that, for the first time, achieve cryptanalytic extraction of DNN parameters in the most challenging hard-label setting, using both a polynomial number of queries and polynomial time. We validate our approach by extracting nearly one million parameters from a DNN trained on the CIFAR-10 dataset, comprising 832 neurons in four hidden layers. Our results reveal the surprising fact that all the weights of a ReLU-based DNN can be efficiently determined by analyzing only the geometric shape of its decision boundaries.
2024
EUROCRYPT
Polynomial Time Cryptanalytic Extraction of Neural Network Models
Abstract
Billions of dollars and countless GPU hours are currently spent on training Deep Neural Networks (DNNs) for a variety of tasks. Thus, it is essential to determine the difficulty of extracting all the parameters of such neural networks when given access to their black-box implementations. Many versions of this problem have been studied over the last 30 years, and the best current attack on ReLU-based deep neural networks was presented at Crypto'20 by Carlini, Jagielski, and Mironov. It resembles a differential chosen plaintext attack on a cryptosystem, which has a secret key embedded in its black-box implementation and requires a polynomial number of queries but an exponential amount of time (as a function of the number of neurons).
In this paper, we improve this attack by developing several new techniques that enable us to extract with arbitrarily high precision all the real-valued parameters of a ReLU-based DNN using a polynomial number of queries \emph{and} a polynomial amount of time. We demonstrate its practical efficiency by applying it to a full-sized neural network for classifying the CIFAR10 dataset, which has 3072 inputs, 8 hidden layers with 256 neurons each, and about $1.2$ million neuronal parameters. An attack following the approach by Carlini et al.\ requires an exhaustive search over $2^{256}$ possibilities.
Our attack replaces this with our new techniques, which require only 30 minutes on a 256-core computer.
Coauthors
- Isaac Canales-Martínez (1)
- Nicholas Carlini (1)
- Jorge Chavez-Saab (2)
- Anna Hambitzer (2)
- Francisco Rodriguez (2)
- Nitin Satpute (1)
- Adi Shamir (2)