Security researchers at North Carolina State University have introduced a new defense mechanism to protect artificial intelligence (AI) systems from cryptanalytic parameter extraction attacks, which are used to obtain the model parameters that determine how an AI system operates.
“AI systems are valuable intellectual property, and cryptanalytic parameter extraction attacks are the most efficient, effective, and accurate way to ‘steal’ that intellectual property,” said Ashley Kurian, first author of the paper and a Ph.D. student at North Carolina State University. “Until now, there has been no way to defend against those attacks. Our technique effectively protects against these attacks.”
Aydin Aysu, corresponding author of the paper and associate professor of electrical and computer engineering at NC State, explained: “Cryptanalytic attacks are already happening, and they’re becoming more frequent and more efficient. We need to implement defense mechanisms now, because implementing them after an AI model’s parameters have been extracted is too late.”
These attacks use mathematical methods to extract information about AI models by submitting inputs and analyzing outputs. The primary targets so far have been neural networks—a type of AI model widely used in commercial applications such as large language models.
“In a cryptanalytic attack, someone submits inputs and looks at outputs,” said Aysu. “They then use a mathematical function to determine what the parameters are. So far, these attacks have only worked against a type of AI model called a neural network. However, many – if not most – commercial AI systems are neural networks, including large language models such as ChatGPT.”
The research team discovered that these attacks rely on differences between neurons within each layer of a neural network. Kurian stated: “What we observed is that cryptanalytic attacks focus on differences between neurons. And the more different the neurons are, the more effective the attack is. Our defense mechanism relies on training a neural network model in a way that makes neurons in the same layer of the model similar to each other. You can do this only in the first layer, or on multiple layers. And you could do it with all of the neurons in a layer, or only on a subset of neurons.”
“This approach creates a ‘barrier of similarity’ that makes it difficult for attacks to proceed,” Aysu added. “The attack essentially doesn’t have a path forward. However, the model still functions normally in terms of its ability to perform its assigned tasks.”
Testing showed that adding this defense mechanism led to changes in accuracy below 1%. According to Kurian: “Sometimes a model that was retrained to incorporate the defense mechanism was slightly more accurate, sometimes slightly less accurate – but the overall change was minimal.”
Kurian further noted: “We also tested how well the defense mechanism worked. We focused on models that had their parameters extracted in less than four hours using cryptanalytic techniques. After retraining to incorporate the defense mechanism, we were unable to extract the parameters with cryptanalytic attacks that lasted for days.”
In addition to developing this practical solution, researchers created a theoretical framework for quantifying how likely an attack is to succeed without running long tests.
“This framework is useful because it allows us to estimate how robust a given AI model is against these attacks without running such attacks for days,” said Aysu. “There is value in knowing how secure your system is – or isn’t.”
Kurian expressed hope for industry adoption: “We know this mechanism works, and we’re optimistic that people will use it to protect AI systems from these attacks… And we are open to working with industry partners who are interested in implementing the mechanism.”
Aysu emphasized ongoing challenges: “We also know that people trying to circumvent security measures will eventually find a way around them – hacking and security are engaged in a constant back and forth… We’re hopeful that there will be sources of funding moving forward that allow those of us working on new security efforts to keep pace.”
The research paper titled “Train to Defend: First Defense Against Cryptanalytic Neural Network Parameter Extraction Attacks” will be presented at NeurIPS 2025—the Thirty-Ninth Annual Conference on Neural Information Processing Systems—scheduled for December 2-7 in San Diego.
This project received support from the National Science Foundation under grant number 1943245.



