When researchers at software management company, JFrog, routinely scanned AI/ML models uploaded to Hugging Face earlier this year, the discovery of a hundred malicious models put the spotlight on an underrated category of cybersecurity woes: data poisoning and manipulation.
The problem with data poisoning, which targets the training data used to build Artificial Intelligence(AI)/Machine Learning(ML) models, is that it’s unorthodox as far as cyberattacks go, and in some cases, can be impossible to detect or stop. Attacking AI this way is relatively easy and no hacking in the traditional sense is even required to poison or manipulate training data that popular large language models (LLMs) like ChatGPT rely on.
Data poisoning can be used to make the AI model do your bidding. Or AI models can be convinced to give erroneous output by modifying the data sent into a trained model. These are two different types of attack–one that is done before the AI model deploys, the other done post-deployment. Both are incredibly difficult to ferret out and guard against.
In its analysis, JFrog noted the “intriguing“ payload embedded within the model looked like something researchers would upload to demonstrate vulnerabilities or showcase proofs-of-concept. That was not the case with the nefarious models uploaded to Hugging Face’s AI collaboration repository. Researchers may have been behind it because the payloads had links to IP addresses from KREOnet, or the Korea Research Environment Open Network.
Global VP and CISO in Residence at Zscaler.
Built-in AI problems exacerbate detection while fertilizing exploits
Examples of training data manipulation can be traced to the origins of machine learning, with researchers demonstrating subtle adversarial attacks on input results in a model outputting an incorrect answer with high confidence a decade ago.
It’s even possible that generative AI models scrapping the internet could eventually “poison” themselves as their outputs become inputs for future training sets, in a process known as “degenerative model collapse.”
What muddles the waters further is that AI model reproducibility is in itself a challenge as there are vast pools of data used to train models, and researchers and data scientists may not even understand exactly what went into a model and what is coming out, exacerbating the detection and traceability of malicious code.
Inconvenient as all of this sounds in the AI gold rush, turning a blind eye to data poisoning and data manipulation can embolden attackers to focus on stealth backdoor exploits of AI software. The results can be malicious code execution, as in the case of Hugging Face, new vectors to successfully carry out phishing attacks, and misclassified model outputs that lead to unexpected behaviors, depending on the goals of the attacker.
In a world increasingly blanketed with an ecosystem of interconnected AI, GenAI, LLMs, and APIs, the global cybersecurity industry should release a collective shudder and take action to protect against the rise of attacks on AI models.
Protecting against the “indefensible”
Experts advise several techniques to protect AI-driven systems from data poisoning or manipulation campaigns. Most focus on the data training stage and the algorithms themselves.
In its “Top 10 for LLM Applications” list, the Open Source Foundation for Application Security (OWASP) recommends steps to prevent training data poisoning, starting with paying attention to the supply chain of internally and externally sourced training data, with continuous verification of data sources across pre-training, fine-tuning, and embedding stages and flagging of any biases or anomalies.
OWASP also recommends “sanitizing” the data with statistical outlier and anomaly detection methods to hunt down any adversarial data from potentially being fed into the fine-tuning process.
If training data is corrupted, alternate AI algorithms can be used to deploy the impacted model. More than one algorithm can be used to compare results, and fallback to pre-defined or averaged outputs when all else fails. Developers should closely examine AI/ML algorithms that interact or feed into others, as it can lead to a cascade of unexpected predictions.
Industry experts also suggest that cybersecurity teams check the robustness and resilience of their AI systems by pentesting and simulating a data poisoning attack.
A 100% cybersecure AI model can be built and poisoned using training data. There is no defense other than validating all the predictive output, which is very expensive computationally.
Building a resilient future for AI
Without trust and dependability, the greatest innovation in tech may hit the brakes.
Organizations need to prevent backdoor threats in AI code generation by treating the entire ecosystem and supply chains that underpin GenAI, LLMs, etc. as part of the overall threat universe.
By monitoring the inputs and outputs of these systems and detecting anomalies with threat intelligence, findings and data from these efforts can help developers promote and use controls and protections in the AI software development lifecycle.
Overall, examining the risks of AI systems within the broader business processes, including checking the entire data governance lifecycle, and monitoring how AI behaves in specific applications, you can stay one step ahead of one of the most challenging issues facing cybersecurity.
We’ve featured the best database software.
This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro