From existential risks to misinformation, the emergence and rapid improvement in generative AI come with many concerns. With powerful generative models like ChatGPT available to almost anyone online, glitches and problems with these models get amplified and become nefarious when in the wrong hands. This article focuses on the problem of AI hallucinations in large language models (LLMs) and the potential cybersecurity threats they might pose.
Body copy (needs to include keyword and keyphrase around 10 times, outbound link, be around 300-1000 words)
While it might seem like they’re really smart, generative language models work by using vast swaths of training data from the internet to essentially predict the next word in a sentence. The process usually works seamlessly for the latest models, like version 4 of ChatGPT, which produces consistent (if rather repetitive) answers to almost any question.
However, there are still cases where these models spit out inaccurate results. These hallucinations are falsehoods produced by the underlying algorithm due to a range of possible factors, such as:
A high-profile example of AI hallucinations emerged when Google’s Bard chatbot messed up a response to a question about telescopes at the bot’s inaugural demo (incidentally, parent company Alphabet lost $100 billion in value after the mishap).
While these hallucinations are annoying (and sometimes costly) for the companies that release generative language models, it’s perhaps not immediately clear how the falsehoods they sometimes produce could increase cybersecurity risks.
A pertinent real-world example clarifies one type of risk: malicious code packages. Research from June 2023 highlighted a proof-of-concept attack that leveraged AI hallucinations to spread malicious code into developer environments.
In the attack, researchers asked ChatGPT to recommend a package that could solve a coding problem. The model replied with recommendations for a package that didn’t exist (in other words, the response was hallucinated).
A hypothetical threat actor could then go and create malicious code, use the title of this previously non-existent package, and publish it to a repository. Then, when a different user asks a similar question for package recommendations, they may end up installing malicious code in their own dev environment based on ChatGPT recommending the same non-existent package as in previous answers.
While this so-called AI package hallucination sounds like a convoluted scenario, it is indeed quite plausible. A recent survey found that 92 percent of programmers use AI tools like ChatGPT in their work. So, it’s clearly not beyond the realm of possibility that hundreds of programmers ask these tools for library or other package recommendations each day.
In fact, the methodology for this specific bit of research was to scrape the most frequently asked questions on Stack Overflow and see which responses involved hallucinated code packages. As developers move to generative AI models, they’ll likely be asking them the same questions they would’ve previously posted on Stack Overflow.
AI hallucinations don’t necessarily have to involve malice on the part of threat actors to cause cybersecurity risks. Remember that the coding capability of generative language models comes from training data based on online public repositories and other unverified open source code. This means developers asking generative models to solve code problems could use insecure coding practices or introduce vulnerabilities in their codebases based on answers given by ChatGPT and other similar tools.
It’s not just developers who use these tools, though; hallucinations’ cybersecurity risks don’t just relate to code. Consider the problem of compliance with data privacy or cybersecurity regulations. Someone involved in helping to maintain or achieve compliance could ask a generative AI model a question about a specific regulation.
There could be cases where the model produces a hallucinated response because it simply doesn’t understand that specific regulation well enough. If the person who prompted the response then takes this advice and applies it in some way to your IT environment, the result could be non-compliance. The risk here is misinformation, with users relying on these models for important information.
The initial inclination might be to ban these tools on your organization’s network, but that will lead to frustrated employees, workarounds and perhaps even a loss of competitiveness (because employees at other companies are likely using them). Instead, here are some tips to use AI more safely and avoid falling prey to risks from hallucinated responses:
AI generative models can boost productivity, offer insights, and automate repetitive tasks. However the key to safely harnessing their potential lies in understanding their limitations and using them as a complementary tool rather than a definitive authority.