Top Artificial Intelligence Security Threats 2024

If the past year had to have a name, it would surely be called the “Year of AI.” Widespread media coverage of ChatGPT and similar offerings made it abundantly clear that artificial intelligence (AI) has profound value to offer many industries — from marketing to banking to healthcare.

This value, however, has a downside. As we become more reliant on this constantly evolving technology, the underlying data will become a more tempting target for cyberattacks, especially algorithms running on personally identifiable information (PII) and protected health records (PHI). Therefore, before building any generative AI solutions, it's important to understand the types of threats you're up against so you can protect sensitive data.

This will be our focus in this blog. We’ll walk through the top 10 AI security threats that are out there today, and touch on some steps you can take to protect against them.

Top Generative AI and LLM Security Threats

Prompt Injection

One of the most common security threats around generative AI is prompt injection. Recall that a prompt is the basic interface for using an LLM. You’ll ask it to generate a poem or a working prototype of an application, and it’ll generate an output. A prompt injection occurs when a bad actor artfully crafts a prompt to extract information that they’re not supposed to have.

There are two basic steps you can take to guard against prompt injection attacks. First, ensure the models have end-to-end encryption; not only does this make it harder to manipulate a prompt in the first place, it also means that an attacker will have to work harder to decrypt any information that does come out. Second, make sure that developers thoroughly test various prompts to see what response they get and where they need to rectify any vulnerabilities.

Insecure Output Handling

Insecure output handling occurs when data fed to or generated by the model isn’t effectively validated. When building generative AI solutions, there needs to be layers of data verification and quality control.

One way to approach this task is to ensure that access to the LLM is as limited as possible – data engineers, for example, might only be granted access to a single tier of data, rather than allowed to pore over everything you have.

Another solution is sanitization. Testers can prompt a model with myriad questions and inputs to see if they can cause the model to violate security best practices. If they’re successful, then you need to vet your protocols before exposing the model to end-users.

Training Data Poisoning

Artificial intelligence is not magic. For all the obvious power of LLMs and generative AI, these tools are only as good as the data they’re trained on.

For this reason, there are a number of ways data can be tampered with or otherwise compromised, including what’s known as “training data poisoning”. This involves corrupting the model’s training data so that when it sees certain information, it makes an incorrect recommendation. If not caught and rectified, this can lead to inaccuracies and biases down the line.

In medicine, it’s often said that prevention is key. The same applies to data poisoning. As the volume of data LLMs are trained on increases, it will become steadily more challenging to retroactively weed out incorrect or malicious information. If you discover any questionable data, go back into the data archives, track down the data uploads for each time that you trained your model, and comb through to determine what needs to be flagged and removed. Some parts of this can be automated.

Model Denial of Service (MDOS)

A denial of service attack refers to any attempt to take down a machine, application, website, etc. by spamming it with traffic or requests until it collapses.

As the name implies, a model denial of service (MDOS) attack is the AI version of this villainy. Because model endpoints are often available publicly they can be relentlessly spammed, thus making them unusable.

This can cause critical business interruptions. For example, a healthcare business might use a generative AI tool for scheduling appointments. A model denial of service would mean that legitimate actors are unable to access the appointment scheduling tool. This could lead to patients not receiving timely care, and cause an administrative nightmare for the business.

Fully protecting your application from an MDOS attack is beyond the scope of this article, but the simplest way to guard against this potentiality is by putting model endpoints inside of a virtual private cloud (VPC), where you can control and limit access to it.

Read Next: 11 Security Best Practices for AI Solutions

Supply Chain Vulnerabilities

Whether we’re talking about a factory or an entire suite of LLMs, the supply chain has many different moving parts. This means many different points at which a bad actor can embed themselves, including hiding within plugins, embedding malicious software, or tampering with training data.

Insecure plugin design is one of the primary issues leading to vulnerabilities in the supply chain, which often results from not having robust encryption protocols that ensure HIPPA-protected data isn’t being disclosed.

To secure the generative AI supply chain, end-to-end encryption at rest and in transit is simply non-negotiable.

Insecure Plugin Design

Having raised the issue of insecure plug-ins, let’s say a little more about them.

Plug-ins can be the one thing that makes or breaks an effective LLM. And, since they’re frequently embedded all across the supply chain, any weak link can create an access point for a bad actor. There’s no alternative to carefully inspecting every plugin at every stage, from programming to maintenance, to make sure that vulnerabilities are addressed.

Sensitive Information Disclosure

Have you ever gotten an email that was intended for someone else, or overheard a conversation you weren’t supposed to hear?

When it comes to LLMs, this is a significant threat. If an LLM is exposed to sensitive data that it's not supposed to see, it can inadvertently reveal confidential information in its responses, leading to privacy violations.

To prevent this, implement data sanitization and strict user policies.

Excessive Agency

You’ve no doubt had colleagues who overstep their authority, or had friends who try to offer advice when they are neither qualified nor experienced enough in a given area to do so (the fancy word for this is “ultracrepidarianism”).

Well, “excessive agency” is a similar concept, referring to situations in which people use LLMs beyond their appropriate scope. This can lead to a model being exposed to information it’s not supposed to see, which in turn could lead to data breaches or other problems.

Preventing this starts by ensuring that all the security systems are aligned across all elements that touch the LLM. You need to have robust policies detailing which data your models can access and which data is off limits, and these need to be understood by everyone using generative AI in your organization.

Overreliance

While LLMs can provide tremendous value and insights, it remains necessary to keep a human in the loop to check that returned data is correct. End users might not always be checking the model’s output, and if no one else does either, that will eventually be problematic.

This necessity stems from basic facts about the way LLMs function. At the end of the day, they’re enormous statistical machines able to map inputs to outputs. Though they’re often surprisingly capable, they have no built-in ability to wonder about their own sources or the veracity of their own conclusions (like many humans do).

In particular, LLMs are well known to “hallucinate”, i.e. “make things up out of whole cloth”. You’ve no doubt already seen examples of people asking ChatGPT for a list of publications by a well-known scientist, only to get back a mix of accurate information and complete fabrications.

We probably don’t need to point out how bad it would be if a model hallucinated illnesses, or invented non-existent patients, so be sure to have someone responsible for fact-checking your generative AI!

Model Theft

Just like a soft drink company will go to great lengths to protect the secret sauce that makes its drinks so popular, you have to strike a balance between making a model that’s hard to steal but easy to use.

Encryption, obfuscation (i.e. hiding or concealing important elements of the code), and paying a close eye to any irregularities or unwanted visitors is your best bet.

When developing an LLM, take some time to think in the shoes of a bad actor, and test out various data exfiltration methods to see how easy it is to steal or reverse engineer the LLM. This is sometimes known as developing a “security mindset”, and it’s an enormous asset.

Guarding Against Generative AI Security Threats

While generative AI has the power to transform countless industries, its usage also raises a bevy of security concerns.

Despite the (largely justified) excitement and buzz surrounding LLMs, developers need to keep in mind the fact that bad actors are seeking to exploit the technology to spread misinformation, steal valuable data, or even abscond with your valuable intellectual property.

As you design, build, and deploy innovative generative AI solutions, securing your LLMs should come first. Learn more about how to get started with generative AI, and generative AI security, in this free eBook, Getting Started with Generative AI in Healthcare.