Cybersecurity for Large-Scale Language Models (LLM)

Hello, I'm Ikeda, an AI engineer.

Generative AI and large-scale language models (LLM) have become a hot topic, and their use is progressing dramatically.
However, in order to reduce the risk of information leaks and take full advantage of the convenience of LLM, we need to know about the cybersecurity of applications that use LLM.

Therefore, this time, I will write an article with the aim of easily understanding the overall picture of LLM cybersecurity, in line with the OWASP Top 10 for LLM rankings.

In this article, I will briefly introduce the risks and security measures of LLM, including examples of attacks on LLM applications that I created for security verification.

Additionally, this article has been created using excerpts from lecture slides given at a seminar sponsored by PE-BANK Co., Ltd. We believe that reading this article together with the slide content will deepen your understanding, so please also refer to the lecture slides below.

[Full version] LLM security lecture slides (click to download PDF)

*The lecture slides cover a broader range of risks and countermeasures than this article, while providing content that can be understood even by those who are not familiar with LLM or cybersecurity. We also introduce examples of security incidents and ideas about cyber security for IT systems, which are a prelude to the content of this article.

table of contents

OWASP Top 10 for LLM (Top 10 Critical LLM Vulnerabilities)

"OWASP Top 10 for LLM" is the "Top 10 Critical LLM Vulnerabilities Ranking" announced by The OWASP foundation.

In line with this ranking, I will briefly introduce attack demos and accident cases that I created.

1st place: Prompt Injection (LLM01: Prompt Injection)

The following slides categorize and explain prompt injections.

We will introduce this vulnerability and its workaround through a demonstration of a prompt leak attack on a used book buying app.

The following slide explains how injection occurs by hacking the "delimiter" in the prompt that LLM recognizes.

2nd place: Insecure Output Handling (LLM02: Insecure Output Handling)

There are various dangers from not properly handling the output from LLM.

In the following slides, we used a demo attack on a computational app to illustrate the dangers of letting the server execute the output from the LLM without properly processing it.
It introduces how it is possible to exploit a vulnerability in a library called LangChain (fixed in v0.0.142) and execute arbitrary attack code, including leaking API keys.

3rd place: Learning Data Poisoning (LLM03: Training Data Poisoning)

Contamination of training data by an attacker has implications other than reducing model accuracy.

The following slides highlight the dangers of learning data poisoning through the Tay chatbot example.
If inappropriate data is learned by LLM, there is a possibility that unintended output by the service provider, such as discriminatory behavior, may occur.

4th place: Model Denial of Service (LLM04: Model Denial of Service)

There are attack methods that aim to stop services by stealing server resources by inputting and executing long texts in a loop.

5th place: Supply Chain Vulnerabilities (LLM05: Supply Chain Vulnerabilities)

AI development requires the same way of thinking about supply chains as general software development, but it also requires an understanding of the risks unique to AI and how to deal with them.

When using LLM or other basic models, it is important to be aware that if the source of the training data on which the model is based is malicious, it may be difficult to notice.

The following slides show a model in which an LLM equipped with a backdoor spreads false information such as ``The name of the first person to set foot on the moon was X,'' based on an experimental example by the security company Mithril security. We introduce the possibility of spreading.

6th place: Sensitive Information Disclosure (LLM06)

There are several possible information leakage paths when working with LLM applications.

The slides below introduce a terrifying example of how a person's name and email address learned from ChatGPT can be retrieved using a method called jailbreaking, and mentions the danger of LLM leaking sensitive information..

7th place: Insecure Plugin Design (LLM07: Insecure Plugin Design)

The following slides introduce the possibility that sensitive information linked to plugins may be leaked by attack prompts embedded in external content such as the web.

8th place: Excessive Agency (LLM08: Excessive Agency)

There are various efforts to run LLM applications autonomously by giving them various privileges as agents, but it is important to be aware that the risks are also increasing.

The following slide page introduces an example of an LLM app with excessive privileges, such as accessing information that should not be accessible via Vector DB and executing arbitrary code.

9th place: Overreliance (LLM09: Overreliance)

It is possible for an LLM answer to seem plausible but be completely false.

The following slides introduce cyber attacks that occur by exploiting these lies. In this example, when a user executes the sample code provided by ChatGPT as is, malicious code prepared by an attacker is executed.

10th place: Model Theft (LLM10: Model Theft)

Theft of LLM knowledge, technology, and trained models is a significant security risk.

At the end

I hope this article helps you utilize your LLM. If you are interested, please use the lecture slide PDF linked at the top of this article.

Kazuki Igeta

Macnica AI Engineer Blog Related Articles