OWASP Top 10 for LLM Applications 2025: What Changed, What's New, and What It Means for Your AI Security | Lorikeet Security Skip to main content
Back to Blog

OWASP Top 10 for LLM Applications 2025: What Changed, What's New, and What It Means for Your AI Security

Lorikeet Security Team March 17, 2026 14 min read AI Security

In August 2023, OWASP published the first-ever Top 10 for Large Language Model Applications. It was a landmark moment. For the first time, the security community had a standardized framework for discussing the unique risks that LLM-powered applications introduce. But the AI landscape moves fast. The models have gotten dramatically more capable, agentic architectures have gone from research papers to production systems, and retrieval-augmented generation has become the default pattern for enterprise AI deployments. The original list needed an update.

The OWASP Top 10 for LLM Applications 2025, published in November 2025, delivers that update with significant changes. Two entirely new categories have been added. Several entries have been renamed and expanded to reflect how the threat landscape has evolved. The ranking order has shifted based on real-world vulnerability data from hundreds of AI security assessments conducted throughout 2024 and 2025. If your organization builds, deploys, or consumes LLM-powered features, this is the reference document you need to understand.

This guide walks through every entry in the 2025 list, explains what changed from the 2023 original, and provides practical guidance for securing your AI applications.


The AI Security Landscape by the Numbers

Before diving into the list itself, it is worth understanding the scale of the problem. The rapid adoption of generative AI has created a security gap that is widening faster than most organizations can close it.

210%
Increase in AI vulnerability reports on HackerOne (2024-2025)
540%
Increase in prompt injection reports year-over-year
32%
Of LLM pentest findings rated serious severity (Cobalt)
66%
Of organizations conduct regular AI security assessments despite 98% using GenAI

These numbers paint a clear picture. Nearly every organization is using generative AI, but only two-thirds are conducting regular security assessments of those systems. Meanwhile, attackers have figured out that LLM applications are soft targets, and the volume of AI-specific vulnerability reports is growing exponentially. The gap between AI adoption and AI security maturity is the defining cybersecurity challenge of 2026.


2023 vs 2025: What Changed at a Glance

The following table summarizes every entry in the 2025 list, its 2023 equivalent, and the nature of the change. Two entries from 2023 (Insecure Plugin Design and Model Theft) were removed entirely and replaced with new categories that better reflect the current threat landscape.

2025 Entry 2023 Equivalent Change
LLM01: Prompt Injection LLM01: Prompt Injection Stable at #1
LLM02: Sensitive Information Disclosure LLM06: Sensitive Information Disclosure Moved up from #6
LLM03: Supply Chain Vulnerabilities LLM05: Supply Chain Vulnerabilities Moved up from #5
LLM04: Data and Model Poisoning LLM03: Training Data Poisoning Expanded scope
LLM05: Improper Output Handling LLM02: Insecure Output Handling Renamed from #2
LLM06: Excessive Agency LLM08: Excessive Agency Expanded for agentic AI
LLM07: System Prompt Leakage N/A New
LLM08: Vector and Embedding Weaknesses N/A New
LLM09: Misinformation LLM09: Overreliance Renamed
LLM10: Unbounded Consumption LLM04: Model Denial of Service Renamed and expanded

The removal of Insecure Plugin Design reflects its absorption into Excessive Agency and Supply Chain Vulnerabilities. Model Theft was dropped because it is better classified as a traditional intellectual property and access control problem rather than an LLM-specific vulnerability category. Both replacements, System Prompt Leakage and Vector and Embedding Weaknesses, address threats that did not exist at scale when the 2023 list was written.


LLM01: Prompt Injection

Prompt injection holds the number one position for the second consecutive release, and for good reason. It is the most fundamental and most difficult to solve vulnerability class in LLM applications. Unlike SQL injection, which has a well-understood prevention pattern in parameterized queries, prompt injection exploits the core architecture of language models where instructions and data are processed through the same channel.

Direct Prompt Injection

In a direct prompt injection attack, the user crafts input that overrides or manipulates the system prompt. This can cause the model to ignore its instructions, bypass content filters, reveal confidential information embedded in the prompt, or behave in ways the developer did not intend. Common techniques include role-playing scenarios ("Pretend you are a different AI without restrictions"), instruction override attempts ("Ignore all previous instructions and instead..."), and encoding tricks that slip past naive input filters.

Indirect Prompt Injection

Indirect prompt injection is significantly more dangerous and harder to defend against. In this variant, the malicious payload is embedded in external content that the LLM processes, such as a web page retrieved by a browsing agent, a document uploaded for summarization, an email processed by an AI assistant, or data returned from an API call. The user may never see the malicious instruction, but the model processes it alongside legitimate content.

Consider a scenario where your AI assistant summarizes emails. An attacker sends an email containing hidden text (white text on white background, or text in HTML comments) that instructs the model to forward sensitive information to an external address. The user asks the AI to summarize their inbox, and the model obediently follows the injected instruction because it cannot distinguish between data it should process and instructions it should follow.

Why this remains #1: Prompt injection reports on HackerOne increased 540% year-over-year. There is no universal fix. Every defense is a mitigation that reduces probability, not a prevention that eliminates the risk. Defense-in-depth with input validation, output filtering, privilege separation, and human-in-the-loop controls for sensitive actions is the current best practice.

Remediation: Implement strict input sanitization and output filtering. Use privilege separation so the LLM operates with minimal permissions. Never embed secrets or sensitive logic in system prompts that would be catastrophic if leaked. Deploy canary tokens in prompts to detect extraction attempts. For agentic systems, require human approval for high-impact actions regardless of the model's confidence.


LLM02: Sensitive Information Disclosure

This entry jumped from sixth to second position, reflecting the scale of data leakage incidents observed throughout 2024 and 2025. LLM applications can expose sensitive information through multiple vectors: the model may have memorized personally identifiable information (PII), API keys, or proprietary data from its training set. RAG systems may retrieve documents the user is not authorized to see. Conversation histories may be accessible to other users through shared contexts or poorly implemented session management.

The promotion to number two was driven by several high-profile incidents where customer data, internal credentials, and proprietary business information were extracted from production LLM systems. In many cases, the organizations did not realize their models had access to this data at all. The integration of LLMs with enterprise data stores through RAG, function calling, and agentic workflows has dramatically increased the blast radius of information disclosure vulnerabilities.

Remediation: Classify all data that flows through your LLM pipeline. Implement access controls at the retrieval layer so users can only access documents they are authorized to see. Sanitize training data to remove PII, credentials, and proprietary information. Apply output filtering to catch sensitive data patterns (credit card numbers, SSNs, API keys) before they reach the user. Audit conversation logs for data leakage. Implement proper session isolation so one user's context is never accessible to another.


LLM03: Supply Chain Vulnerabilities

Supply chain risks in LLM applications extend well beyond traditional software dependencies. This category covers poisoned pre-trained models downloaded from public repositories like Hugging Face, malicious plugins and tool integrations, compromised fine-tuning datasets, vulnerable inference frameworks, and outdated model versions with known security issues.

The LLM supply chain is particularly treacherous because model files are opaque. You cannot read a .safetensors file the way you can read source code. A model that has been backdoored to respond to a specific trigger phrase will pass all standard evaluation benchmarks while containing a latent vulnerability. Similarly, the ecosystem of LLM plugins, tools, and integrations is immature from a security perspective. Many popular integrations have not undergone security review, and the package management systems for AI tooling lack the verification mechanisms that exist for traditional software packages.

Remediation: Source models only from trusted providers and verify checksums. Audit fine-tuning datasets for poisoning and data integrity. Review all third-party plugins and integrations before deployment. Maintain a software bill of materials (SBOM) that includes model versions, training data provenance, and inference framework dependencies. Pin versions and test updates in staging before production deployment. Treat model files with the same suspicion you would apply to executable code from an untrusted source.


LLM04: Data and Model Poisoning

This entry was expanded from the 2023 "Training Data Poisoning" to encompass a broader set of attack vectors. Data and Model Poisoning now covers attacks against pre-training data, fine-tuning data, embedding data used in RAG systems, and reinforcement learning from human feedback (RLHF) processes. The expanded scope reflects the reality that modern LLM applications are not just consuming pre-trained models. They are customizing them through fine-tuning, grounding them through RAG, and shaping their behavior through feedback loops.

An attacker who can influence any of these data sources can manipulate the model's behavior in targeted ways. Poisoned fine-tuning data can introduce backdoors that activate on specific inputs. Manipulated RAG data can cause the model to generate incorrect or harmful information when queried about specific topics. Compromised RLHF data can shift the model's alignment, making it more likely to comply with harmful requests.

Remediation: Validate and sanitize all training and fine-tuning data. Implement data provenance tracking so you know where every piece of training data originated. Use anomaly detection on fine-tuning datasets to identify potential poisoning attempts. For RAG systems, implement integrity checks on document stores and monitor for unauthorized modifications. Conduct adversarial testing after fine-tuning to check for introduced biases or backdoors. Maintain baseline evaluations and compare model behavior after any data pipeline change.


LLM05: Improper Output Handling

Renamed from "Insecure Output Handling," this category addresses what happens when LLM output is passed to downstream systems without proper validation or sanitization. Because LLMs generate free-form text, their output can contain anything: SQL statements, JavaScript code, shell commands, HTML, or serialized objects. If this output is passed to a web browser, a database query, an API call, or a system command without sanitization, the result is a classic injection vulnerability with the LLM as the attack vector.

This risk is amplified in agentic architectures where LLM output directly drives tool execution. If an agent uses the model's output to construct database queries, file system operations, or API calls, a prompt injection that manipulates the model's output becomes a server-side injection attack. The LLM acts as an unwitting proxy, converting a prompt injection into SQL injection, command injection, or cross-site scripting.

Remediation: Treat all LLM output as untrusted input. Apply the same sanitization and validation you would apply to user input before passing LLM output to any downstream system. Use parameterized interfaces for database queries and API calls constructed from model output. Implement output encoding appropriate to the context (HTML encoding for web display, shell escaping for commands). Restrict the format and structure of LLM output using structured generation techniques where possible.


LLM06: Excessive Agency

This entry received the most significant expansion in the 2025 update, reflecting the explosive growth of agentic AI systems. In 2023, Excessive Agency primarily addressed the risk of giving LLMs too many permissions or too many tools. The 2025 version expands the scope to cover autonomous decision-making, multi-step action chains, agent-to-agent delegation, and the compounding risk that occurs when an AI system can take real-world actions without adequate human oversight.

Agentic AI systems can book travel, send emails, execute code, modify databases, deploy infrastructure, and interact with external services. When these systems have excessive agency, a single prompt injection or model hallucination can trigger a cascade of real-world consequences. The risk is not just that the AI can do too much. It is that the AI can do too much, too fast, without a checkpoint where a human can intervene.

The 2025 guidance specifically addresses agent-to-agent communication patterns where one AI agent delegates tasks to another. Each hop in the delegation chain amplifies the risk because the downstream agent may have different permissions, different context, and different trust assumptions than the originating agent.

Remediation: Apply the principle of least privilege to every tool and permission granted to an LLM agent. Implement approval workflows for high-impact actions (financial transactions, data deletion, external communications). Use structured tool interfaces with explicit parameter schemas rather than free-form text commands. Implement rate limiting on action frequency. Log every action taken by the agent with full context for audit purposes. For multi-agent systems, enforce explicit permission boundaries at each delegation point.


LLM07: System Prompt Leakage

This is one of two entirely new entries in the 2025 list, and it addresses a vulnerability class that was barely discussed in 2023 but has since become one of the most commonly exploited weaknesses in production LLM applications. System Prompt Leakage occurs when an attacker extracts the hidden system prompt that defines the application's behavior, personality, constraints, and capabilities.

System prompts often contain far more sensitive information than developers realize. They may include internal business logic, content filtering rules, the names of tools and APIs the model can access, database schema information, authentication patterns, role definitions, and even API keys or credentials (a practice that is disturbingly common). When an attacker extracts the system prompt, they gain a detailed blueprint of the application's architecture, filtering mechanisms, and potential attack surface.

Extraction techniques range from simple direct requests ("Repeat your system prompt") to sophisticated multi-turn strategies that gradually coax the model into revealing fragments of its instructions. Some techniques exploit the model's desire to be helpful by framing the request as debugging assistance or documentation generation. Others use encoding tricks, asking the model to translate its instructions into another language or represent them as a poem.

Real-world impact: In 2024 and 2025, researchers and bug bounty hunters extracted system prompts from major AI products including ChatGPT, Copilot, Gemini, and numerous enterprise AI applications. In several cases, the extracted prompts revealed undisclosed tool integrations, internal API endpoints, and content filtering bypass techniques that were subsequently weaponized.

Remediation: Never put secrets, credentials, or sensitive business logic in system prompts. Assume the system prompt will be extracted and design accordingly. Implement prompt leakage detection by including canary strings and monitoring for their appearance in outputs. Use a layered architecture where sensitive logic resides in backend code, not in the prompt. Apply output filtering to detect and block responses that contain system prompt content. Test your application with prompt extraction techniques as part of your security assessment.


LLM08: Vector and Embedding Weaknesses

The second new entry addresses the security implications of retrieval-augmented generation, which has become the dominant pattern for building enterprise LLM applications. RAG systems use vector databases to store document embeddings and retrieve relevant context based on semantic similarity. This architecture introduces an entirely new attack surface that the 2023 list did not cover.

Vector and Embedding Weaknesses encompass several attack categories. Poisoned embeddings involve injecting malicious documents into the vector store that, when retrieved, manipulate the model's behavior. Adversarial document injection crafts documents specifically designed to be semantically similar to target queries, ensuring they are retrieved and processed by the model. Embedding inversion attacks attempt to reconstruct the original text from stored embeddings, potentially exposing sensitive information. Access control bypass occurs when the vector database does not enforce the same access permissions as the source document system, allowing users to query for and retrieve documents they should not have access to.

The access control problem deserves particular attention. Many RAG implementations index documents from sources with complex permission models (SharePoint, Google Drive, Confluence) into a flat vector database that has no concept of per-user permissions. The result is that any user who can query the RAG system can potentially retrieve content from documents they cannot access through the original system.

Remediation: Implement access control at the retrieval layer that mirrors the permissions of the source document system. Validate and sanitize documents before indexing them in the vector store. Monitor the vector database for unauthorized additions or modifications. Use metadata filtering to restrict retrieval based on user permissions, classification level, or document source. Test for embedding inversion vulnerabilities. Implement anomaly detection on retrieval patterns to identify potential poisoning or extraction attacks.


LLM09: Misinformation

Renamed from "Overreliance" in the 2023 list, this entry now focuses explicitly on the risk that LLM applications generate false, misleading, or fabricated information that users or downstream systems treat as factual. The renaming reflects a shift in understanding: the problem is not just that users trust LLMs too much, but that LLMs produce convincing misinformation that is structurally indistinguishable from accurate information.

Hallucination remains the primary driver of this risk. LLMs generate plausible-sounding text that may contain fabricated citations, invented statistics, nonexistent legal precedents, or incorrect technical specifications. When these outputs are consumed by automated systems, published in reports, or used to make business decisions, the consequences can range from embarrassment to legal liability to physical harm.

The 2025 guidance specifically addresses the risk of misinformation in automated content pipelines, where LLM-generated content is published without human review. It also addresses the weaponization of LLM hallucination, where attackers deliberately craft inputs that maximize the probability of the model generating specific false claims.

Remediation: Implement fact-checking layers for high-stakes outputs. Use RAG with verified source documents to ground model responses in factual data. Display confidence indicators and source citations alongside LLM-generated content. Require human review for published content, legal analysis, medical advice, and financial guidance. Establish clear disclaimers that content is AI-generated. Monitor for patterns of hallucination in production and use them to improve prompts and retrieval quality.


LLM10: Unbounded Consumption

Renamed from "Model Denial of Service," this entry broadens the scope from simple availability attacks to encompass all forms of uncontrolled resource consumption in LLM applications. The new framing reflects the reality that modern LLM abuse goes beyond crashing a service. It includes inference cost attacks that run up API bills, token exhaustion through crafted inputs that maximize processing time, and resource abuse through automated query flooding.

Inference costs for large language models are orders of magnitude higher than traditional API endpoints. A single complex prompt can cost several cents to process. An attacker who can submit thousands of expensive prompts can inflict significant financial damage without ever causing a service outage. This economic dimension of LLM abuse was underrepresented in the 2023 list's "Denial of Service" framing.

The 2025 guidance also addresses resource consumption through legitimate but excessive use, such as users who build automation on top of your LLM application without your knowledge, or internal processes that consume far more inference capacity than budgeted. Without proper monitoring and controls, costs can spiral before anyone notices.

Remediation: Implement rate limiting on API endpoints that invoke LLM inference. Set per-user and per-organization usage quotas. Monitor inference costs in real time and alert on anomalous spending. Limit input token lengths and reject prompts that exceed reasonable bounds. Implement timeout controls on inference requests. Use caching to reduce redundant inference calls. Set hard budget caps on API spending with automatic cutoffs.


Building an AI Security Program

Understanding the OWASP Top 10 for LLM Applications is the starting point, not the destination. Translating this knowledge into an effective AI security program requires a structured approach that addresses people, processes, and technology.

Step 1: Inventory Your AI Assets

You cannot secure what you do not know exists. Catalog every LLM integration in your organization, including shadow AI usage where employees are using third-party AI tools with company data. Document the models, data sources, integrations, and permissions for each system.

Step 2: Assess Your Risk Exposure

Map each AI asset against the OWASP Top 10 for LLM. Identify which categories represent the highest risk for your specific implementations. A customer-facing chatbot grounded in RAG has a different risk profile than an internal code generation tool. Prioritize based on data sensitivity, user exposure, and action capability.

Step 3: Conduct AI-Specific Penetration Testing

Traditional web application penetration testing does not cover the LLM-specific attack surface. You need testers who understand prompt injection techniques, RAG poisoning, system prompt extraction, agent boundary testing, and the unique output handling risks that LLMs introduce. A comprehensive AI security assessment should cover all ten categories in the OWASP LLM Top 10.

Step 4: Implement Continuous Monitoring

AI applications are dynamic. Model behavior can shift with updates, fine-tuning, or changes to the underlying data. Implement ongoing monitoring for prompt injection attempts, unusual output patterns, data leakage indicators, and cost anomalies. Treat AI security as a continuous practice, not a one-time assessment.

Step 5: Train Your Development Team

Developers building LLM-powered features need to understand these risks at a technical level. The OWASP Top 10 for LLM should be part of your secure development training, alongside the traditional OWASP Top 10 for web applications. Secure coding practices for AI applications differ meaningfully from traditional application security, and your team needs to know the difference.


What This Means for Your Organization

The OWASP Top 10 for LLM Applications 2025 is not an academic exercise. It is a practical framework that reflects real vulnerabilities being exploited in production AI systems right now. With 98% of organizations using generative AI and only 66% conducting regular AI security assessments, the gap between adoption and security maturity represents an enormous exposure for most companies.

The two new entries, System Prompt Leakage and Vector and Embedding Weaknesses, are particularly important because they target the exact patterns that most enterprise AI deployments rely on. If you have a RAG-based application, LLM08 applies directly to you. If your application uses a system prompt (and virtually all do), LLM07 is a vulnerability you need to test for. The expansion of Excessive Agency to cover agentic AI is equally critical as more organizations deploy autonomous AI agents with real-world action capabilities.

The bottom line is straightforward. If your last security assessment did not include AI-specific testing, you have blind spots that attackers are already probing. The OWASP Top 10 for LLM Applications 2025 gives you the framework. What you do with it is up to you.

Secure Your AI Applications

Our AI and LLM penetration testing methodology covers every category in the OWASP Top 10 for LLM Applications 2025, from prompt injection and system prompt extraction to RAG poisoning and agentic boundary testing.

-- views
Link copied!
Lorikeet Security

Lorikeet Security Team

Penetration Testing & Cybersecurity Consulting

We've completed 170+ security engagements across web apps, APIs, cloud infrastructure, and AI-generated codebases. Everything we publish here comes from patterns we see in real client work.

Lory waving

Hi, I'm Lory! Need help finding the right service? Click to chat!