What is LLM red teaming?+
LLM red teaming is an adversarial evaluation process in which a security team attempts to exploit vulnerabilities in a language model or the system wrapping it: prompt injections, jailbreaks, system prompt leaks, privilege escalation via tools, and multi-turn attacks. The goal is to identify reproducible failures before a real attacker does, and to deliver prioritised remediation.
What is prompt injection in AI systems?+
Prompt injection is an attack in which an attacker introduces malicious instructions into an LLM's input to alter its behaviour. In direct injection, the attacker manipulates the user prompt; in indirect injection, malicious content arrives through external sources processed by the model — websites, documents, search results. In agentic systems with access to tools, a successful injection can trigger destructive actions on internal systems.
What vulnerabilities do AI agents have?+
According to OWASP GenAI Top 10 2025, the main risks in agentic systems include: prompt injection (LLM01), insecure output handling (LLM02), excessive agency (LLM06), over-reliance on the model, and sensitive information disclosure. Agents with access to internal APIs are especially vulnerable to privilege escalation and unauthorised action execution if they lack minimal permissions, human oversight, and action traceability.
How do you secure an AI chatbot or agent against attacks?+
Securing a production AI system requires multiple layers: input and output validation, robust system instructions, minimal permissions for the agent over external tools, content guardrails, context segregation in RAG pipelines, full traceability of model actions, and periodic review of attack vectors. A single guardrail is not enough; security must be built into the architecture from design, not bolted on afterwards.
What is the OWASP LLM Top 10?+
The OWASP LLM Top 10 is a reference framework published by OWASP (Open Web Application Security Project) that lists the ten most critical security risks in large language model applications. The GenAI 2025 version covers vulnerabilities such as prompt injection, training data poisoning, system prompt leakage, excessive agency, and model supply chain vulnerabilities. It is the industry-standard reference for assessing the security posture of AI systems.