An adaptive LLM security assessment framework for authorised red teams.
Burp-Suite-style intruder for Large Language Model applications — with adaptive intelligence, 633+ curated payloads, session replay, and evidence-grade reporting.
What is LLM-Intruder?
LLM-Intruder is an open-source framework for systematically assessing the security of Large Language Model (LLM) applications — chatbots, copilots, RAG systems, AI agents, MCP tool servers, and any application that exposes an LLM to users.
It combines the breadth of a curated attack library (49 catalogues, 633+ payloads, 22 mutation strategies, 20 encoding techniques) with the depth of an adaptive hunting loop that learns from each response. You point it at a target — a web chat UI, an OpenAI-compatible API, a Burp Suite request — and it probes, mutates, and reports.
Purpose
Find bypass conditions in LLM applications before attackers do:
- Prompt injection and jailbreak vulnerabilities
- System-prompt / instruction leakage
- Cross-tenant RAG retrieval boundary failures
- MCP tool-poisoning and agent misuse
- Markdown / image-based data exfiltration (EchoLeak class)
- PII and sensitive-data leakage
- Output-handling vulnerabilities (XSS, SSRF, SQLi, RCE via LLM)
- Defense-specific bypasses (Azure Prompt Shield, Llama Guard, Constitutional AI, OpenAI Moderation)
Features at a glance
- 🎯 5 run modes — Campaign (broad sweep), Hunt (adaptive), Pool-Run (concurrent), Probe (single-shot), RAG-Test (cross-tenant).
- 🌐 Web + API targets — Drive a real Chromium browser via Playwright, or fire raw HTTP requests with a Burp-imported template.
- 🧠 Adaptive intelligence — 4 togglable modules: TombRaider, Burn Detection, AutoAdv Temperature, Defense Fingerprint.
- 📚 633+ curated payloads across 49 catalogues, updatable from internet sources with one click.
- 🔄 22 mutation strategies + 20 encoding techniques with tri-state selection (All / Subset / None).
- 🔐 Session replay — record a login once, reuse it for every payload automatically.
- 🖱️ Interactive picker — Burp-style element selection for complex sites where auto-detect fails.
- 📦 Burp Suite import — paste a saved HTTP request, get an adapter YAML.
- 🤖 9 LLM providers supported for attacker + judge (Ollama, LM Studio, OpenAI, Anthropic, Gemini, Grok, OpenRouter, Heuristic, Auto).
- 📊 Evidence-grade reports — Markdown / HTML / JSON / SARIF (GitHub Advanced Security).
- 🖥️ Web dashboard with live WebSocket progress + CLI for CI / headless use.
- 💾 Local-first — everything stored in a per-project SQLite DB. No telemetry.
Comments
Post a Comment