Specialist Agents

Why Specialists Beat Generalists

A single generalist agent attempting to cover all vulnerability classes produces broad but shallow results. It might identify an SQL injection pattern in Semgrep output but lack the focused reasoning to trace how that injection could be chained with a privilege escalation found in the access control layer. Specialist agents concentrate their full reasoning capacity on a single security domain, going deeper rather than wider.

Paladin deploys 9 specialist agents (plus a generalist used only at the Recon tier). Each specialist has a distinct system prompt that focuses its analysis on specific vulnerability classes, ensuring depth where it matters most.

The 9 Specialists

1. Web App Agent

Focus areas: XSS, CSRF, injection, session management, input validation

The Web App Agent analyzes ZAP, Nuclei, and Nikto output for web-layer vulnerabilities. Its system prompt instructs it to focus on cross-site scripting, cross-site request forgery, SQL injection, command injection, session management flaws, and input validation bypasses. When it finds cross-domain issues (such as a web misconfiguration that exposes infrastructure details), it posts leads to the relevant specialist.

2. API Security Agent

Focus areas: IDOR, auth flaws, rate limiting, GraphQL, REST misconfiguration

The API Agent examines ZAP and FFUF output for API-specific issues: insecure direct object references, broken authentication on API endpoints, missing rate limiting, GraphQL introspection and injection vulnerabilities, and REST API misconfigurations. API endpoints discovered by FFUF (such as /api/docs, /graphql, or /swagger) receive focused analysis.

3. Infrastructure Agent

Focus areas: Open ports, service misconfiguration, outdated software, cloud exposure

The Infrastructure Agent works primarily with Nmap, OpenVAS, and Subfinder output. It identifies open ports that should not be publicly accessible, services running outdated or vulnerable software versions, cloud infrastructure misconfigurations, and network-level attack vectors. An open management port or an unpatched SSH service often becomes the starting point for a deeper attack chain.

4. Code Analysis Agent

Focus areas: SAST findings, secrets, dependency vulnerabilities

The Code Agent is only active when source code was provided for the pentest. It analyzes Semgrep SAST findings for security antipatterns (SQL injection via string concatenation, insecure deserialization, missing input sanitization), Gitleaks results for exposed secrets (API keys, passwords, tokens), and Grype SCA results for vulnerable dependencies. This agent bridges the gap between source code weaknesses and runtime exploitability.

5. Crypto/TLS Agent

Focus areas: Weak ciphers, certificate issues, key management

The Crypto Agent specializes in cryptographic and TLS security using TestSSL output. It evaluates cipher suite configuration, identifies deprecated protocol support (TLS 1.0/1.1), checks for known TLS vulnerabilities (BEAST, POODLE, Heartbleed, LUCKY13), and assesses certificate validity, chain completeness, and key strength. It also reviews HSTS configuration and certificate transparency.

6. Auth/Access Control Agent

Focus areas: Authentication bypass, privilege escalation, broken access control

The Auth Agent focuses on authentication and authorization vulnerabilities. It examines login endpoints for bypass techniques, checks for privilege escalation paths, identifies broken access control patterns (such as IDOR on resource endpoints), and evaluates session fixation risks. This agent cross-references ZAP findings about cookie flags and HTTPX findings about authentication headers.

7. Business Logic Agent

Focus areas: Race conditions, workflow bypass, data integrity

The Business Logic Agent addresses vulnerability classes that automated tools consistently miss. Race conditions in concurrent requests, workflow bypass (skipping payment steps, reusing one-time tokens), price manipulation, and data integrity violations all require reasoning about how an application should behave - not just what patterns exist in its responses. This agent's system prompt explicitly notes that it should look for logic flaws that automated tools often miss.

8. Supply Chain Agent

Focus areas: Dependency risks, third-party vulnerabilities, component security

The Supply Chain Agent examines the security of external dependencies and third-party components. It works with Grype SCA data on known CVEs in libraries, Semgrep findings about insecure usage of third-party APIs, and HTTPX technology fingerprinting to identify client-side dependencies with known vulnerabilities. Subresource integrity and software composition are its primary concerns.

9. AI/LLM Security Agent

Focus areas: Prompt injection, indirect injection, sensitive data leakage, model extraction, embedding manipulation, agentic limits, toxic output, training-data exposure

The AI/LLM Security Agent specializes in the security of AI-backed applications. It hunts for direct and indirect prompt injection, system-prompt leakage, sensitive data leakage in model output, unsafe handling of model output, embedding and vector manipulation, model extraction, excessive agency, and supply-chain risk in AI infrastructure. It maps findings to the OWASP AI Testing Guide, the OWASP LLM Top 10, and MITRE ATLAS technique IDs, and is most relevant when the target exposes an LLM-backed feature such as a chatbot or AI agent endpoint.

System Prompt Architecture

All specialists share a common prompt structure:

Identity statement - "You are Paladin, an elite penetration testing AI agent operating as a specialist in a multi-agent orchestration."
Role and label - The specific specialist role and human-readable label.
Role-specific suffix - A paragraph directing focus to the agent's vulnerability domain and instructing it to post leads for cross-domain issues.
Assignment - The orchestrator's specific instructions for this pentest, which may vary based on what the target looks like.
Tool outputs - The complete Phase 1 data.
Reporting tools - The post_finding and post_lead tool definitions for reporting findings and leads.

This layered prompt architecture means adding a new specialist is straightforward: define a new role with focus areas and a system prompt suffix, and the existing orchestration infrastructure handles everything else.

The Generalist Agent

At the Recon tier (1 agent), the system deploys a single generalist rather than a specialist. The generalist's system prompt instructs it to analyze all Phase 1 tool outputs and produce a comprehensive security assessment covering all vulnerability classes. While it lacks the depth of specialists, it still provides validated, reasoned analysis - far beyond what Phase 1 tools produce on their own.

How Specialists Collaborate

Specialists do not communicate directly. Instead, they post findings and leads to the shared blackboard. The Web Agent does not send a message to the API Agent. It posts a lead on the blackboard that says "Found reflected input on /api/search endpoint - API Agent should check for injection." The API Agent picks up this lead on its next loop iteration and investigates.

This decoupled communication pattern means agents are loosely coupled. The Web Agent does not need to know whether the API Agent is running, how many agents are active, or what tier the pentest uses. It simply posts leads for any specialist that might benefit, and whichever agents are active will pick them up.

Identity statement - "You are Paladin, an elite penetration testing AI agent operating as a specialist in a multi-agent orchestration."
Role and label - The specific specialist role and human-readable label.
Role-specific suffix - A paragraph directing focus to the agent's vulnerability domain and instructing it to post leads for cross-domain issues.
Assignment - The orchestrator's specific instructions for this pentest, which may vary based on what the target looks like.
Tool outputs - The complete Phase 1 data.
Reporting tools - The post_finding and post_lead tool definitions for reporting findings and leads.

Specialist Agents

Why Specialists Beat Generalists

The 9 Specialists

1. Web App Agent

2. API Security Agent

3. Infrastructure Agent

4. Code Analysis Agent

5. Crypto/TLS Agent

6. Auth/Access Control Agent

7. Business Logic Agent

8. Supply Chain Agent

9. AI/LLM Security Agent

System Prompt Architecture

The Generalist Agent

How Specialists Collaborate

On this page

Specialist Agents

Why Specialists Beat Generalists

The 9 Specialists

1. Web App Agent

2. API Security Agent

3. Infrastructure Agent

4. Code Analysis Agent

5. Crypto/TLS Agent

6. Auth/Access Control Agent

7. Business Logic Agent

8. Supply Chain Agent

9. AI/LLM Security Agent

System Prompt Architecture

The Generalist Agent

How Specialists Collaborate

On this page