AI Agents & System Prompts

What is an AI Agent?

An AI agent is a software system powered by a large language model (LLM) that can perceive its environment, reason about what it observes, make decisions, and take actions to achieve a goal. Unlike a simple chatbot that responds to one prompt at a time, an agent operates in a loop: observe, think, act, observe the result, think again, act again.

In the context of pentesting, an AI agent does not just read scan output and summarize it. It reads the output, identifies interesting patterns, decides what to investigate next, runs additional tools or crafts exploits, observes the results, and continues until it has thoroughly tested the target.

The Anatomy of an AI Agent

Every AI agent in TurboPentest's Paladin system has four core capabilities:

1. Perception

The agent receives information about its environment. For a pentesting agent, this includes:

Phase 1 tool output (Nmap ports, ZAP findings, Nuclei templates matched, etc.)
Blackboard entries from other agents (findings, leads, status updates)
Results from tools the agent itself has run
Source code analysis results (in white-box mode)

2. Reasoning

The LLM at the agent's core can analyze what it perceives and form hypotheses. When the Web App Agent sees a reflected input parameter, it reasons: "This input is reflected in the HTML response without encoding. I should test whether various XSS payloads execute in this context. The Content-Security-Policy header allows inline scripts, which means script injection is likely viable."

This is fundamentally different from a scanner that simply pattern-matches "input reflected in output = possible XSS." The agent understands context, considers defenses, and plans its approach.

3. Tool Use

Agents can execute tools to act on their reasoning. Paladin agents can:

Run security tools (Nmap, Subfinder, WhatWeb, Schemathesis)
Navigate web applications using a built-in browser
Execute command-line exploits
Craft and send HTTP requests with custom payloads
Analyze source code files

Tool use is what transforms reasoning into validated findings. An agent that reasons "this might be vulnerable to SQL injection" can then craft and execute injection payloads to confirm or refute that hypothesis.

4. Memory and Communication

Agents maintain context throughout their analysis session and communicate with other agents via the blackboard. When the Infrastructure Agent discovers that port 6379 (Redis) is open and unauthenticated, it posts a lead to the blackboard: "Redis instance on port 6379 accepts connections without authentication - API agents should check if session tokens are stored here."

System Prompts: Shaping Agent Behavior

A system prompt is a set of instructions given to an AI agent before it begins working. It defines the agent's role, expertise, focus areas, and behavioral constraints. Think of it as the agent's job description and training manual combined.

How System Prompts Work

In Paladin, each specialist agent receives a role-specific system prompt. For example, the Web App Agent's prompt tells it to focus on XSS, CSRF, injection, session management, and input validation. The Auth/Access Agent's prompt directs it toward authentication bypass, privilege escalation, and broken access control.

The system prompt does not just list topics - it shapes how the agent thinks about its task:

"You specialize in web application security. Focus on XSS, CSRF,
SQL injection, command injection, session management flaws, and
input validation bypasses. Post leads to other agents when you
find cross-domain issues."

This prompt achieves several things:

Specialization - The agent concentrates its analysis where it is most effective
Depth over breadth - Rather than superficially checking everything, it goes deep on its domain
Collaboration - It is explicitly instructed to share cross-domain discoveries with other agents

Why System Prompts Matter

The same underlying LLM (Claude Sonnet 4.6) powers every Paladin agent. What makes the Web App Agent different from the Infrastructure Agent is their system prompt. This is a powerful concept: by carefully crafting prompts, you can create a team of specialists from a single foundation model.

System prompts also establish behavioral boundaries:

What to test and what to skip
How to prioritize severity
When to post findings vs. leads to the blackboard
How to format output for the structured report

Prompt Engineering in Security

Writing effective system prompts for security agents requires domain expertise. A prompt that says "find all vulnerabilities" will produce shallow, unfocused results. A prompt that says "focus on authentication bypass techniques including credential stuffing, session fixation, JWT manipulation, OAuth misconfiguration, and password reset flow abuse" produces a targeted, thorough analysis.

The quality of TurboPentest's system prompts directly affects the quality of its pentests. This is why prompt versioning and systematic improvement are critical - each revision is tested against benchmarks to ensure it improves (or at least maintains) detection rates.

Autonomy Levels

Not all agents need the same level of autonomy. Paladin uses a tiered approach:

Recon tier (1 generalist agent): Lower autonomy, broad coverage, shorter duration
Standard tier (4 agents): Moderate autonomy, domain-focused analysis across the Web, API, and Infrastructure domains
Deep tier (10 agents): Full autonomy with all specialist domains covered
Blitz tier (20 agents): Maximum autonomy with depth passes, exploit chain analysis, and verification

Higher tiers do not just add more agents - they add more sophisticated agent types. The Blitz tier includes an Exploit Chain Agent that reads all other agents' findings and specifically looks for multi-step attack paths, plus a Verification Agent that double-checks severity ratings and PoC reproducibility.

The Role of the LLM

Paladin is powered by Claude Sonnet 4.6 via the Anthropic API. The LLM provides the reasoning engine for every agent. It is what allows agents to:

Understand natural-language tool output
Generate hypotheses about potential vulnerabilities
Craft context-appropriate exploit payloads
Write human-readable finding descriptions and remediation guidance
Correlate disparate pieces of evidence into coherent attack narratives

The choice of LLM matters because pentesting requires sophisticated reasoning about security concepts, code analysis, and creative exploit development. Claude's capabilities in code understanding, multi-step reasoning, and structured output generation make it well-suited for this domain.

AI Agents & System Prompts

What is an AI Agent?

The Anatomy of an AI Agent

Every AI agent in TurboPentest's Paladin system has four core capabilities:

1. Perception

The agent receives information about its environment. For a pentesting agent, this includes:

Phase 1 tool output (Nmap ports, ZAP findings, Nuclei templates matched, etc.)
Blackboard entries from other agents (findings, leads, status updates)
Results from tools the agent itself has run
Source code analysis results (in white-box mode)

2. Reasoning

This is fundamentally different from a scanner that simply pattern-matches "input reflected in output = possible XSS." The agent understands context, considers defenses, and plans its approach.

3. Tool Use

Agents can execute tools to act on their reasoning. Paladin agents can:

Run security tools (Nmap, Subfinder, WhatWeb, Schemathesis)
Navigate web applications using a built-in browser
Execute command-line exploits
Craft and send HTTP requests with custom payloads
Analyze source code files

"You specialize in web application security. Focus on XSS, CSRF,
SQL injection, command injection, session management flaws, and
input validation bypasses. Post leads to other agents when you
find cross-domain issues."

This prompt achieves several things:

Specialization - The agent concentrates its analysis where it is most effective
Depth over breadth - Rather than superficially checking everything, it goes deep on its domain
Collaboration - It is explicitly instructed to share cross-domain discoveries with other agents

Why System Prompts Matter

System prompts also establish behavioral boundaries:

What to test and what to skip
How to prioritize severity
When to post findings vs. leads to the blackboard
How to format output for the structured report

Prompt Engineering in Security

Autonomy Levels

Not all agents need the same level of autonomy. Paladin uses a tiered approach:

Recon tier (1 generalist agent): Lower autonomy, broad coverage, shorter duration
Standard tier (4 agents): Moderate autonomy, domain-focused analysis across the Web, API, and Infrastructure domains
Deep tier (10 agents): Full autonomy with all specialist domains covered
Blitz tier (20 agents): Maximum autonomy with depth passes, exploit chain analysis, and verification

The Role of the LLM

Paladin is powered by Claude Sonnet 4.6 via the Anthropic API. The LLM provides the reasoning engine for every agent. It is what allows agents to:

Understand natural-language tool output
Generate hypotheses about potential vulnerabilities
Craft context-appropriate exploit payloads
Write human-readable finding descriptions and remediation guidance
Correlate disparate pieces of evidence into coherent attack narratives

AI Agents & System Prompts

What is an AI Agent?

The Anatomy of an AI Agent

1. Perception

2. Reasoning

3. Tool Use

4. Memory and Communication

System Prompts: Shaping Agent Behavior

How System Prompts Work

Why System Prompts Matter

Prompt Engineering in Security

Autonomy Levels

The Role of the LLM

On this page

AI Agents & System Prompts

What is an AI Agent?

The Anatomy of an AI Agent

1. Perception

2. Reasoning

3. Tool Use

4. Memory and Communication

System Prompts: Shaping Agent Behavior

How System Prompts Work

Why System Prompts Matter

Prompt Engineering in Security

Autonomy Levels

The Role of the LLM

On this page