
Bug Bounty Automation: The Complete Guide to AI-Powered Security Research
How to build a multi-agent bug bounty automation system with evidence-gated progression, zero false positives, and a learning layer that improves with every scan. The full architecture from 3 months of production use.
My first automated bug bounty scan found 47 “critical” vulnerabilities.
I submitted 12 reports. Every single one was a false positive.
The program I targeted now knows my name. Not in a good way.
That specific embarrassment is what made me rebuild everything from scratch. Not a faster scanner. Not a better scanner. A fundamentally different approach to what automation should and shouldn’t do in security research.
This guide is the result: a complete system for bug bounty automation that actually works in production.
What Bug Bounty Automation Actually Is (and Isn’t)
Bug bounty automation is not a script that finds vulnerabilities for you.
That framing leads directly to 47 false positive submissions and a wrecked reputation.
What it actually is: a system that handles the mechanical parts of security research — reconnaissance, asset discovery, initial scanning — while keeping humans in control of the decision that matters most: what to submit.
The best automation makes you a more effective researcher. It doesn’t replace your judgment. It amplifies it.
What automation handles well:
- Subdomain enumeration across certificate transparency logs
- Technology fingerprinting at scale
- Running known payload patterns against hundreds of endpoints simultaneously
- Tracking which findings have been validated vs. just detected
- Generating properly formatted reports for each platform’s requirements
What automation handles poorly:
- Novel vulnerability classes that don’t match existing patterns
- Context-aware exploitation (is this XSS actually exploitable in this specific app context?)
- Deciding whether a finding is worth a researcher’s reputation
- Anything that requires reading the room on a specific target
Understanding this division is more important than any technical decision you’ll make.
The Core Architecture: 4 Agents, One Orchestrator
After rebuilding the system twice, the architecture that works is a 4-agent pipeline coordinated by a central orchestrator.
Orchestrator (Claude Opus)
├── Recon Agents (parallel)
├── Testing Agents (max 4 concurrent)
├── Validation Agent (single, evidence-gated)
└── Reporter Agent (platform-specific formatters) The orchestrator is a project manager, not a worker. It distributes tasks, manages rate limit budgets, detects agent failures, and persists session state between runs. It never touches an endpoint directly.
Recon Agents
Recon runs in parallel across multiple discovery methods:
- Subdomain enumeration via certificate transparency (crt.sh, Censys)
- Technology fingerprinting with httpx to identify frameworks, servers, CDNs
- JavaScript analysis for hidden endpoints, API keys in source, internal route paths
- GraphQL introspection where applicable
All discovered assets feed into a shared SQLite database. Recon agents never block each other — if subdomain enum hits a rate limit, JavaScript analysis keeps running.
Testing Agents
Testing agents take the recon output and probe for vulnerabilities. I cap these at 4 concurrent to avoid triggering WAFs or rate limits.
What they test:
- IDOR: multi-account replay of authenticated requests
- XSS: payload injection with response diff analysis
- SQL injection: error-based and time-based patterns
- SSRF: metadata service probing, internal network access
- Authentication issues: token fixation, session handling edge cases
Each testing agent handles one vulnerability class. Failure is isolated — if the IDOR agent crashes, XSS testing continues unaffected.
Validation Agent: The Most Important Part
Here’s the thing most bug bounty automation gets wrong: detection is not exploitation.
My payload appearing in a response means nothing. It might be in an error log that’s never rendered, in an HTML attribute that’s properly escaped, on a WAF block page, or in a JSON response that’s never interpreted as HTML.
The Validation Agent’s only job is to disprove findings.
The evidence gate process:
Every finding starts with a confidence score of 0.0 to 1.0 based on initial detection (around 0.3 for most). To advance to human review, a finding must reach 0.85+.
- Baseline capture: Normal request with innocuous input. Record response headers, body length, content type.
- PoC execution: Same request with malicious payload in a sandboxed environment.
- Response diff analysis: Not “does the response contain my payload?” but “does the response differ from baseline in an exploitable way?”
- False positive signature matching: Known-harmless patterns get auto-dismissed.
If PoC succeeds and diff analysis confirms exploitability: confidence rises to 0.85+. Queued for human review.
If PoC fails: confidence drops. Finding goes to weekly batch review, not discarded.
This is adversarial validation. The agent is trying to kill findings. Findings that survive are credible.
Since implementing this: 0 false positives submitted across 3 months.
Reporter Agent
Once a finding clears human review and gets approved, the Reporter Agent handles formatting. Every platform has different submission requirements. I built a unified findings model plus platform-specific formatters — write the finding once, output to HackerOne, Intigriti, or Bugcrowd format automatically.
The Learning Layer: SQLite RAG
The piece I didn’t plan but won’t remove.
Every time an agent hits a rate limit, gets banned, or has a finding dismissed, it logs that to a SQLite database with semantic embeddings. Before running against a new target, the orchestrator queries this database — “have we seen this stack before? what broke?”
After 3 months of data, the system meaningfully avoids mistakes it’s already made. That wasn’t in the original design. I added it after watching the system make the same rate-limit mistake on three targets in a row. The fourth target, it slowed down automatically. That was the moment I stopped thinking of this as a script.
The Human-in-the-Loop Gate
Full automation for security research is wrong.
Not in a theoretical sense. Wrong in a “your reputation will be destroyed” sense.
Finding cleared by Validation Agent (confidence 0.85+)
↓
Human review queue (checked once per day)
↓
[APPROVE] → Reporter Agent formats + submits
[DISMISS] → Logged with reason, updates false positive signatures
[INVESTIGATE] → Flagged for manual testing Every submission has been through my eyes before it goes to a program. Non-negotiable.
Tools and Stack
- Orchestration: Claude Opus (orchestrator), Claude Haiku (testing agents)
- Recon: httpx, subfinder, amass, crt.sh API
- Testing: Custom Python agents per vulnerability class, Playwright for JS analysis
- Validation: Docker sandboxed execution, custom response diff library
- Storage: SQLite with sqlite-vec for semantic search
- Platform integration: HackerOne API, Intigriti API, Bugcrowd API
- Infrastructure: VPS ($40/mo) — not serverless, you need persistent state
- Total monthly cost: ~$180 ($40 VPS + ~$140 Claude API)
What I’d Do Differently
Start with the Validation Agent, not the scanner. The scanner is interesting. The validation layer is what actually matters. Build it first.
Cap concurrent agents at 4 from day one. Started with 10. Got IP-banned from 3 programs in two weeks.
Build the human review queue before anything else. The moment you can submit without a gate is the moment you will. Build the gate first.
Accept that it won’t make you rich quickly. This system makes you roughly 3.5x more effective. That’s the actual value proposition.
Current Results (3 Months In)
- 12 active programs being monitored
- ~30 findings surfaced for human review per week
- ~4-6 submitted after review
- 0 false positives submitted
- ~$180/month running cost
- ~3.5x throughput increase vs. manual research
The 5-Part Deep Dive
This is the architecture overview. Each component has its own detailed breakdown:
Part 1: Why Multi-Agent Architecture — The decision to use 4 specialized agents and how evidence-gated progression works in practice.
Part 2: Cutting False Positives with Response Diff Analysis — The full Validation Agent, why detection isn’t exploitation, and how response diff analysis catches what signature matching misses.
Part 3: The Learning System — How the SQLite RAG layer works and how the system improves over time.
Part 4: Multi-Platform Integration — Unified findings model, platform-specific formatters for HackerOne, Intigriti, and Bugcrowd.
Part 5: Why I Added a Mandatory Human Gate — The operational and reputational case for keeping humans in the submission loop.
Building something similar? The hardest part is the validation layer. Start there — everything else is just plumbing.
FAQ
What is bug bounty automation?
Bug bounty automation uses software to handle the repetitive parts of vulnerability research — subdomain discovery, technology fingerprinting, initial scanning, and report formatting. The goal is higher research throughput, not replacing human judgment on what to submit.
Does bug bounty automation actually work?
Yes, with the right architecture. Systems that fail use automation for the entire pipeline including submission. Systems that work treat automation as a force multiplier for human researchers, with mandatory review gates before anything reaches a program.
What tools do I need for bug bounty automation?
Core tools: httpx and subfinder for recon, an LLM orchestrator (Claude works well), a sandboxed environment for PoC execution, and SQLite for state management. Platform APIs for HackerOne, Intigriti, and Bugcrowd enable programmatic submission after human review.
How do I reduce false positives in bug bounty automation?
Response diff analysis instead of payload presence detection. Evidence-gated progression where findings must have proof of exploitability before advancing. False positive signature matching for known-harmless patterns. And a mandatory human review gate before anything gets submitted.
How much does bug bounty automation cost to run?
My current system costs around $180 per month — about $40 for a VPS and $140 for Claude API costs. Progressive context loading cuts those API costs significantly. Without it the system would cost roughly $350 per month.
What is the best LLM for bug bounty automation?
Claude Opus for orchestration (complex decision-making, failure recovery) and Claude Haiku for testing agents (fast, cheap, good enough for pattern matching). Match capability to task rather than using one model for everything.
Is automated bug bounty hunting ethical?
Yes, when confined to authorized programs within scope, with rate limiting that respects program infrastructure, and with human review before submission. Scanning targets outside programs or ignoring scope is not ethical regardless of automation.
How long does it take to build a bug bounty automation system?
The basic pipeline takes about 3 weeks to build. The validation layer and learning system take another 4 weeks to get right. Budget 6 to 8 weeks for a production-ready system.
Sources & Further Reading
Sources
- OWASP Web Security Testing Guide Baseline methodology for structured vulnerability validation.
- OWASP Top Ten Vulnerability classification framework used by recon agents.
- MITRE CWE Vulnerability type identifiers used in reporter agent output.
Further Reading
- I Built a Semi-Autonomous Bug Bounty System: Here's the Full Architecture How I built a multi-agent bug bounty hunting system with evidence-gated progression, RAG-enhanced learning, and safety mechanisms that keeps humans in the loop.
- I Built an AI-Powered Bug Bounty System: Here's Everything That Happened Why I chose multi-agent architecture over monolithic scanners, and how evidence-gated progression keeps findings honest. Part 1 of 5.
- I Built a Multi-Agent SaaS Builder: Here's the Full Architecture Deep dive into MicroSaaSBot's multi-agent architecture: Researcher, Architect, Developer, and Deployer agents working in sequence to ship SaaS products.
Discussion
Comments powered by GitHub Discussions coming soon.