The intelligence of a hacker. The patience of a machine.
Pentest Genie is an autonomous penetration tester. It thinks like an attacker, chains real exploits, and verifies every finding with the same tools your red team uses.
tools wielded by a single agent

Scanners match patterns. Attackers don't.
Manual pentests are slow and expensive — six figures and six weeks for a snapshot in time. Scanners are fast but shallow, blind to chained logic flaws and authenticated paths. Pentest Genie is what happens when the LLM is the hacker, not a wrapper around one.
Four ideas that make it work.
LLM as the operator
The model picks what to test, in what order, based on what it just learned. Not a scanner with an LLM bolted on — the LLM is the loop.
Multi-agent coordinator
A planner decomposes the target into specialists — SQLi, auth bypass, IDOR — running in parallel with a shared budget and ledger.
Chain-mode exploitation
A confirmed finding triggers the next move. The agent pivots from a leaked credential to a full privilege escalation, on its own.
Verified, not claimed
Browser-confirmed XSS. OOB-callback-confirmed blind SQLi. Code that ran, not patterns that matched.
Watch the hack unfold.
20 event types streamed live over WebSocket. Strictly monotonic sequence numbers per scan. Operators see every phase transition, every mission, every finding the agent decides to keep — in real time.
Simulated · live stream uses WebSocket in product.
20 event types
What changes when the pentester is autonomous.
Hours, not weeks
A target that takes a manual pentest two weeks to scope is fully covered before the kickoff call.
Cost-capped
Set a budget. The agent respects it and terminates gracefully when it hits the cap. No surprise bills.
Bug-bounty ready
Reports export to HackerOne and Bugcrowd formats out of the box — with reproducible PoC scripts attached.
Real exploit chains
Custom Python proofs, not just CVE references. The artifacts that come out are the artifacts you'd submit.
The hardest pentest benchmark in the industry. In progress.
Public benchmarks for autonomous pentesters don't exist yet. So we're building one — a suite of black-box targets across web, API, and infrastructure, each with a known-best human result.
Built for teams that take offense.
Rolling access for bug bounty hunters, red teams, and AppSec leaders. Tell us what you want to test.