YAML-Driven Pentest: Reproducible AI Scans for Complex Auth + 2FA Targets
Pentestas Team
Security Analyst

The YAML is the scan. Any engineer runs it. Any CI run runs it. 2FA handled automatically.
Pentestas ships a YAML config format that encodes the entire scan — target, login flow, 2FA secret, scope rules, source-code repo — in one file you can commit to your repository and run from anywhere.
In Detail
The file
description: "Rails e-commerce, PostgreSQL, Devise auth"
authentication:
login_type: form
login_url: "https://app.example.com/login"
credentials:
username: "audit@example.com"
password: "***"
totp_secret: "LB2E2RX7XFHSTGCK" # optional Base32 seed for 2FA
login_flow:
- "Type $username into the email field"
- "Type $password into the password field"
- "Click the 'Sign In' button"
- "Enter $totp in the code field"
- "Click Verify"
success_condition:
type: url_contains
value: "/dashboard"
rules:
avoid:
- description: "Skip logout endpoint"
type: path
url_path: "/logout"
- description: "No DELETE operations on user API"
type: path
url_path: "/api/v1/users/*"
focus:
- description: "Prioritise the checkout flow"
type: path
url_path: "/api/checkout"
source_code:
repo_url: "https://github.com/acme/ecommerce.git"
pipeline:
retry_preset: default
max_concurrent_pipelines: 5Forty lines of YAML encodes a fully-specified scan: a form-login with 2FA, multi-step login flow expressed in natural language, scope rules, source-code-aware mode, and AI pipeline concurrency — all reproducible, versionable, diff-able.
In Detail
TOTP 2FA — the killer feature
Almost every modern target has 2FA on the admin account. Most pentest tools handle this poorly:
- **Cookie paste*— you log in manually, paste the post-2FA cookie. Works until the session expires, usually mid-scan.
- **Skip the admin scan*— you don't pentest the admin area. You miss most of the high-value attack surface.
- **Manual restart*— the scanner halts on a 2FA prompt and waits for a human to type a code. Good luck running this unattended.
Pentestas handles 2FA by generating codes inline. When your YAML's login_flow contains the $totp placeholder, Pentestas:
- Reads the Base32
totp_secretfrom the credentials block. - Generates a live RFC 6238 TOTP code when the login step fires (via
pyotpinternally). - Substitutes the code into the step.
- Drives the step through the authenticated crawler / embedded browser.
The scan runs unattended through 2FA. Multiple re-logins over a multi-hour scan are handled the same way — fresh TOTP each time.
Getting the TOTP secret
When you set up 2FA on your target's admin account, the provider shows a QR code + a Base32 "manual entry" string. The string is the totp_secret. Paste it into the YAML. Pentestas's CA + TLS story applies — the secret is encrypted at rest with your tenant's Fernet key.
In Detail
Natural-language login flows
Login is the first-pass differentiator between scanners. The Pentestas YAML accepts natural-language steps executed by the embedded browser agent:
login_flow:
- "Navigate to https://login.acme.com"
- "Click 'Continue with Google'"
- "Wait for Google's consent screen"
- "Type $username into the Email field"
- "Click 'Next'"
- "Type $password into the Password field"
- "Click 'Next'"
- "If 'Continue as $username' dialog appears, click Continue"
- "Wait for redirect to https://app.acme.com/dashboard"A Claude agent drives a real browser through these steps. This handles flows that break typical form-login scanners:
- Google / Microsoft SSO with consent screens
- Conditional dialogs ("continue as user X?")
- Multi-step flows with interstitial pages
- 2FA interrupts with
$totpsubstitution - OAuth-redirect callbacks that need a specific wait condition
Flows can be up to 20 steps, each up to 500 chars. The step executor is tolerant: "Type $username in the email field" works when the actual label is "Email address" or "Your email".
In Detail
Scope rules — avoid + focus
The rules block gives you scan-scope control without editing the scanner:
rules:
avoid:
- description: "Skip the logout URL — would end the session"
type: path
url_path: "/logout"
- description: "No DELETE on the orders API — destructive"
type: path
url_path: "/api/orders/*"
method: DELETE
- description: "Don't test the marketing subdomain"
type: subdomain
url_path: "www"
focus:
- description: "Prioritise the checkout flow"
type: path
url_path: "/api/checkout"
- description: "Focus extra attention on the admin panel"
type: subdomain
url_path: "admin"Up to 50 avoid rules + 50 focus rules. Rule types: path / subdomain / domain / method / header / parameter. Rules are enforced both during crawl (avoided paths aren't visited) and during active testing (avoided paths get no probes).
Natural-language description fields aren't just comments — they show up in the final report as a "scope boundaries" section, documenting why certain paths weren't tested. This is exactly the evidence auditors ask for.
In Detail
Source-code hook
Wire in white-box mode with two lines:
source_code:
repo_url: "https://github.com/acme/ecommerce.git"or
source_code:
repo_path: "/workspace/ecommerce"Shallow-clone happens at scan-start; Opus-tier code analyst runs; downstream specialists get the intelligence deliverable. See Source-code-aware scans.
In Detail
Pipeline settings
Fine-tune the AI pipeline:
pipeline:
retry_preset: subscription # switches to 6-hour backoff for Anthropic subscription plans
max_concurrent_pipelines: 2 # run 2 of 5 specialist pipelines in parallel (1–5)max_concurrent_pipelines: 2 reduces AI rate-limit spikes at the cost of wall-clock time. Useful when scanning on constrained AI quotas.
In Detail
Submission
API
curl -X POST "https://app.pentestas.com/api/scans/yaml?target_url=https://app.example.com" \
-H "X-API-Key: aa_..." \
-H "Content-Type: application/yaml" \
--data-binary @scan-config.yamlCLI
pentestas start -u https://app.example.com -c scan-config.yaml -w 1hCI (GitHub Actions)
- name: Pentestas scan
env:
PENTESTAS_API_KEY: ${{ secrets.PENTESTAS_API_KEY }}
run: |
pentestas start \
-u https://staging-${{ github.sha }}.example.com \
-c .pentestas/scan.yaml \
-w 45mIn Detail
Secret handling
Commiting a YAML with a password into a repo is obviously wrong. Two patterns work:
Pattern A — env-var placeholders
authentication:
credentials:
username: "${PENTESTAS_USERNAME}"
password: "${PENTESTAS_PASSWORD}"
totp_secret: "${PENTESTAS_TOTP}"Pentestas expands ${…} at submission time from the submitting client's environment. Commit the YAML with placeholders; supply the secrets in CI via secrets.*.
Pattern B — CI-only YAML
Keep the YAML in ~/.pentestas/scans/ on the CI host (never in the repo). The CI job reads it and passes -c to the CLI. Secrets live wherever CI secrets live.
Pattern A is the default recommendation — more reproducible, secrets are explicit.
In Detail
Validation
The YAML parser enforces:
- Description ≤ 500 chars.
- Login flow ≤ 20 steps, each ≤ 500 chars.
- TOTP secret matches Base32 alphabet; fails fast on bad input.
- Rules list ≤ 50 entries per section, URL paths ≤ 1000 chars.
- Required fields for every
login_typepresent.
Malformed YAML returns HTTP 400 with a specific error message before any scan work starts. Your CI fails at submission, not mid-scan.
By Industry
Industry scenarios
Fintech
A fintech company's staging environment usually has 2FA on the admin panel + OAuth with Okta for employee SSO. A single YAML encodes both paths — the customer-user login flow + the admin-user login flow (via login_flow steps that navigate to the Okta redirect). The team commits the YAML to the staging-environment repo; every deploy reruns the same scan; findings map to the same baseline.
Medtech
Medtech platforms have physician-role and patient-role views. Two YAMLs: scan-physician.yaml with physician credentials, scan-patient.yaml with patient credentials. Both get scanned on every deploy. The Authz specialist runs authorization-boundary hypotheses across the two role sets, catching cross-role privilege bugs that a single-role scan would miss.
Legaltech
Legal SaaS has enterprise customers with SAML SSO to the customer's own IdP. The YAML captures the full SSO flow via login_type: sso + a login_flow that navigates through the customer's login. One customer = one YAML; the continuous pentest programme scales to every customer integration without hand-configuring the scanner.
Banks
A bank's internal admin panel has certificate-based auth that the embedded browser handles natively. The YAML specifies login_type: form with a login_flow that navigates through the client-cert selection dialog. The agent running inside the bank's network (see Internal network pentest) picks up the YAML and runs the scan from the network location that has access to the client cert.
Insurance
Underwriting platforms often have long-running test accounts with 2FA required on every login. TOTP secret in the YAML + nightly scan via CI = continuous coverage with zero human involvement. The YAML is reviewed monthly during the AppSec team's pipeline retrospective; rule avoid/focus entries are adjusted as the app evolves.
The Problem
Why this matters for AI penetration testing specifically
An ai pentest's value is proportional to its coverage. The single biggest coverage blocker in practice is "we couldn't log in". A reproducible, version-controlled, CI-friendly YAML format with first-class 2FA support means the auth barrier is crossed reliably on every run. The AI specialists spend their token budget testing the post-login attack surface, not guessing at the login form.
Try a YAML-driven ai pentest
Register, copy the example YAML for your stack, ship it to CI. Repeatable scans, 2FA handled, auditors happy.
Start your AI pentestMore Reading
Further reading
- YAML scan config docs — full schema
- Pentestas CLI — how CI wires it up
- Authenticated scans — auth primitives the YAML composes

Alexander Sverdlov
Founder of Pentestas. Author of 2 information security books, cybersecurity speaker at the largest cybersecurity conferences in Asia and a United Nations conference panelist. Former Microsoft security consulting team member, external cybersecurity consultant at the Emirates Nuclear Energy Corporation.