Production-ready · 6 platforms · 4 providers · 54 models

Every feature,
earning its keep.

SDLC orchestration with multi-provider support, quality-gated iterations, browser-driven QA, per-provider cost tracking, and a web control plane, across six feature-parity platforms.

01 · ORCHESTRATION

7-phase SDLC pipeline

Analyst → PM → Developers → Integration → QA → Feedback → Summary. Each phase is a separate agent role with its own model, max-tokens budget, and prompt. Iterates automatically until the score clears the threshold.

Parallel Developers (up to 5 by default) with strict file-assignment enforcement
Integration Architect verifies cross-file consistency before QA
Up to 3 QA agents review ALL files independently for thorough coverage
Feedback Coordinator deduplicates issues across QA agents and scores quality

ORCHESTRATION

02 · MULTI-PROVIDER

Mix every major LLM

Assign Anthropic Claude, OpenAI GPT, Google Gemini, or DeepSeek to each role independently. All four providers stream responses live and report per-call token usage and cost back into the dashboard.

Anthropic native SDK · Claude family (9 models)
OpenAI SDK · GPT family (35 models)
Google REST · Gemini family (8 models)
DeepSeek OpenAI-compatible · reasoner + chat (2 models)

MULTI-PROVIDER

03 · TESTING

Visual & Interactive QA

Three configurable QA modes that build on each other. The deepest mode hands QA agents a real browser with 16 tools. They navigate, click, type, evaluate JS, and write structured test reports.

code mode, multi-agent independent review of every generated file
visual mode, adds desktop (1280×720) and mobile (375×667) screenshot capture
interactive mode, 16-tool browser automation with ACTION/VERIFY/COMPARE/REPORT protocol
Auto-failure detection for failed interactions and missing test reports

TESTING

04 · QUALITY GATE

Weighted scoring with capping

A weighted average across critical (50%) / major (20%) / minor (10%) issues and acceptance criteria (20%), with score capping that prevents inflated scores when critical issues remain.

1 critical issue caps max score at 0.45
3+ major issues caps max score at 0.65
Acceptance criteria pass/fail ratio feeds into the score directly
Mobile variant uses a 5-dimension geometric-mean scorecard instead

QUALITY GATE

05 · OBSERVABILITY

Per-provider cost tracking

Real-time token usage and cost breakdown by provider and role. Track API calls, input/output tokens, cache reads, and per-iteration cost separately for each LLM. Hard cost cap stops the run when budget is reached.

Per-provider $ rollup with % share
Per-role token attribution
Cache hit rate visibility
--max-cost CLI flag for hard budget cap

OBSERVABILITY

06 · CONTROL

Interactive check-ins

Between iterations the orchestrator pauses, surfaces the current score / critical issues / total cost, and lets the operator accept early or continue iterating. Configurable check-in frequency.

Score bar + issue counts + cost dashboard
Accept-early or continue-iterating decision per check-in
Configurable check-in frequency (every N iterations)
Graceful Ctrl+C cancels in-flight API calls mid-stream

CONTROL

QA In Depth

Three modes, each building on the previous.

Configurable per project. Code is fast and cheap. Visual catches layout regressions. Interactive drives the app like a user.

Code

Default, fastest

→Multi-agent independent review of ALL generated files
→Cross-file integration checks (CSS / HTML selectors, JS / HTML IDs, imports)
→Severity-tagged issues (critical, major, minor)
→Acceptance criteria evaluation against requirements

Visual

Code + screenshots

→Launches headless browser via Playwright (Node / Python / .NET) or chromedp (Go)
→Captures screenshots at configurable viewports (desktop 1280×720, mobile 375×667)
→Configurable wait strategy: networkidle / load / domcontentloaded
→Multi-modal QA prompts include screenshots for layout & alignment review

Interactive

Full browser-driven testing

→Turn-based tool use with 16 browser tools (navigate, click, type, evaluate, ...)
→Mandatory ACTION / VERIFY / COMPARE / REPORT testing protocol
→Auto-failure detection: failed interactions, missing test reports, suspicious patterns
→Max 20 turns per agent with tool-use tracking

Interactive Mode · 16 Browser Tools

interactive-qa.tools.json● 16 tools

navigate

Navigate to a URL

screenshot

Capture current page state

get_page_info

URL, title, viewport

click

Click an element

type

Type into an input

select

Select dropdown option

hover

Hover over element

get_text

Get element text

get_value

Get input value

get_attribute

Get element attribute

is_visible

Check visibility

count_elements

Count matches

get_console_logs

Get browser console

evaluate

Run JS in page

wait_for

Wait for element

report_test

Record test result

SDLC Mode · CLI

Same flags. Six platforms.

swarm · multi-platform run

# Node.js

$ node swarm.js --mode sdlc --config ./swarm-config.json --threshold 0.8

# Python

$ python swarm.py --mode sdlc --config ./swarm-config.json --threshold 0.8

# Go

$ ./swarm -mode sdlc -config ./swarm-config.json -threshold 0.8

# .NET

$ dotnet run -- --mode sdlc --config ./swarm-config.json --threshold 0.8

# Web (Docker)

$ docker compose up --build # http://localhost:3000

# Mobile (Android, Kotlin / Jetpack Compose)

$ ./gradlew :app:installDevProd # adb install + launch on-device

# Key SDLC options

--mode sdlc · enable pipeline

--config <path> · role settings

--threshold <n> · 0.0 – 1.0 (default 0.8)

--max-iterations · default 10

--max-cost <n> · hard $ budget cap

--requirements <path> · input doc

Web Control Plane

A browser-based dashboard.

Next.js 16 + React 19 with SSE-driven live dashboards, run history with full replay, conversational requirements builder, and an integrated output file browser with ZIP downloads. OIDC auth via Rdn.Identity.

Chat Builder

Conversational requirements

Chat-with-analyst flow refines requirements interactively using the Analyst role's configured model, then drops the generated requirements.md into the runner.

Live Dashboard

SSE-streamed run UI

Animated spinners per phase, per-developer and per-QA progress rows, live token/cost meters, score bar, check-in modal, and event replay on reconnect.

Output Browser

Files · preview · zip

File tree of generated code and docs, syntax-highlighted source viewer, sandboxed iframe HTML preview, and ZIP download for the whole output or any archived run.

Mobile · New

All of this, running on your phone.

The Android client adds two roles (Visual QA + Test Author), a five-dimension Quality Scorecard, on-device key vault, and a bundled benchmark template suite. Same pipeline. Zero servers.

Tour the Mobile App→

Every feature, earning its keep.