Production-ready · 6 platforms · 4 providers · 54 models

Every feature,
earning its keep.

SDLC orchestration with multi-provider support, quality-gated iterations, browser-driven QA, per-provider cost tracking, and a web control plane, across six feature-parity platforms.

01 · ORCHESTRATION

7-phase SDLC pipeline

Analyst → PM → Developers → Integration → QA → Feedback → Summary. Each phase is a separate agent role with its own model, max-tokens budget, and prompt. Iterates automatically until the score clears the threshold.

  • Parallel Developers (up to 5 by default) with strict file-assignment enforcement
  • Integration Architect verifies cross-file consistency before QA
  • Up to 3 QA agents review ALL files independently for thorough coverage
  • Feedback Coordinator deduplicates issues across QA agents and scores quality

02 · MULTI-PROVIDER

Mix every major LLM

Assign Anthropic Claude, OpenAI GPT, Google Gemini, or DeepSeek to each role independently. All four providers stream responses live and report per-call token usage and cost back into the dashboard.

  • Anthropic native SDK · Claude family (9 models)
  • OpenAI SDK · GPT family (35 models)
  • Google REST · Gemini family (8 models)
  • DeepSeek OpenAI-compatible · reasoner + chat (2 models)

03 · TESTING

Visual & Interactive QA

Three configurable QA modes that build on each other. The deepest mode hands QA agents a real browser with 16 tools. They navigate, click, type, evaluate JS, and write structured test reports.

  • code mode, multi-agent independent review of every generated file
  • visual mode, adds desktop (1280×720) and mobile (375×667) screenshot capture
  • interactive mode, 16-tool browser automation with ACTION/VERIFY/COMPARE/REPORT protocol
  • Auto-failure detection for failed interactions and missing test reports

04 · QUALITY GATE

Weighted scoring with capping

A weighted average across critical (50%) / major (20%) / minor (10%) issues and acceptance criteria (20%), with score capping that prevents inflated scores when critical issues remain.

  • 1 critical issue caps max score at 0.45
  • 3+ major issues caps max score at 0.65
  • Acceptance criteria pass/fail ratio feeds into the score directly
  • Mobile variant uses a 5-dimension geometric-mean scorecard instead

05 · OBSERVABILITY

Per-provider cost tracking

Real-time token usage and cost breakdown by provider and role. Track API calls, input/output tokens, cache reads, and per-iteration cost separately for each LLM. Hard cost cap stops the run when budget is reached.

  • Per-provider $ rollup with % share
  • Per-role token attribution
  • Cache hit rate visibility
  • --max-cost CLI flag for hard budget cap

06 · CONTROL

Interactive check-ins

Between iterations the orchestrator pauses, surfaces the current score / critical issues / total cost, and lets the operator accept early or continue iterating. Configurable check-in frequency.

  • Score bar + issue counts + cost dashboard
  • Accept-early or continue-iterating decision per check-in
  • Configurable check-in frequency (every N iterations)
  • Graceful Ctrl+C cancels in-flight API calls mid-stream

QA In Depth

Three modes, each building on the previous.

Configurable per project. Code is fast and cheap. Visual catches layout regressions. Interactive drives the app like a user.

Code

01

Default, fastest

  • Multi-agent independent review of ALL generated files
  • Cross-file integration checks (CSS / HTML selectors, JS / HTML IDs, imports)
  • Severity-tagged issues (critical, major, minor)
  • Acceptance criteria evaluation against requirements

Visual

02

Code + screenshots

  • Launches headless browser via Playwright (Node / Python / .NET) or chromedp (Go)
  • Captures screenshots at configurable viewports (desktop 1280×720, mobile 375×667)
  • Configurable wait strategy: networkidle / load / domcontentloaded
  • Multi-modal QA prompts include screenshots for layout & alignment review

Interactive

03

Full browser-driven testing

  • Turn-based tool use with 16 browser tools (navigate, click, type, evaluate, ...)
  • Mandatory ACTION / VERIFY / COMPARE / REPORT testing protocol
  • Auto-failure detection: failed interactions, missing test reports, suspicious patterns
  • Max 20 turns per agent with tool-use tracking

Interactive Mode · 16 Browser Tools

interactive-qa.tools.json● 16 tools
navigate
Navigate to a URL
screenshot
Capture current page state
get_page_info
URL, title, viewport
click
Click an element
type
Type into an input
select
Select dropdown option
hover
Hover over element
get_text
Get element text
get_value
Get input value
get_attribute
Get element attribute
is_visible
Check visibility
count_elements
Count matches
get_console_logs
Get browser console
evaluate
Run JS in page
wait_for
Wait for element
report_test
Record test result

SDLC Mode · CLI

Same flags. Six platforms.

swarm · multi-platform run

# Node.js

$ node swarm.js --mode sdlc --config ./swarm-config.json --threshold 0.8

# Python

$ python swarm.py --mode sdlc --config ./swarm-config.json --threshold 0.8

# Go

$ ./swarm -mode sdlc -config ./swarm-config.json -threshold 0.8

# .NET

$ dotnet run -- --mode sdlc --config ./swarm-config.json --threshold 0.8

# Web (Docker)

$ docker compose up --build  # http://localhost:3000

# Mobile (Android, Kotlin / Jetpack Compose)

$ ./gradlew :app:installDevProd  # adb install + launch on-device

# Key SDLC options

--mode sdlc · enable pipeline

--config <path> · role settings

--threshold <n> · 0.0 – 1.0 (default 0.8)

--max-iterations · default 10

--max-cost <n> · hard $ budget cap

--requirements <path> · input doc

Web Control Plane

A browser-based dashboard.

Next.js 16 + React 19 with SSE-driven live dashboards, run history with full replay, conversational requirements builder, and an integrated output file browser with ZIP downloads. OIDC auth via Rdn.Identity.

Chat Builder

Conversational requirements

Chat-with-analyst flow refines requirements interactively using the Analyst role's configured model, then drops the generated requirements.md into the runner.

Live Dashboard

SSE-streamed run UI

Animated spinners per phase, per-developer and per-QA progress rows, live token/cost meters, score bar, check-in modal, and event replay on reconnect.

Output Browser

Files · preview · zip

File tree of generated code and docs, syntax-highlighted source viewer, sandboxed iframe HTML preview, and ZIP download for the whole output or any archived run.

Mobile · New

All of this, running on your phone.

The Android client adds two roles (Visual QA + Test Author), a five-dimension Quality Scorecard, on-device key vault, and a bundled benchmark template suite. Same pipeline. Zero servers.

Tour the Mobile App