Every feature,
earning its keep.
SDLC orchestration with multi-provider support, quality-gated iterations, browser-driven QA, per-provider cost tracking, and a web control plane, across six feature-parity platforms.
01 · ORCHESTRATION
7-phase SDLC pipeline
Analyst → PM → Developers → Integration → QA → Feedback → Summary. Each phase is a separate agent role with its own model, max-tokens budget, and prompt. Iterates automatically until the score clears the threshold.
- Parallel Developers (up to 5 by default) with strict file-assignment enforcement
- Integration Architect verifies cross-file consistency before QA
- Up to 3 QA agents review ALL files independently for thorough coverage
- Feedback Coordinator deduplicates issues across QA agents and scores quality
02 · MULTI-PROVIDER
Mix every major LLM
Assign Anthropic Claude, OpenAI GPT, Google Gemini, or DeepSeek to each role independently. All four providers stream responses live and report per-call token usage and cost back into the dashboard.
- Anthropic native SDK · Claude family (9 models)
- OpenAI SDK · GPT family (35 models)
- Google REST · Gemini family (8 models)
- DeepSeek OpenAI-compatible · reasoner + chat (2 models)
03 · TESTING
Visual & Interactive QA
Three configurable QA modes that build on each other. The deepest mode hands QA agents a real browser with 16 tools. They navigate, click, type, evaluate JS, and write structured test reports.
- code mode, multi-agent independent review of every generated file
- visual mode, adds desktop (1280×720) and mobile (375×667) screenshot capture
- interactive mode, 16-tool browser automation with ACTION/VERIFY/COMPARE/REPORT protocol
- Auto-failure detection for failed interactions and missing test reports
04 · QUALITY GATE
Weighted scoring with capping
A weighted average across critical (50%) / major (20%) / minor (10%) issues and acceptance criteria (20%), with score capping that prevents inflated scores when critical issues remain.
- 1 critical issue caps max score at 0.45
- 3+ major issues caps max score at 0.65
- Acceptance criteria pass/fail ratio feeds into the score directly
- Mobile variant uses a 5-dimension geometric-mean scorecard instead
05 · OBSERVABILITY
Per-provider cost tracking
Real-time token usage and cost breakdown by provider and role. Track API calls, input/output tokens, cache reads, and per-iteration cost separately for each LLM. Hard cost cap stops the run when budget is reached.
- Per-provider $ rollup with % share
- Per-role token attribution
- Cache hit rate visibility
- --max-cost CLI flag for hard budget cap
06 · CONTROL
Interactive check-ins
Between iterations the orchestrator pauses, surfaces the current score / critical issues / total cost, and lets the operator accept early or continue iterating. Configurable check-in frequency.
- Score bar + issue counts + cost dashboard
- Accept-early or continue-iterating decision per check-in
- Configurable check-in frequency (every N iterations)
- Graceful Ctrl+C cancels in-flight API calls mid-stream
QA In Depth
Three modes, each building on the previous.
Configurable per project. Code is fast and cheap. Visual catches layout regressions. Interactive drives the app like a user.
Code
01Default, fastest
- →Multi-agent independent review of ALL generated files
- →Cross-file integration checks (CSS / HTML selectors, JS / HTML IDs, imports)
- →Severity-tagged issues (critical, major, minor)
- →Acceptance criteria evaluation against requirements
Visual
02Code + screenshots
- →Launches headless browser via Playwright (Node / Python / .NET) or chromedp (Go)
- →Captures screenshots at configurable viewports (desktop 1280×720, mobile 375×667)
- →Configurable wait strategy: networkidle / load / domcontentloaded
- →Multi-modal QA prompts include screenshots for layout & alignment review
Interactive
03Full browser-driven testing
- →Turn-based tool use with 16 browser tools (navigate, click, type, evaluate, ...)
- →Mandatory ACTION / VERIFY / COMPARE / REPORT testing protocol
- →Auto-failure detection: failed interactions, missing test reports, suspicious patterns
- →Max 20 turns per agent with tool-use tracking
Interactive Mode · 16 Browser Tools
SDLC Mode · CLI
Same flags. Six platforms.
# Node.js
$ node swarm.js --mode sdlc --config ./swarm-config.json --threshold 0.8
# Python
$ python swarm.py --mode sdlc --config ./swarm-config.json --threshold 0.8
# Go
$ ./swarm -mode sdlc -config ./swarm-config.json -threshold 0.8
# .NET
$ dotnet run -- --mode sdlc --config ./swarm-config.json --threshold 0.8
# Web (Docker)
$ docker compose up --build # http://localhost:3000
# Mobile (Android, Kotlin / Jetpack Compose)
$ ./gradlew :app:installDevProd # adb install + launch on-device
# Key SDLC options
--mode sdlc · enable pipeline
--config <path> · role settings
--threshold <n> · 0.0 – 1.0 (default 0.8)
--max-iterations · default 10
--max-cost <n> · hard $ budget cap
--requirements <path> · input doc
Web Control Plane
A browser-based dashboard.
Next.js 16 + React 19 with SSE-driven live dashboards, run history with full replay, conversational requirements builder, and an integrated output file browser with ZIP downloads. OIDC auth via Rdn.Identity.
Chat Builder
Conversational requirements
Chat-with-analyst flow refines requirements interactively using the Analyst role's configured model, then drops the generated requirements.md into the runner.
Live Dashboard
SSE-streamed run UI
Animated spinners per phase, per-developer and per-QA progress rows, live token/cost meters, score bar, check-in modal, and event replay on reconnect.
Output Browser
Files · preview · zip
File tree of generated code and docs, syntax-highlighted source viewer, sandboxed iframe HTML preview, and ZIP download for the whole output or any archived run.
Mobile · New
All of this, running on your phone.
The Android client adds two roles (Visual QA + Test Author), a five-dimension Quality Scorecard, on-device key vault, and a bundled benchmark template suite. Same pipeline. Zero servers.
Tour the Mobile App→