Meet AgentDesk: Enforced Process Discipline for Claude Code, Every Time

Amir Habib
Mar 31
6 min read

Here's the honest pitch for AgentDesk.live: it makes you run tests, update your ticket, write a PR description, and get a QA pass on every single task, not just when you remember to. That's it. That's the thing worth selling. Everything else, the named agents, the personality system, and the structured phases, are the mechanisms by which that discipline gets enforced. But the outcome is what matters: a consistent, auditable workflow that solo developers and small teams almost never maintain on their own.

What AgentDesk Actually Does

AgentDesk is a structured workflow runner for Claude Code. When you point it at a task, from Linear, Jira, GitHub Issues, or a plain text description, it runs that task through five sequential phases, each applying a different specialist perspective: product requirements, architecture, implementation, testing, QA, UI review, and copy review. The workflow always completes the same way: a committed branch, written tests, and an open pull request, with the ticket status updated in your tracker.

One important technical note: AgentDesk runs within a single Claude Code session. The named roles, Jane, Dennis, Sam, Bart, Vera, Luna, and Mark, are structured prompting personas applied sequentially within the same context window, not independent agents with isolated memory. This is worth understanding: the "team" is a storytelling and prompting framework, not a parallel multi-agent architecture. What you get is a single, highly structured Claude Code session that methodically applies seven different lenses to your task, in order, every time.

Everything runs locally via Claude Code on your machine. A lightweight daemon connects your local environment to the AgentDesk dashboard over WebSocket, so you can trigger tasks from a browser tab and watch the session stream in real time. Your code never leaves your machine.

The Real Value: Process You Don't Have to Remember

Most developers using Claude Code solo will write the implementation, maybe run it, commit, and push. Tests? Sometimes. PR description? Often a one-liner. Ticket status? Updated when they remember. Architecture review? Only if something felt wrong.

AgentDesk makes all of that non-negotiable. Every task goes through the same checklist, baked into the workflow:

Sam checks the architecture before implementation starts. Vera writes unit tests as part of execution, not as an afterthought. Luna flags contrast failures and spacing issues before anything is committed. Mark reviews every user-facing string. Bart does a QA pass on the full changeset and only then creates the PR. Jane updates the ticket throughout. The workflow doesn't complete until each step is done.

If you're a solo developer or a small team, this matters more than any specific AI capability. Consistency beats brilliance at scale. AgentDesk gives you a repeatable process that would otherwise require significant discipline or dedicated people to maintain.

The Five Phases

1. Intake: Jane fetches the task from your tracker, reading the ticket description, checking the current branch state, recent commits, and any open PRs. This sets the shared context for everything that follows.

2. Brainstorm: Each role perspective weighs in on the approach. This is where architecture risks, edge cases, and scope questions surface, before a line of code is written. Bart will raise at least two failure scenarios. Sam will check whether the proposed approach respects existing patterns.

3. Planning: The plan is finalized before execution begins. This prevents the common failure mode in a Claude Code session: it starts coding immediately and then, ten minutes later, rewrites half of it when it discovers a constraint it didn't account for.

4. Execution: Dennis implements against the plan. Sam audits the output's architecture. Vera writes unit tests. Luna reviews UI changes for spacing, contrast, and responsiveness. Mark reviews all user-facing copy. All within the same Claude Code session, applied sequentially.

5. Review: Bart does a final QA pass and creates the pull request. The PR description references the original ticket. The ticket is updated. The team summarizes decisions made and any outstanding concerns.

A Real Ticket, Start to Finish

Here's a concrete example from AgentDesk's own development. Ticket AD-1: "Create signed-out homepage with AgentDesk intro, benefits, docs links, and auth CTAs." A UI-heavy feature with copy, layout, accessibility requirements, and auth flows, exactly the kind of task where shortcuts accumulate.

On Intake, Jane fetched the ticket and checked the branch: feature/AD-1-signed-out-homepage already existed with some prior work, and a previous PR had been closed. She surfaced a comment from the latest user: the hero section needed an HTML5 animation of AI agents collaborating, speaking, coding, writing, and arguing.

During Execution, Bart checked the animation against real viewports and flagged that it was rendering too subtly on smaller screens; the connection lines and speech bubbles weren't visible enough. Luna reviewed the output: "Desktop looks great; the agents create an atmospheric hero. Color-coded nodes and floating bubbles look crisp. CTA buttons maintain good contrast. Approved." Dennis committed the visibility fixes. Bart pushed and created the PR.

Here's what the dashboard looked like at that moment:

AgentDesk dashboard: Bart creates the PR after Luna approves and Dennis commits — An end-to-end workflow in progress

What a solo Claude Code session would likely have missed: the mobile viewport rendering issue (no one asked it to check), the WCAG contrast audit, and the PR description linking back to the ticket. What AgentDesk caught because the workflow required it: all three. Total session time for that task: approximately 35 minutes, 41,280 tokens, 38 tool calls, 7 files changed.

The Role System: Useful Storytelling with Real Structure

The named agent personas, Jane, Dennis, Sam, Bart, Vera, Luna, and Mark, are a prompting architecture, not separate AI instances. Each name is a structured persona applied within the session that biases the model toward a specific lens: Jane toward user requirements, Sam toward architectural patterns, Bart toward failure modes, and Luna toward visual precision. The "personality" quotes you see in the documentation ("This logic belongs in a service, not inline in the component") are accurate representations of the kind of output each persona produces, not scripted responses, but characteristic thinking patterns that emerge from how each role is prompted.

This approach works because different role framings genuinely surface different issues from the same model. A prompt focused on QA failure modes will catch things; a prompt focused on implementation won't. The structure isn't decorative; it's doing real work. You can also add custom roles (Security Engineer, DevOps) via project settings, and they participate in all phases alongside the built-in set.

Where AgentDesk Struggles

Being honest about limitations is part of being useful. AgentDesk has real ones.

Context window pressure on large tasks: Because all phases run within a single Claude Code session, tasks that touch many files or require deep context across a large codebase will exhaust the context window before the workflow completes. The sequential phase structure adds overhead; by the time you reach Execution, the Brainstorm and Planning exchanges are already consuming context. For very large refactorings or cross-cutting changes, this is a real constraint.

Sequential role-switching on tightly coupled tasks: The five-phase structure assumes phases are reasonably separable. On tasks where planning and implementation are deeply interleaved - where you can't know the architecture until you've partially built it - the rigid sequencing can produce a plan that immediately needs to be revised during execution, sometimes creating confusion rather than reducing it.

Not a replacement for human review: creating a PR by Bart is not the same as a senior engineer reviewing it. The QA pass catches obvious regressions and surface-level issues, but it doesn't substitute for domain expertise, business logic knowledge, or the kind of review that catches subtle correctness bugs. Treat the PR as well-structured and process-compliant, not as pre-approved.

Best fit: focused, well-defined tickets: AgentDesk works best on tasks with a clear scope, a new feature, a UI change, a bug with a known root cause, or a refactor of a specific module. It works less well on vague or exploratory tickets where the real work is figuring out what to do, not doing it.

Tracker Integration: The Closing of the Loop

AgentDesk connects to Linear, Jira, and GitHub Issues. The integration is end-to-end: Jane reads the ticket requirements at the start, status updates are posted throughout, the branch is named after the ticket ID, and the PR references the original ticket with a description of what each phase produced. Run the agentdesk team KEN-213, and the workflow handles everything from requirements intake to PR creation. No copy-pasting. No manual status updates.