The supervision spectrum
People talk about autonomous agents as if autonomy were a switch: either a person does the work or an agent does. In practice there is a long spectrum between those poles, and knowing where a system honestly sits on it tells you more than any claim about what the system can do.
At one end, the machine proposes and the person does everything else. Autocomplete. The person asks, reads, judges, edits, and acts. The machine’s contribution is keystrokes saved.
A step further, the machine drafts and the person approves. Most products sold as agents today live here. The agent prepares the email, the report, the code change, and a person reviews every one before it goes anywhere. This is genuinely useful. It is also supervision in its purest form: nothing happens without human attention, which means human attention is still the limiting resource. The bottleneck has not been removed. It has been moved from production to review.
Further still, the machine acts within a session and the person reviews outcomes rather than actions. The agent handles the conversation, makes the booking, fixes the bug, and someone checks results afterward, sampling instead of inspecting everything. Attention begins to scale. This is the field’s honest frontier today.
Then, the machine holds a role across time and the person reviews exceptions. The agent owns the pipeline, the inbox, the deployment process. It works while nobody is watching. People hear about the cases the agent itself flags as unusual, plus whatever audits catch. This is what employment actually looks like. Very few systems anywhere operate at this level for work that matters.
Finally, the machine holds the role and also decides what the role requires. Nobody reviews exceptions, because the agent’s judgment about what counts as an exception is itself trusted. Human oversight exists at the level of goals, not operations. No system should be run this way today. Some will be, eventually.
Two things about this spectrum seem worth saying plainly.
First, progress along it is not driven by intelligence. Moving from drafts to sessions to roles depends on memory, verification, boundaries, and accountability, structural elements that have little to do with how smart the model is. A smarter model writes better drafts. It does not, by itself, become more trustable with a role. These are different axes, and confusing them is how organizations end up deploying systems one level beyond what the structure supports. That is where the painful stories come from.
Second, each step changes the economics completely. Supervised systems save labor. Autonomous systems hold labor. A tool that makes a team faster is worth something. A system that owns a function outright is worth a different kind of something. This gap, far more than model quality, explains why so much automation that seems obviously possible has not actually happened.
The work of moving systems along this spectrum, level by level, with structure earning each step, is the work. Verse is doing it for individual roles now. The laboratory exists to make sure the next level is buildable when the current one is done.
Artemis Labs