Framing requests
The quality of Operator's output is tightly coupled to the quality of your instruction. Not because the model needs hand-holding — it can fill in context — but because ambiguous scope leads to over-reach: Operator will make reasonable assumptions and keep going.
Start with a clear outcome, not a method. "Refactor the auth module" is a method. "Make the auth module easier to test — today we can't mock the token provider" is an outcome. The difference matters because Operator will sometimes find a better method than the one you had in mind.
Scope is your most powerful lever. If you want Operator to work only inside src/billing/, say so. If you want it to avoid touching the database schema for now, say that. Unspecified scope is read as implicit permission — and that is usually fine, but on sensitive codebases you will want to be explicit.
Constraints are information, not restrictions. Telling Operator "we are on Node 18, no ESM" or "this must be backwards-compatible with the existing API contract" is not limiting it — it is giving it the context it needs to make the right decisions without stopping to ask.
Use concrete examples liberally. If you want output in a particular style — a certain test structure, a commit message convention, a particular error handling pattern — paste one example. Operator will generalise from it across the whole task.
A well-framed request typically runs without a single clarifying question. If Operator is asking many questions before starting, that is a signal the request was under-specified.
Letting it plan
Operator does not just execute instructions — it reasons about the best path before touching a file. Understanding how this planning phase works helps you know when to intervene and when to let it run.
When you submit a task, Operator first reads the relevant parts of the codebase, traces dependencies, and constructs a plan. This might look like inactivity. It is not. The reading phase is often longer than the writing phase — especially on a first task in a new codebase.
Plans are surfaced as text before action when the task is ambiguous or risky. You will see something like: "I plan to modify X, Y, and Z. Y touches the payment flow — should I proceed?" This is not indecision; it is the approval gate working correctly.
For large tasks, Operator decomposes the work into phases and tracks them internally. You do not need to break work into subtasks yourself — though for very large refactors, giving Operator a phased structure ("first just the model layer, then the routes") can reduce the size of each diff and make review easier.
You can influence the plan by asking Operator to explain its approach before acting. "Before you make any changes, walk me through what you plan to do" will produce a plain-language plan you can approve, redirect, or narrow before a single character is written.
Planning is not overhead. The biggest source of wasted agent time is acting on a misunderstood task. A 30-second planning phase prevents a 10-minute rollback.
Reading the diff
Operator's output is a git diff. Reading it well is a skill — and a fast one once you know what to look for. The goal is not to read every line; it is to spot the few lines that matter.
Look at the file list first. If ten files changed and nine are in src/auth/, but one is in infrastructure/terraform/, that is the file to read first. Scope drift — changes outside the expected radius — is the most common class of Operator output worth redirecting.
Deletion is higher-signal than addition. Added lines are typically low-risk (new tests, new comments, new helpers). Deleted lines are where intent lives. Scan the red lines in each file before reading the green ones.
Check for unintended test relaxation. A test suite that goes from 12 passing tests to 11, or that gains a .skip(), is a class of diff to read carefully. Operator will not remove tests without reason, but it will occasionally update a test to match new behaviour — you need to decide if the new behaviour is correct.
Comments and docs are signal, not noise. When Operator adds a comment explaining why something is done a certain way, that comment is documentation of a decision. It is worth reading even if you trust the code change itself.
A diff review is faster than you think once you have the pattern: file list → deleted lines → test changes → comments. Most diffs need less than two minutes.
Approval gates
Operator pauses for your approval before actions that are hard to reverse — pushes to remote branches, merges, schema migrations, infrastructure changes, anything that crosses a boundary outside the local workspace.
Gates are not interruptions. They are the handshake between Operator's judgment and yours. For most tasks the only gate is the final "merge this PR?" — the rest runs unattended. For tasks touching production, gates are more frequent.
Each gate presents a summary of what Operator is about to do and why. Approve to proceed. Reject to block that step. Redirect to send Operator back to re-plan with new constraints. You do not need to know the implementation details — you are approving the intent, not the code.
You can configure gates at the workspace level. Require a gate before any network call, before any file deletion, before any commit. Or require no gates at all for a trusted internal repo. Defaults are conservative — you loosen them as trust is established.
Gates accumulate context. If you reject a gate and explain why, Operator carries that reasoning forward. A rejection is not a restart — it is a course correction, and the work that was done before the gate stays in place.
The gate system is designed so that you can be gone for most of a task and still catch the one moment that genuinely needs your judgment.
When to redirect
Redirection is not failure. It is the normal mode of working with an agent on a complex task. Knowing when and how to redirect — rather than starting over — is what keeps a session productive.
Redirect when the direction is wrong, not when a detail is wrong. If Operator is solving the problem using the right approach but has made a small mistake in a function signature, that is a correction — give it the correct signature and let it continue. If Operator is solving the wrong problem entirely, that is a redirect — it needs a new frame.
Surface the underlying constraint, not the symptom. If you find yourself writing "don't do X", ask what constraint X violates. "Don't modify the database schema" is less useful than "we need this to be backwards-compatible with clients on v2 of the API, so the schema must not change". The second version lets Operator solve the problem differently rather than just avoiding the thing you named.
Partial work is not wasted. When you redirect mid-task, the files Operator has already written stay on disk. On the next run, Operator reads them and builds on them rather than starting from scratch. You are steering an ongoing session, not resetting a test.
Some redirects are architectural. "We've been adding a new cache layer but I think we need to step back — this complexity isn't worth it" is a valid redirect. Operator will assess what to unwind, what to keep, and propose a simpler path. It handles the logistics; you handle the judgment.
A session with two or three redirects is a normal, healthy session. A session that required no redirects either had a very well-scoped request or a task that didn't need an agent.
Recovering a run
Occasionally a run will stall, error, or produce output you can't use. Recovery is fast when you know the tools available.
Most errors are not errors — they are clarification requests that look like stops. If a run ends with "I can't proceed because X", X is almost always something you can provide: a missing credential, a dependency that needs to be installed, a conflict in two instructions you gave. Read the stop message carefully before treating it as a failure.
Use the session log before restarting. Every step Operator took is logged — the files it read, the plan it formed, the gates it hit, the actions it took. The log tells you exactly where things went off-track so you can redirect to that point rather than rerunning the whole task.
For runs that produced partial output, commit the good parts first. A task that got halfway done and then stalled has produced real value. Commit what's correct, then start a narrower follow-up task from that checkpoint. Operator will read the committed state and continue from there.
Persistent failures usually point to an environment problem, not a task problem. If three consecutive runs fail at the same point, the issue is likely a missing dependency, a broken test environment, or an API credential that has expired — not something that a better instruction will fix. Check the environment first.
Recovery is part of the workflow, not an exception to it. Every experienced team running agents has a recovery routine — and once you have one, a stalled run is a five-minute interruption, not a lost afternoon.