Dispatch №13Apr 25, 202610 min read

The Agent Didn't Fail. It Was Just Told Too Much, Too Soon.

Why progressive disclosure is the most underrated concept in agent reliability, and why hooks are the primitive that finally makes it possible.

By Nivedit JainOriginally published on jainnivedit.substack.com

Contents · 8 sections

01A UI Concept That Should Have Gone to Agents Years Ago
02Two Ways This Actually Breaks
03What Hooks Actually Are (and Why Most People Misread Them)
04What This Looks Like in Practice
05Why This Stays Underrated
06The Reliability Connection
07Where This Goes
08Getting Started with Hooks Today

There’s a pattern I keep seeing in post-mortems of failed agent deployments. Someone shares a trace. The agent took a wrong turn, made a bad call, or worse, did something irreversible. And the instinct is always to blame the model. “Ah, the LLM hallucinated.” “It didn’t follow instructions.”

But when you scroll back through the trace and look at what the agent actually knew at the moment it made that decision, you almost always find the same thing. Everything was dumped upfront, in one enormous blob, and the piece of context that would have guided that specific decision was buried 14,000 tokens ago under a pile of stuff that wasn’t relevant yet.

The model didn’t fail. The disclosure timing failed.

A UI Concept That Should Have Gone to Agents Years Ago

Progressive disclosure is one of the oldest ideas in interface design. The principle is simple: don’t overwhelm users with everything at once. Surface information when, and only when, it becomes relevant to what they’re currently doing. A settings page doesn’t show advanced options until you click “Advanced.” A checkout flow doesn’t ask for your shipping address until you’ve confirmed your cart.

The insight sounds obvious, but the engineering implication is real. Information has a correct moment, and delivering it before that moment arrives isn’t neutral. It competes for attention. It anchors the user to a version of the world that no longer applies. It makes the relevant signal harder to find when the time finally comes.

Now think about how we build agents. Almost universally, we do the opposite. The entire system prompt goes in upfront. Security policies, tool descriptions, domain rules, credentials, operational constraints, historical context, all of it, in one shot, before the agent has done a single thing, before we have any idea which branch of the problem it’s going to walk down.

This is the equivalent of handing a new employee, on their first day, a binder containing the entire company handbook, every internal policy, the login credentials for every system they might ever touch, and a transcript of every incident that’s ever happened, and then saying: “Okay, go handle this customer complaint.” Before they’ve even sat down.

We don’t do that to people. But we do it to agents constantly.

Two Ways This Actually Breaks

Context timing failures tend to show up in two distinct flavors, and it’s worth naming both because they look different in traces.

The first is context overload. The agent is given too much information, too early. Competing signals create noise. The model has to prioritize relevance across a context window that spans things that matter right now and things that will matter in step seven, if step seven even happens. Token budgets get consumed by boilerplate. Attention dilutes. A simple retrieval task goes sideways because the agent is swimming in information about what to do when things go wrong, which isn’t relevant yet.

The second is subtler and actually more dangerous: stale context. The agent was given accurate information at the start, but the world changed while it was running. The context anchoring its decisions reflects an earlier state. It proceeds confidently on a belief that is no longer true. This is especially painful in long-running agents doing research loops, multi-file edits, or anything where the clock runs in minutes rather than seconds. In both cases, the fix isn’t a better model. It’s a better understanding of when each piece of context belongs.

What Hooks Actually Are (and Why Most People Misread Them)

For a long time, there was no clean way to implement timed context delivery. You could do gymnastics with multi-agent handoffs, careful turn management, or dynamic prompt construction, but nothing felt right. There was no causal trigger. You were always guessing when the right moment was.

Hooks change this.

In Claude Code and most modern agentic harnesses, hooks are intercept points wired to specific events in the agent lifecycle: PreToolUse, PostToolUse, PreCompact, Stop. They fire not on a timer, not on a heuristic. They fire because the agent itself took an action.

The agent calling a tool is ground-truth signal about where it is in the problem, what it’s trying to do right now, and therefore what context is now relevant. When the agent calls query_database, it just told you it needs the schema. When it touches a financial data endpoint, it just told you the compliance rules apply. You don’t dump credentials into the system prompt and hope the agent uses them at the right time. The agent calls the tool that needs them, and you inject right then.

The shift in mental model is from broadcasting context to delivering it. One is a fire hose. The other is a targeted handoff, timed by the agent’s own behavior.

Most teams read hooks as a safety primitive. They’re how you stop an agent from doing something it shouldn’t. That framing is accurate but incomplete, and it’s caused a lot of people to treat hooks as a purely defensive tool rather than an operational one. The deeper framing is that hooks are a context orchestration primitive. Yes, you can use them to block. But you can also use them to inject, enrich, and time. They sit at the causal intersection between what the agent knows and what the agent does, which is exactly the right place to intervene if you care about disclosure timing.

What This Looks Like in Practice

Four patterns that become natural once you’re thinking through hooks as a disclosure mechanism.

Secrets and credentials on demand. Instead of loading all API keys into the environment upfront, you inject them as a PreToolUse hook fires for the specific tool that needs them. The agent never holds credentials it isn’t about to use. The blast radius of a prompt injection or leakage bug shrinks significantly.

Compliance context at regulated surfaces. If your agent operates in a domain with policy constraints, financial services, healthcare, legal, you don’t front-load all the rules. You surface the relevant policy subset when the agent touches a regulated surface. The rules are fresh, scoped, and causally tied to the exact action they’re meant to govern.

Failure history on retry. When an agent is about to attempt something for the second or third time, that’s the moment to surface what happened before and why it failed, not before the first attempt when that information is just noise. The PostToolUse result from a failed call is your trigger. The retry gets context the initial attempt didn’t need.

Expensive context behind demand gates. Long documents, full database schemas, rich domain knowledge. These are expensive to include and mostly wasted if the agent never touches the relevant surface. Gating them behind a PreToolUse for the specific tool that requires them keeps the working context lean for 90% of the run and rich exactly when needed.

Each pattern has the same underlying structure: the agent’s action is the signal, and the hook is the mechanism that translates that signal into the right context delivery.

Why This Stays Underrated

Part of the issue is how hooks were introduced. The default fix for agent failures is a larger, more detailed system prompt. More rules. More examples. More coverage. And to be fair, this often helps in the short term. But it’s treating the symptom. A larger prompt delivered at the wrong time is still delivered at the wrong time. You’ve added more information for the agent to work through in a state where most of it isn’t yet relevant.

The other part is the framing problem I mentioned earlier. Hooks shipped as guardrails, and that’s how they got adopted. Teams use them to add a “are you sure?” before destructive operations, which is useful but narrow. The broader surface, hooks as context delivery infrastructure, hasn’t been explored much yet. Partly because it’s a newer pattern and partly because the tooling to do it well, hooks in Claude Code, structured lifecycle events in agent frameworks, is itself relatively new.

The Reliability Connection

When you look at transient failures, the category of agent failures that are non-deterministic, hard to reproduce, and frustrating to debug, a significant fraction of them are, at their root, disclosure timing bugs. The agent wasn’t wrong in some fundamental sense. It just made a decision in a context that was underspecified, overloaded, or stale relative to what that specific decision required.

Right context at the right moment produces fewer hallucinations, because the model is working with a crisp, current, causally relevant signal rather than reconstructing from noise. It produces fewer irreversible actions, because the constraints governing those actions are present when the actions are being considered, not buried under earlier instruction. It burns fewer tokens because the working context stays lean.

This is infrastructure-level thinking, not prompt-level thinking. It’s not about writing better prompts. It’s about building systems that understand the lifecycle of an agent and deliver information in alignment with that lifecycle.

Where This Goes

We’re early. Most production agent frameworks still treat context management as a prompt engineering problem rather than a systems problem. But as agents run longer, touch more systems, and operate with greater autonomy, the cost of bad disclosure timing goes up. The gap between what agents could do and what they actually do reliably will increasingly be explained not by model capability but by runtime context architecture.

The interesting open question at the frontier is whether agents can eventually learn to request disclosure rather than wait for it. Signaling to the harness what they’d need to know to act more confidently, and letting the system decide whether and when to surface it. That would close the loop in a way that static hook rules can’t.

But even before we get there, the shift from “dump everything upfront” to “deliver context causally via hooks” is a real leap that most teams building production agents haven’t made yet. And it’s the kind of thing that, once you see it, you can’t unsee it in every trace you read.

Getting Started with Hooks Today

If you want to start doing this in Claude Code without writing hook plumbing from scratch, we built Failproof AI to make this easier.

It installs as a global CLI, and a single command wires it into Claude Code’s hook system. From that point, every tool call routes through Failproof before and after it fires. You get 30 built-in policies out of the box covering the most common failure modes — blocking destructive commands, preventing secrets from leaking into context, keeping the agent inside project boundaries, catching accidental pushes to main. Things most teams only discover after something goes wrong.

The part that’s more relevant to everything above is custom policies. You write your own in plain JavaScript around three primitives: allow, deny, and instruct. The instruct primitive is the one that does context delivery. It doesn’t block the operation, it injects a message into Claude’s context at exactly the moment the tool fires. So if you want the database schema to appear only when the agent runs a query command, you write a policy that matches on that specific tool call and uses instruct to inject it right then. The schema never sits in the system prompt. It arrives exactly when it’s needed, tied causally to the action that needed it.

You can also drop policy files into a .failproofai/policies/ folder in your project repo and they load automatically, no config changes needed. Commit the folder, and every person on the team gets the same policies on their next pull. It’s the same progressive disclosure idea applied at the team level — shared context standards that travel with the codebase.

There’s also a local dashboard where you can browse sessions, see exactly which policies fired on which tool calls, and review the full trace. Useful for debugging disclosure timing issues after the fact. Everything runs locally and nothing leaves your machine.

The repo is at github.com/exospherehost/failproofai. If this framing resonates with how you’re thinking about agent reliability, a star goes a long way, and if you’ve built hook patterns we haven’t thought of yet, we’d genuinely love a PR or just a conversation.