Anthropic recently published how it runs self-service analytics on Claude. One result caught my eye: context + skills took its analytics agent from 21% accuracy to consistently above 95%.
Highlighting that generating SQL is the easy part, the hard part is everything underneath it: canonical datasets, a semantic layer, lineage, maintained skills, and provenance on every answer.
That jump came from the foundation, not a bigger model. With the context right, the agent on top matters much less.
Why Agents Alone Fail
In addition to cost, three context problems keep coming up.
Entity ambiguity. "Active users" or "revenue" has several definitions in the warehouse. The agent picks one and writes correct SQL against the wrong data.
Staleness. The definition was right when written. Then the pipeline changed and the skill was never updated.
Retrieval failure. The right definition exists somewhere, but the agent can't find it, or grabs the wrong version.
Two of these, staleness and retrieval, can't be fixed easily by prompting alone. They need the context to be a versioned, owned asset wired to the pipeline it describes.
Anthropic tried the shortcut of handing the agent the raw query corpus, and accuracy barely moved. As they put it: "The information was there, the agent saw it, and it still didn't use it."



