Web Project Studios

Field notes

The prompt library that nobody uses

14 May 2026

ai-workflow-opsprompt-managementworkflow

I have sat in a screen-share with an operations manager who was genuinely proud of their Notion prompt library. Forty-seven prompts, colour-coded by department, with a cover page that said "Last updated: Q3 2025." It was Q1 2026. Three of the prompts referenced GPT-4 by name in a way that assumed a specific behaviour that no longer existed. One linked to an internal doc that had been archived. Nobody had flagged any of this because nobody was looking.

This is not a story about a bad team. It is a story about a library that was built to be launched, not maintained.

A prompt library is not a product. It is infrastructure. The distinction matters because products get launched and celebrated. Infrastructure gets maintained or it degrades.

When an agency builds a shared prompt library, they are usually doing it at peak enthusiasm: the AI pilot is working, the team is energised, someone books a half-day workshop to document everything. The output is real and often genuinely useful. The prompts work. The structure makes sense. People use it.

Then the model updates. Or the workflow it was built around changes. Or the person who understood why each prompt was written the way it was leaves. And the library does not break all at once. It drifts. One prompt stops producing reliable output. Another becomes redundant when a tool changes. A third gets quietly abandoned in favour of a version someone wrote in their own notes and never shared back.

By month six, the library is a historical document. By month nine, people have stopped opening it. By month twelve, a new joiner asks if there is a prompt library and someone says "yes, but don't trust it."

This is the same failure mode I described in why most AI pilots fail: no named owner, no error handling, no cadence. The pilot ends not with a decision but with a slow stop.

The immediate cause is always model behaviour change or workflow change. But those are triggers, not causes. The cause is that the library was built as a deliverable rather than a system.

When a prompt is written during a workshop, the implicit contract is: someone built this, it works, use it. There is no implicit contract about what happens when it stops working. No one is responsible for noticing. No one is responsible for fixing it.

Compare this to how a sensible agency handles its CRM configuration. If a pipeline stage breaks, someone owns that. There is a person whose job includes knowing the CRM is functioning. They may not fix it themselves, but they notice and they escalate. The prompt library has no equivalent person. It was built by a committee and inherited by nobody.

The second reason is that prompts are invisible when they degrade. A broken CRM field produces an obvious error. A prompt that has drifted produces output that is slightly worse, slightly less reliable, slightly off-brand. The person using it might not even notice, or they might notice and assume they are doing something wrong. They adjust their own behaviour rather than flagging the prompt.

This is how you end up with a team that has quietly stopped using the shared library and is running their own personal versions, inconsistently, with no oversight.

The fix is not a better Notion template. It is named ownership and a review cadence baked into the metadata of every prompt.

Here is the metadata structure I recommend for any shared prompt in a team library:

prompt:
  id: "listing-description-residential-v3"
  name: "Residential listing description"
  owner: "sarah.okonkwo@agency.co.uk"
  department: "lettings"
  model_tested_on: "gpt-4o (2025-11)"
  last_reviewed: "2026-02-01"
  next_review_due: "2026-05-01"
  status: "active"  # active | under-review | deprecated
  linked_workflow: "rightmove-listing-upload"
  known_issues: ""
  change_log:
    - date: "2026-02-01"
      author: "sarah.okonkwo@agency.co.uk"
      note: "Updated tone guidance after brand refresh. Removed GPT-4 temperature reference."
    - date: "2025-11-10"
      author: "james.reid@agency.co.uk"
      note: "Initial version."

The fields that matter most are owner, next_review_due, and status. Everything else is useful context. Those three are the difference between a library and an archive.

The owner is a named individual, not a team. Teams do not notice things. Individuals do. The review date is quarterly by default, monthly for any prompt that feeds a client-facing or compliance-adjacent workflow. The status field means that a deprecated prompt does not disappear: it stays visible, marked as deprecated, so someone does not stumble on it and assume it still works.

The review itself does not have to be long. Fifteen minutes per prompt: run it against a sample input, check the output against the expected standard, confirm the model version is still the one being used. If something has changed, update the prompt or escalate to whoever owns the underlying workflow.

When I suggest named ownership in workshops, the first response is usually "but what if that person leaves?" The answer is: then you reassign ownership during offboarding, the same way you would reassign a CRM pipeline or an email account. Prompt ownership is an operational responsibility, not a personal one.

The harder conversation is who owns prompts that were built collaboratively or that span departments. A listing description prompt might be used by lettings, sales, and marketing. In that case, pick one owner and make the others stakeholders with review access. Shared ownership is no ownership.

This connects to a broader point I made in the post on AI reporting and hallucinated metrics: when nobody is accountable for the output of an AI system, the system produces whatever it produces and nobody catches the drift. The prompt library is exactly this at a smaller scale.

If you have an existing library that has gone stale, here is a practical sequence.

Week one. Audit what is in the library. Do not fix anything yet. Just tag each prompt as active, uncertain, or suspected-broken. This takes an hour if the library is under fifty prompts.

Week two. For every prompt tagged active or uncertain, run a quick test. Sample input, check the output. If the output is still good, mark it active and assign an owner. If it is off, mark it under-review and note what changed.

Week three. For every prompt tagged suspected-broken or under-review, decide: fix it, deprecate it, or escalate. If a prompt is not worth fixing, deprecate it visibly rather than deleting it. Someone will thank you for the paper trail.

Week four. Add the metadata structure above to every active prompt. Set review dates. Brief the named owners on what the quarterly review involves. Put the review dates in a shared calendar so they are not forgotten.

After that, the library runs on a maintenance cadence rather than on enthusiasm. That is the only cadence that survives past month six.

Most agencies do not need a bigger prompt library. They need a smaller one that actually works, with someone whose name is on every entry and a date by which they have to check it still does.

If you are not sure whether your current AI workflows have this kind of ownership structure, that is exactly what an AI Workflow Audit is for. We look at what you have built, who owns it, and what happens when it breaks. The library is usually the first thing we find.