The Registry-Driven Field Projection Pattern: Write It Once, Read It Everywhere

Pattern #7 in the Agentic Platform Patterns catalog. See the Tiered Config Cascade and Prompt Slot Injection Patterns article for what this builds on, and the introduction for the catalog framing.

This is the seventh in the catalog of patterns that I recently identified and named after building an internal framework for agentic products where I work, and it's the first one that flips things around. Every pattern up to here governs the write path: how application content flows into the graph and into the response. The graph blueprint says what the graph is. Tool contracts say what tools do. Skill output effects assemble the response. All inbound. This one governs the read paths (replaying a conversation's history, streaming a turn live), and it needs no new machinery to do it. The registry that already governs the writes turns out to be the authority the reads were missing.

Three lists that had to agree

Our response pipeline had a field set: the things a turn carries beyond its raw answer text (eg. sources, citations, key points, follow-up questions, related articles, etc.). That set was needed in three different places, and each place kept its own copy. The write path knew it, because it assembled the response envelope. The history endpoint knew a copy, because it enumerated fields to project off each stored turn. The streaming runner knew a third copy, because it pulled fields off the parser to emit as SSE events.

Three lists. Three files. Nothing forcing them to agree. They agreed only as long as every engineer who added a field remembered to add it in all three.

Anyone who has run a CQRS-style split will recognize the failure mode. The write model evolves, a read model lags, and the gap stays silent because each side is internally consistent. The usual remedy is discipline: remember to update all three. Discipline does not survive a quarter spent shipping at AI-codegen velocity.

A field set is an authority

Before the code, the conceptual move.

A response field like citations or key_points is not owned by the code that happens to write it. It is a fact about the shape of a turn: this kind of answer carries these fields. That fact is needed in at least three places. You write it into the response, you replay it in history, you stream it live. As the product grows, more places will want it.

The moment a fact is needed in N places, you have a choice. Either each place re-derives the fact, which gives you N copies and N chances to drift, or one place owns the fact and the other N-1 subscribe to it. The catalog already made this choice once, for the write path. The Skill Output Effect pattern put the field set in a SkillOutputRegistry so the response node could assemble an envelope without naming any field. The registry was already the authority. It just wasn't being treated as one anywhere except the write side.

So give the read paths the same authority. The registry already knows which fields are active for a given set of loaded skills. Instead of the history code carrying its own list, it asks the registry which keys are active. Instead of the stream code carrying its own list, it asks the same question. The write path that populates the envelope and the read paths that project it now agree, because they read the same source.

Stated generally: derive your read projection from the same registry that governs your writes, so the two can't drift. The principle is reusable in any system with one write model and several read models. That describes most agent systems the moment they have both a live stream and a persisted history. The registry stops being "how the write path assembles a response" and becomes the single authority for what fields a turn has, in every direction.

The Pattern

Name. Registry-Driven Field Projection. Project every read path from the same field-set registry that governs your writes.

Intent. Eliminate read/write skew in a response pipeline. Make one registry the authority for the active field set, and have the write path and every read path (history reconstruction, live streaming) derive their field list from it rather than each maintaining a hand-written copy. The core names no domain field on any path.

Structure. Two methods on the registry are the entire read-side surface. The write side, collect_response_outputs, is the Skill Output Effect pattern from the previous article; this pattern adds the read side.

class SkillOutputRegistry:
    def get_active_keys(self, skills_loaded: list[str]) -> set[str]:
        """The active field set, given which skills are loaded.
        Every read path asks this instead of hardcoding a list."""

    def default_for(self, key: str, skills_loaded: list[str]) -> Any:
        """The typed empty default for one active key, so a read path
        can fill an absent field without knowing its type."""

get_active_keys is the subscription point. A read path that wants to project domain fields calls it once and iterates the result. It never spells a field name. Add a skill that introduces a new field, and every read path picks it up with no edit.

default_for closes the gap on the read side. When a stored turn predates a field, or simply didn't produce it, the read path needs a typed empty value: [] for a list field, {} for a dict, "" for a string. The registry supplies it from the same type hint the write side declared.

There's a trick here worth naming: declare, don't execute. Some fields aren't produced by a simple declarative effect. They come from a chunk of custom logic (format these sources, regroup these citation anchors) wrapped in an opaque callable. You can't ask an unrun callable what fields it will produce. So the callable advertises them:

@dataclass(frozen=True)
class CallableSkillOutputEffect:
    callable_fn: Callable[[dict, Any], dict[str, Any]]
    skill_name: str | None = None
    output_keys: tuple[tuple[str, type], ...] = ()
    """Declared (key, type_hint) pairs this effect's callable produces.
    Lets get_active_keys and default_for see callable-produced fields
    without executing the callable."""

get_active_keys then unions in a callable effect's output_keys exactly as it unions the key of a declarative one:

if isinstance(effect, CallableSkillOutputEffect):
    keys.update(name for name, _type in effect.output_keys)
else:
    keys.add(effect.key)

That isinstance branch is the line that closed the original bug. The citation and source formatting is a callable effect. Before output_keys, its fields were invisible to projection, because the callable hadn't run and nothing could see what it would produce, so the fields fell out of every read path. After: the effect declares ("sources", list) and ("citations", dict) statically, and all three paths see them.

The read paths themselves stay generic. The history reconstructor owns only a frozen set of core keys and injects everything else:

# history_dto.py — names no domain field
_CORE_KEYS = frozenset({"thread_id", "run_id", "query", "answer", "status", "created_at"})

def build_turn_dtos(messages, *, thread_id, field_keys=None, defaults=None, ...):
    ...
    for key in field_keys:          # field_keys = registry.get_active_keys(...)
        if key in _CORE_KEYS:
            continue
        dto[key] = meta.get(key, defaults.get(key))   # defaults from default_for

The streaming projector is the same shape against the parser's extra space:

def _build_parse_fields(parse_result, active_keys, exclude=frozenset()):
    # Hugo names no domain field here: it iterates the registry-declared
    # keys and pulls each off the parse result's extra space.
    for key in active_keys:         # active_keys = registry.get_active_keys(...)
        if key in exclude:
            continue
        ...

Real example. Citations, one field, traveling three paths through one declaration.

First the declaration. In the application's bootstrap, the source-and-citation formatter is registered as a callable effect that advertises exactly what it produces:

# bootstrap.py — the single place the field set is stated
sources_citations_effect = CallableSkillOutputEffect(
    callable_fn=_format_sources_and_citations,
    skill_name=None,  # always active
    output_keys=(("sources", list), ("citations", dict)),
)
registry.register_metadata_effect(sources_citations_effect)

That output_keys line is the whole fix in one place. Three paths obey it:

Write. The response node assembles the envelope through the registry, running the callable and merging its sources and citations into the response. It never names "citations."
Read, history. The history service reads the thread's loaded skills, calls field_keys = registry.get_active_keys(skills_loaded) and defaults = {k: registry.default_for(k, skills_loaded) for k in field_keys}, and hands both to build_turn_dtos. Citations now ride the same projection. This is the literal fix for the silent-drop bug.
Read, stream. The streaming runner builds its active key set once and feeds _build_parse_fields, with a FINALIZE_CUMULATIVE_FIELDS exclude set so cumulative fields the parser re-surfaces at finalize aren't emitted twice.

The two read paths do not ask the registry the same question. The history service passes the turn's real skills_loaded:

# threads_service.py
skills_loaded = list(values.get("skills_loaded", []))
field_keys = registry.get_active_keys(skills_loaded)

The live stream passes an empty list:

# streaming_runner.py
parse_active_keys = registry.get_active_keys([])

This is not a shortcut. The two read models legitimately want different slices of the same authority. History replays a specific past turn, and it knows which skills that turn loaded, so it can faithfully project skill-gated fields. A gray-area-analysis field, say, that only exists when the gray-area skill was active. The live stream is emitting the current turn's always-on envelope, so it projects the unconditional set: the fields declared with no skill gate. Same registry, same authority, two different questions, because the two read models are answering two different needs. A single shared list could never have expressed that distinction; it would have forced both paths to the same field set and quietly over- or under-projected one of them.

Consequences. What the pattern buys you:

The write model and every read model agree by construction. Add a field once; the reads pick it up.
The core names no domain field on any path. Both the history reconstructor and the stream projector carry "names no domain field" as their guiding comment, and both mean it.
Callable-produced fields, the ones most likely to be forgotten because they aren't simple declarations, are projectable through output_keys.

What it costs:

The registry is now load-bearing for reads, not just writes. That's a heavier responsibility for one object, and a place where a bug now reaches three surfaces at once instead of one.
output_keys is a hand-maintained promise. A callable that returns a field it didn't declare is invisible to the reads again, and the declaration has to stay honest. The way to keep it honest is a test that runs the callable and asserts its returned keys are a subset of its declared output_keys.

When not to use. A pipeline with exactly one consumer of the field set, where you write a response and never replay or stream it, has nothing to skew against. The pattern is useful when a second path reads the same fields. Most agent systems cross that line the day they add either a history endpoint or a live stream, which is to say almost immediately.

The symmetric half

Step back to the catalog. Every pattern before this one governs the write path: the blueprint shapes the graph, contracts shape tools, policies shape the planner, profiles pick models, skills supply capabilities, the Skill Output Effect assembles the response. This pattern is the one that reverses direction, and it does so with no new machinery. The same registry, asked a different question — get_active_keys instead of collect_response_outputs — drives the reads. The write authority is the read authority.

It pairs with the catalog's outbound stream and output-handler seam, the mirror of the Skill Output Effect, which the Graph Blueprint article pointed forward to as the outbound twin of the inbound seams. The write effect, the history projection, and the stream handler are three consumers of one field declaration.

Readers who have run a CQRS-style split will recognize the shape in a sentence: one write model, several read models, none of them enumerating a field the write model didn't hand it. I'll name the parallel and leave it there, because this isn't a CQRS retread. What's actually new is doing this where the field set is dynamic (it depends on which skills loaded) and where some fields are produced by opaque callables that have to advertise their outputs before they run. The two read paths asking the registry different questions, one with the real skill set and one with the empty set, is exactly the kind of thing a static field list can't express and a live authority can.

The honest close: it syncs the field set, not the field meaning

This pattern keeps the field set in sync across paths, but it does not keep the field semantics in sync. If the write path stores citations as one shape and a read path expects another, the registry won't catch it; you've agreed on the name, not the meaning. The pattern that comes next governs meaning at the model boundary. And the pattern doesn't help if a read path needs a field the write path never produces — the authority is the write registry, so a read-only field has no home here and probably wants its own declaration.

The output_keys promise is exactly that, a promise. The registry believes a callable when it says what it produces. The strongest version of this pattern pins the promise with a test: run the callable, assert its returned keys are a subset of its declared output_keys. Without that, the declare-don't-execute trick can drift the same way the hand-written lists did — one layer up, and quieter.

Next in the catalog: Parse-Boundary Ownership, the allow-mode carrier and forbid-mode schema provider pair. This pattern got the field set to agree across write and read. The next one governs who owns the field names and meanings at the boundary where the framework reads the model's output and the application gives those fields their domain vocabulary.