The Skill Loader and Skill Output Effect Patterns: Pluggable Capabilities and Declarative Response Assembly

Pattern #5 in the Agentic Platform Patterns catalog: two patterns that travel together. See the LLM Profile article for what this builds on, and the introduction for the catalog framing.

For a while, each graph node knew every skill's output shape: it checked whether a skill was loaded and then reached in for that skill's fields by name, one skills_loaded check and one hardcoded field per skill. The dispatch was already declarative, but the assembly wasn't. Three skills' worth of fields read fine. By the seventh, nodes knew the innards of every skill in the system, and adding one sometimes meant editing nodes that weren't supposed to care.

A note for anyone who's met skills through the standard agent write-ups first: a skill is an artifact of Progressive Disclosure, the context-management pattern that says don't pour every capability into the model's context at once. Instead, show it a short menu and reveal each capability's full instructions only when it's selected. That's exactly what a SKILL.md is: the planner-facing description is the line disclosed up front, and the skill body is the detail that loads only once the skill is chosen. This article is the layer below that. Progressive Disclosure is the behavior: how a capability's detail reaches the model at the right moment. The Skill Loader and Skill Output Effect patterns are the architecture that lets those disclosed units plug into the system without the core ever knowing them by name. Disclosure decides what the model sees; these two patterns decide how a capability plugs into the system that surrounds the model.

This article is two patterns, not one, because the thing that fixed that node has two halves that only make sense together: how skills get loaded, and how their outputs get assembled into the response. Most teams collapse those into a single "skills system" and then wonder why it's hard to change. They're two different problems with two different owners, and pulling them apart is the whole move.

Nodes knew about every skill

In the agentic product that led to the creation of this framework, a number of skills were conversational capabilities: packaged units of behavior the agent deploys on demand to augment or enhance the response. A prescription drug information skill knows how to handle named-drug questions. A gray-area analysis skill knows how to surface clinical uncertainty as a structured module. A follow-up-questions skill knows how to suggest sensible next questions at the end of an answer. The skills follow the agentskills.io convention where each is a directory with a SKILL.md file in it that gets matched to a query and pulled into the run when it applies.

The trouble wasn't loading them. The trouble was the other end: the response node, the last node in the graph, the one that assembles the final response envelope the client receives. Several skills produce output that has to land in that envelope. For the first few skills, the response node incorporated them the obvious way:

def response_node(state):
    response = {"answer": state["final_answer"]}
    if "gray_area" in state["skills_loaded"]:
        response["gray_area_analysis"] = state.get("gray_area_analysis", [])
    if "follow_up_questions" in state["skills_loaded"]:
        response["follow_up_questions"] = state.get("follow_up_questions", [])
    # ...one branch per skill, forever
    return {"final_response": response}

It worked. It also grew a branch every time a skill grew an output, and every one of those branches lived in the response node, a file that has no business being a directory of every skill we'd ever written. Adding a skill meant: write the skill, register it, add its branch to the response node, give that branch a sensible empty default, and add the state field it reads. Five edits, in four files, every time, for a capability that should have been a new directory.

The deeper problem is the same shape the rest of this catalog keeps circling: the response node knew which skills existed. It didn't need to. Skills know what they produce; that's their job. The response node only needs to know how to assemble an envelope from whatever skills happen to be loaded. The knowledge was filed in the wrong place, in the consumer rather than in the things being consumed.

Two problems wearing one coat

These patterns fall out of refusing to treat skill-handling as one thing.

Problem one: how skills get loaded. A directory on disk holds skill definitions. At startup the system scans it, reads each skill's manifest, and produces an index the orchestrator can show the planner ("here are the skills available; pick the ones that fit"). Then, during a run, the chosen skills get merged into state, with conflict resolution, because two skills can collide. That's the loader and its companion merge step. It answers "which skills are in play for this run?"

Problem two: how skill outputs reach the response. Each loaded skill that produces a response-level field needs that field to land in the final envelope, without the response node ever knowing the field exists. That's the output-effect registry. It answers "given the skills in play, what fields does the response carry, and where does each value come from?"

The two are coupled by exactly one thread: an output effect can name the skill it depends on. A SkillOutputEffect can say "this field comes from the gray_area skill," and that name has to match a skill the loader actually put into play. If the skill isn't loaded, the effect doesn't fire and the field is omitted. That single skill_name link is the entire seam between the two patterns. Everything else about them is independent.

This is the same architectural shape the Tool Contract pattern had, pointed at a different layer. Tools register contracts; skills register output effects. In both cases a core node (the orchestrator there, the response node here) consumes a registry of records it didn't author and can't enumerate. The transferable lesson, stated once: anything an application supplies to the platform should be a named record registered at bootstrap, not a fact the core had to learn by importing a module.

Two registries. Two extension models. One rule that makes both safe: nothing is implicit. The skills directory is pointed at explicitly from the application's bootstrap. The output effects are registered explicitly from the same bootstrap. The response node finds out what exists by asking the registry, never by knowing.

The patterns

This article carries two. Here they are back-to-back, the loader first, because it's the simpler of the pair and the output effects build on the names it produces.

Pattern A: Skill Loader

Name: Skill Loader.

Tagline: Discover skills from a directory at startup, then merge the chosen ones into state with deterministic, role-aware conflict resolution.

Intent: Make adding a skill mean adding a directory, not editing core code. Make the available skill set legible: scanned from one place, visible to the planner, and resolved by a rule you can state in a sentence instead of a precedence table nobody can reconstruct.

Structure

There are two halves, and it's worth keeping them distinct.

The scan half reads a configurable skills directory. Each subdirectory is one skill, holding a SKILL.md file that follows the agentskills.io spec: YAML frontmatter (a name, a description, an optional metadata map) above the skill's prose body. The loader walks the directory, parses each manifest, and returns an index:

def _sync_load_skills_index() -> list[dict]:
    skills = []
    for path in sorted(_get_skills_dir().glob("*/SKILL.md")):
        post = frontmatter.load(str(path))
        name = post.metadata.get("name")
        description = post.metadata.get("description")
        if name and description:
            meta = post.metadata.get("metadata") or {}
            applies_to = _parse_applies_to(meta, post.metadata)
            selection_group = meta.get("selection_group")  # None → stackable
            skills.append({
                "name": name,
                "description": description,
                "applies_to": applies_to,
                "selection_group": selection_group,
                "path": path,
            })
        else:
            logger.warning("skill missing name or description", extra={"path": str(path)})
    logger.info("skills index loaded", extra={"skills_count": len(skills)})
    return skills

Three things in there are deliberate. The scan is sorted(), so the index order is stable across machines and filesystems; no run depends on directory-read order. A malformed skill (missing name or description) is logged and skipped, not fatal, so one bad skill directory doesn't take down startup. And two fields off the frontmatter, applies_to (which prompt roles the skill modifies) and selection_group (an optional bucket the skill belongs to), are the inputs the second half needs.

The merge half is where "conflict resolution" actually lives, and it's the part the planner's choices flow through. When skills get selected for a run, some by the orchestrator, some by earlier nodes like history detection, they're merged into state["skills_loaded"] through a single function:

async def load_skill_into_state(skills, state):
    registry = get_prompt_registry()
    index = await registry.get_skills_index()
    known = {s["name"] for s in index}
    valid = [s for s in skills if s in known]  # unknown names logged + dropped

    existing = list(state.get("skills_loaded", []))
    merged = existing + valid  # existing skills come first → they win conflicts

    accepted, warnings = await registry.validate_skill_selection(merged)
    if warnings:
        logger.warning("Skills conflict on selection_group", extra={"warnings": warnings})
    return {"skills_loaded": accepted}

The precedence rule is one sentence: already-loaded skills come first in the merge, and first-in-list wins. A skill an earlier node committed to isn't displaced by a later selection. That's the determinism. Not a priority integer, not an alphabetical tiebreak, just order of arrival, earliest wins.

The conflict rule itself is quite interesting, and it's easy to oversimplify. Skills in the same selection_group are not simply mutually exclusive. The constraint is one skill per group, per role:

# Two skills in the same selection_group conflict only when they share
# at least one applies_to role. Same group, different roles → both kept.
for name in skills:
    group = group_map.get(name)
    if group is None:
        accepted.append(name)          # ungrouped skills are stackable — always kept
        continue
    if any((group, role) in seen for role in applies_map[name]):
        warnings.append(f"dropped '{name}' — group '{group}' already filled for that role")
    else:
        for role in applies_map[name]:
            seen[(group, role)] = name
        accepted.append(name)

So a skill with no selection_group is stackable; it coexists with anything. A grouped skill conflicts with another grouped skill only if they share a role. Two skills in the same "intent" group that modify entirely different prompt roles both survive. This is more permissive than the naive "one per group" rule, and it's the right kind of permissive: the conflict that matters is two skills trying to reshape the same prompt, not two skills that happen to share a label.

A real example

In the HUGO framework, the application owns its skills directory and points the loader at it from bootstrap in one line:

set_skills_dir(Path(__file__).resolve().parent / "skills")

That's the entire per-deployment story for which skills exist. The platform's loader has no directory of its own baked in; the application hands it one. A second product on the same platform points the loader at its skills directory and gets a completely different roster, with zero changes to the loader. The directory is the extension seam: the same "application supplies the data, platform consumes it generically" move the blueprint made for topology and the profile registry made for models, applied here to capabilities.

A representative manifest, lightly trimmed, is a clinical-trials retrieval skill:

---
name: trial-evidence-retrieval
description: "For questions about clinical-trial evidence ONLY. Use when the
  question turns on trial results — efficacy endpoints, trial phase, study
  population, or how a finding was measured — rather than general background.
  Do NOT use for definitional or mechanism questions a textbook would answer."
metadata:
  applies_to: model_retrieval
---

# Trial Evidence Retrieval Skill

Use this skill only when the answer depends on what a trial actually found.
It biases retrieval toward primary trial literature and registry records over
secondary summaries. It adds trial-specific retrieval decisions, not a
restatement of the base prompt.

The description isn't documentation. It's the text the planner reads to decide whether the skill applies, so it's written at the LLM, precise about when to select and when not to. The applies_to: model_retrieval says this skill reshapes the retrieval role's prompt, which is also what determines who it can conflict with under the group rule.

Consequences

Adding a skill is a new directory plus a SKILL.md. No core edit to make the skill exist and be selectable.
The available skill set is one scan, in one configured directory. Legible, stable-ordered, and shown to the planner from a single source.
Conflict resolution is a stated rule, not a buried precedence table. First-loaded wins; grouped skills conflict only when they share a role.
Per-deployment skill rosters ship as a directory, not a fork. Point the loader elsewhere and the whole capability set changes.

When not to use it

If you have two or three skills that never conflict and never change between deployments, an if/elif that matches a query to a skill is fine. The loader earns its keep the moment skills start colliding, or a second deployment needs a different roster, which for any agent that's accreting capabilities arrives faster than you'd guess.

Pattern B: Skill Output Effect Registry

Name: Skill Output Effect.

Tagline: Each skill declares how its output flows into the response envelope; the response node assembles the envelope without knowing which skills exist.

Intent: Decouple the response node from the set of loaded skills. Make adding a response field driven by a skill mean one effect registered at bootstrap, instead of a branch in the response node plus a default plus a state field, scattered across four files.

Structure

The whole abstraction is one frozen dataclass:

@dataclass(frozen=True)
class SkillOutputEffect:
    """Declares a response-level output field driven by a skill."""

    key: str                          # response dict key, e.g. "gray_area_modules"
    source: str = "parsed"            # "parsed" | "state" | "both"
    parsed_field: str | None = None   # field name to read off the parsed response
    state_key: str | None = None      # key to read out of graph state
    merge_method: str = "append"             # "replace" | "append" (when source="both")
    type_hint: type = list            # drives the empty default
    skill_name: str | None = None     # only fires if this skill is loaded

The field breakdown:

key is the envelope field the effect contributes: what the client sees.
source says where the value comes from. "parsed" reads the LLM's structured output; "state" reads a value some node wrote into graph state; "both" combines them.
parsed_field / state_key are the actual locations in each source. The parsed side is deliberately late-bound: the core's StructuredResponse carries only a visible response string and an extra dict of everything the JSON block contained, so parsed_field names a key in that dict rather than a declared attribute. (More on why the core refuses to name those fields in a moment.)
merge decides how "both" combines: "replace" (state wins if present) or "append" (union the two lists, de-duplicated).
type_hint generates the empty default, so a field whose skill isn't loaded still resolves to a sensible empty value rather than a missing key.
skill_name is the one thread back to the loader. Set it, and the effect fires only when that skill is in skills_loaded. Leave it None, and the effect always fires, for response fields that aren't gated on any skill.

The response node, holding this, collapses to a loop over the registry:

def collect_response_outputs(self, state, parsed, skills_loaded):
    result = {}
    for effect in self._effects:
        if effect.skill_name is not None and effect.skill_name not in skills_loaded:
            continue  # skill not loaded → field omitted entirely
        if effect.source == "parsed":
            value = getattr(parsed, effect.parsed_field, None)
            result[effect.key] = value if value is not None else effect.empty_default()
        elif effect.source == "state":
            value = state.get(effect.state_key)
            result[effect.key] = value if value is not None else effect.empty_default()
        else:  # "both"
            result[effect.key] = _merge_values(
                getattr(parsed, effect.parsed_field, None),
                state.get(effect.state_key),
                effect.merge, effect.type_hint,
            )
    return result

The node calls it and injects the result into the envelope:

skill_outputs = SkillOutputRegistry.get_instance().collect_response_outputs(
    state=state, parsed=parsed, skills_loaded=list(state.get("skills_loaded", [])),
)
result = {
    "response": parsed.response,
    "key_points": parsed.extra.get("key_points", ""),
    **skill_outputs,
}

That **skill_outputs is the entire payoff. The response node now references one type, the registry, and no skill by name. Even key_points, the one field it still reaches for directly, it pulls out of the generic extra dict rather than off a named attribute, so the core node names nothing domain-specific. A new skill that produces a response field is a new effect registered at bootstrap, and it shows up in the envelope with no edit here. A deleted skill's effect either stops firing (its skill_name no longer in skills_loaded) or, if you also delete the registration, vanishes cleanly. The dangerous-deletion problem from the opening is gone: the worst a forgotten effect registration can do now is fire an empty default, which is annoying rather than a production incident.

A real example

The two effects the engine registers at bootstrap show both halves of the skill_name linkage:

# Always-on: combines inline parsed follow-ups with the post-generation
# node's computed follow-ups, unioned.
registry.register(SkillOutputEffect(
    key="follow_up_questions",
    source="both",
    parsed_field="follow_up_questions",
    state_key="follow_up_questions",
    merge="append",
    skill_name=None,                 # not gated — always in the envelope
))

# Skill-gated: present only when the gray_area skill is loaded.
registry.register(SkillOutputEffect(
    key="gray_area_analysis",
    source="parsed",
    parsed_field="gray_area_analysis",
    merge="replace",
    skill_name="gray_area",          # ← the thread back to the loader
))

The first effect is the merge="append" case made concrete. Follow-up questions arrive from two places, some inline in the LLM's structured output, some computed afterward by a dedicated post-generation node, and the effect unions them so the client gets one de-duplicated list, never two competing ones. The second is the linkage in the flesh: skill_name="gray_area" is a string that has to match a skill the loader put into skills_loaded. When the gray-area skill applies, the field is there. When it doesn't, the field is omitted. The response node was told none of this.

There's a companion type, NodeOutputEffect, that solves the same shape one layer up. When a node calls an LLM with a structured-output schema, each output field is declared as an effect mapping output field → state key, scoped to the node's role, so a node can discover and write its own outputs generically, the same way the response node assembles its envelope generically. Same idea, different altitude; not the focus here.

Consequences

The response node imports no skill. It references the registry and nothing else.
Adding a response field is one effect registration at bootstrap.
Removing a skill is safe. The effect stops firing, or you delete the registration; neither path can corrupt the envelope.
A skill that produces no response field registers no effect. No orphan keys, no empty branches left behind.
Two sources can feed one field via merge="append", which the hardcoded version could never do without hand-written merge logic in the node.

When not to use it

If exactly one skill ever contributes to the response and you're confident that won't change, an effect registry is more machinery than the problem needs; read the field directly. The registry pays off at the second response-contributing skill, and it pays off hugely at the moment you first try to delete one.

One field, three doors: where this is going

Everything to this point is the write path: how a skill's output lands in the response envelope. That's where the pattern started, and for a while it was where I thought it ended. Then a single domain field — citations — forced the realization that the write path was only one of three places the same knowledge was being re-encoded.

A citation is born when a retrieval skill writes sources into graph state and the LLM emits inline reference identifier tokens. Three different surfaces then have to do something with that same field. The write path puts sources and citations into the response envelope. The read path rebuilds them on each past turn when a client loads thread history, so a reopened conversation looks identical to the live one. The stream path emits them as SSE events while the answer is still generating. Three surfaces, one field — and the naïve version names citations in all three, so the day you add, rename, or skill-gate the field you have to remember every one.

The fix is the one this whole pattern has been building toward: make all three surfaces derive their field set from the same registry instead of each naming the field. The same output_keys declaration that a callable effect advertises on the write side is what the read and stream paths read back, so a field added once opens all three doors at once and a field removed closes all three. That is the proof the pattern composes — but it's also a pattern in its own right, and it's the entire subject of the article on Registry-Driven Field Projection (Pattern #7). That is where the write→read→stream journey of a citation gets its full treatment, including the get_active_keys / default_for mechanics that make the read projection the write declaration read backward.

One half of the outbound story lives naturally here though, beside the write-side effects it mirrors.

The outbound twin

There's a mirror image of the whole write-side mechanism that's worth naming here, because a second product leans on it and the catalog's spine article (the Graph Blueprint) points to this section for it.

The Skill Output Effect is inbound: it governs how a field gets into the response envelope the core assembles. But there's a symmetric outbound seam, how a node's state update gets out through the live SSE stream, and it's the same architectural move pointed the other direction. The core exposes a handler registry:

# stream handler registration function from the second product built on HUGO

def register_stream_handlers() -> None:   
    register_update_handler("sources", handle_sources)
    register_update_handler("citations", handle_citations)
    register_update_handler("escalation_needed", handle_escalation_needed)
    register_message_parse_handler("follow_up_questions", handle_inline_follow_up_questions)

register_update_handler(state_field, handler) keys a callback off a node-update field; register_message_parse_handler(field, handler) keys one off a parsed message chunk. (There's a third, register_finalize_handler, for end-of-stream emission.) The core's stream runner dispatches generically. It walks each node update, looks up handlers by field name, and emits whatever SSE events they return. It never branches on "citations"; the app registered that association at bootstrap and the runner never learned it.

This is the Skill Output Effect's twin. The effect registry decides what the envelope contains; the handler registry decides what the stream emits. Both are bootstrap-registered records consumed by a core that names no domain field. I'm naming it here rather than expanding the Graph Blueprint's seam list, because the blueprint's job is the inbound seams (what the graph is, how it runs), and this outbound seam is its mirror, living naturally alongside the write-side effects it parallels. If you've read the blueprint article, this is the seam it told you would be covered here.

The snippet above is from the second product adopting the platform, which registers an entirely different handler set (escalation events, community patterns) against the same core runner, having added not one line to it. That's the same proof-of-reuse the rest of the catalog keeps producing: a second consumer snapping into a seam it didn't design, by writing records.

The same move, one layer down: the parser

There's a third place this pattern shows up which took a little longer to see.

Something has to turn the LLM's raw output string into the structured object the effects read from, and for a while that something lived in the core and knew far too much. The original parser, sitting in the platform's orchestrator package, hardcoded the medical engine's entire output vocabulary: it scanned for summary tags to pull out key points, rewrote inline citation identification tokens into grouped anchors, and produced a "generic" object whose declared fields were key_points, citations, and gray_areas. Parsing survived as a leak this long precisely because it feels like infrastructure — but the tag set, the citation format, and the question markers are the application's content contract with its own LLM, and none of it belongs in reusable code.

The fix is the move you've now seen three times: the core keeps a parser that does one platform-level job which involves splitting the visible answer from the trailing JSON and collect domain fields into an untyped extra. Then the domain-specific reading moves out across a registered factory seam via a register_stream_parser_factory function. The runner calls whatever factory was registered and never imports the app's parser. That's register_contract for tools and register for effects, repeated a third time for parsing. The full grammar-versus-vocabulary version of this — who owns the field names at the boundary where the framework reads the model's output — is Parse Boundary Ownership (Pattern #8). Here it's enough to see that the parser is the same seam pointed at the LLM's output.

The factory seam hides a second decision that is sharper than the registration itself. This one is worth pulling into the open here because a second consumer's reaction is what settled it.

When we extracted the parser, the seam signature hid a real choice, and it's the kind that's easy to get backwards. The factory returns a parser, fine, but how much should the base class it subclasses already do? There are two coherent answers. In the bare version, the base parser is empty and every consumer supplies all of its own grammar. In the rich version, the base parser carries the logic that's the same shape across products and the consumer overrides only what differs. The bare version looks maximally flexible, but that's the trap.

The bare cut forces every consumer to reimplement the citation parsing (logic that's likely the same shape for any application that needs it) just to get back to baseline. The version that shipped keeps that logic in the base, so a consumer inherits it and overrides only what differs. Two seams with the identical signature, factory: Callable[[], StreamStructuredOutputParser], can sit on opposite sides of this choice, and the difference is entirely in what lives behind them:

	The bare cut	The version that shipped
Seam shape	`factory: Callable[[], Parser]`	`factory: Callable[[], Parser]` (identical)
Base class	empty passthrough	full citation logic (`[ref_XXX]→[cit_YYY]` regrouping, tag handling)
To get base behavior	reimplement it	inherit it — `class X(Base): pass`
To customize	write everything	override the one method that differs
The verdict	clunky	clean

Base behavior + some other specificity is the whole design rule. It refines the thesis this catalog keeps stating, from "the application declares, the platform consumes" to "the platform supplies a working default; the application declares only its difference." The trade is worth naming explicitly, because the empty-passthrough extreme advertises flexibility and delivers busywork: every consumer pays full freight for behavior that was the same shape for everyone.

This is the real reason the StreamParser is allowed to be empty:

class StreamParser(StreamStructuredOutputParser):
    """application-owned stream parser. Currently a pass-through subclass."""

An empty subclass with a "this'll matter later" comment is exactly what a skeptical reader distrusts; it reads as a class earning its keep on credit. But it isn't a placeholder. The second product to adopt the platform hit this exact seam, wanted the base behavior, and got it for free. The subclass is precisely where its divergence will go the day its citation shape differs from the medical engine's. The empty subclass isn't an absence of work. It's the proof the default was the right one.

The surprising payoff: three things the registries gave us for free

Removal got cheaper than addition used to be. Before, adding a skill was a five-edit ritual and removing one was a five-edit undo you had to remember in full, the path that nearly shipped a dead branch to production. After, addition is a directory plus an effect registration, and removal is deleting the directory. A leftover effect registration fires its empty default; it doesn't break anything. The asymmetry of risk reversed: the operation that used to be scariest became the one that's nearly free.

Two skills could share a response field. The hardcoded node read a single source per field. There was no way for two skills to both contribute follow-up questions without someone writing merge logic by hand inside the node, and nobody was going to do that cleanly under deadline. The merge="append" strategy made it natural: inline follow-ups and post-generation follow-ups land in the same follow_up_questions key, unioned and de-duplicated, with the registry doing the merge. A capability we'd have called "too fiddly to bother with" became a default value on a dataclass field.

The empty default turned into a contract. We added type_hint so an unloaded skill's field would resolve to [] instead of disappearing. That empty default quietly became the stability guarantee the client could build on. The web UI gets to assume follow_up_questions is always a list, possibly empty, and never has to check whether the skill was loaded, because the absence of the skill and the presence of an empty list are the same observable thing. A fallback we added for tidiness became part of the envelope's public shape.

There was a fourth consequence, the largest, that doesn't fit under "for free" because it grew into a pattern of its own: the registry built to fix the write path turned out to be a field-set authority the read paths could snap onto, so history and the live stream stopped being able to skew from the response. That's Pattern #7's whole story, and I've left it there.

All of these are consequences of the same move: putting declarative records in the assembly path instead of branches in a node. That's the general lesson, and it's the one this catalog keeps re-discovering at every layer. Describe what each thing contributes as data, and the system finds uses for the description you didn't plan: the read paths, the stream, the second product's handler set, none of which you were thinking about when you wrote the record.

The honest close: registries don't help with craft, coordination, or versioning

Here's what these patterns don't do, stated plainly. They don't make skill authoring easier. Writing a SKILL.md whose description reliably gets the planner to select it at the right time, and whose body actually changes the model's behavior, is craft, and no registry helps you with craft. They don't handle cross-skill coordination beyond the group rule; if two skills genuinely need to negotiate, override, or sequence each other, you need orchestration logic above the loader. And they don't address skill versioning, what happens when a skill's output contract changes and a client depends on the old shape.

What they do, durably: they make the response envelope a function of what's loaded rather than a function of what's been hardcoded into a node. They make conflict resolution a rule you can say out loud. They make per-deployment capability differences a directory you point at rather than a branch you fork. And they make deleting a skill, the operation that exposed the whole problem, the cheapest thing in the system.

That completes the skills layer of the catalog. The orchestrator now selects tools through a registry, planning behavior through a policy, models through a profile registry, capabilities through a loaded directory, and the raw-output parser through a registered factory. And once you follow one field all the way through, the reads and the stream come off the same registry that governs the writes, with the outbound stream handlers as the inbound effects' mirror. In every single case, the thing selected is a value the application supplies, never a fact the core had to learn, and it reaches in both directions: what the envelope contains and what leaves through the stream, both declared, neither hardcoded. The pattern under all the patterns, one more time: the application declares, the platform consumes.

There's one layer left on the write side, and it's the one that's been quietly holding up several of the others. Two of the patterns in this catalog lean on mechanisms we've name-dropped without ever opening: the config cascade that resolves a tool's settings through configurable → state → default, and the prompt slot injection that lets a policy's output flow into the next LLM call's prompt. They're small, the bricks rather than the buildings, but they're load-bearing, and the catalog needs them named before it turns to the read side.

Pattern #6, the last of the write-side patterns: the Tiered Config Cascade and Prompt Slot Injection patterns, is the two foundational utilities the rest of the catalog has been standing on.

Nodes knew about every skill

Two problems wearing one coat

The patterns

Pattern A: Skill Loader

Structure

A real example

Consequences

When not to use it

Pattern B: Skill Output Effect Registry

Structure

A real example

Consequences

When not to use it

One field, three doors: where this is going

The outbound twin

The same move, one layer down: the parser

Sidebar — bring your own, but only the difference

The surprising payoff: three things the registries gave us for free

The honest close: registries don't help with craft, coordination, or versioning