
  <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
      <title>Articles by Dan Orlando</title>
      <link>https://danorlando.com/blog</link>
      <description>How-to&#39;s, discoveries and other useful nuggets fit for developer consumption.</description>
      <language>en-us</language>
      <managingEditor>undefined (Dan Orlando)</managingEditor>
      <webMaster>undefined (Dan Orlando)</webMaster>
      <lastBuildDate>Tue, 19 May 2026 00:00:00 GMT</lastBuildDate>
      <atom:link href="https://danorlando.com/tags/agenticai/feed.xml" rel="self" type="application/rss+xml"/>
      
  <item>
    <guid>https://danorlando.com/blog/genai/bounded-autonomy</guid>
    <title>Bounded Autonomy: The Architecture Pattern That Replaces the Multi-Agent Trap</title>
    <link>https://danorlando.com/blog/genai/bounded-autonomy</link>
    <description>When researchers stripped the safety constraints from a bounded autonomy system, they expected faster task completion. Instead, they got worse. The bounded version completed 23 of 25 tasks; the unconstrained baseline completed 17—and produced wrong-entity mutations that silently corrupted a different customer&#39;s data. The guardrails weren&#39;t slowing the system down, they were actually making it smarter.</description>
    <pubDate>Tue, 19 May 2026 00:00:00 GMT</pubDate>
    <author>undefined (Dan Orlando)</author>
    <category>generativeai</category><category>agenticai</category><category>architecture</category><category>systemdesign</category>
  </item>

  <item>
    <guid>https://danorlando.com/blog/genai/context-engineering</guid>
    <title>Context Engineering: The Missing Layer</title>
    <link>https://danorlando.com/blog/genai/context-engineering</link>
    <description>A paper found that simply repeating the prompt before generation jumped accuracy from 21% to 97% on one benchmark. Everyone filed it under &quot;prompting tricks.&quot; The mechanism is architectural, and it&#39;s the clearest proof that context engineering and prompt engineering have completely different ceilings. Most teams are optimizing the wrong layer.</description>
    <pubDate>Mon, 20 Apr 2026 00:00:00 GMT</pubDate>
    <author>undefined (Dan Orlando)</author>
    <category>generativeai</category><category>agenticai</category><category>architecture</category><category>designpatterns</category>
  </item>

  <item>
    <guid>https://danorlando.com/blog/genai/goal-oriented-memory</guid>
    <title>Goal-Oriented Memory: Backward Chaining for Long-Horizon Agents</title>
    <link>https://danorlando.com/blog/genai/goal-oriented-memory</link>
    <description>Your vector store has the right answer stored. Your agent still gets it wrong. The failure is in retrieval logic that fetches topically similar content instead of logically relevant facts. Backward chaining, a technique from 1970s Prolog theorem provers, closes a 21-point accuracy gap on multi-hop memory benchmarks without touching your storage layer.</description>
    <pubDate>Tue, 05 May 2026 00:00:00 GMT</pubDate>
    <author>undefined (Dan Orlando)</author>
    <category>generativeai</category><category>agenticai</category><category>agentmemory</category><category>architecture</category><category>systemdesign</category>
  </item>

  <item>
    <guid>https://danorlando.com/blog/genai/multi-agent-is-a-trap</guid>
    <title>Multi-Agent AI Is a Trap</title>
    <link>https://danorlando.com/blog/genai/multi-agent-is-a-trap</link>
    <description>One team running a six-agent debate system switched to two agents with a strict state machine. Latency dropped from 18 seconds to 3. Cost per query dropped from $8-12 to $0.40. Accuracy changed by less than 1%. This wasn&#39;t a fluke. Information theory explains exactly why multi-agent systems can never outperform a single agent with full context.</description>
    <pubDate>Wed, 13 May 2026 00:00:00 GMT</pubDate>
    <author>undefined (Dan Orlando)</author>
    <category>generativeai</category><category>agenticai</category><category>architecture</category><category>systemdesign</category><category>agentsystems</category><category>informationtheory</category>
  </item>

  <item>
    <guid>https://danorlando.com/blog/genai/multi-tenant-agent-saas</guid>
    <title>Multi-Tenant AI Agent SaaS: The Infrastructure Decisions That Scale</title>
    <link>https://danorlando.com/blog/genai/multi-tenant-agent-saas</link>
    <description>Semantic search doesn&#39;t throw errors when it returns the wrong tenant&#39;s data. It just returns it—and your agent weaves it into the response like it belongs there. This failure surfaces silently between customer 20 and 30, and when it does, the fix isn&#39;t a configuration change. It&#39;s a full audit of your execution graph, measured in months. Here&#39;s the architectural decision that prevents it.</description>
    <pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate>
    <author>undefined (Dan Orlando)</author>
    <category>generativeai</category><category>agenticai</category><category>infrastructure</category><category>architecture</category><category>systemdesign</category>
  </item>

  <item>
    <guid>https://danorlando.com/blog/genai/recall-vs-learn</guid>
    <title>The Recall vs. Learn Distinction: Why Your Agent Forgets Everything You Taught It</title>
    <link>https://danorlando.com/blog/genai/recall-vs-learn</link>
    <description>Your agent can surface any conversation from six months ago verbatim, yet it is still making the same mistakes it made then. Recall and learning are architecturally distinct, and most agent memory systems only build the former. Removing memory from an agent hurts performance more than swapping the underlying LLM. You&#39;re probably investing in the wrong layer.</description>
    <pubDate>Fri, 15 May 2026 00:00:00 GMT</pubDate>
    <author>undefined (Dan Orlando)</author>
    <category>generativeai</category><category>agenticai</category><category>agentmemory</category>
  </item>

  <item>
    <guid>https://danorlando.com/blog/genai/retrieval-as-generation</guid>
    <title>Retrieval as Generation: The Architecture That Kills External Orchestrators</title>
    <link>https://danorlando.com/blog/genai/retrieval-as-generation</link>
    <description>An 8B-parameter model matches GPT-4o across five knowledge-intensive benchmarks and beats it on two. It does this by replacing the entire retrieval orchestration layer—confidence classifiers, query routers, rerankers, fusion logic—with four special tokens. No external components. When it fails, you read a transcript. That&#39;s the whole debugging story.</description>
    <pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate>
    <author>undefined (Dan Orlando)</author>
    <category>generativeai</category><category>rag</category><category>architecture</category><category>systemdesign</category><category>agentsystems</category><category>agenticai</category>
  </item>

  <item>
    <guid>https://danorlando.com/blog/genai/single-agent-vs-multi-agent</guid>
    <title>Single vs. Multi-Agent: The Cognition-Anthropic Schism and Why Both Are Right</title>
    <link>https://danorlando.com/blog/genai/single-agent-vs-multi-agent</link>
    <description>Two frontier teams published production data that directly contradicts each other. Cognition says multi-agent architectures cause compounding information loss and single-threaded execution is the fix. Anthropic says multi-agent delegation beat a single agent by 90.2% on their research eval. Both are right—for the model version they tested on. Architectural best practices in agent systems have a six-month half-life.</description>
    <pubDate>Mon, 25 May 2026 00:00:00 GMT</pubDate>
    <author>undefined (Dan Orlando)</author>
    <category>generativeai</category><category>agenticai</category><category>architecture</category><category>systemdesign</category><category>agentsystems</category><category>informationtheory</category>
  </item>

  <item>
    <guid>https://danorlando.com/blog/genai/why-your-rag-is-obsolete</guid>
    <title>Why Your RAG Is Already Obsolete (And What Works Instead)</title>
    <link>https://danorlando.com/blog/genai/why-your-rag-is-obsolete</link>
    <description>If your RAG pipeline runs the same retrieval strategy for every query, you&#39;re in one corner of a five-dimensional design space—and it&#39;s the worst-performing corner. Static pipelines leave up to 15% accuracy on the table and spend 3x the tokens they need to. The move to agentic retrieval is incremental, and each step compounds with every model generation.</description>
    <pubDate>Sat, 18 Apr 2026 00:00:00 GMT</pubDate>
    <author>undefined (Dan Orlando)</author>
    <category>generativeai</category><category>rag</category><category>architecture</category><category>systemdesign</category><category>agentsystems</category><category>agenticai</category>
  </item>

    </channel>
  </rss>
