Cloud Computing

Best practices for building agentic systems

The rapid evolution of artificial intelligence has ushered in a new era of "agentic AI," a paradigm shift from simple task automation to systems capable of autonomous action and decision-making. These AI agents, far beyond sophisticated chatbots, are poised to redefine enterprise efficiency by continuously operating until a task is complete. Andrew McNamara, director of applied machine learning at Shopify, explains that agentic AI systems "can take actions on behalf of users, not just generate text or answer questions." This proactive capability is exemplified by Shopify’s Sidekick, a merchant-facing agent designed for continuous, intelligent assistance.

The adoption of agentic AI is spanning a diverse range of business domains. Anthropic, a prominent provider of large language models (LLMs), reports that software engineering currently accounts for approximately half of all AI agent use cases. This is closely followed by back-office automation, marketing, sales, finance, and data analysis, indicating a broad recognition of the transformative potential of autonomous AI across core business functions.

Heath Ramsey, group VP of AI platform outbound product management at ServiceNow, offers a concrete example in IT incident resolution. In this scenario, AI agents can autonomously gather contextual data from various systems, cross-reference past resolutions and established policies, implement fixes, update records, and even escalate issues to relevant team members. This demonstrates a move towards intelligent, self-sufficient problem-solving within complex IT environments.

However, the development of agent-centric systems introduces new challenges. A critical concern is the need for a novel form of systems thinking to mitigate potential pitfalls such as indeterminism, where the system’s behavior is unpredictable, and "token bloat," referring to the excessive consumption of computational resources by AI models. Furthermore, significant security vulnerabilities, often termed "agentic misalignment," are emerging from LLMs. These include a model’s propensity to generate fabricated information or "lie" to achieve its objectives, a behavior that can have serious ramifications in enterprise applications.

Consequently, meticulous upfront planning is paramount for teams developing agents that must integrate with other systems and navigate complex decision trees to execute multi-step workflows. This has led to the necessity for a new architectural playbook specifically designed for agentic systems.

Anurag Gurtu, CEO of AIRRIVED, an agentic AI platform provider, emphasizes this architectural shift, stating, "Building agentic systems requires a fundamentally new architecture, one designed for autonomy, not just automation." He elaborates that agents require a robust framework encompassing a runtime environment, a cognitive "brain," operational "hands," a memory system, and crucial "guardrails" to ensure safe and controlled operation.

Despite the significant promise of agentic AI, realizing a measurable return on investment (ROI) remains a complex undertaking. Alteryx research indicates that less than half of organizations have reported a tangible impact from their agentic AI experiments, with fewer than a third expressing confidence in AI for accurate decision-making. This underscores the need for practical guidance and lessons learned from early adopters to navigate the intricacies of building enterprise-grade agentic systems.

Architectural Components of an Agentic System

The foundation of any successful agentic system lies in its interconnected architectural components. Ari Weil, cloud evangelist at Akamai, likens the construction of an AI agent to building a "nervous system," highlighting the intricate interplay of various layers. These layers typically include reasoning, memory, context gathering, coordination, validation, and human-in-the-loop oversight. As Ramsey of ServiceNow notes, "Agentic systems rely on a combination of AI, workflow automation, and enterprise controls working together."

Reasoning Model

At the heart of an agentic system is its reasoning model. Frank Kilcommins, head of enterprise architecture at Jentic, a provider of AI integration layers, explains that this "reasoning engine performs the planning based on the user’s prompt, combined with the context-at-hand and available capabilities." The selection of a suitable reasoning model is crucial. McNamara of Shopify looks for models that exhibit "agentic" qualities, characterized by appropriate tool-calling capabilities and strong instruction-following that is easily steerable through prompting.

Context and Data

An agent’s effectiveness is heavily dependent on the context and data it can access. This encompasses a broad spectrum, including internal company data, institutional knowledge, policies, system prompts, external data sources, and a history of past interactions (memory). Crucially, "agentic metadata"—the record of user prompts, reasoning steps, and tool interactions—is vital for observing and debugging agent behavior.

Edgar Kussberg, product director for AI, agents, IDE, and devtools at Sonar, identifies a wide array of data sources, such as databases, APIs, retrieval-augmented generation (RAG) systems, vector databases, file systems, document stores, internal dashboards, and external services like Google Drive. Organizations are actively developing specialized "agentic knowledge bases" to organize this data efficiently and streamline retrieval. Emerging patterns in semantic retrieval are powering sophisticated context management systems. Anusha Kovi, a business intelligence engineer at Amazon, notes that for memory, "most teams combine a vector store like pgvector with something structured like a data catalog or knowledge graph."

Tools and Discovery

Beyond static context, actionable agents require read and write access to a variety of tools and APIs. Jackie Brosamer, head of data and AI at Block, emphasizes that "some of the most important work being done to make agents more powerful is happening with the ways we connect AI and existing systems." The industry has largely converged on the Model Context Protocol (MCP) as a universal connector between agents and disparate systems. The proliferation of MCP registries is further unifying and cataloging these capabilities for agents at scale. Notable examples of MCP adoption include Block’s open-source goose agent for LLM-powered software development and Workato’s integration of MCP for Claude-powered enterprise workflows.

Defined Workflows

Clearly documented workflows are another critical component, especially for multi-step actions that involve interlinked MCP servers or direct API calls. Ramsey of ServiceNow stresses the importance of coordinating agents through defined workflows to ensure that "autonomy scales in a predictable and governed way rather than becoming chaotic." Kilcommins of Jentic suggests using "clear, machine-readable capability definitions," referencing the Arazzo specification from the OpenAPI Initiative as a standard for documenting agent behaviors.

Multi-Agent Orchestration

As agentic systems scale, the need for multi-agent orchestration becomes apparent. Gurtu of AIRRIVED explains that instead of a single generalist agent, enterprises often deploy teams of specialized agents, such as reasoning agents, retrieval agents, action agents, and validation agents. This necessitates robust connective tissue. Kovi from Amazon highlights the need for an "orchestration layer for the plan-do-evaluate loop." Emerging orchestration frameworks include LangGraph for low-level control, CrewAI for Python-based multi-agent orchestration, and Bedrock Agents for automating multi-step tasks. Open standards like the A2A protocol for agent-to-agent communications are also crucial for fostering effective collaboration among AI agents.

Security and Authorization

Given the inherent tendencies of LLMs to hallucinate or deviate from expected behavior, security is arguably the most critical aspect of building safe agentic systems. Gurtu warns, "You’re no longer securing software that suggests, you’re securing software that acts." The ability of agents to alter access, trigger workflows, or remediate incidents means that "every decision becomes a potential control failure if it isn’t governed." Kilcommins points out the potentially vast "blast radius" of uncontrolled agentic actions, particularly in chained executions, and advocates for clearly defined permissions to prevent privilege escalation and sensitive data exposure.

Nuanced security methods are essential. Kovi explains that since agents decide at runtime which tools to call, traditional permission scoping is inadequate. Experts foresee "just-in-time authorization" as a key technology for securing the evolving "non-human internet." She further stresses that safety rules, such as prohibiting queries of personal information columns, should not reside solely within prompt instructions but rather be embedded in identity and access management policies and configurations.

Human Checkpoints

Even with advanced security measures, sensitive actions will necessitate human oversight. Shopify adopts a "human-in-the-loop by design" approach, incorporating approval gates to prevent autonomous changes to production systems, allowing merchants to review AI-generated content before it goes live. Block employs a similar strategy for financial transactions, stating, "Our general rule is that anything touching production systems needs human checkpoints," a principle applied to their Moneybot agent within Cash App.

Evaluation Capabilities

Thorough upfront testing is vital to ensure agentic system outcomes align with intended results. Shopify conducts rigorous pre-deployment evaluations of agentic outputs through human testing and user simulations with specialized LLM-based judges. McNamara notes, "Once your judge reliably matches human evaluators, you can trust it at scale." Gurtu advises treating agents as "regulated systems," emphasizing the importance of sandboxing changes and testing agents in simulation environments.

Behavioral Observability

A fundamental layer for agentic systems is observability, which extends beyond traditional monitoring to capture advanced signals like the reasons behind agent failures or specific action choices. Kussberg of Sonar underscores that "Observability must be built in from day one," requiring transparency into every step of execution, including prompts, tool calls, intermediate decisions, and final outputs. Enhanced observability fuels continuous system improvement, as Kussberg states, "transparency fuels improvement."

Context Optimization Strategies

A near-unanimous consensus among experts is that providing AI agents with minimal, relevant data is significantly more effective than overwhelming them with information. This is crucial to avoid exceeding context window limitations and degrading output quality. Brosamer of Block emphasizes, "Thoughtful data curation matters far more than data volume," adding that "The quality of an agent’s output is directly tied to the quality of its context." Block engineers achieve this through clear README files, consistent documentation standards, well-structured project hierarchies, and adherence to semantic conventions that facilitate agent data surfacing.

Kussberg of Sonar reiterates, "Agentic systems don’t need more data, they need the right data at the right time." Effective systems equip agents with versatile discovery tools and allow them to iteratively retrieve information until sufficient context is acquired. The prevailing philosophy favors "progressive disclosure of information." Shopify implements "just-in-time context delivery," where relevant context is provided alongside tool data only when needed, rather than overloading the initial system prompt. Kovi points out the importance of semantic nuances in context, warning that if an agent misunderstands context-specific terminology, it can produce confident but incorrect answers, which can be difficult to detect.

Architectural Best Practices

Beyond the core components, several architectural best practices are emerging for agentic systems. A primary recommendation is the realization that not all processes need to be agentified. While LLMs and MCP integrations are powerful for novel, scalable, and situationally-aware tasks, MCP can be overly complex for repetitive, deterministic automation with static context and strict security requirements. Kilcommins suggests distinguishing between adaptive and deterministic behaviors, codifying the latter to enhance system stability.

Identifying reusable use cases is key to determining prime areas for agentic implementation. Ramsey notes that successful deployments often begin with "identifying a high-friction process," such as employee service requests, new-hire onboarding, or customer incident response. Gurtu advises focusing on concrete business goals, stating, "Start with decisions, not demos." He cautions against treating agents as stateless chatbots or attempting to replace humans wholesale.

Narrowing an agent’s autonomy can also yield better results. Kussberg advocates for agents to function as specialists rather than generalists. McNamara of Shopify explains that while tool boundaries can blur beyond 20-50 tools, they opt for a sub-agent architecture with low-level tools, advising against multi-agent architectures in the early stages. Their approach involves building granular tools and teaching the system to translate natural language into these low-level commands, rather than developing tools scenario by scenario.

Experts also offer additional wisdom:

  • Modular Design: Breaking down complex tasks into smaller, manageable modules enhances flexibility and maintainability.
  • Agent-to-Agent Communication Standards: Adopting open protocols for inter-agent communication facilitates seamless collaboration.
  • Iterative Development: Building and refining agent capabilities incrementally allows for continuous learning and adaptation.
  • Focus on Value Proposition: Clearly defining the business value and desired outcomes of agentic systems is crucial for guiding development and ensuring ROI.

The Future for Agentic Systems

Agentic AI development has progressed at an extraordinary pace, with patterns for agentic systems beginning to solidify. Experts anticipate a significant increase in multi-agent systems development, driving the need for more sophisticated orchestration patterns and a greater reliance on open standards. This evolution is expected to profoundly reshape knowledge work.

Brosamer of Block forecasts that "in 2026, we will see experimentation with frameworks to structure ‘factories’ of agents to coordinate producing complex knowledge work, starting with coding." A major challenge will be optimizing existing information flows for agentic use cases. The future may also see a greater emphasis on alternative clouds and edge-based inference to reduce latency and move workloads closer to data sources. Weil of Akamai posits, "The future of competitive AI demands proximity, not just processing power," as agents need to interact with the real world as events unfold.

In conclusion, the development of agentic systems is a complex and maturing endeavor. It demands a synthesis of novel technologies, microservices-inspired design principles, and robust security guardrails to achieve meaningful autonomy at scale. The future is undoubtedly agentic, but the success of these systems will hinge on intelligent design that balances autonomy with control and predictability.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button