agents.md
1 --- 2 title: Agents 3 sidebar_position: 220 4 --- 5 6 Agents are specialized processing units within the Agent Mesh framework that are built around the Google Agent Development Kit (ADK) and provide the core intelligence layer. They: 7 8 * perform specific tasks or provide domain-specific knowledge or capabilities 9 * integrate with the ADK runtime for advanced AI capabilities including tool usage, memory management, and session handling 10 * play a crucial role in the system's ability to handle a wide range of tasks and adapt to various domains 11 12 :::tip[In one sentence] 13 Agents are intelligence units that communicate through the A2A protocol to provide system capabilities beyond basic orchestrator capabilities. 14 ::: 15 16 ## Key Functions 17 18 1. **ADK Integration**: Agents are built using the Google Agent Development Kit, providing advanced AI capabilities including tool usage, memory management, and artifact handling. 19 20 2. **AI-Enabled**: Agents come packaged with access to large language models (LLMs) and can utilize various tools. 21 3. **Dynamic Discovery**: New agents can self-register/deregister and be discovered dynamically through broadcast messages without requiring changes to the running system. 22 23 4. **Tool Ecosystem**: Agents have access to built-in tools for artifact management, data analysis, web scraping, and peer-to-peer delegation. 24 25 5. **Session Management**: Agents support conversation continuity through ADK's session management capabilities. 26 27 6. **Independence**: Agents are modularized and can be updated or replaced independently of other components. 28 29 30 ## Agent Design 31 32 Agents in Agent Mesh are built around the Solace AI Connector (SAC) component with ADK. Agent Mesh agents are complete self-contained units that can carry out specific tasks or provide domain-specific knowledge or capabilities. Each agent is defined by a YAML configuration file. 33 34 Each agent integrates with: 35 - **ADK Runtime**: For AI model access, tool execution, and session management 36 - **A2A Protocol**: For standardized agent-to-agent communication 37 - **Tool Registry**: Access to built-in and custom tools 38 - **Artifact Service**: For file handling and management 39 40 41 For example, an agent configured with SQL database tools can execute queries, perform data analysis, and generate visualizations through the integrated tool ecosystem, all while maintaining conversation context through its session management. 42 43 ### The Agent Lifecycle 44 45 Agents in Agent Mesh follow the A2A protocol lifecycle and interact with the agent registry: 46 47 - **Discovery**: Agents start broadcasting discovery messages on startup to announce their availability and capabilities to the agent mesh. 48 49 - **Active**: The agent listens for A2A protocol messages on its designated topics and processes incoming tasks through the ADK runtime. 50 51 - **Execution**: The agent works on a task. They can also delegate tasks to other agents through the peer-to-peer A2A communication protocol. 52 53 - **Cleanup**: When shutting down, agents perform session cleanup and deregister from the agent mesh. 54 55 56 ### Potential Agent Examples 57 58 - **RAG (Retrieval Augmented Generation) Agent**: An agent that can retrieve information based on a natural language query using an embedding model and vector database, and then generate a response using a language model. 59 60 - **External API Bridge**: An agent that acts as a bridge to external APIs, retrieving information from third-party services such as weather APIs or product information databases. 61 62 - **Internal System Lookup**: An agent that performs lookups in internal systems, such as a ticket management system or a customer relationship management (CRM) database. 63 64 - **Natural Language Processing Agent**: An agent that can perform tasks like sentiment analysis, named entity recognition, or language translation. 65 66 67 ## Tool Ecosystem 68 69 Agents perform tasks by using **tools**. A tool is a specific capability, like querying a database, calling an external API, or generating an image. The Agent Mesh framework provides a flexible and powerful tool ecosystem, allowing you to equip your agents with the right capabilities for any job. 70 71 There are three primary ways to add tools to an agent: 72 73 ### 1. Built-in Tools 74 75 Agent Mesh includes a rich library of pre-packaged tools for common tasks like data analysis, file management, and web requests. These are the easiest to use and can be enabled with just a few lines of configuration. 76 77 - **Use Case**: For standard, out-of-the-box functionality. 78 - **Learn More**: See the [Built-in Tools Reference](./builtin-tools/builtin-tools.md) for a complete list and configuration details. 79 80 ### 2. Custom Python Tools 81 82 For unique business logic or specialized tasks, you can create your own tools using Python. This is the most powerful and flexible method, supporting everything from simple functions to advanced, class-based tool factories that can generate multiple tools programmatically. 83 84 - **Use Case**: For implementing custom logic, integrating with proprietary systems, or creating dynamically configured tools. 85 - **Learn More**: See the [Creating Python Tools](../developing/creating-python-tools.md) guide for a complete walkthrough. 86 87 ### 3. MCP (Model Context Protocol) Tools 88 89 For integrating with external, standalone tool servers that conform to the Model Context Protocol, you can configure an MCP tool. This allows agents to communicate with tools running in separate processes or on different machines. 90 91 - **Use Case**: For integrating with existing MCP-compliant tool servers or language-agnostic tool development. 92 - **Learn More**: See the [MCP Integration Tutorial](../developing/tutorials/mcp-integration.md). 93 94 ## Agent Card 95 96 The Agent Card is a public-facing profile that describes an agent's identity, capabilities, and how to interact with it. It functions like a digital business card, allowing other agents and clients within Agent Mesh to discover what an agent can do. This information is published by the agent and is crucial for dynamic discovery and interoperability. 97 98 The Agent Card is defined in the agent's YAML configuration file under the `agent_card` section. 99 100 ### Key Fields 101 102 You can configure the following fields in the `agent card`: 103 104 - **`description`**: A summary of the agent's purpose and capabilities. 105 - **`defaultInputModes`**: A list of supported MIME types for input (e.g., `["text/plain", "application/json", "file"]`). 106 - **`defaultOutputModes`**: A list of supported MIME types for output. 107 - **`skills`**: A list of specific skills the agent possesses. Each skill corresponds to a capability, often backed by a tool. 108 109 ### Skills 110 111 A skill describes a specific function the agent can perform. It provides granular detail about the agent's abilities. 112 113 Key attributes of a skill include: 114 115 - **`id`**: A unique identifier for the skill, which should match the `tool_name` if the skill is directly mapped to a tool. 116 - **`name`**: A human-readable name for the skill. 117 - **`description`**: A clear explanation of what the skill does, which helps the LLM (and other agents) decide when to use it. 118 119 ### Example Configuration 120 121 Here is an example of an `agent_card` configuration for a "Mermaid Diagram Generator" agent: 122 123 ```yaml 124 # ... inside app_config ... 125 agent_card: 126 description: "An agent that generates PNG images from Mermaid diagram syntax." 127 defaultInputModes: ["text"] # Expects Mermaid syntax as text 128 defaultOutputModes: ["text", "file"] # Confirms with text, outputs file artifact 129 skills: 130 - id: "mermaid_diagram_generator" 131 name: "Mermaid Diagram Generator" 132 description: "Generates a PNG image from Mermaid diagram syntax. Input: mermaid_syntax (string), output_filename (string, optional)." 133 ``` 134 135 This card clearly communicates that the agent can take text (the Mermaid syntax) and produce a file (the PNG image), and it details the specific "mermaid_diagram_generator" skill it offers. For more details on creating agents and configuring their cards, see [Creating Custom Agents](../developing/create-agents.md). 136 137 ## User-Defined Agents 138 139 Using the Agent Mesh CLI, you can create your own agents. Agents are configured through YAML files that specify: 140 141 - Agent name and instructions 142 - LLM model configuration 143 - Available tools and capabilities 144 - Artifact and session management settings 145 - Discovery settings 146 147 The following Agent Mesh CLI command creates an agent configuration: 148 149 ```sh 150 sam add agent my-agent [--gui] 151 ``` 152 153 For more information, see [Creating Custom Agents](../developing/create-agents.md). 154 155 ## Remote A2A Agents 156 157 In addition to agents that run natively within Agent Mesh, you can integrate external agents that communicate using the A2A protocol over HTTPS. These remote agents run on separate infrastructure but can still participate in collaborative workflows with mesh agents. 158 159 Remote A2A agents are useful when you need to: 160 161 - Integrate third-party agents from vendors or partners 162 - Connect agents running in different cloud environments or on-premises systems 163 - Maintain service isolation while enabling collaboration 164 - Gradually migrate existing A2A agents to the mesh 165 166 To integrate external agents, you use a proxy component that acts as a protocol bridge between A2A over HTTPS and A2A over Solace event mesh. The proxy handles authentication, artifact flow, and discovery, making remote agents appear as native mesh agents to other components. 167 168 For detailed information on configuring and deploying proxies for remote agents, see [Proxies](./proxies.md). 169 170 ## Agent Plugins 171 172 You can also use agents built by the community or Solace directly in your app with little to no configuration. 173 174 For more information, see [Use a Plugin](./plugins.md#use-a-plugin). 175