Cradicle Explorer

/ docs / docs / documentation / deploying / deployment-options.md

deployment-options.md

1 ---
2 title: Choosing Deployment Options
3 sidebar_position: 10
4 ---
5
6 # Choosing Deployment Options
7
8 Agent Mesh offers flexible deployment options designed to meet different operational requirements. Understanding these options helps you choose the right approach for your specific environment and scale needs.
9
10 Agent and workflow deployments follow identical processes in Solace Agent Mesh. Both agents and workflows are configured using YAML files and deployed using the same commands and infrastructure. From a deployment perspective, workflows are treated as specialized agents that orchestrate other agents through defined steps.
11 :::note
12 Workflows cannot run independently. They must be triggered by an LLM agent, such as the Orchestrator. Therefore, at least one LLM agent must be deployed alongside the workflows. This requirement may change in the future.
13 :::
14
15 ## Development Environment
16
17 During development, simplicity and rapid iteration are key priorities. The Agent Mesh CLI provides a streamlined way to run your entire project as a single application, making it easy to test changes and debug issues locally.
18
19 The development setup automatically loads environment variables from your configuration file (typically a `.env` file at the project root), eliminating the need for complex environment management:
20
21 **Run your entire project:**
22
23 ```bash
24 sam run
25 ```
26
27 This command starts all configured components together, providing immediate feedback and allowing you to see how different agents interact within your mesh.
28
29 **Run a specific agent or workflow:**
30
31 ```bash
32 sam run <agent or workflow config file path>
33 ```
34
35 **Run multiple components together:**
36
37 ```bash
38 sam run <agent config file path> <workflow config file path>
39 ```
40
41 This flexibility allows you to test individual components in isolation or verify how agents and workflows interact within your agent mesh. Workflows often invoke multiple standalone agents, which must be loaded via the `run` command. For example, if workflow W1 requires agents A1 and A2, the following command runs them all together.
42
43 ```bash
44 sam run <agent A1 config file path> <agent A2 config file path> <workflow W1 config file path>
45 ```
46
47 If workflow W1 depends on workflow W2, you can invoke them together by running the following command:
48
49 ```bash
50 sam run <agent A1 config file path> <agent A2 config file path> <workflow W2 config file path> <workflow W1 config file path>
51 ```
52
53 ## Production Environment
54
55 Production deployments require different considerations than development environments. You need reproducible builds, scalable infrastructure, and robust monitoring capabilities. Containerization addresses these requirements by providing consistent runtime environments and enabling modern orchestration platforms.
56
57 We recommend using Docker for single-node deployments or Kubernetes for multi-node, scalable deployments. These technologies ensure your application runs consistently across different environments and can scale to meet demand.
58
59 :::note Platform Compatibility
60 If your host system architecture is not `linux/amd64`, add the `--platform linux/amd64` flag when you run the container to ensure compatibility with the pre-built images.
61 :::
62
63 ### Deploying with Docker
64
65 Docker provides an excellent foundation for production deployments because it packages your application with all its dependencies into a portable container. This approach ensures consistent behavior across different environments and simplifies deployment processes.
66
67 The following Dockerfile demonstrates how to containerize an Agent Mesh project:
68
69 ```Dockerfile
70 FROM solace/solace-agent-mesh:latest
71 WORKDIR /app
72
73 # Install Python dependencies
74 COPY ./requirements.txt /app/requirements.txt
75 RUN python3.11 -m pip install --no-cache-dir -r /app/requirements.txt
76
77 # Copy project files
78 COPY . /app
79
80 CMD ["run", "--system-env"]
81
82 # To run one specific agent use:
83 # CMD ["run", "--system-env", "configs/agents/main_orchestrator.yaml"]
84
85 # To run one specific workflow, use:
86 # CMD ["run", "--system-env", "YOUR-WORKFLOW.yaml"]
87
88 ```
89
90 You can deploy your workflow’s dependencies (agents and other workflows) in the same container, as shown below, or in separate containers.
91
92 ```
93 CMD ["run", "--system-env", "YOUR-AGENT.yaml", "YOUR-WORKFLOW.yaml"]
94 ```
95
96 To optimize build performance and security, create a `.dockerignore` file that excludes unnecessary files from the Docker build context:
97
98 ```
99 .env
100 *.log
101 dist
102 .git
103 .vscode
104 .DS_Store
105 ```
106
107
108 ### Deploying with Kubernetes
109
110 Kubernetes excels at managing containerized applications at scale, providing features like automatic scaling, rolling updates, and self-healing capabilities. When your Agent Mesh deployment needs to handle varying loads or requires high availability, Kubernetes becomes the preferred orchestration platform.
111
112 Agent Mesh provides Helm charts for Kubernetes deployments that handle resource management, scaling, and configuration. For prerequisites, Helm setup, and production configurations, see [Kubernetes](kubernetes/kubernetes.md).
113
114 ### Separating and Scaling Components
115
116 A microservices approach to deployment offers significant advantages for production systems. By splitting your Agent Mesh components into separate containers, you achieve better fault isolation, independent scaling, and more granular resource management.
117
118 This architectural pattern ensures that if one component experiences issues, the rest of your system continues operating normally. When the failed component restarts, it automatically rejoins the mesh through the Solace event broker, maintaining system resilience.
119
120 To implement component separation:
121
122 **Reuse the same Docker image**: Your base container image remains consistent across all components, simplifying maintenance and ensuring compatibility.
123
124 **Customize startup commands**: Each container runs only the components it needs by specifying different configuration files in the startup command.
125
126 **Scale independently**: Components with higher resource demands or traffic can be scaled separately, optimizing resource utilization and cost.
127
128 For example, you might run your main orchestrator in one deployment while scaling your specialized tool agents in separate deployments based on demand.
129
130 ### Managing Storage Requirements
131
132 When deploying multiple containers, shared storage becomes critical for maintaining consistency across your Agent Mesh deployment. All container instances must access the same storage location with identical configurations to ensure proper operation.
133
134 :::warning Shared Storage Requirement
135 If you are using multiple containers, ensure all instances access the same storage with identical configurations. Inconsistent storage configurations can lead to data synchronization issues and unpredictable behavior.
136 :::
137
138 Consider using persistent volumes in Kubernetes or shared file systems in Docker deployments to meet this requirement.
139
140 :::note
141 Workflow deployment requires special attention, especially when workflows call agents running in separate containers. Agents must be discoverable and accessible for the workflow to function correctly.
142 :::
143
144 ### Implementing Security Best Practices
145
146 Production deployments require robust security measures to protect sensitive data and ensure system integrity. Implementing these practices helps safeguard your Agent Mesh deployment against common security threats.
147
148 **Environment Variables and Secrets Management**: Never store sensitive information like API keys, passwords, or certificates in `.env` files or container images. Instead, use dedicated secret management solutions such as AWS Secrets Manager, HashiCorp Vault, or Kubernetes Secrets. These tools provide encryption at rest, access controls, and audit trails for sensitive data.
149
150 **TLS Encryption**: All communication channels should use TLS encryption to protect data in transit. This includes communication between Agent Mesh components and connections to the Solace event broker. TLS prevents eavesdropping and ensures data integrity during transmission.
151
152 **Container Security**: Maintain security throughout your container lifecycle by regularly updating base images to include the latest security patches. Implement security scanning tools like Trivy or Clair in your CI/CD pipeline to identify vulnerabilities before deployment. Additionally, run containers with minimal privileges and avoid running processes as root when possible.
153
154 ### Configuring Solace Event Broker
155
156 The Solace event broker serves as the communication backbone for your agent mesh, handling all message routing and delivery between components. For production environments, using a Solace Cloud-managed event broker provides significant advantages over self-managed installations.
157
158 Solace Cloud-managed event brokers offer built-in high availability, automatic scaling, security updates, and professional support. These managed services eliminate the operational overhead of maintaining event broker infrastructure while providing enterprise-grade reliability and performance.
159
160 For more information about cloud-managed options, see [Solace Cloud](https://solace.com/products/event-broker/). For detailed configuration instructions, see [Configuring the Event Broker Connection](../installing-and-configuring/configurations.md#event-broker-connection).
161
162
163 ### Setting up Queue Templates
164
165 When the `app.broker.temporary_queue` parameter is set to `true` (default), the system uses [temporary endpoints](https://docs.solace.com/Messaging/Guaranteed-Msg/Endpoints.htm#temporary-endpoints) for A2A communication. Temporary queues are automatically created and deleted by the broker, which simplifies management and removes the need for manual cleanup. However, temporary queues do not support multiple client connections to the same queue, which may be limiting in scenarios where you run multiple instances of the same agent or need to start a new instance while an old one is still running.
166
167 If you set `temporary_queue` to `false`, the system will create a durable queue for the client. Durable queues persist beyond the lifetime of a client connection, allowing multiple clients to connect to the same queue and ensuring messages are not lost if the client disconnects. However, this requires manual management of queues, including cleanup of unused ones.
168
169 :::tip
170 For production environments that are container-managed (for example, Kubernetes), we recommend setting `temporary_queue` to `false` by setting the environment variable `USE_TEMPORARY_QUEUES=false`.
171 Using temporary queues in these environments can cause startup issues, since a new container may fail to connect if the previous instance is still running and holding the queue. Durable queues avoid this by allowing multiple agent instances to share the same queue.
172 :::
173
174 To prevent messages from piling up in a durable queue when an agent is not running, the queue should be configured with a message TTL (time-to-live) and the **Respect Message TTL** option enabled. To apply these settings automatically for all new queues, you can create a [Queue Template](https://docs.solace.com/Messaging/Guaranteed-Msg/Configuring-Endpoint-Templates.htm) for your Solace Agent Mesh clients.
175
176 To create a queue template in the Solace Cloud Console:
177 1. Navigate to **Message VPNs** and select your VPN.
178 2. Go to the **Queues** page.
179 3. Open the **Templates** tab.
180 4. Click **+ Queue Template**.
181
182 Use the following settings for the template:
183
184 - **Queue Name Filter** = `{NAMESPACE}/>`
185 (Replace `{NAMESPACE}` with the namespace defined in your configuration, for example, `sam/`)
186 - **Respect TTL** = `true`
187 *(Under: Advanced Settings > Message Expiry)*
188 - **Maximum TTL (sec)** = `18000`
189 *(Under: Advanced Settings > Message Expiry)*
190
191 :::info
192 Queue templates are only applied when a new queue is created from the messaging client.
193 If you have already been running SAM with `temporary_queue` set to `false`, your durable queues were created before the template existed.
194 To apply TTL settings to those queues, either:
195 - Enable **TTL** and **Respect TTL** manually in the Solace console on each queue, or
196 - Delete the existing queues and restart SAM to have them recreated automatically using the new template.
197 :::