/ workflows / claude.md
claude.md
  1  # workflows/
  2  
  3  **Context:** Multi-Agent Workflow Definitions & Orchestration
  4  
  5  This directory contains workflow definitions that coordinate multiple agents to complete complex organizational tasks.
  6  
  7  ## Purpose
  8  
  9  Workflows enable:
 10  - **Multi-agent coordination** - Sequential and parallel agent tasks
 11  - **Decision orchestration** - Automated decision flows with proper authority
 12  - **Human-in-the-loop** - Pause for human approval at critical points
 13  - **Process automation** - Repeatable organizational processes
 14  
 15  ## Directory Structure
 16  
 17  ```
 18  workflows/
 19  ├── claude.md              # This file
 20  ├── examples/              # Example workflow definitions
 21  │   ├── feature_development.exs
 22  │   ├── hiring_process.exs
 23  │   ├── incident_response.exs
 24  │   ├── strategic_planning.exs
 25  │   └── budget_allocation.exs
 26  └── README.md             # Workflow usage guide
 27  ```
 28  
 29  ## Workflow DSL
 30  
 31  Workflows use the `EchoShared.Workflow.Definition` DSL:
 32  
 33  ```elixir
 34  alias EchoShared.Workflow.Definition
 35  
 36  Definition.new(
 37    "workflow_name",                  # Unique workflow identifier
 38    "Workflow description",           # Human-readable description
 39    [:list, :of, :participating_agents],  # Required agent roles
 40    [
 41      # Workflow steps (see below)
 42    ]
 43  )
 44  ```
 45  
 46  ## Workflow Step Types
 47  
 48  ### 1. Request Step
 49  
 50  Execute an agent tool:
 51  
 52  ```elixir
 53  {:request, :agent_role, "tool_name", %{
 54    param1: "value1",
 55    param2: :from_previous_step  # Use output from previous step
 56  }}
 57  ```
 58  
 59  ### 2. Decision Step
 60  
 61  Trigger a decision with specified mode:
 62  
 63  ```elixir
 64  {:decision, %{
 65    type: "budget_approval",
 66    mode: :autonomous,  # or :collaborative, :hierarchical, :human
 67    initiator: :ceo,
 68    context: %{amount: 500_000}
 69  }}
 70  ```
 71  
 72  ### 3. Parallel Steps
 73  
 74  Execute multiple steps concurrently:
 75  
 76  ```elixir
 77  {:parallel, [
 78    {:request, :senior_developer, "implement_backend", %{}},
 79    {:request, :uiux_engineer, "design_ui", %{}},
 80    {:request, :test_lead, "create_test_plan", %{}}
 81  ]}
 82  ```
 83  
 84  ### 4. Conditional Step
 85  
 86  Branch based on condition:
 87  
 88  ```elixir
 89  {:conditional,
 90    fn context -> context.budget > 1_000_000 end,
 91    {:decision, %{mode: :human, reason: "High budget"}},
 92    {:decision, %{mode: :autonomous}}}
 93  ```
 94  
 95  ### 5. Pause Step
 96  
 97  Wait for human approval:
 98  
 99  ```elixir
100  {:pause, "Awaiting executive approval before production deployment"}
101  ```
102  
103  ### 6. Loop Step
104  
105  Repeat steps:
106  
107  ```elixir
108  {:loop,
109    fn context -> length(context.items) > 0 end,
110    [
111      {:request, :agent, "process_item", %{item: :from_context}}
112    ]}
113  ```
114  
115  ## Example Workflows
116  
117  ### Feature Development Workflow
118  
119  ```elixir
120  # workflows/examples/feature_development.exs
121  
122  Definition.new(
123    "feature_development",
124    "Complete feature development from requirements to deployment",
125    [:product_manager, :senior_architect, :cto, :senior_developer,
126     :uiux_engineer, :test_lead, :ceo],
127    [
128      # Step 1: Product Manager defines feature
129      {:request, :product_manager, "define_feature", %{
130        name: :from_input,
131        priority: :from_input
132      }},
133  
134      # Step 2: Architect designs system
135      {:request, :senior_architect, "design_system", %{
136        requirements: :from_previous_step
137      }},
138  
139      # Step 3: CTO reviews and approves architecture
140      {:decision, %{
141        type: "architecture_approval",
142        mode: :autonomous,
143        initiator: :cto,
144        context: :from_previous_step
145      }},
146  
147      # Step 4: Parallel implementation
148      {:parallel, [
149        {:request, :senior_developer, "implement_backend", %{
150          design: :from_step_2
151        }},
152        {:request, :uiux_engineer, "design_ui", %{
153          requirements: :from_step_1
154        }}
155      ]},
156  
157      # Step 5: Test Lead creates test plan
158      {:request, :test_lead, "create_test_plan", %{
159        implementation: :from_previous_step
160      }},
161  
162      # Step 6: Budget approval
163      {:decision, %{
164        type: "budget_approval",
165        mode: :autonomous,
166        initiator: :ceo,
167        context: %{estimated_cost: :from_context}
168      }},
169  
170      # Step 7: Human approval for production
171      {:pause, "Awaiting human approval for production deployment"}
172    ]
173  )
174  ```
175  
176  ### Hiring Process Workflow
177  
178  ```elixir
179  # workflows/examples/hiring_process.exs
180  
181  Definition.new(
182    "hiring_process",
183    "Complete hiring workflow from job posting to offer",
184    [:chro, :cto, :ceo],
185    [
186      # Step 1: CHRO posts job
187      {:request, :chro, "post_job", %{
188        position: :from_input,
189        department: :from_input
190      }},
191  
192      # Step 2: CTO reviews candidates (technical positions)
193      {:conditional,
194        fn ctx -> ctx.department == "engineering" end,
195        {:request, :cto, "review_candidates", %{
196          candidates: :from_context
197        }},
198        {:request, :chro, "review_candidates", %{}}
199      },
200  
201      # Step 3: Interview scheduling
202      {:request, :chro, "schedule_interviews", %{
203        selected_candidates: :from_previous_step
204      }},
205  
206      # Step 4: Offer approval
207      {:decision, %{
208        type: "offer_approval",
209        mode: :collaborative,
210        participants: [:chro, :cto, :ceo],
211        context: %{
212          candidate: :from_context,
213          salary: :from_context
214        }
215      }},
216  
217      # Step 5: Send offer
218      {:request, :chro, "send_offer", %{
219        candidate: :from_context,
220        approved_terms: :from_previous_step
221      }}
222    ]
223  )
224  ```
225  
226  ### Incident Response Workflow
227  
228  ```elixir
229  # workflows/examples/incident_response.exs
230  
231  Definition.new(
232    "incident_response",
233    "Handle production incidents",
234    [:operations_head, :cto, :senior_developer, :ceo],
235    [
236      # Step 1: Operations Head assesses severity
237      {:request, :operations_head, "assess_incident", %{
238        incident_id: :from_input,
239        description: :from_input
240      }},
241  
242      # Step 2: Notify leadership if critical
243      {:conditional,
244        fn ctx -> ctx.severity == "critical" end,
245        {:request, :operations_head, "notify_leadership", %{
246          incident: :from_previous_step
247        }},
248        {:noop}
249      },
250  
251      # Step 3: CTO approves mitigation plan
252      {:decision, %{
253        type: "mitigation_approval",
254        mode: :autonomous,
255        initiator: :cto,
256        context: :from_step_1
257      }},
258  
259      # Step 4: Implement fix
260      {:request, :senior_developer, "deploy_hotfix", %{
261        mitigation_plan: :from_previous_step
262      }},
263  
264      # Step 5: Operations verifies resolution
265      {:request, :operations_head, "verify_resolution", %{
266        incident_id: :from_input
267      }},
268  
269      # Step 6: CEO informed if customer-impacting
270      {:conditional,
271        fn ctx -> ctx.customer_impact == true end,
272        {:request, :ceo, "customer_communication", %{
273          incident_summary: :from_context
274        }},
275        {:noop}
276      }
277    ]
278  )
279  ```
280  
281  ## Executing Workflows
282  
283  ### Start Workflow
284  
285  ```elixir
286  alias EchoShared.Workflow.Engine
287  
288  # Load workflow definition
289  {:ok, workflow} = File.read("workflows/examples/feature_development.exs")
290  {:ok, definition} = Code.eval_string(workflow)
291  
292  # Start execution
293  {:ok, execution_id} = Engine.start_workflow(definition, %{
294    triggered_by: "ceo",
295    input: %{
296      name: "User Authentication",
297      priority: "high"
298    }
299  })
300  ```
301  
302  ### Monitor Workflow
303  
304  ```elixir
305  # Check status
306  {:ok, status} = Engine.get_status(execution_id)
307  # => %{
308  #   workflow_name: "feature_development",
309  #   status: "running",
310  #   current_step: 3,
311  #   total_steps: 7,
312  #   started_at: ~U[...],
313  #   completed_steps: [
314  #     %{step: 1, agent: :product_manager, result: %{...}},
315  #     %{step: 2, agent: :senior_architect, result: %{...}}
316  #   ]
317  # }
318  
319  # Get execution history
320  {:ok, history} = Engine.get_history(execution_id)
321  ```
322  
323  ### Resume Paused Workflow
324  
325  ```elixir
326  # Resume after human approval
327  Engine.resume(execution_id, %{
328    human_approval: true,
329    approved_by: "john@company.com",
330    notes: "Approved for production deployment"
331  })
332  ```
333  
334  ### Cancel Workflow
335  
336  ```elixir
337  Engine.cancel(execution_id, "Requirements changed, no longer needed")
338  ```
339  
340  ## Best Practices
341  
342  ### 1. Clear Step Names
343  
344  Use descriptive names for workflow steps:
345  
346  ```elixir
347  # Good
348  {:request, :product_manager, "define_feature_requirements", %{...}}
349  
350  # Bad
351  {:request, :product_manager, "do_thing", %{...}}
352  ```
353  
354  ### 2. Handle Errors
355  
356  Add error handling steps:
357  
358  ```elixir
359  [
360    {:request, :agent, "risky_operation", %{}},
361    {:conditional,
362      fn ctx -> ctx.error != nil end,
363      {:request, :agent, "handle_error", %{error: :from_context}},
364      {:request, :agent, "continue_normal_flow", %{}}
365    }
366  ]
367  ```
368  
369  ### 3. Context Management
370  
371  Pass data between steps explicitly:
372  
373  ```elixir
374  [
375    {:request, :pm, "define_feature", %{name: "Auth"}},
376    # Store result in context
377    {:assign, :feature_spec, :from_previous_step},
378    # Use stored context later
379    {:request, :architect, "design", %{spec: :feature_spec}}
380  ]
381  ```
382  
383  ### 4. Timeout Handling
384  
385  Set timeouts for long-running steps:
386  
387  ```elixir
388  {:request, :agent, "long_operation", %{
389    timeout: 300_000,  # 5 minutes
390    on_timeout: :retry  # or :fail, :continue
391  }}
392  ```
393  
394  ### 5. Idempotency
395  
396  Ensure steps can be safely retried:
397  
398  ```elixir
399  def execute_tool("create_record", %{"id" => id} = args) do
400    case Repo.get(Record, id) do
401      nil -> Repo.insert(%Record{id: id, ...})  # Create if not exists
402      record -> {:ok, record}  # Return existing
403    end
404  end
405  ```
406  
407  ## Testing Workflows
408  
409  ```elixir
410  defmodule WorkflowTest do
411    use ExUnit.Case
412    alias EchoShared.Workflow.Engine
413  
414    test "feature development workflow completes successfully" do
415      {:ok, workflow} = load_workflow("feature_development")
416  
417      {:ok, execution_id} = Engine.start_workflow(workflow, %{
418        input: %{name: "Test Feature", priority: "high"}
419      })
420  
421      # Wait for completion
422      assert_workflow_completes(execution_id, timeout: 30_000)
423  
424      # Verify results
425      {:ok, status} = Engine.get_status(execution_id)
426      assert status.status == "completed"
427      assert length(status.completed_steps) == 7
428    end
429  
430    test "workflow pauses for human approval" do
431      {:ok, workflow} = load_workflow("feature_development")
432      {:ok, execution_id} = Engine.start_workflow(workflow, %{...})
433  
434      # Should pause at step 7
435      :timer.sleep(5000)
436      {:ok, status} = Engine.get_status(execution_id)
437      assert status.status == "paused"
438      assert status.pause_reason == "Awaiting human approval..."
439  
440      # Resume
441      Engine.resume(execution_id, %{human_approval: true})
442  
443      # Should complete
444      assert_workflow_completes(execution_id)
445    end
446  end
447  ```
448  
449  ## Common Issues
450  
451  ### Workflow stuck in "running" state
452  
453  **Cause:** Agent not responding to tool request
454  
455  **Debug:**
456  ```elixir
457  {:ok, status} = Engine.get_status(execution_id)
458  IO.inspect(status.current_step)
459  # Check agent logs for the stuck step
460  ```
461  
462  ### Steps executing out of order
463  
464  **Cause:** Parallel step without proper synchronization
465  
466  **Solution:** Use explicit synchronization:
467  ```elixir
468  {:parallel, [step1, step2, step3]},
469  {:wait_all},  # Wait for all parallel steps to complete
470  {:request, :next_agent, "next_step", %{}}
471  ```
472  
473  ### Context not passing between steps
474  
475  **Cause:** Incorrect context reference
476  
477  **Solution:**
478  ```elixir
479  # Explicit context passing
480  {:request, :agent1, "step1", %{}},
481  {:assign, :result1, :from_previous_step},
482  {:request, :agent2, "step2", %{input: :result1}}
483  ```
484  
485  ## Environment Variables
486  
487  ```bash
488  # Workflow execution
489  WORKFLOW_TIMEOUT=3600000       # Default workflow timeout (1 hour)
490  WORKFLOW_STEP_TIMEOUT=300000   # Default step timeout (5 minutes)
491  WORKFLOW_RETRY_ATTEMPTS=3      # Retry failed steps
492  WORKFLOW_PARALLEL_LIMIT=5      # Max concurrent parallel steps
493  
494  # Storage
495  WORKFLOW_EXECUTION_RETENTION=30  # Days to keep execution history
496  ```
497  
498  ## Using LocalCode for Workflow Development
499  
500  **LocalCode** (scripts/llm/) provides quick assistance for workflow development and debugging. See `../CLAUDE.md` Rule 8 and `../scripts/claude.md` for complete documentation.
501  
502  ### Quick Start
503  
504  ```bash
505  source ./scripts/llm/localcode_quick.sh
506  lc_start
507  
508  # Explore workflow patterns
509  lc_query "Show me the feature development workflow structure"
510  lc_query "How do I implement a conditional step?"
511  lc_query "Explain workflow context passing between steps"
512  
513  lc_end
514  ```
515  
516  ### Common Workflow Queries
517  
518  **Understanding Workflows:**
519  ```bash
520  lc_query "What workflow step types are available?"
521  lc_query "How do parallel steps work?"
522  lc_query "Explain workflow pause and resume"
523  ```
524  
525  **Building Workflows:**
526  ```bash
527  lc_query "How do I create a new workflow for incident response?"
528  lc_query "What's the pattern for human-in-the-loop workflows?"
529  lc_query "Show me error handling in workflows"
530  ```
531  
532  **Debugging:**
533  ```bash
534  lc_query "Workflow stuck in running state, how to debug?"
535  lc_query "Why might context not pass between steps?"
536  lc_query "How to debug parallel step synchronization issues?"
537  ```
538  
539  **Review:**
540  ```bash
541  # Dual perspective
542  lc_query "Review workflows/examples/feature_development.exs for issues"
543  # Then ask Claude Code for comprehensive analysis
544  ```
545  
546  ### When to Use LocalCode vs Claude Code
547  
548  **Use LocalCode for:**
549  - Understanding workflow DSL syntax
550  - Quick pattern lookup
551  - Debugging hints for stuck workflows
552  - Exploring example workflows
553  
554  **Use Claude Code for:**
555  - Creating new complex workflows
556  - Refactoring workflow definitions
557  - Writing workflow tests
558  - Multi-file workflow changes
559  
560  **Use Both for:**
561  - Reviewing workflow definitions
562  - Design decisions for workflow patterns
563  - Complex debugging scenarios
564  
565  Response time: 7-30 seconds typical
566  
567  ## Related Documentation
568  
569  - **Parent:** [../CLAUDE.md](../CLAUDE.md) - Project overview
570  - **Shared Library:** [../shared/claude.md](../shared/claude.md) - Workflow engine API
571  - **Agents:** [../agents/claude.md](../agents/claude.md) - Agent tools used in workflows
572  
573  ---
574  
575  **Remember:** Workflows coordinate agents but don't replace good agent design. Keep workflows simple and delegate complexity to agents.