<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <title>Cradicle Explorer</title>
    <link href="/css/bootstrap/bootstrap.min.css" rel="stylesheet">
    <style>
      .form-control-dark::placeholder {
          color: #aaa;
          opacity: 1;
      }
    </style>
    <link rel="stylesheet" href="/assets/fontawesome/css/all.min.css">
    <link rel="icon" type="image/png" href="/favicon.png">


                <link href="/css/dashboard.css" rel="stylesheet">
                </head>
                <body>
                <header class="navbar navbar-dark sticky-top bg-dark flex-md-nowrap p-0 shadow">
                  <a class="navbar-brand col-md-3 col-lg-2 me-0 px-3 fs-6" href="/">Cradicle Explorer</a>
                  <button class="navbar-toggler position-absolute d-md-none collapsed" type="button" data-bs-toggle="collapse" data-bs-target="#sidebarMenu" aria-controls="sidebarMenu" aria-expanded="false" aria-label="Toggle navigation">
                    <span class="navbar-toggler-icon"></span>
                  </button>
                  <form method="get" action="/cgi-bin/main" style="width:100%;"><input class="form-control form-control-dark w-100 rounded-0 border-0" type="text" name="q" placeholder="Search repos" aria-label="Search"></form>
                  <div class="navbar-nav flex-row">
                    <div class="nav-item text-nowrap">
                      <a class="nav-link px-3 active" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM">hermes-agent</a>
                    </div>
                  </div>
                </header>
                <div class="container-fluid">
                  <div class="row">
                    <nav id="sidebarMenu" class="col-md-3 col-lg-2 d-md-block bg-dark sidebar collapse">
                      <div class="position-sticky pt-3 sidebar-sticky">
                        <ul class="nav flex-column">
                          <li class="nav-item">
                            <a class="nav-link" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM">
                              <i class="align-text-bottom fa-solid fa-info"></i>
                              Info
                            </a>
                          </li>
                          <li class="nav-item">
                            <a class="nav-link" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&issue=list">
                              <i class="align-text-bottom fa-solid fa-layer-group"></i>
                              Issues
                            </a>
                          </li>
                          <li class="nav-item">
                            <a class="nav-link" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&patch=list">
                              <i class="align-text-bottom fa-solid fa-vest-patches"></i>
                              Patches
                            </a>
                          </li>
                          <li class="nav-item">
                            <a class="nav-link" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&wallet=list">
                              <i class="align-text-bottom fa-solid fa-wallet"></i>
                              Wallets
                            </a>
                          </li>
                          <li class="nav-item">
                            <a class="nav-link active" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=.">
                              <i class="align-text-bottom fa-solid fa-code"></i>
                              Source
                            </a>
                          </li>
                        <h6 class="sidebar-heading d-flex justify-content-between align-items-center px-3 mt-4 mb-1 text-muted text-uppercase">
                          <span></span>
                        </h6>
                        <ul class="nav flex-column mb-2">
                        
    <h6 class="sidebar-heading d-flex justify-content-between align-items-center px-3 mt-1 mb-1 text-muted text-uppercase">
      <span>Source</span>
    </h6>
    <li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=.github"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> .github</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=.plans"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> .plans</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=acp_adapter"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> acp_adapter</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=acp_registry"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> acp_registry</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=agent"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> agent</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=assets"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> assets</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=cron"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> cron</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=datagen-config-examples"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> datagen-config-examples</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=docker"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> docker</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=docs"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> docs</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=environments"><i class="fa-solid fa-folder-open" style="color:#f0c040;"></i> environments</a></li><li><a class="nav-link py-0" style="padding-left:32px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=environments%2Fbenchmarks"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> benchmarks</a></li><li><a class="nav-link py-0" style="padding-left:32px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=environments%2Fhermes_swe_env"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> hermes_swe_env</a></li><li><a class="nav-link py-0" style="padding-left:32px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=environments%2Fterminal_test_env"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> terminal_test_env</a></li><li><a class="nav-link py-0" style="padding-left:32px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=environments%2Ftool_call_parsers"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> tool_call_parsers</a></li><li><a class="nav-link py-0 active" style="padding-left:32px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=environments%2FREADME.md"><i class="fa-solid fa-file" style="color:#888;"></i> README.md</a></li><li><a class="nav-link py-0" style="padding-left:32px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=environments%2F__init__.py"><i class="fa-solid fa-file" style="color:#888;"></i> __init__.py</a></li><li><a class="nav-link py-0" style="padding-left:32px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=environments%2Fagent_loop.py"><i class="fa-solid fa-file" style="color:#888;"></i> agent_loop.py</a></li><li><a class="nav-link py-0" style="padding-left:32px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=environments%2Fagentic_opd_env.py"><i class="fa-solid fa-file" style="color:#888;"></i> agentic_opd_env.py</a></li><li><a class="nav-link py-0" style="padding-left:32px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=environments%2Fhermes_base_env.py"><i class="fa-solid fa-file" style="color:#888;"></i> hermes_base_env.py</a></li><li><a class="nav-link py-0" style="padding-left:32px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=environments%2Fpatches.py"><i class="fa-solid fa-file" style="color:#888;"></i> patches.py</a></li><li><a class="nav-link py-0" style="padding-left:32px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=environments%2Ftool_context.py"><i class="fa-solid fa-file" style="color:#888;"></i> tool_context.py</a></li><li><a class="nav-link py-0" style="padding-left:32px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=environments%2Fweb_research_env.py"><i class="fa-solid fa-file" style="color:#888;"></i> web_research_env.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=gateway"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> gateway</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=hermes_cli"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> hermes_cli</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=nix"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> nix</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=optional-skills"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> optional-skills</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=packaging"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> packaging</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=plans"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> plans</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=plugins"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> plugins</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=scripts"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> scripts</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=skills"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> skills</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=tests"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> tests</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=tools"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> tools</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=tui_gateway"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> tui_gateway</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=ui-tui"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> ui-tui</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=web"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> web</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=website"><i class="fa-solid fa-folder" style="color:#f0c040;"></i> website</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=.dockerignore"><i class="fa-solid fa-file" style="color:#888;"></i> .dockerignore</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=.env.example"><i class="fa-solid fa-file" style="color:#888;"></i> .env.example</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=.envrc"><i class="fa-solid fa-file" style="color:#888;"></i> .envrc</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=.gitattributes"><i class="fa-solid fa-file" style="color:#888;"></i> .gitattributes</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=.gitignore"><i class="fa-solid fa-file" style="color:#888;"></i> .gitignore</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=.gitmodules"><i class="fa-solid fa-file" style="color:#888;"></i> .gitmodules</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=.mailmap"><i class="fa-solid fa-file" style="color:#888;"></i> .mailmap</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=AGENTS.md"><i class="fa-solid fa-file" style="color:#888;"></i> AGENTS.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=CONTRIBUTING.md"><i class="fa-solid fa-file" style="color:#888;"></i> CONTRIBUTING.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=Dockerfile"><i class="fa-solid fa-file" style="color:#888;"></i> Dockerfile</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=LICENSE"><i class="fa-solid fa-file" style="color:#888;"></i> LICENSE</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=MANIFEST.in"><i class="fa-solid fa-file" style="color:#888;"></i> MANIFEST.in</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=README.md"><i class="fa-solid fa-file" style="color:#888;"></i> README.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=RELEASE_v0.10.0.md"><i class="fa-solid fa-file" style="color:#888;"></i> RELEASE_v0.10.0.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=RELEASE_v0.11.0.md"><i class="fa-solid fa-file" style="color:#888;"></i> RELEASE_v0.11.0.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=RELEASE_v0.12.0.md"><i class="fa-solid fa-file" style="color:#888;"></i> RELEASE_v0.12.0.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=RELEASE_v0.2.0.md"><i class="fa-solid fa-file" style="color:#888;"></i> RELEASE_v0.2.0.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=RELEASE_v0.3.0.md"><i class="fa-solid fa-file" style="color:#888;"></i> RELEASE_v0.3.0.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=RELEASE_v0.4.0.md"><i class="fa-solid fa-file" style="color:#888;"></i> RELEASE_v0.4.0.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=RELEASE_v0.5.0.md"><i class="fa-solid fa-file" style="color:#888;"></i> RELEASE_v0.5.0.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=RELEASE_v0.6.0.md"><i class="fa-solid fa-file" style="color:#888;"></i> RELEASE_v0.6.0.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=RELEASE_v0.7.0.md"><i class="fa-solid fa-file" style="color:#888;"></i> RELEASE_v0.7.0.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=RELEASE_v0.8.0.md"><i class="fa-solid fa-file" style="color:#888;"></i> RELEASE_v0.8.0.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=RELEASE_v0.9.0.md"><i class="fa-solid fa-file" style="color:#888;"></i> RELEASE_v0.9.0.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=SECURITY.md"><i class="fa-solid fa-file" style="color:#888;"></i> SECURITY.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=batch_runner.py"><i class="fa-solid fa-file" style="color:#888;"></i> batch_runner.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=cli-config.yaml.example"><i class="fa-solid fa-file" style="color:#888;"></i> cli-config.yaml.example</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=cli.py"><i class="fa-solid fa-file" style="color:#888;"></i> cli.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=constraints-termux.txt"><i class="fa-solid fa-file" style="color:#888;"></i> constraints-termux.txt</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=docker-compose.yml"><i class="fa-solid fa-file" style="color:#888;"></i> docker-compose.yml</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=flake.lock"><i class="fa-solid fa-file" style="color:#888;"></i> flake.lock</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=flake.nix"><i class="fa-solid fa-file" style="color:#888;"></i> flake.nix</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=hermes"><i class="fa-solid fa-file" style="color:#888;"></i> hermes</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=hermes-already-has-routines.md"><i class="fa-solid fa-file" style="color:#888;"></i> hermes-already-has-routines.md</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=hermes_constants.py"><i class="fa-solid fa-file" style="color:#888;"></i> hermes_constants.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=hermes_logging.py"><i class="fa-solid fa-file" style="color:#888;"></i> hermes_logging.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=hermes_state.py"><i class="fa-solid fa-file" style="color:#888;"></i> hermes_state.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=hermes_time.py"><i class="fa-solid fa-file" style="color:#888;"></i> hermes_time.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=mcp_serve.py"><i class="fa-solid fa-file" style="color:#888;"></i> mcp_serve.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=mini_swe_runner.py"><i class="fa-solid fa-file" style="color:#888;"></i> mini_swe_runner.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=model_tools.py"><i class="fa-solid fa-file" style="color:#888;"></i> model_tools.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=package-lock.json"><i class="fa-solid fa-file" style="color:#888;"></i> package-lock.json</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=package.json"><i class="fa-solid fa-file" style="color:#888;"></i> package.json</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=pyproject.toml"><i class="fa-solid fa-file" style="color:#888;"></i> pyproject.toml</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=rl_cli.py"><i class="fa-solid fa-file" style="color:#888;"></i> rl_cli.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=run_agent.py"><i class="fa-solid fa-file" style="color:#888;"></i> run_agent.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=setup-hermes.sh"><i class="fa-solid fa-file" style="color:#888;"></i> setup-hermes.sh</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=tinker-atropos"><i class="fa-solid fa-file" style="color:#888;"></i> tinker-atropos</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=toolset_distributions.py"><i class="fa-solid fa-file" style="color:#888;"></i> toolset_distributions.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=toolsets.py"><i class="fa-solid fa-file" style="color:#888;"></i> toolsets.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=trajectory_compressor.py"><i class="fa-solid fa-file" style="color:#888;"></i> trajectory_compressor.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=utils.py"><i class="fa-solid fa-file" style="color:#888;"></i> utils.py</a></li><li><a class="nav-link py-0" style="padding-left:16px;" href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&file=uv.lock"><i class="fa-solid fa-file" style="color:#888;"></i> uv.lock</a></li>
    
                        </ul>
                      </div>
                    </nav>
                <main class="col-md-9 ms-sm-auto col-lg-10">
                  <div class="container px-1 py-3">
        
<div class="mb-2" style="font-size:1.1rem;"><a href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=.">/</a> <a href="/cgi-bin/repo?id=zocbfki6Y3WyerhbCJXb5nCbkWDM&source=environments">environments</a> / README.md</div>
        <div class="list-group">
        <div class="list-group-item">
        <div class="mb-2" style="font-weight:bold;"><i class="fa-solid fa-file"></i> README.md</div>
        <pre style="margin:0; font-size:0.85rem; overflow-x:auto; color:#fafafa;"><span style="color:#666; user-select:none;">  1</span>  # Hermes-Agent Atropos Environments
<span style="color:#666; user-select:none;">  2</span>  
<span style="color:#666; user-select:none;">  3</span>  This directory contains the integration layer between **hermes-agent&#x27;s** tool-calling capabilities and the **Atropos** RL training framework. It provides everything needed to run agentic LLMs through multi-turn tool-calling loops, score their output with arbitrary reward functions, and feed results into Atropos for training or evaluation.
<span style="color:#666; user-select:none;">  4</span>  
<span style="color:#666; user-select:none;">  5</span>  ## Architecture Overview
<span style="color:#666; user-select:none;">  6</span>  
<span style="color:#666; user-select:none;">  7</span>  ```
<span style="color:#666; user-select:none;">  8</span>                          Atropos Framework
<span style="color:#666; user-select:none;">  9</span>                      ┌───────────────────────┐
<span style="color:#666; user-select:none;"> 10</span>                      │       BaseEnv          │  (atroposlib)
<span style="color:#666; user-select:none;"> 11</span>                      │  - Server management   │
<span style="color:#666; user-select:none;"> 12</span>                      │  - Worker scheduling   │
<span style="color:#666; user-select:none;"> 13</span>                      │  - Wandb logging       │
<span style="color:#666; user-select:none;"> 14</span>                      │  - CLI (serve/process/ │
<span style="color:#666; user-select:none;"> 15</span>                      │    evaluate)           │
<span style="color:#666; user-select:none;"> 16</span>                      └───────────┬───────────┘
<span style="color:#666; user-select:none;"> 17</span>                                  │ inherits
<span style="color:#666; user-select:none;"> 18</span>                      ┌───────────┴───────────┐
<span style="color:#666; user-select:none;"> 19</span>                      │  HermesAgentBaseEnv    │  hermes_base_env.py
<span style="color:#666; user-select:none;"> 20</span>                      │  - Terminal backend    │
<span style="color:#666; user-select:none;"> 21</span>                      │  - Tool resolution     │
<span style="color:#666; user-select:none;"> 22</span>                      │  - Agent loop          │
<span style="color:#666; user-select:none;"> 23</span>                      │  - ToolContext          │
<span style="color:#666; user-select:none;"> 24</span>                      │  - Async patches       │
<span style="color:#666; user-select:none;"> 25</span>                      └───────────┬───────────┘
<span style="color:#666; user-select:none;"> 26</span>                                  │ inherits
<span style="color:#666; user-select:none;"> 27</span>                ┌─────────────────┼─────────────────┐
<span style="color:#666; user-select:none;"> 28</span>                │                 │                  │
<span style="color:#666; user-select:none;"> 29</span>       TerminalTestEnv     HermesSweEnv    TerminalBench2EvalEnv
<span style="color:#666; user-select:none;"> 30</span>       (stack testing)     (SWE training)   (TB2 benchmark eval)
<span style="color:#666; user-select:none;"> 31</span>  ```
<span style="color:#666; user-select:none;"> 32</span>  
<span style="color:#666; user-select:none;"> 33</span>  ### Inheritance Chain
<span style="color:#666; user-select:none;"> 34</span>  
<span style="color:#666; user-select:none;"> 35</span>  **BaseEnv** (from `atroposlib`) is the Atropos base class. It provides:
<span style="color:#666; user-select:none;"> 36</span>  - Server management (OpenAI-compatible API servers, VLLM, SGLang)
<span style="color:#666; user-select:none;"> 37</span>  - Worker scheduling for parallel rollouts
<span style="color:#666; user-select:none;"> 38</span>  - Wandb integration for metrics and rollout logging
<span style="color:#666; user-select:none;"> 39</span>  - CLI interface with three subcommands: `serve`, `process`, `evaluate`
<span style="color:#666; user-select:none;"> 40</span>  - `evaluate_log()` for saving eval results to JSON + samples.jsonl
<span style="color:#666; user-select:none;"> 41</span>  
<span style="color:#666; user-select:none;"> 42</span>  **HermesAgentBaseEnv** (`hermes_base_env.py`) extends BaseEnv with hermes-agent specifics:
<span style="color:#666; user-select:none;"> 43</span>  - Sets `os.environ[&quot;TERMINAL_ENV&quot;]` to configure the terminal backend (local, docker, modal, daytona, ssh, singularity)
<span style="color:#666; user-select:none;"> 44</span>  - Resolves hermes-agent toolsets via `_resolve_tools_for_group()` (calls `get_tool_definitions()` which queries `tools/registry.py`)
<span style="color:#666; user-select:none;"> 45</span>  - Implements `collect_trajectory()` which runs the full agent loop and computes rewards
<span style="color:#666; user-select:none;"> 46</span>  - Supports two-phase operation (Phase 1: OpenAI server, Phase 2: VLLM ManagedServer)
<span style="color:#666; user-select:none;"> 47</span>  - Applies monkey patches for async-safe tool operation at import time
<span style="color:#666; user-select:none;"> 48</span>  
<span style="color:#666; user-select:none;"> 49</span>  Concrete environments inherit from `HermesAgentBaseEnv` and implement:
<span style="color:#666; user-select:none;"> 50</span>  - `setup()` -- Load dataset, initialize state
<span style="color:#666; user-select:none;"> 51</span>  - `get_next_item()` -- Return the next item for rollout
<span style="color:#666; user-select:none;"> 52</span>  - `format_prompt()` -- Convert a dataset item into the user message
<span style="color:#666; user-select:none;"> 53</span>  - `compute_reward()` -- Score the rollout using ToolContext
<span style="color:#666; user-select:none;"> 54</span>  - `evaluate()` -- Periodic evaluation logic
<span style="color:#666; user-select:none;"> 55</span>  
<span style="color:#666; user-select:none;"> 56</span>  ## Core Components
<span style="color:#666; user-select:none;"> 57</span>  
<span style="color:#666; user-select:none;"> 58</span>  ### Agent Loop (`agent_loop.py`)
<span style="color:#666; user-select:none;"> 59</span>  
<span style="color:#666; user-select:none;"> 60</span>  `HermesAgentLoop` is the reusable multi-turn agent engine. It runs the same pattern as hermes-agent&#x27;s `run_agent.py`:
<span style="color:#666; user-select:none;"> 61</span>  
<span style="color:#666; user-select:none;"> 62</span>  1. Send messages + tools to the API via `server.chat_completion()`
<span style="color:#666; user-select:none;"> 63</span>  2. If the response contains `tool_calls`, execute each one via `handle_function_call()` (which delegates to `tools/registry.py`&#x27;s `dispatch()`)
<span style="color:#666; user-select:none;"> 64</span>  3. Append tool results to the conversation and go back to step 1
<span style="color:#666; user-select:none;"> 65</span>  4. If the response has no tool_calls, the agent is done
<span style="color:#666; user-select:none;"> 66</span>  
<span style="color:#666; user-select:none;"> 67</span>  Tool calls are executed in a thread pool (`run_in_executor`) so backends that use `asyncio.run()` internally (Modal, Docker) don&#x27;t deadlock inside Atropos&#x27;s event loop.
<span style="color:#666; user-select:none;"> 68</span>  
<span style="color:#666; user-select:none;"> 69</span>  Returns an `AgentResult` containing the full conversation history, turn count, reasoning content per turn, tool errors, and optional ManagedServer state (for Phase 2).
<span style="color:#666; user-select:none;"> 70</span>  
<span style="color:#666; user-select:none;"> 71</span>  ### Tool Context (`tool_context.py`)
<span style="color:#666; user-select:none;"> 72</span>  
<span style="color:#666; user-select:none;"> 73</span>  `ToolContext` is a per-rollout handle that gives reward/verification functions direct access to **all** hermes-agent tools, scoped to the rollout&#x27;s `task_id`. The same `task_id` means the terminal/browser session is the SAME one the model used during its rollout -- all state (files, processes, browser tabs) is preserved.
<span style="color:#666; user-select:none;"> 74</span>  
<span style="color:#666; user-select:none;"> 75</span>  ```python
<span style="color:#666; user-select:none;"> 76</span>  async def compute_reward(self, item, result, ctx: ToolContext):
<span style="color:#666; user-select:none;"> 77</span>      # Run tests in the model&#x27;s terminal sandbox
<span style="color:#666; user-select:none;"> 78</span>      test = ctx.terminal(&quot;pytest -v&quot;)
<span style="color:#666; user-select:none;"> 79</span>      if test[&quot;exit_code&quot;] == 0:
<span style="color:#666; user-select:none;"> 80</span>          return 1.0
<span style="color:#666; user-select:none;"> 81</span>  
<span style="color:#666; user-select:none;"> 82</span>      # Check if a file was created
<span style="color:#666; user-select:none;"> 83</span>      content = ctx.read_file(&quot;/workspace/solution.py&quot;)
<span style="color:#666; user-select:none;"> 84</span>      if content.get(&quot;content&quot;):
<span style="color:#666; user-select:none;"> 85</span>          return 0.5
<span style="color:#666; user-select:none;"> 86</span>  
<span style="color:#666; user-select:none;"> 87</span>      # Download files locally for verification (binary-safe)
<span style="color:#666; user-select:none;"> 88</span>      ctx.download_file(&quot;/remote/output.bin&quot;, &quot;/local/output.bin&quot;)
<span style="color:#666; user-select:none;"> 89</span>  
<span style="color:#666; user-select:none;"> 90</span>      return 0.0
<span style="color:#666; user-select:none;"> 91</span>  ```
<span style="color:#666; user-select:none;"> 92</span>  
<span style="color:#666; user-select:none;"> 93</span>  Available methods:
<span style="color:#666; user-select:none;"> 94</span>  - **Terminal**: `terminal(command, timeout)` -- run shell commands
<span style="color:#666; user-select:none;"> 95</span>  - **Files**: `read_file(path)`, `write_file(path, content)`, `search(query, path)`
<span style="color:#666; user-select:none;"> 96</span>  - **Transfers**: `upload_file()`, `upload_dir()`, `download_file()`, `download_dir()` -- binary-safe file transfers between host and sandbox
<span style="color:#666; user-select:none;"> 97</span>  - **Web**: `web_search(query)`, `web_extract(urls)`
<span style="color:#666; user-select:none;"> 98</span>  - **Browser**: `browser_navigate(url)`, `browser_snapshot()`
<span style="color:#666; user-select:none;"> 99</span>  - **Generic**: `call_tool(name, args)` -- call any hermes-agent tool by name
<span style="color:#666; user-select:none;">100</span>  - **Cleanup**: `cleanup()` -- release all resources (called automatically after `compute_reward`)
<span style="color:#666; user-select:none;">101</span>  
<span style="color:#666; user-select:none;">102</span>  ### Patches (`patches.py`)
<span style="color:#666; user-select:none;">103</span>  
<span style="color:#666; user-select:none;">104</span>  **Problem**: Some hermes-agent tools use `asyncio.run()` internally (e.g., the Modal backend). This crashes when called from inside Atropos&#x27;s event loop because `asyncio.run()` cannot be nested.
<span style="color:#666; user-select:none;">105</span>  
<span style="color:#666; user-select:none;">106</span>  **Solution**: `ModalEnvironment` uses a dedicated `_AsyncWorker` background thread with its own event loop. The calling code sees a sync interface, but internally all async Modal SDK calls happen on the worker thread so they don&#x27;t conflict with Atropos&#x27;s loop. This is built directly into `tools/environments/modal.py` — no monkey-patching required.
<span style="color:#666; user-select:none;">107</span>  
<span style="color:#666; user-select:none;">108</span>  `patches.py` is now a no-op (kept for backward compatibility with imports).
<span style="color:#666; user-select:none;">109</span>  
<span style="color:#666; user-select:none;">110</span>  ### Tool Call Parsers (`tool_call_parsers/`)
<span style="color:#666; user-select:none;">111</span>  
<span style="color:#666; user-select:none;">112</span>  Client-side parsers that extract structured `tool_calls` from raw model output text. Used in **Phase 2** (VLLM server type) where ManagedServer&#x27;s `/generate` endpoint returns raw text without tool call parsing.
<span style="color:#666; user-select:none;">113</span>  
<span style="color:#666; user-select:none;">114</span>  Each parser is a standalone reimplementation of the corresponding VLLM parser&#x27;s `extract_tool_calls()` logic. No VLLM dependency -- only standard library (`re`, `json`, `uuid`) and `openai` types.
<span style="color:#666; user-select:none;">115</span>  
<span style="color:#666; user-select:none;">116</span>  Available parsers:
<span style="color:#666; user-select:none;">117</span>  - `hermes` -- Hermes/ChatML `&lt;tool_call&gt;` XML format
<span style="color:#666; user-select:none;">118</span>  - `mistral` -- Mistral `[TOOL_CALLS]` format
<span style="color:#666; user-select:none;">119</span>  - `llama3_json` -- Llama 3 JSON tool calling
<span style="color:#666; user-select:none;">120</span>  - `qwen` -- Qwen tool calling format
<span style="color:#666; user-select:none;">121</span>  - `qwen3_coder` -- Qwen3 Coder format
<span style="color:#666; user-select:none;">122</span>  - `deepseek_v3` -- DeepSeek V3 format
<span style="color:#666; user-select:none;">123</span>  - `deepseek_v3_1` -- DeepSeek V3.1 format
<span style="color:#666; user-select:none;">124</span>  - `kimi_k2` -- Kimi K2 format
<span style="color:#666; user-select:none;">125</span>  - `longcat` -- Longcat format
<span style="color:#666; user-select:none;">126</span>  - `glm45` / `glm47` -- GLM model formats
<span style="color:#666; user-select:none;">127</span>  
<span style="color:#666; user-select:none;">128</span>  Usage:
<span style="color:#666; user-select:none;">129</span>  ```python
<span style="color:#666; user-select:none;">130</span>  from environments.tool_call_parsers import get_parser
<span style="color:#666; user-select:none;">131</span>  
<span style="color:#666; user-select:none;">132</span>  parser = get_parser(&quot;hermes&quot;)
<span style="color:#666; user-select:none;">133</span>  content, tool_calls = parser.parse(raw_model_output)
<span style="color:#666; user-select:none;">134</span>  ```
<span style="color:#666; user-select:none;">135</span>  
<span style="color:#666; user-select:none;">136</span>  In Phase 1 (OpenAI server type), these parsers are not needed -- the server handles tool call parsing natively.
<span style="color:#666; user-select:none;">137</span>  
<span style="color:#666; user-select:none;">138</span>  ## Two-Phase Operation
<span style="color:#666; user-select:none;">139</span>  
<span style="color:#666; user-select:none;">140</span>  ### Phase 1: OpenAI Server (Evaluation / SFT Data Generation)
<span style="color:#666; user-select:none;">141</span>  
<span style="color:#666; user-select:none;">142</span>  Uses `server.chat_completion()` with `tools=` parameter. The server (VLLM, SGLang, OpenRouter, OpenAI) handles tool call parsing natively. Returns `ChatCompletion` objects with structured `tool_calls`.
<span style="color:#666; user-select:none;">143</span>  
<span style="color:#666; user-select:none;">144</span>  - Good for: evaluation, SFT data generation, testing
<span style="color:#666; user-select:none;">145</span>  - Run with: `serve` (with `run-api`), `process`, or `evaluate` subcommands
<span style="color:#666; user-select:none;">146</span>  - Placeholder tokens are created for the Atropos pipeline
<span style="color:#666; user-select:none;">147</span>  
<span style="color:#666; user-select:none;">148</span>  ### Phase 2: VLLM ManagedServer (Full RL Training)
<span style="color:#666; user-select:none;">149</span>  
<span style="color:#666; user-select:none;">150</span>  Uses ManagedServer for exact token IDs + logprobs via `/generate`. Client-side tool call parser (from `tool_call_parsers/`) reconstructs structured `tool_calls` from raw output.
<span style="color:#666; user-select:none;">151</span>  
<span style="color:#666; user-select:none;">152</span>  - Good for: full RL training with GRPO/PPO
<span style="color:#666; user-select:none;">153</span>  - Run with: `serve` subcommand
<span style="color:#666; user-select:none;">154</span>  - Real tokens, masks, and logprobs flow through the pipeline
<span style="color:#666; user-select:none;">155</span>  
<span style="color:#666; user-select:none;">156</span>  ## Directory Structure
<span style="color:#666; user-select:none;">157</span>  
<span style="color:#666; user-select:none;">158</span>  ```
<span style="color:#666; user-select:none;">159</span>  environments/
<span style="color:#666; user-select:none;">160</span>  ├── README.md                     # This file
<span style="color:#666; user-select:none;">161</span>  ├── __init__.py                   # Package exports
<span style="color:#666; user-select:none;">162</span>  ├── hermes_base_env.py            # Abstract base (HermesAgentBaseEnv)
<span style="color:#666; user-select:none;">163</span>  ├── agent_loop.py                 # Multi-turn agent engine (HermesAgentLoop)
<span style="color:#666; user-select:none;">164</span>  ├── tool_context.py               # Per-rollout tool access for reward functions
<span style="color:#666; user-select:none;">165</span>  ├── patches.py                    # Async-safety patches for Modal backend
<span style="color:#666; user-select:none;">166</span>  │
<span style="color:#666; user-select:none;">167</span>  ├── tool_call_parsers/            # Phase 2 client-side parsers
<span style="color:#666; user-select:none;">168</span>  │   ├── __init__.py               # Registry + base class
<span style="color:#666; user-select:none;">169</span>  │   ├── hermes_parser.py
<span style="color:#666; user-select:none;">170</span>  │   ├── mistral_parser.py
<span style="color:#666; user-select:none;">171</span>  │   ├── llama_parser.py
<span style="color:#666; user-select:none;">172</span>  │   ├── qwen_parser.py
<span style="color:#666; user-select:none;">173</span>  │   ├── qwen3_coder_parser.py
<span style="color:#666; user-select:none;">174</span>  │   ├── deepseek_v3_parser.py
<span style="color:#666; user-select:none;">175</span>  │   ├── deepseek_v3_1_parser.py
<span style="color:#666; user-select:none;">176</span>  │   ├── kimi_k2_parser.py
<span style="color:#666; user-select:none;">177</span>  │   ├── longcat_parser.py
<span style="color:#666; user-select:none;">178</span>  │   ├── glm45_parser.py
<span style="color:#666; user-select:none;">179</span>  │   └── glm47_parser.py
<span style="color:#666; user-select:none;">180</span>  │
<span style="color:#666; user-select:none;">181</span>  ├── terminal_test_env/            # Stack validation environment
<span style="color:#666; user-select:none;">182</span>  │   └── terminal_test_env.py
<span style="color:#666; user-select:none;">183</span>  │
<span style="color:#666; user-select:none;">184</span>  ├── hermes_swe_env/               # SWE-bench style training environment
<span style="color:#666; user-select:none;">185</span>  │   └── hermes_swe_env.py
<span style="color:#666; user-select:none;">186</span>  │
<span style="color:#666; user-select:none;">187</span>  └── benchmarks/                   # Evaluation benchmarks
<span style="color:#666; user-select:none;">188</span>      ├── terminalbench_2/          # 89 terminal tasks, Modal sandboxes
<span style="color:#666; user-select:none;">189</span>      │   └── terminalbench2_env.py
<span style="color:#666; user-select:none;">190</span>      ├── tblite/                   # 100 calibrated tasks (fast TB2 proxy)
<span style="color:#666; user-select:none;">191</span>      │   └── tblite_env.py
<span style="color:#666; user-select:none;">192</span>      └── yc_bench/                 # Long-horizon strategic benchmark
<span style="color:#666; user-select:none;">193</span>          └── yc_bench_env.py
<span style="color:#666; user-select:none;">194</span>  ```
<span style="color:#666; user-select:none;">195</span>  
<span style="color:#666; user-select:none;">196</span>  ## Concrete Environments
<span style="color:#666; user-select:none;">197</span>  
<span style="color:#666; user-select:none;">198</span>  ### TerminalTestEnv (`terminal_test_env/`)
<span style="color:#666; user-select:none;">199</span>  
<span style="color:#666; user-select:none;">200</span>  A self-contained environment with inline tasks (no external dataset needed) for validating the full stack end-to-end. Each task asks the model to create a file at a known path, and the verifier checks the content matches.
<span style="color:#666; user-select:none;">201</span>  
<span style="color:#666; user-select:none;">202</span>  ```bash
<span style="color:#666; user-select:none;">203</span>  # Serve mode (needs run-api)
<span style="color:#666; user-select:none;">204</span>  run-api
<span style="color:#666; user-select:none;">205</span>  python environments/terminal_test_env/terminal_test_env.py serve
<span style="color:#666; user-select:none;">206</span>  
<span style="color:#666; user-select:none;">207</span>  # Process mode (no run-api, saves to JSONL)
<span style="color:#666; user-select:none;">208</span>  python environments/terminal_test_env/terminal_test_env.py process \
<span style="color:#666; user-select:none;">209</span>      --env.data_path_to_save_groups terminal_test_output.jsonl
<span style="color:#666; user-select:none;">210</span>  ```
<span style="color:#666; user-select:none;">211</span>  
<span style="color:#666; user-select:none;">212</span>  ### HermesSweEnv (`hermes_swe_env/`)
<span style="color:#666; user-select:none;">213</span>  
<span style="color:#666; user-select:none;">214</span>  SWE-bench style training environment. The model gets a coding task, uses terminal + file + web tools to solve it, and the reward function runs tests in the same Modal sandbox.
<span style="color:#666; user-select:none;">215</span>  
<span style="color:#666; user-select:none;">216</span>  ```bash
<span style="color:#666; user-select:none;">217</span>  python environments/hermes_swe_env/hermes_swe_env.py serve \
<span style="color:#666; user-select:none;">218</span>      --openai.model_name YourModel \
<span style="color:#666; user-select:none;">219</span>      --env.dataset_name bigcode/humanevalpack \
<span style="color:#666; user-select:none;">220</span>      --env.terminal_backend modal
<span style="color:#666; user-select:none;">221</span>  ```
<span style="color:#666; user-select:none;">222</span>  
<span style="color:#666; user-select:none;">223</span>  ### TerminalBench2EvalEnv (`benchmarks/terminalbench_2/`)
<span style="color:#666; user-select:none;">224</span>  
<span style="color:#666; user-select:none;">225</span>  **Eval-only** environment for the Terminal-Bench 2.0 benchmark (89 tasks). Each task gets a pre-built Docker Hub image, a natural language instruction, and a test suite. The agent uses terminal + file tools to solve the task, then the test suite verifies correctness.
<span style="color:#666; user-select:none;">226</span>  
<span style="color:#666; user-select:none;">227</span>  Follows the standard Atropos eval pattern (like GPQA, MMLU, etc.):
<span style="color:#666; user-select:none;">228</span>  - Run via `evaluate` subcommand (no `run-api` needed)
<span style="color:#666; user-select:none;">229</span>  - `setup()` loads the dataset, `evaluate()` runs all tasks
<span style="color:#666; user-select:none;">230</span>  - `rollout_and_score_eval()` handles per-task agent loop + test verification
<span style="color:#666; user-select:none;">231</span>  - Downloads verifier output locally for reliable reward checking (Harbor pattern)
<span style="color:#666; user-select:none;">232</span>  
<span style="color:#666; user-select:none;">233</span>  ```bash
<span style="color:#666; user-select:none;">234</span>  # Run full benchmark
<span style="color:#666; user-select:none;">235</span>  python environments/benchmarks/terminalbench_2/terminalbench2_env.py evaluate \
<span style="color:#666; user-select:none;">236</span>      --openai.model_name anthropic/claude-opus-4.6
<span style="color:#666; user-select:none;">237</span>  
<span style="color:#666; user-select:none;">238</span>  # Run subset of tasks
<span style="color:#666; user-select:none;">239</span>  python environments/benchmarks/terminalbench_2/terminalbench2_env.py evaluate \
<span style="color:#666; user-select:none;">240</span>      --openai.model_name anthropic/claude-opus-4.6 \
<span style="color:#666; user-select:none;">241</span>      --env.task_filter fix-git,git-multibranch
<span style="color:#666; user-select:none;">242</span>  
<span style="color:#666; user-select:none;">243</span>  # Skip specific tasks
<span style="color:#666; user-select:none;">244</span>  python environments/benchmarks/terminalbench_2/terminalbench2_env.py evaluate \
<span style="color:#666; user-select:none;">245</span>      --openai.model_name anthropic/claude-opus-4.6 \
<span style="color:#666; user-select:none;">246</span>      --env.skip_tasks heavy-task,slow-task
<span style="color:#666; user-select:none;">247</span>  ```
<span style="color:#666; user-select:none;">248</span>  
<span style="color:#666; user-select:none;">249</span>  ## Creating a New Environment
<span style="color:#666; user-select:none;">250</span>  
<span style="color:#666; user-select:none;">251</span>  ### Training Environment
<span style="color:#666; user-select:none;">252</span>  
<span style="color:#666; user-select:none;">253</span>  1. Create a new directory under `environments/`
<span style="color:#666; user-select:none;">254</span>  2. Create your env file inheriting from `HermesAgentBaseEnv`
<span style="color:#666; user-select:none;">255</span>  3. Implement the four abstract methods + `evaluate()`
<span style="color:#666; user-select:none;">256</span>  
<span style="color:#666; user-select:none;">257</span>  ```python
<span style="color:#666; user-select:none;">258</span>  from environments.hermes_base_env import HermesAgentBaseEnv, HermesAgentEnvConfig
<span style="color:#666; user-select:none;">259</span>  
<span style="color:#666; user-select:none;">260</span>  class MyEnvConfig(HermesAgentEnvConfig):
<span style="color:#666; user-select:none;">261</span>      pass  # Add custom fields as needed
<span style="color:#666; user-select:none;">262</span>  
<span style="color:#666; user-select:none;">263</span>  class MyEnv(HermesAgentBaseEnv):
<span style="color:#666; user-select:none;">264</span>      name = &quot;my-env&quot;
<span style="color:#666; user-select:none;">265</span>      env_config_cls = MyEnvConfig
<span style="color:#666; user-select:none;">266</span>  
<span style="color:#666; user-select:none;">267</span>      @classmethod
<span style="color:#666; user-select:none;">268</span>      def config_init(cls):
<span style="color:#666; user-select:none;">269</span>          env_config = MyEnvConfig(
<span style="color:#666; user-select:none;">270</span>              enabled_toolsets=[&quot;terminal&quot;, &quot;file&quot;],
<span style="color:#666; user-select:none;">271</span>              terminal_backend=&quot;modal&quot;,
<span style="color:#666; user-select:none;">272</span>              # ... other config
<span style="color:#666; user-select:none;">273</span>          )
<span style="color:#666; user-select:none;">274</span>          server_configs = [APIServerConfig(...)]
<span style="color:#666; user-select:none;">275</span>          return env_config, server_configs
<span style="color:#666; user-select:none;">276</span>  
<span style="color:#666; user-select:none;">277</span>      async def setup(self):
<span style="color:#666; user-select:none;">278</span>          self.dataset = load_dataset(...)
<span style="color:#666; user-select:none;">279</span>          self.iter = 0
<span style="color:#666; user-select:none;">280</span>  
<span style="color:#666; user-select:none;">281</span>      async def get_next_item(self):
<span style="color:#666; user-select:none;">282</span>          item = self.dataset[self.iter % len(self.dataset)]
<span style="color:#666; user-select:none;">283</span>          self.iter += 1
<span style="color:#666; user-select:none;">284</span>          return item
<span style="color:#666; user-select:none;">285</span>  
<span style="color:#666; user-select:none;">286</span>      def format_prompt(self, item):
<span style="color:#666; user-select:none;">287</span>          return item[&quot;instruction&quot;]
<span style="color:#666; user-select:none;">288</span>  
<span style="color:#666; user-select:none;">289</span>      async def compute_reward(self, item, result, ctx):
<span style="color:#666; user-select:none;">290</span>          # ctx gives you full tool access to the rollout&#x27;s sandbox
<span style="color:#666; user-select:none;">291</span>          test = ctx.terminal(&quot;pytest -v&quot;)
<span style="color:#666; user-select:none;">292</span>          return 1.0 if test[&quot;exit_code&quot;] == 0 else 0.0
<span style="color:#666; user-select:none;">293</span>  
<span style="color:#666; user-select:none;">294</span>      async def evaluate(self, *args, **kwargs):
<span style="color:#666; user-select:none;">295</span>          # Periodic evaluation logic
<span style="color:#666; user-select:none;">296</span>          ...
<span style="color:#666; user-select:none;">297</span>  
<span style="color:#666; user-select:none;">298</span>  if __name__ == &quot;__main__&quot;:
<span style="color:#666; user-select:none;">299</span>      MyEnv.cli()
<span style="color:#666; user-select:none;">300</span>  ```
<span style="color:#666; user-select:none;">301</span>  
<span style="color:#666; user-select:none;">302</span>  ### Eval-Only Environment (Benchmark)
<span style="color:#666; user-select:none;">303</span>  
<span style="color:#666; user-select:none;">304</span>  For eval benchmarks, follow the pattern in `terminalbench2_env.py`:
<span style="color:#666; user-select:none;">305</span>  1. Create under `environments/benchmarks/your-benchmark/`
<span style="color:#666; user-select:none;">306</span>  2. Inherit from `HermesAgentBaseEnv`
<span style="color:#666; user-select:none;">307</span>  3. Set eval-only config: `eval_handling=STOP_TRAIN`, `steps_per_eval=1`, `total_steps=1`
<span style="color:#666; user-select:none;">308</span>  4. Stub the training methods (`collect_trajectories`, `score`)
<span style="color:#666; user-select:none;">309</span>  5. Implement `rollout_and_score_eval()` and `evaluate()`
<span style="color:#666; user-select:none;">310</span>  6. Run with `evaluate` subcommand
<span style="color:#666; user-select:none;">311</span>  
<span style="color:#666; user-select:none;">312</span>  ## Key Config Fields
<span style="color:#666; user-select:none;">313</span>  
<span style="color:#666; user-select:none;">314</span>  | Field | Description | Default |
<span style="color:#666; user-select:none;">315</span>  |-------|-------------|---------|
<span style="color:#666; user-select:none;">316</span>  | `enabled_toolsets` | Which hermes toolsets to enable | `None` (all) |
<span style="color:#666; user-select:none;">317</span>  | `disabled_toolsets` | Toolsets to disable | `None` |
<span style="color:#666; user-select:none;">318</span>  | `distribution` | Probabilistic toolset distribution name | `None` |
<span style="color:#666; user-select:none;">319</span>  | `max_agent_turns` | Max LLM calls per rollout | `30` |
<span style="color:#666; user-select:none;">320</span>  | `agent_temperature` | Sampling temperature | `1.0` |
<span style="color:#666; user-select:none;">321</span>  | `terminal_backend` | `local`, `docker`, `modal`, `daytona`, `ssh`, `singularity` | `local` |
<span style="color:#666; user-select:none;">322</span>  | `system_prompt` | System message for the agent | `None` |
<span style="color:#666; user-select:none;">323</span>  | `tool_call_parser` | Parser name for Phase 2 | `hermes` |
<span style="color:#666; user-select:none;">324</span>  | `eval_handling` | `STOP_TRAIN`, `LIMIT_TRAIN`, `NONE` | `STOP_TRAIN` |
</pre>
        </div>
        </div>

</div>
</main>
</div>
</div>


</body>
</html>

