benchmark.md
1 # Egress benchmark 2 3 **Prerequisites**: Docker, `curl` on the host. Domains: [`tests/hostname.txt`](../tests/hostname.txt) (one hostname per line; `#` and blank lines ignored). Run from `components/egress` or adjust paths. 4 5 --- 6 7 ## 1. `bench-dns-nft.sh` 8 9 **Compares**: plain **`curl`** container (**baseline**) → egress **`dns`** → egress **`dns+nft`**. Prints **Req/s**, **Avg**, **P50**, **P99**; percentages are vs **baseline**. 10 11 ### Run 12 13 ```bash 14 cd components/egress 15 ./tests/bench-dns-nft.sh 16 ``` 17 18 Builds `opensandbox/egress:local` unless you set **`IMG=...`**. Optional: **`BENCH_SAMPLE_SIZE=n`** to use `n` random domains. 19 20 ### View results 21 22 - **Terminal**: summary table at end. 23 - **Host `/tmp`**: `bench-e2e-baseline-total.txt`, `bench-e2e-dns-total.txt`, `bench-e2e-dns+nft-total.txt` (one **`time_total`** per line); `bench-e2e-{mode}-namelookup.txt`, `bench-e2e-{mode}-wall.txt`. 24 25 --- 26 27 ## 2. `bench-mitm-overhead.sh` 28 29 **Compares**: **`dns+nft`** without MITM vs **`dns+nft` + transparent mitmproxy**. Default **`BENCH_SCENARIOS=short,download`** — **`short`** = many HTTPS **HEAD**s; **`download`** = parallel **GET** to **`BENCH_DOWNLOAD_URL`** (default Cloudflare `__down` ~20 MiB). 30 31 ### Run 32 33 ```bash 34 cd components/egress 35 ./tests/bench-mitm-overhead.sh 36 ``` 37 38 **`SKIP_BUILD=1`** skips image build; **`IMG`** is at the top of the script. One scenario only, e.g. **`BENCH_SCENARIOS=short`** or **`=download`**. 39 40 ### View results 41 42 - **Terminal**: tables per scenario (latency / throughput vs no-MITM), plus **`E2E latency loss (avg time_total)`** in **ms/request** and **%**. 43 - **Host `/tmp`**: 44 - Latency artifacts: `bench-mitm-*-short-*.txt`, `*-download-*.tsv`, `*-wall.txt`, etc. 45 - **Container metrics** (always written): `bench-mitm-docker-stats-dns_nft.tsv`, `bench-mitm-docker-stats-dns_nft_mitm.tsv` — `unix_ts`, **`/proc/loadavg`** (load1/5/15, …), **`docker stats`** (CPUPerc, MemUsage, …). *`loadavg` inside the container often tracks the host; use for relative trends.* 46 47 --- 48 49 ## 3. Reference baselines (example runs) 50 51 Illustrative only — **same machine, same script**, not a SLA. **MITM** row = **`dns+nft` + transparent mitm**. 52 53 ### `BENCH_SCENARIOS=download` (parallel GET, ~20 MiB, 4 streams, 1 round, 1 s sampling) 54 55 | Metric | `dns+nft` | + mitm | 56 |--------|-----------|--------| 57 | **CPUPerc** (docker) | Mostly **~2–5%**, max **~5.6%** | Often **~5–11%**, max **~10.9%** | 58 | **MemUsage** | **~9–18 MiB** | **~68–91 MiB** | 59 | **load1** | Up to **~0.23** | Spike **~0.66**, then **~0.4–0.6** | 60 61 **Takeaway**: ~**2×** peak CPU% and ~**5×** RSS vs no MITM in this trace. 62 63 ### `BENCH_SCENARIOS=short` (HEAD storm; **sparse** rows if the phase is short) 64 65 Run profile (sample): `10 rounds × 40 URLs × 1 inflight = 400 requests`. 66 67 | Metric | `dns+nft` | + mitm | 68 |--------|-----------|--------| 69 | **Req/s** | **3.64** | **1.90** (**-47.6%**) | 70 | **Avg latency (time_total)** | **0.315 s** | **0.605 s** (**+91.9%**) | 71 | **P50 latency** | **0.136 s** | **0.143 s** (**+5.2%**) | 72 | **P99 latency** | **1.439 s** | **10.006 s** (**+595.2%**) | 73 | **E2E latency loss (avg)** | baseline | **+289.88 ms/request (+91.95%)** | 74 75 | Metric | `dns+nft` | + mitm | 76 |--------|-----------|--------| 77 | **CPUPerc** | Hot sample **~132%** | Hot sample **~232%** | 78 | **MemUsage** | **~6–10 MiB** | **~58–88 MiB** | 79 80 **`CPUPerc` > 100%** on multi-core is normal (container can use more than one core-equivalent per Docker’s metric). 81 82 **Takeaway**: this sample shows clear request-side overhead from transparent MITM, about **+289.88 ms/request** on average with throughput dropping to about half. `P50` is close to baseline while `P99` grows sharply, indicating tail-latency amplification. With only **40 requests**, tail metrics are timing-sensitive; use more rounds or more domains for stable P99. 83 84 CPU/memory trend remains consistent: peak CPU sample **~1.8×** (**232/132**), and RSS is much higher with mitmdump. For denser host/container telemetry, use longer runs or **`BENCH_DOCKER_STATS_INTERVAL=0.5`**.