/ components / egress / docs / benchmark.md
benchmark.md
 1  # Egress benchmark
 2  
 3  **Prerequisites**: Docker, `curl` on the host. Domains: [`tests/hostname.txt`](../tests/hostname.txt) (one hostname per line; `#` and blank lines ignored). Run from `components/egress` or adjust paths.
 4  
 5  ---
 6  
 7  ## 1. `bench-dns-nft.sh`
 8  
 9  **Compares**: plain **`curl`** container (**baseline**) → egress **`dns`** → egress **`dns+nft`**. Prints **Req/s**, **Avg**, **P50**, **P99**; percentages are vs **baseline**.
10  
11  ### Run
12  
13  ```bash
14  cd components/egress
15  ./tests/bench-dns-nft.sh
16  ```
17  
18  Builds `opensandbox/egress:local` unless you set **`IMG=...`**. Optional: **`BENCH_SAMPLE_SIZE=n`** to use `n` random domains.
19  
20  ### View results
21  
22  - **Terminal**: summary table at end.
23  - **Host `/tmp`**: `bench-e2e-baseline-total.txt`, `bench-e2e-dns-total.txt`, `bench-e2e-dns+nft-total.txt` (one **`time_total`** per line); `bench-e2e-{mode}-namelookup.txt`, `bench-e2e-{mode}-wall.txt`.
24  
25  ---
26  
27  ## 2. `bench-mitm-overhead.sh`
28  
29  **Compares**: **`dns+nft`** without MITM vs **`dns+nft` + transparent mitmproxy**. Default **`BENCH_SCENARIOS=short,download`** — **`short`** = many HTTPS **HEAD**s; **`download`** = parallel **GET** to **`BENCH_DOWNLOAD_URL`** (default Cloudflare `__down` ~20 MiB).
30  
31  ### Run
32  
33  ```bash
34  cd components/egress
35  ./tests/bench-mitm-overhead.sh
36  ```
37  
38  **`SKIP_BUILD=1`** skips image build; **`IMG`** is at the top of the script. One scenario only, e.g. **`BENCH_SCENARIOS=short`** or **`=download`**.
39  
40  ### View results
41  
42  - **Terminal**: tables per scenario (latency / throughput vs no-MITM), plus **`E2E latency loss (avg time_total)`** in **ms/request** and **%**.
43  - **Host `/tmp`**:
44    - Latency artifacts: `bench-mitm-*-short-*.txt`, `*-download-*.tsv`, `*-wall.txt`, etc.
45    - **Container metrics** (always written): `bench-mitm-docker-stats-dns_nft.tsv`, `bench-mitm-docker-stats-dns_nft_mitm.tsv` — `unix_ts`, **`/proc/loadavg`** (load1/5/15, …), **`docker stats`** (CPUPerc, MemUsage, …). *`loadavg` inside the container often tracks the host; use for relative trends.*
46  
47  ---
48  
49  ## 3. Reference baselines (example runs)
50  
51  Illustrative only — **same machine, same script**, not a SLA. **MITM** row = **`dns+nft` + transparent mitm**.
52  
53  ### `BENCH_SCENARIOS=download` (parallel GET, ~20 MiB, 4 streams, 1 round, 1 s sampling)
54  
55  | Metric | `dns+nft` | + mitm |
56  |--------|-----------|--------|
57  | **CPUPerc** (docker) | Mostly **~2–5%**, max **~5.6%** | Often **~5–11%**, max **~10.9%** |
58  | **MemUsage** | **~9–18 MiB** | **~68–91 MiB** |
59  | **load1** | Up to **~0.23** | Spike **~0.66**, then **~0.4–0.6** |
60  
61  **Takeaway**: ~**2×** peak CPU% and ~**5×** RSS vs no MITM in this trace.
62  
63  ### `BENCH_SCENARIOS=short` (HEAD storm; **sparse** rows if the phase is short)
64  
65  Run profile (sample): `10 rounds × 40 URLs × 1 inflight = 400 requests`.
66  
67  | Metric | `dns+nft` | + mitm |
68  |--------|-----------|--------|
69  | **Req/s** | **3.64** | **1.90** (**-47.6%**) |
70  | **Avg latency (time_total)** | **0.315 s** | **0.605 s** (**+91.9%**) |
71  | **P50 latency** | **0.136 s** | **0.143 s** (**+5.2%**) |
72  | **P99 latency** | **1.439 s** | **10.006 s** (**+595.2%**) |
73  | **E2E latency loss (avg)** | baseline | **+289.88 ms/request (+91.95%)** |
74  
75  | Metric | `dns+nft` | + mitm |
76  |--------|-----------|--------|
77  | **CPUPerc** | Hot sample **~132%** | Hot sample **~232%** |
78  | **MemUsage** | **~6–10 MiB** | **~58–88 MiB** |
79  
80  **`CPUPerc` > 100%** on multi-core is normal (container can use more than one core-equivalent per Docker’s metric).
81  
82  **Takeaway**: this sample shows clear request-side overhead from transparent MITM, about **+289.88 ms/request** on average with throughput dropping to about half. `P50` is close to baseline while `P99` grows sharply, indicating tail-latency amplification. With only **40 requests**, tail metrics are timing-sensitive; use more rounds or more domains for stable P99.
83  
84  CPU/memory trend remains consistent: peak CPU sample **~1.8×** (**232/132**), and RSS is much higher with mitmdump. For denser host/container telemetry, use longer runs or **`BENCH_DOCKER_STATS_INTERVAL=0.5`**.