implementation-log.html
1 <!doctype html> 2 <html lang="en"> 3 <head> 4 <meta charset="utf-8"> 5 <meta name="viewport" content="width=device-width, initial-scale=1"> 6 <title>AERIS-10 Docs | Implementation Log</title> 7 <link rel="stylesheet" href="assets/style.css"> 8 </head> 9 <body> 10 <header class="topbar"> 11 <div class="container nav"> 12 <a class="brand" href="index.html">AERIS-10 Docs</a> 13 <nav> 14 <a href="architecture.html">Architecture</a> 15 <a href="implementation-log.html">Implementation Log</a> 16 <a href="bring-up.html">Bring-Up</a> 17 <a href="reports.html">Reports</a> 18 <a href="release-notes.html">Release Notes</a> 19 </nav> 20 </div> 21 </header> 22 23 <main class="container page"> 24 <section class="hero"> 25 <p class="eyebrow">Engineering Journal</p> 26 <h1>Implementation Timeline and Improvements</h1> 27 <p>Consolidated record of key firmware, timing, debug and infrastructure changes.</p> 28 </section> 29 30 <section class="card" style="margin-top:0.8rem;"> 31 <h2>Recent milestone timeline</h2> 32 <div class="timeline"> 33 <article> 34 <h3>Build 25 — MTI canceller + DC notch filter (ed629e7)</h3> 35 <p class="muted">MTI 2-pulse canceller (H(z) = 1 - z^{-1}) integrated between range bin decimator and Doppler processor for ground clutter removal. DC notch filter (post-Doppler, pre-CFAR) zeroes bins within ±host_dc_notch_width of bin 0. Two new host registers: host_mti_enable (0x26), host_dc_notch_width (0x27). Both default to off/pass-through for backward compatibility. Build 25: WNS +0.132 ns, WHS +0.058 ns. 9,252 LUTs, 12,488 FFs, 17 BRAM, 142 DSP, 0.753 W. 23/23 FPGA regression, 29/29 MTI standalone checks, 3/3 real-data co-sim exact match.</p> 36 </article> 37 <article> 38 <h3>Build 24 tagged v0.1.5-cfar — CA-CFAR production baseline (075ae1e)</h3> 39 <p class="muted">CA-CFAR detector with CA/GO/SO modes integrated, replacing old threshold detector. Pipelined noise computation (Build 23 fix). WNS +0.179 ns, WHS +0.056 ns. 8,558 LUTs, 10,384 FFs, 17 BRAM, 142 DSP, 0.754 W. CFAR cost: +2,229 LUTs, +1,281 FFs, +1 BRAM, +3 DSP. Includes magnitude BRAM buffer, sliding-window algorithm, host-configurable guard/train/alpha/mode registers (opcodes 0x21-0x25).</p> 40 </article> 41 <article> 42 <h3>Build 23 failed timing, root-caused and fixed (0745cc4)</h3> 43 <p class="muted">Build 23 had WNS -0.309 ns due to combinational path through CFAR noise_sum_comb → cross-multiply → alpha*noise DSP. Fixed by pipelining noise computation into ST_CFAR_THR + ST_CFAR_MUL stages, splitting the path across two clock cycles.</p> 44 </article> 45 <article> 46 <h3>7 production fixes tagged v0.1.4-prod-fixes (e93bc33)</h3> 47 <p class="muted">Detection bug fixes (sticky flag + one-cycle-lag magnitude), rename cfar→threshold_detect, digital gain control (host-configurable power-of-2 shift), Doppler/chirps mismatch protection (clamp + error flag), decimator watchdog (timeout counter), bypass_mode dead code removal, range-mode register (0x20). Real-data co-simulation framework added. 22/22 FPGA regression.</p> 48 </article> 49 <article> 50 <h3>Real-data co-simulation framework (0b06436)</h3> 51 <p class="muted">Three real-data testbenches added: range FFT, Doppler, and full-chain. Compare RTL outputs against Python golden reference using recorded ADC captures. 5,137 total data checks, all exact bit-for-bit match. Tagged v0.1.4-pre-fixes as safety net before production fixes.</p> 52 </article> 53 <article> 54 <h3>Build 21 tagged v0.1.4-build21 — pre-CFAR production baseline (2efab23)</h3> 55 <p class="muted">WNS +0.156 ns, WHS +0.064 ns, WPWS +0.361 ns. 6,192 LUTs (4.6%), 9,064 FFs (3.4%), 16 BRAM (4.4%), 139 DSP48E1 (18.8%), 0.732 W. Includes FFT 4-cycle butterfly (20% throughput), barrel-shift twiddle (-1 DSP), Gap 2 GUI Settings, E2E RTL fixes (mixer sequencing, USB data-pending, receiver toggle wiring), Vivado DRC multiple-driver fix for data_pending flags, and MMCM LOCKED XDC false_path correction (-from → -through). Build script crash at report_exceptions/check_timing (Vivado 2025.2 bug) fixed by wrapping in catch blocks; all 12 critical reports and bitstream generated successfully.</p> 56 </article> 57 <article> 58 <h3>E2E integration test + RTL fixes: mixer sequencing, USB data-pending, receiver wiring (0773001)</h3> 59 <p class="muted">New end-to-end testbench (tb_system_e2e.v) with 46 checks across 12 groups covering reset, TX, safety, RX, USB R/W, CDC, beam scanning, reset recovery, stream control, latency budgets, and watchdog. RTL fixes discovered via E2E: chirp controller TX/RX mixer enables now mutually exclusive by FSM state; USB write FSM gains doppler/cfar data_pending sticky flags with stream-control reset default changed to range-only (3'b001); receiver gets STM32 toggle signal inputs and dynamic frame detection. USB unit tests 21/22/56 updated for data_pending architecture. Regression script PASS/FAIL parsing hardened. 19/19 FPGA, 20/20 MCU.</p> 60 </article> 61 <article> 62 <h3>FFT engine optimizations: 4-cycle butterfly + barrel-shift twiddle (a3e1996)</h3> 63 <p class="muted">Merged SHIFT state into WRITE stage for a 5→4 cycle butterfly pipeline (20% throughput improvement). Replaced multiplier-based twiddle factor index computation with variable left-shift (barrel shift), freeing one DSP48 multiplier. Both changes verified via FFT testbench.</p> 64 </article> 65 <article> 66 <h3>Gap 2: GUI Settings — runtime chirp timing, stream control, status readback (7cdfa48)</h3> 67 <p class="muted">Radar chirp timing parameters (long/short chirp, listen, guard cycles, chirps-per-elevation) are now runtime-configurable via 6 new USB opcodes (0x10-0x15). Stream control (opcode 0x04) gates the USB write FSM per-stream. CFAR threshold (opcode 0x03) is wired to actual comparison logic (was hardcoded). Status readback (opcode 0xFF) returns a 7-word packet with all current settings. CDC handled via per-bit 2-stage synchronizers (stream control) and toggle CDC (status request). 4 new testbench groups added. 18/18 FPGA, 20/20 MCU.</p> 68 </article> 69 <article> 70 <h3>Gap 4: USB Read Path wired with toggle CDC (e5d1b3c)</h3> 71 <p class="muted">FT601 read FSM cmd_* outputs connected through toggle CDC to clk_100m command decode registers in radar_system_top.v. Host can now set radar mode, trigger chirps, set CFAR threshold, and control data streaming via USB. 3 new testbench groups (55 total checks). 18/18 FPGA regression.</p> 72 </article> 73 <article> 74 <h3>Build 20 tagged v0.1.3-build20 — new production baseline (c6103b3)</h3> 75 <p class="muted">WNS improved 7x to +0.426 ns (from +0.062 ns in Build 18). Includes 400 MHz MMCM jitter cleaner, CIC comb DSP48E1 CREG pipeline, and XDC clock-name fix. All timing constraints met. 6,092 LUTs (4.5%), 9,024 FFs (3.4%), 16 BRAM (4.4%), 140 DSP48E1 (18.9%), 0.730 W.</p> 76 </article> 77 <article> 78 <h3>Build 19 timing failure root-caused and fixed</h3> 79 <p class="muted">Build 19 had WNS -0.011 ns due to conflicting XDC create_generated_clock preventing false-path application on CDC paths. Fixed by removing the conflicting constraint and using Vivado auto-generated clk_mmcm_out0.</p> 80 </article> 81 <article> 82 <h3>Gap 3: Safety Architecture closed (f3bbf77)</h3> 83 <p class="muted">Added IWDG watchdog configuration, Emergency_Stop PA rail cutoff, temperature max guard, periodic IDQ re-read, and emergency state ordering. 5 new MCU tests, 20/20 MCU regression pass.</p> 84 </article> 85 <article> 86 <h3>Gap 5: BRAM async reset fixed (c87dce0)</h3> 87 <p class="muted">Chirp memory loader BRAM async reset converted to synchronous reset pattern per Xilinx UG901 guidelines. Prevents BRAM inference failures on production target.</p> 88 </article> 89 <article> 90 <h3>Build 18 tagged v0.1.2-build18 — prior production baseline</h3> 91 <p class="muted">WNS +0.062 ns, WHS +0.059 ns. 6,088 LUTs, 8,946 FFs, 16 BRAM, 140 DSP48E1, 0.631 W. All timing met.</p> 92 </article> 93 <article> 94 <h3>Firmware bug sweep closed with regression coverage</h3> 95 <p class="muted">All 17 audited MCU firmware bugs were fixed, regression-tested, and pushed, including LO init ordering, SPI chip-select handling, PA calibration logic, TIM3 PWM bring-up, and stale diagnostic mismatches. 20/20 MCU tests pass.</p> 96 </article> 97 <article> 98 <h3>FPGA timing/resource cleanup phase completed</h3> 99 <p class="muted">Chirp BRAM migration, Doppler DSP48 pipelining, CIC pipeline staging, matched-filter regression repair, and full FPGA regression brought the active baseline to 18/18 passing tests.</p> 100 </article> 101 </div> 102 </section> 103 104 <section class="grid-2" style="margin-top:0.8rem;"> 105 <article class="card"> 106 <h2>Codebase quality and verification upgrades</h2> 107 <ul> 108 <li>FPGA regression: 23/23 passing suites covering matched filter, Doppler, CIC, CDC, USB (with read path), FFT, NCO, FIR, range decimator, mode controller, system-top integration, system E2E, CFAR standalone, and MTI standalone.</li> 109 <li>MCU regression: 20/20 passing tests (15 bug-fix + 5 Gap-3 safety tests).</li> 110 <li>Architectural gaps 1–7 all closed. Gap 1 (CFAR) integrated as CA-CFAR detector (Build 24). MTI canceller + DC notch filter added (Build 25). Gaps 2–7 closed prior to Build 21.</li> 111 <li>USB host-to-FPGA command path fully wired: read FSM, toggle CDC, command decode for mode/trigger/CFAR/stream control. GUI settings (chirp timing, stream gating, status readback) fully operational.</li> 112 <li>Safety architecture: IWDG watchdog, emergency stop PA cutoff, temperature guard, IDQ re-read, state ordering.</li> 113 </ul> 114 </article> 115 <article class="card"> 116 <h2>Build history and timing improvements</h2> 117 <ul> 118 <li><strong>Build 25 (v0.1.6-mti)</strong>: Current production baseline. WNS +0.132 ns, WHS +0.058 ns. MTI canceller + DC notch filter. 9,252 LUTs, 12,488 FFs, 142 DSP48E1. 0.753 W.</li> 119 <li><strong>Build 24 (v0.1.5-cfar)</strong>: Prior production baseline. WNS +0.179 ns, WHS +0.056 ns. CA-CFAR detector (CA/GO/SO). 8,558 LUTs, 142 DSP48E1. 0.754 W.</li> 120 <li><strong>Build 21 (v0.1.4-build21)</strong>: Pre-CFAR baseline. WNS +0.156 ns, WHS +0.064 ns. FFT 4-cycle butterfly + barrel-shift twiddle. 139 DSP48E1 (-1). 0.732 W.</li> 121 <li><strong>Build 20 (v0.1.3-build20)</strong>: Prior production baseline. WNS +0.426 ns, WHS +0.058 ns. 400 MHz MMCM + CIC CREG pipeline. 0.730 W.</li> 122 <li><strong>Build 19</strong>: Failed (WNS -0.011 ns). Root cause: conflicting XDC generated clock prevented false-path application.</li> 123 <li><strong>Build 18 (v0.1.2-build18)</strong>: Prior baseline. WNS +0.062 ns, WHS +0.059 ns. 0.631 W.</li> 124 <li><strong>Build 17 (v0.1.1-build17)</strong>: FIR DSP48 pipelining + matched filter BRAM migration.</li> 125 <li>Remote Vivado build infrastructure on Ubuntu 24.04 with Vivado 2025.2, targeting XC7A200T-2FBG484I.</li> 126 </ul> 127 </article> 128 </section> 129 </main> 130 131 <footer class="footer"> 132 <div class="container"><p>Update this page at each major commit or bring-up gate.</p></div> 133 </footer> 134 </body> 135 </html>