agentic_demo.html
1 <!DOCTYPE html> 2 <html lang="en"> 3 <head> 4 <meta charset="UTF-8"> 5 <meta name="viewport" content="width=device-width, initial-scale=1.0"> 6 <title>PyOD V3: Agentic Anomaly Detection</title> 7 <style> 8 :root { 9 --bg: #fafbfc; 10 --surface: #ffffff; 11 --surface-alt: #f4f6f9; 12 --border: #e1e5eb; 13 --border-light: #eef1f5; 14 --text: #1a2332; 15 --text-secondary: #5a6577; 16 --text-dim: #8892a4; 17 --accent: #2563eb; 18 --accent-light: #dbeafe; 19 --accent-bg: #eff6ff; 20 --green: #059669; 21 --green-light: #d1fae5; 22 --green-bg: #ecfdf5; 23 --amber: #d97706; 24 --amber-light: #fef3c7; 25 --red: #dc2626; 26 --red-light: #fee2e2; 27 --purple: #7c3aed; 28 --purple-light: #ede9fe; 29 --user-bg: #f0f4ff; 30 --user-border: #c7d7fe; 31 --code-bg: #f6f8fa; 32 --shadow-sm: 0 1px 2px rgba(0,0,0,0.04); 33 --shadow: 0 1px 3px rgba(0,0,0,0.06), 0 1px 2px rgba(0,0,0,0.04); 34 --radius: 10px; 35 } 36 * { margin: 0; padding: 0; box-sizing: border-box; } 37 body { 38 font-family: -apple-system, 'SF Pro Text', 'Segoe UI', system-ui, sans-serif; 39 background: var(--bg); color: var(--text); 40 line-height: 1.65; -webkit-font-smoothing: antialiased; 41 } 42 43 .toolbar { 44 background: var(--surface); 45 border-bottom: 1px solid var(--border); 46 padding: 10px 28px; 47 display: flex; align-items: center; gap: 14px; 48 position: sticky; top: 0; z-index: 100; 49 box-shadow: var(--shadow-sm); 50 } 51 .toolbar .logo { font-size: 17px; font-weight: 700; letter-spacing: -0.3px; } 52 .toolbar .logo em { color: var(--accent); font-style: normal; } 53 .toolbar .version { 54 font-size: 11px; font-weight: 600; color: var(--accent); 55 background: var(--accent-light); padding: 2px 8px; border-radius: 6px; 56 } 57 .toolbar .spacer { flex: 1; } 58 .toolbar .caption { font-size: 13px; color: var(--text-dim); } 59 .toolbar button { 60 background: var(--surface-alt); color: var(--text-secondary); 61 border: 1px solid var(--border); padding: 5px 14px; 62 border-radius: 6px; cursor: pointer; font-size: 12px; font-weight: 600; 63 } 64 65 .container { max-width: 1180px; margin: 0 auto; padding: 24px 20px; } 66 67 .header { text-align: center; padding: 28px 0 8px; } 68 .header h1 { 69 font-size: 26px; font-weight: 700; letter-spacing: -0.5px; 70 } 71 .header h1 em { color: var(--accent); font-style: normal; } 72 .header .subtitle { 73 color: var(--text-secondary); font-size: 14px; margin-top: 6px; 74 max-width: 620px; margin-left: auto; margin-right: auto; 75 } 76 .header .dataset-badge { 77 display: inline-flex; align-items: center; gap: 6px; margin-top: 14px; 78 padding: 5px 14px; background: var(--green-bg); 79 border: 1px solid #a7f3d0; border-radius: 20px; 80 font-size: 12px; font-weight: 600; color: var(--green); 81 } 82 .header .dataset-badge .dot { 83 width: 6px; height: 6px; background: var(--green); border-radius: 50%; 84 } 85 86 .layers { display: flex; gap: 8px; justify-content: center; margin: 18px 0 28px; } 87 .layer-chip { 88 padding: 5px 14px; border-radius: 16px; font-size: 12px; 89 font-weight: 600; border: 1px solid; 90 } 91 .layer-chip.l1 { border-color: var(--border); color: var(--text-dim); background: var(--surface); } 92 .layer-chip.l2 { border-color: #c4b5fd; color: var(--purple); background: var(--purple-light); } 93 .layer-chip.l3 { border-color: #93c5fd; color: var(--accent); background: var(--accent-bg); } 94 95 /* The chat container is wide enough to fit a 780px chat column plus a 96 378px callout area to its right. Standalone (un-annotated) turns 97 stay constrained to the 780px chat column so the left edge is 98 consistent with the turn-groups that extend a callout. */ 99 .chat { 100 max-width: 1180px; margin: 0 auto; 101 display: flex; flex-direction: column; gap: 14px; 102 } 103 .chat > .turn { max-width: 780px; } 104 .turn { display: flex; gap: 10px; align-items: flex-start; } 105 .avatar { 106 width: 32px; height: 32px; border-radius: 8px; 107 display: flex; align-items: center; justify-content: center; 108 font-size: 13px; font-weight: 700; flex-shrink: 0; 109 box-shadow: var(--shadow-sm); 110 } 111 .turn.user .avatar { background: var(--accent); color: white; } 112 .turn.agent .avatar { 113 background: linear-gradient(135deg, var(--green), #10b981); color: white; 114 } 115 116 .bubble { 117 border-radius: var(--radius); padding: 14px 16px; 118 font-size: 14px; max-width: 90%; 119 box-shadow: var(--shadow); 120 overflow-wrap: anywhere; /* long inline-code tokens wrap on narrow */ 121 min-width: 0; 122 } 123 .bubble code, .skill-panel code { 124 overflow-wrap: anywhere; 125 word-break: break-word; 126 } 127 .turn.user .bubble { background: var(--user-bg); border: 1px solid var(--user-border); } 128 .turn.agent .bubble { background: var(--surface); border: 1px solid var(--border); } 129 130 .bubble .label { 131 font-size: 10px; font-weight: 700; 132 text-transform: uppercase; letter-spacing: 0.8px; 133 margin-bottom: 5px; 134 } 135 .turn.user .label { color: var(--accent); } 136 .turn.agent .label { color: var(--green); } 137 138 .step-tag { 139 display: inline-block; background: var(--accent); color: white; 140 font-size: 9px; font-weight: 700; padding: 1px 7px; 141 border-radius: 4px; margin-left: 4px; letter-spacing: 0.5px; 142 vertical-align: middle; 143 } 144 145 /* Benchmark attribution */ 146 .bench-note { 147 font-size: 11px; color: var(--text-dim); 148 font-style: italic; margin-top: 4px; 149 } 150 .bench-note .cite { 151 color: var(--accent); font-weight: 600; font-style: normal; 152 } 153 154 /* Detector selection cards */ 155 .det-selection { 156 display: flex; flex-direction: column; gap: 6px; 157 margin: 8px 0; 158 } 159 .det-card { 160 display: flex; align-items: center; gap: 10px; 161 padding: 8px 12px; 162 background: var(--surface-alt); 163 border: 1px solid var(--border-light); 164 border-radius: 8px; 165 } 166 .det-card .rank { 167 font-size: 11px; font-weight: 700; 168 background: var(--accent); color: white; 169 padding: 2px 7px; border-radius: 10px; 170 min-width: 34px; text-align: center; 171 } 172 .det-card .name { 173 font-weight: 700; color: var(--text); 174 font-size: 13px; 175 font-family: 'SF Mono', 'Consolas', monospace; 176 } 177 .det-card .desc { 178 font-size: 12px; color: var(--text-secondary); 179 flex: 1; 180 } 181 182 /* Per-detector score table */ 183 .score-table { 184 width: 100%; border-collapse: collapse; 185 margin: 8px 0; font-size: 13px; 186 } 187 .score-table th { 188 text-align: left; color: var(--text-dim); 189 font-size: 10px; font-weight: 700; 190 text-transform: uppercase; letter-spacing: 0.5px; 191 padding: 6px 8px; border-bottom: 2px solid var(--border); 192 } 193 .score-table td { 194 padding: 6px 8px; border-bottom: 1px solid var(--border-light); 195 } 196 .score-table .det-name { 197 font-family: 'SF Mono', 'Consolas', monospace; 198 font-weight: 700; color: var(--accent); 199 } 200 .score-table .num { 201 font-family: 'SF Mono', 'Consolas', monospace; 202 text-align: right; 203 } 204 .score-table .above { color: var(--red); font-weight: 700; } 205 .score-table .pct { color: var(--text-dim); font-size: 11px; } 206 207 /* Metric cards */ 208 .metrics { 209 display: grid; grid-template-columns: repeat(4, 1fr); 210 gap: 8px; margin: 10px 0; 211 } 212 .metric { 213 background: var(--surface-alt); 214 border: 1px solid var(--border-light); 215 border-radius: 8px; padding: 10px; text-align: center; 216 } 217 .metric .val { 218 font-size: 18px; font-weight: 700; letter-spacing: -0.5px; 219 } 220 .metric .val.green { color: var(--green); } 221 .metric .val.amber { color: var(--amber); } 222 .metric .val.blue { color: var(--accent); } 223 .metric .val.red { color: var(--red); } 224 .metric .lbl { 225 font-size: 10px; color: var(--text-dim); 226 font-weight: 600; text-transform: uppercase; 227 letter-spacing: 0.3px; margin-top: 2px; 228 } 229 230 /* Feature contribution table */ 231 .feat-table { 232 width: 100%; border-collapse: collapse; 233 margin: 8px 0; font-size: 12px; 234 } 235 .feat-table th { 236 text-align: left; color: var(--text-dim); 237 font-size: 10px; font-weight: 700; 238 text-transform: uppercase; letter-spacing: 0.5px; 239 padding: 5px 8px; border-bottom: 1px solid var(--border); 240 } 241 .feat-table td { padding: 4px 8px; border-bottom: 1px solid var(--border-light); } 242 .feat-table .feat { 243 font-family: 'SF Mono', 'Consolas', monospace; 244 color: var(--accent); 245 } 246 .feat-table .z { 247 color: var(--amber); font-weight: 700; 248 font-family: 'SF Mono', 'Consolas', monospace; 249 } 250 .feat-table .val { 251 color: var(--text-secondary); 252 font-family: 'SF Mono', 'Consolas', monospace; 253 } 254 255 /* Code snippet */ 256 .code { 257 background: var(--code-bg); 258 border: 1px solid var(--border-light); 259 border-radius: 6px; padding: 8px 12px; 260 margin: 8px 0 2px; 261 font-family: 'SF Mono', 'Consolas', monospace; 262 font-size: 12px; color: var(--text-secondary); 263 line-height: 1.5; 264 overflow-x: auto; /* long lines scroll horizontally within the block */ 265 } 266 .code .kw { color: #cf222e; } 267 .code .fn { color: #8250df; } 268 .code .str { color: #0a3069; } 269 .code .num { color: #0550ae; } 270 .code .cm { color: var(--text-dim); } 271 272 /* Multi-modal flex */ 273 .modality-grid { 274 display: grid; grid-template-columns: repeat(5, 1fr); 275 gap: 6px; margin: 10px 0 8px; 276 } 277 .mod-card { 278 padding: 8px 6px; text-align: center; 279 border: 1px solid var(--border-light); 280 border-radius: 8px; background: var(--surface); 281 } 282 .mod-card.current { 283 border-color: var(--accent); 284 background: var(--accent-bg); 285 } 286 .mod-card .mod-name { 287 font-size: 11px; font-weight: 700; 288 text-transform: uppercase; letter-spacing: 0.3px; 289 } 290 .mod-card.current .mod-name { color: var(--accent); } 291 .mod-card:not(.current) .mod-name { color: var(--text-dim); } 292 .mod-card .mod-count { 293 font-size: 10px; color: var(--text-dim); 294 margin-top: 2px; 295 } 296 297 /* Quality bars */ 298 .qbar { display: flex; align-items: center; gap: 8px; margin: 3px 0; } 299 .qbar .qlbl { font-size: 12px; color: var(--text-secondary); width: 80px; text-align: right; } 300 .qbar .qtrack { flex: 1; height: 5px; background: var(--border-light); border-radius: 3px; overflow: hidden; } 301 .qbar .qfill { height: 100%; border-radius: 3px; } 302 .qbar .qval { 303 font-size: 12px; font-weight: 700; width: 34px; 304 font-family: 'SF Mono', monospace; 305 } 306 307 .verdict { 308 display: inline-block; padding: 2px 10px; 309 border-radius: 6px; font-size: 11px; font-weight: 700; 310 } 311 .verdict.high { 312 background: var(--green-bg); color: var(--green); 313 border: 1px solid #a7f3d0; 314 } 315 316 .note { 317 font-size: 12px; color: var(--text-dim); 318 margin-top: 6px; line-height: 1.5; font-style: italic; 319 } 320 hr.sep { border: none; border-top: 1px solid var(--border-light); margin: 8px 0; } 321 322 .footer { 323 text-align: center; padding: 28px 0; 324 font-size: 12px; color: var(--text-dim); 325 } 326 .footer a { color: var(--accent); text-decoration: none; } 327 .footer strong { color: var(--text-secondary); } 328 329 /* v3.2.0 skill decisions callout — shows the agent's internal reasoning 330 alongside the agent's conversational turn. The outer .turn-group grid 331 places the chat on the left and the callout on the right so the reader 332 can follow the conversation without scrolling past the machinery. */ 333 .turn-group { 334 display: grid; 335 grid-template-columns: minmax(0, 780px) 360px; 336 gap: 18px; 337 align-items: start; 338 position: relative; 339 } 340 .turn-group > .turn { min-width: 0; } 341 /* Pointer arrow on the callout, pointing back at the agent bubble */ 342 .turn-group > .skill-panel { 343 position: relative; 344 } 345 .turn-group > .skill-panel::before { 346 content: ""; 347 position: absolute; 348 left: -9px; top: 18px; 349 width: 0; height: 0; 350 border-top: 8px solid transparent; 351 border-bottom: 8px solid transparent; 352 border-right: 9px solid #0f172a; 353 } 354 /* On narrow screens, collapse to the stacked layout. */ 355 @media (max-width: 980px) { 356 .turn-group { 357 display: block; 358 } 359 .turn-group > .skill-panel { 360 margin-top: 10px; 361 margin-left: 42px; 362 } 363 .turn-group > .skill-panel::before { display: none; } 364 } 365 366 .skill-panel { 367 background: #0f172a; 368 color: #cbd5e1; 369 border-radius: var(--radius); 370 padding: 14px 16px; 371 font-family: 'SF Mono', 'Consolas', monospace; 372 font-size: 11.5px; 373 line-height: 1.55; 374 box-shadow: var(--shadow); 375 border: 1px solid #1e293b; 376 overflow-wrap: anywhere; 377 min-width: 0; 378 /* Sticks near the top of the group on long agent bubbles so the callout 379 stays visually tied to the turn it annotates. */ 380 position: sticky; top: 74px; 381 } 382 .skill-panel .sp-header { 383 font-family: -apple-system, 'Segoe UI', system-ui, sans-serif; 384 font-size: 9.5px; font-weight: 700; 385 text-transform: uppercase; letter-spacing: 1px; 386 color: #64748b; 387 margin-bottom: 8px; 388 display: flex; align-items: center; gap: 8px; 389 } 390 .skill-panel .sp-header::before { 391 content: ""; 392 width: 6px; height: 6px; border-radius: 50%; 393 background: #38bdf8; 394 box-shadow: 0 0 6px #38bdf8; 395 } 396 .skill-panel .sp-section { 397 margin-top: 10px; 398 padding-top: 8px; 399 border-top: 1px dashed #1e293b; 400 } 401 .skill-panel .sp-section:first-of-type { 402 margin-top: 0; padding-top: 0; border-top: none; 403 } 404 .skill-panel .sp-title { 405 font-family: -apple-system, 'Segoe UI', system-ui, sans-serif; 406 font-size: 11px; font-weight: 700; 407 color: #e2e8f0; 408 margin-bottom: 4px; 409 } 410 .skill-panel .sp-pass { color: #34d399; } /* green check */ 411 .skill-panel .sp-fire { color: #fb923c; } /* orange flame */ 412 .skill-panel .sp-skip { color: #64748b; } /* dim skip */ 413 .skill-panel .sp-kv { 414 color: #94a3b8; 415 } 416 .skill-panel .sp-kv b { color: #e2e8f0; font-weight: 700; } 417 .skill-panel a { 418 color: #38bdf8; text-decoration: none; 419 } 420 421 body.capture .toolbar { display: none; } 422 @media print { 423 .toolbar { display: none !important; } 424 body { background: white; } 425 .bubble { box-shadow: none; } 426 } 427 </style> 428 </head> 429 <body> 430 431 <div class="toolbar"> 432 <div class="logo">PyOD <em>V3</em></div> 433 <span class="version">Agentic</span> 434 <div class="spacer"></div> 435 <span style="display:inline-flex; align-items:center; gap:5px; font-size:12px; color:var(--green); background:var(--green-bg); padding:3px 10px; border-radius:12px; border:1px solid #a7f3d0; font-weight:600;"> 436 <span style="width:5px; height:5px; background:var(--green); border-radius:50%; display:inline-block;"></span> 437 od-expert skill active 438 </span> 439 <button onclick="document.body.classList.toggle('capture')">CAPTURE</button> 440 </div> 441 442 <div class="container"> 443 <div class="header"> 444 <h1>Any AI Agent Becomes an <em>OD Expert</em></h1> 445 <p class="subtitle">When a user asks about anomalies in their data, PyOD's <code>od-expert</code> skill auto-activates. The v3.2.0 skill encodes a master decision tree, top-10 critical pitfalls, and 11 adaptive escalation triggers. The shaded panels below the agent's turns show the skill's live decision-making.</p> 446 <div class="dataset-badge"><span class="dot"></span> Live run on Diabetes Screening dataset (768 patients, 8 clinical features)</div> 447 <div class="layers"> 448 <span class="layer-chip l1">Layer 1: fit / predict</span> 449 <span class="layer-chip l2">Layer 2: ADEngine</span> 450 <span class="layer-chip l3">Layer 3: Agentic (v3.2.0 deep skill)</span> 451 </div> 452 </div> 453 454 <div class="chat"> 455 456 <!-- Turn 1: User asks --> 457 <div class="turn user"> 458 <div class="avatar">U</div> 459 <div class="bubble"> 460 <div class="label">User</div> 461 I have a diabetes screening dataset — 768 patients with 8 clinical features (glucose, blood pressure, BMI, insulin, age, etc.). Find unusual patient profiles that might need follow-up. This is a medical screening context. 462 </div> 463 </div> 464 465 <!-- Turn 1: Agent response + its skill-decisions callout (side by side) --> 466 <div class="turn-group"> 467 468 <div class="turn agent"> 469 <div class="avatar">A</div> 470 <div class="bubble"> 471 <div class="label">Agent <span class="step-tag">INVESTIGATE</span> <span style="display:inline-block; background:var(--purple); color:white; font-size:9px; font-weight:700; padding:1px 7px; border-radius:4px; margin-left:4px; letter-spacing:0.5px; vertical-align:middle;">od-expert skill v3.2.0</span></div> 472 <em style="color:var(--text-dim); font-size:12px;">→ Anomaly-detection intent on tabular medical data. Activating <code>od-expert</code>.</em> 473 474 <div style="margin-top:8px;">Profiled as <strong>tabular, low-dim, n=768, 8 features</strong>. Because this is a medical screening context, I will run the full consensus and flag the scale-mismatch caveat in the final report.</div> 475 476 <div class="code"> 477 <span class="kw">import</span> numpy <span class="kw">as</span> np<br> 478 <span class="kw">from</span> pyod.utils.ad_engine <span class="kw">import</span> ADEngine<br> 479 np.random.<span class="fn">seed</span>(<span class="num">42</span>)<br> 480 data = np.<span class="fn">genfromtxt</span>(<span class="str">'examples/data/pima.csv'</span>, delimiter=<span class="str">','</span>, skip_header=<span class="num">1</span>)<br> 481 X = data[:, :-<span class="num">1</span>]<br> 482 engine = <span class="fn">ADEngine</span>()<br> 483 state = engine.<span class="fn">investigate</span>(X) 484 </div> 485 486 <div class="bench-note">Routing informed by <span class="cite">ADBench (NeurIPS 2022)</span>: 30 algorithms on 57 tabular datasets. For n < 1000 low-dim, KNN / LOF tie with IForest in ADBench top-5.</div> 487 488 <div class="det-selection"> 489 <div class="det-card"> 490 <span class="rank">#1</span> 491 <span class="name">KNN</span> 492 <span class="desc">K-Nearest Neighbors — proximity-based, confidence 0.85</span> 493 </div> 494 <div class="det-card"> 495 <span class="rank">#2</span> 496 <span class="name">IForest</span> 497 <span class="desc">Isolation Forest, ADBench top-5, confidence 0.85</span> 498 </div> 499 <div class="det-card"> 500 <span class="rank">#3</span> 501 <span class="name">LOF</span> 502 <span class="desc">Local Outlier Factor — local density, confidence 0.80</span> 503 </div> 504 </div> 505 506 Running all 3 in parallel… 507 508 <div class="metrics"> 509 <div class="metric"><div class="val amber">62</div><div class="lbl">Flagged (8.1%)</div></div> 510 <div class="metric"><div class="val green">0.96</div><div class="lbl">Separation</div></div> 511 <div class="metric"><div class="val blue">0.59</div><div class="lbl">Agreement</div></div> 512 <div class="metric"><div class="val green">HIGH</div><div class="lbl">Quality (0.79)</div></div> 513 </div> 514 515 <strong>Top case #13</strong> — all 3 detectors agree; consensus score 0.997 (100th percentile). KNN was the strongest individually: 516 517 <table class="score-table"> 518 <tr> 519 <th>Detector</th><th>Raw score</th><th>Percentile</th> 520 </tr> 521 <tr> 522 <td class="det-name">KNN</td> 523 <td class="num above">304.13</td> 524 <td class="num pct">100th</td> 525 </tr> 526 <tr> 527 <td class="det-name">IForest</td> 528 <td class="num above">0.15</td> 529 <td class="num pct">99.9th</td> 530 </tr> 531 <tr> 532 <td class="det-name">LOF</td> 533 <td class="num above">3.41</td> 534 <td class="num pct">99.8th</td> 535 </tr> 536 </table> 537 538 </div> 539 </div> 540 541 <!-- Skill-decisions callout: what the skill walked before and after the run --> 542 <aside class="skill-panel"> 543 <div class="sp-header">od-expert · decisions on this turn</div> 544 545 <div class="sp-section"> 546 <div class="sp-title">Modality triage</div> 547 Sequential? <span class="sp-skip">no</span>. Graph? <span class="sp-skip">no</span>. 548 Text/image? <span class="sp-skip">no</span>. → <b>tabular</b>. 549 Loaded <a href="#">references/tabular.md</a>. 550 </div> 551 552 <div class="sp-section"> 553 <div class="sp-title">Top-10 pitfalls (pre-run)</div> 554 <span class="sp-fire">⚠ #1 Unscaled features</span>: per-feature σ ratio = <b style="color:#fb923c;">347x</b> (insulin σ=115, pedigree σ=0.33). KNN / LOF need scaled input. <span class="sp-kv">Note for report caveat.</span><br> 555 <span class="sp-pass">✓ #2</span> contamination default (will check post-run).<br> 556 <span class="sp-pass">✓ #3</span> DL × tiny data: n=768 < 1000 floor → skip AutoEncoder / VAE / DeepSVDD.<br> 557 <span class="sp-pass">✓ #4</span> PyG not needed.<br> 558 <span class="sp-pass">✓ #5</span> all 8 features numeric.<br> 559 <span class="sp-pass">✓ #6–10</span> all pass. 560 </div> 561 562 <div class="sp-section"> 563 <div class="sp-title">Escalation triggers (11 checked)</div> 564 <span class="sp-fire">🚨 T8 High-stakes domain</span>: user said "<b>medical screening</b>" → <b>FIRES</b>. Committing to dual-detector validation and a confidence caveat in the final report.<br> 565 <span class="sp-skip">⊗ T1–T7, T9–T11</span> — not applicable on this turn. 566 </div> 567 568 <div class="sp-section"> 569 <div class="sp-title">Plan → state.plans</div> 570 <b>KNN</b> (0.85), <b>IForest</b> (0.85), <b>LOF</b> (0.80).<br> 571 Running all three in parallel. 572 </div> 573 574 <div class="sp-section"> 575 <div class="sp-title">Post-run triggers (T3, T4, T10)</div> 576 <span class="sp-pass">✓ T3</span> agreement = <b>0.59</b> > 0.4 floor.<br> 577 <span class="sp-pass">✓ T4</span> separation = <b>0.96</b>, stability = <b>0.81</b>, overall = <b>0.79</b> (verdict: <b>high</b>).<br> 578 <span class="sp-pass">✓ T10</span> not over-tight.<br> 579 <span class="sp-kv"><b>state.next_action['action']</b> = 'report_to_user'.</span> 580 </div> 581 </aside> 582 </div> 583 584 <!-- Turn 2: features --> 585 <div class="turn user"> 586 <div class="avatar">U</div> 587 <div class="bubble"> 588 <div class="label">User</div> 589 What clinical features are driving the flags? 590 </div> 591 </div> 592 593 <div class="turn agent"> 594 <div class="avatar">A</div> 595 <div class="bubble"> 596 <div class="label">Agent <span class="step-tag">EXPLAIN</span></div> 597 Analyzing the top anomaly (<strong>patient #13</strong>) via <strong>KNN</strong>. Absolute z-score magnitudes from <code>engine.explain_findings</code>: 598 599 <table class="feat-table"> 600 <tr><th>Feature</th><th>|z-score|</th><th>Raw value</th></tr> 601 <tr><td class="feat">insulin</td><td class="z">6.65</td><td class="val">846 µU/mL</td></tr> 602 <tr><td class="feat">age</td><td class="z">2.19</td><td class="val">59 yrs</td></tr> 603 <tr><td class="feat">glucose</td><td class="z">2.13</td><td class="val">189 mg/dL</td></tr> 604 <tr><td class="feat">pregnancies</td><td class="z">0.84</td><td class="val">1</td></tr> 605 </table> 606 607 <p class="note">Note: <code>engine.explain_findings</code> returns absolute z-score magnitudes; the raw values above are cross-checked manually to show direction. Extreme insulin (|z|=6.65), elevated glucose and older age — a metabolic-syndrome profile that warrants individual clinical review. Other top-flagged patients (#228, #247, #579, #177) each have their own contributing-feature ranking; insulin dominates several but is not a universal signature.</p> 608 609 <div class="code"> 610 best_result = state.results[state.analysis[<span class="str">'best_detector_index'</span>]]<br> 611 explanations = engine.<span class="fn">explain_findings</span>(best_result, X=X, top_k=<span class="num">5</span>) 612 </div> 613 </div> 614 </div> 615 616 <!-- Turn 3: User reveals labels --> 617 <div class="turn user"> 618 <div class="avatar">U</div> 619 <div class="bubble"> 620 <div class="label">User</div> 621 I should mention — I actually have diabetes diagnosis labels for the full cohort. 622 </div> 623 </div> 624 625 <!-- Turn 3: Agent escalation + Trigger 5 callout --> 626 <div class="turn-group"> 627 628 <div class="turn agent"> 629 <div class="avatar">A</div> 630 <div class="bubble"> 631 <div class="label">Agent <span class="step-tag">ESCALATE · T5</span></div> 632 <em style="color:var(--text-dim); font-size:12px;">→ Trigger 5 fired: labels mentioned but not provided.</em> 633 634 <div style="margin-top:8px;">That changes the plan. Supervised detection (with your labels as training signal) typically lifts AUC 10–20 points over unsupervised on a labelled cohort like this. The <code>ADEngine</code> session API is unsupervised in v3.2.x, so I would drop out of the session and call <code>XGBOD</code> directly on the classic fit / predict path:</div> 635 636 <div class="code"> 637 <span class="kw">from</span> pyod.models.xgbod <span class="kw">import</span> XGBOD<br> 638 clf = <span class="fn">XGBOD</span>()<br> 639 clf.<span class="fn">fit</span>(X, y_labels) <span class="cm"># supervised</span><br> 640 scores = clf.<span class="fn">decision_function</span>(X)<br> 641 labels_pred = clf.<span class="fn">predict</span>(X) 642 </div> 643 644 For now, I will finalise the unsupervised report and flag the label-availability note so you can switch to <code>XGBOD</code> later if the labels are validated. 645 </div> 646 </div> 647 648 <!-- Skill-decisions callout: Trigger 5 fires on new information --> 649 <aside class="skill-panel"> 650 <div class="sp-header">od-expert · trigger re-check</div> 651 652 <div class="sp-section"> 653 <div class="sp-title">New information → re-walk triggers</div> 654 <span class="sp-fire">🚨 T5 Labels mentioned but not provided</span>: 655 <b>FIRES</b>. User has labels but was using the unsupervised session path. Pausing to recommend a different entry point.<br> 656 <span class="sp-kv">Reasoning: supervised methods typically lift AUC 10–20 points over unsupervised on labelled datasets.</span> 657 </div> 658 659 <div class="sp-section"> 660 <div class="sp-title">Recommended switch</div> 661 Drop out of <code>ADEngine</code> session (unsupervised only in v3.2.x) and call 662 <b>XGBOD</b> directly via its classic <code>fit(X, y)</code> / <code>predict(X)</code> interface.<br> 663 <span class="sp-kv">Requires <b>pyod[xgboost]</b>. Verified via <code>engine.explain_detector('XGBOD')['requires']</code> → <code>['xgboost']</code>.</span> 664 </div> 665 </aside> 666 </div> 667 668 <!-- Turn 4: Final report --> 669 <div class="turn user"> 670 <div class="avatar">U</div> 671 <div class="bubble"> 672 <div class="label">User</div> 673 Just give me the report for now. 674 </div> 675 </div> 676 677 <div class="turn agent"> 678 <div class="avatar">A</div> 679 <div class="bubble"> 680 <div class="label">Agent <span class="step-tag">REPORT</span></div> 681 682 <div style="font-size: 15px; font-weight: 700; margin-bottom: 10px;">Diabetes Screening — Anomaly Report</div> 683 684 <div style="font-size: 12px; font-weight: 600; color: var(--accent); margin-bottom: 6px;">Quality Assessment</div> 685 <div class="qbar"> 686 <span class="qlbl">Separation</span> 687 <div class="qtrack"><div class="qfill" style="width:96%; background: var(--green);"></div></div> 688 <span class="qval" style="color: var(--green);">0.96</span> 689 </div> 690 <div class="qbar"> 691 <span class="qlbl">Agreement</span> 692 <div class="qtrack"><div class="qfill" style="width:59%; background: var(--accent);"></div></div> 693 <span class="qval" style="color: var(--accent);">0.59</span> 694 </div> 695 <div class="qbar"> 696 <span class="qlbl">Stability</span> 697 <div class="qtrack"><div class="qfill" style="width:81%; background: var(--purple);"></div></div> 698 <span class="qval" style="color: var(--purple);">0.81</span> 699 </div> 700 701 <hr class="sep"> 702 703 <div style="font-size: 13px; line-height: 1.7;"> 704 <strong>Dataset:</strong> Diabetes Screening (768 × 8)<br> 705 <strong>Detectors:</strong> KNN, IForest, LOF (3 / 3 converged)<br> 706 <strong>Best detector:</strong> KNN (highest correlation with consensus)<br> 707 <strong>Final flagged (consensus):</strong> 62 patients (8.1%)<br> 708 <strong>Quality:</strong> <span class="verdict high">HIGH (0.79)</span> 709 </div> 710 711 <hr class="sep"> 712 713 <div style="font-size: 12px; background: var(--amber-light); border: 1px solid #fcd34d; border-radius: 6px; padding: 10px; margin: 8px 0;"> 714 <strong style="color: var(--amber);">⚠ Assumptions & caveats</strong><br> 715 <span style="color: var(--text-secondary);"> 716 • <b>Scale mismatch</b>: per-feature std ratio = 347×. LOF / KNN results may be biased toward the insulin axis. For a follow-up run, apply <code>RobustScaler</code> first.<br> 717 • <b>Contamination</b>: default 0.1, observed 0.081 (<code>state.analysis['consensus_analysis']['anomaly_ratio']</code>). Close enough for unsupervised screening; adjust if you have domain knowledge.<br> 718 • <b>High-stakes domain</b>: medical screening context triggered dual-detector validation (Pitfall 7 → always use ≥ 2 detectors). All 3 detectors independently flagged the top cases.<br> 719 • <b>Label availability</b>: user mentioned labels exist. For labelled data, dropping out of the <code>ADEngine</code> session and calling <code>XGBOD</code> (classic fit / predict, <code>pyod[xgboost]</code> extra) typically lifts AUC 10–20 points over unsupervised. 720 </span> 721 </div> 722 723 <hr class="sep"> 724 725 <div style="font-size: 11px; color: var(--text-dim); font-style: italic;"> 726 Reference numbers (agent did not see ground truth): precision 33 / 62 (53.2%), recall 33 / 268 (12.3%). The low recall reflects the large contamination gap — the true positive rate is 34.9%, far above the unsupervised threshold. 727 </div> 728 </div> 729 </div> 730 731 </div> 732 733 <div class="footer"> 734 <strong>PyOD V3.2.0</strong> · 61 detectors · 5 modalities · od-expert skill (1000 lines: SKILL.md + 6 references)<br> 735 <a href="https://github.com/yzhao062/pyod">github.com/yzhao062/pyod</a> · 736 <a href="https://pyod.readthedocs.io">pyod.readthedocs.io</a> · 737 <code style="font-size: 12px;">pip install pyod && pyod install skill</code> 738 </div> 739 </div> 740 741 </body> 742 </html>