/ docs / 03-pipeline / scoring-research.html
scoring-research.html
   1  <!doctype html>
   2  <html lang="en">
   3    <head>
   4      <meta charset="UTF-8" />
   5      <meta name="viewport" content="width=device-width, initial-scale=1.0" />
   6      <title>Scoring System Research &amp; Design — 333 Method</title>
   7      <style>
   8        :root {
   9          --bg: #fafafa;
  10          --fg: #1a1a1a;
  11          --accent: #2563eb;
  12          --border: #e5e7eb;
  13          --code-bg: #f3f4f6;
  14          --table-stripe: #f9fafb;
  15          --blockquote-border: #d1d5db;
  16          --blockquote-bg: #f9fafb;
  17        }
  18        @media (prefers-color-scheme: dark) {
  19          :root {
  20            --bg: #111827;
  21            --fg: #e5e7eb;
  22            --accent: #60a5fa;
  23            --border: #374151;
  24            --code-bg: #1f2937;
  25            --table-stripe: #1f2937;
  26            --blockquote-border: #4b5563;
  27            --blockquote-bg: #1f2937;
  28          }
  29        }
  30        * {
  31          box-sizing: border-box;
  32          margin: 0;
  33          padding: 0;
  34        }
  35        body {
  36          font-family:
  37            -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif;
  38          line-height: 1.7;
  39          color: var(--fg);
  40          background: var(--bg);
  41          max-width: 52rem;
  42          margin: 0 auto;
  43          padding: 2rem 1.5rem 4rem;
  44        }
  45        h1 {
  46          font-size: 2rem;
  47          margin: 2rem 0 1rem;
  48          border-bottom: 2px solid var(--accent);
  49          padding-bottom: 0.5rem;
  50        }
  51        h2 {
  52          font-size: 1.5rem;
  53          margin: 2.5rem 0 0.75rem;
  54          border-bottom: 1px solid var(--border);
  55          padding-bottom: 0.4rem;
  56        }
  57        h3 {
  58          font-size: 1.2rem;
  59          margin: 1.5rem 0 0.5rem;
  60        }
  61        p {
  62          margin: 0.75rem 0;
  63        }
  64        a {
  65          color: var(--accent);
  66          text-decoration: none;
  67        }
  68        a:hover {
  69          text-decoration: underline;
  70        }
  71        ul,
  72        ol {
  73          margin: 0.5rem 0 0.5rem 1.5rem;
  74        }
  75        li {
  76          margin: 0.25rem 0;
  77        }
  78        code {
  79          font-family: 'SF Mono', 'Fira Code', 'JetBrains Mono', Consolas, monospace;
  80          font-size: 0.875em;
  81          background: var(--code-bg);
  82          padding: 0.15em 0.35em;
  83          border-radius: 4px;
  84        }
  85        pre {
  86          background: var(--code-bg);
  87          border: 1px solid var(--border);
  88          border-radius: 8px;
  89          padding: 1rem;
  90          overflow-x: auto;
  91          margin: 1rem 0;
  92        }
  93        pre code {
  94          background: none;
  95          padding: 0;
  96          font-size: 0.85em;
  97        }
  98        table {
  99          width: 100%;
 100          border-collapse: collapse;
 101          margin: 1rem 0;
 102          font-size: 0.95em;
 103        }
 104        th,
 105        td {
 106          border: 1px solid var(--border);
 107          padding: 0.5rem 0.75rem;
 108          text-align: left;
 109        }
 110        th {
 111          background: var(--code-bg);
 112          font-weight: 600;
 113        }
 114        tr:nth-child(even) {
 115          background: var(--table-stripe);
 116        }
 117        blockquote {
 118          border-left: 4px solid var(--blockquote-border);
 119          background: var(--blockquote-bg);
 120          padding: 0.75rem 1rem;
 121          margin: 1rem 0;
 122          border-radius: 0 8px 8px 0;
 123        }
 124        blockquote p {
 125          margin: 0.25rem 0;
 126        }
 127        hr {
 128          border: none;
 129          border-top: 1px solid var(--border);
 130          margin: 2rem 0;
 131        }
 132        @media print {
 133          body {
 134            max-width: 100%;
 135            padding: 1cm;
 136          }
 137          pre {
 138            white-space: pre-wrap;
 139          }
 140        }
 141      </style>
 142    </head>
 143    <body>
 144      <h1>Scoring System Research &amp; Design</h1>
 145      <p>
 146        This document captures the original deep-research conversation that produced the 333
 147        Method&#39;s website conversion scoring system. It covers the research foundations, rubric
 148        design rationale, factor weighting decisions, screenshot strategy, rescoring approach, contact
 149        extraction, and implementation considerations.
 150      </p>
 151      <blockquote>
 152        <p>
 153          <strong>Source:</strong> Open-WebUI chat with OpenRouter (January 2026), exported and
 154          consolidated.
 155        </p>
 156      </blockquote>
 157      <hr />
 158      <h2>Table of Contents</h2>
 159      <ol>
 160        <li><a href="#1-research-question">Research Question</a></li>
 161        <li><a href="#2-foundations-why-these-factors">Foundations: Why These Factors</a></li>
 162        <li><a href="#3-the-nine-factor-rubric">The Nine-Factor Rubric</a></li>
 163        <li><a href="#4-factor-weights--rationale">Factor Weights &amp; Rationale</a></li>
 164        <li>
 165          <a href="#5-score-calculation--grading-scale">Score Calculation &amp; Grading Scale</a>
 166        </li>
 167        <li><a href="#6-screenshot-strategy">Screenshot Strategy</a></li>
 168        <li><a href="#7-two-pass-architecture">Two-Pass Architecture</a></li>
 169        <li><a href="#8-contact-extraction-on-rescore">Contact Extraction on Rescore</a></li>
 170        <li><a href="#9-popover-handling">Popover Handling</a></li>
 171        <li><a href="#10-llm-prompt-design">LLM Prompt Design</a></li>
 172        <li><a href="#11-validation--calibration">Validation &amp; Calibration</a></li>
 173        <li><a href="#12-token-optimization">Token Optimization</a></li>
 174        <li><a href="#13-implementation-notes">Implementation Notes</a></li>
 175        <li><a href="#14-appendix-original-json-schemas">Appendix: Original JSON Schemas</a></li>
 176      </ol>
 177      <hr />
 178      <h2>1. Research Question</h2>
 179      <p>The original question that kicked off this research:</p>
 180      <blockquote>
 181        <p>
 182          Propose a scoring system for website conversion (with standard school grading of A+ to F)
 183          based on factors such as clear offer, CTA, urgency, hook, strong headline, strong value
 184          proposition, clear reason to choose them (USP), no generic stock photos, trust elements
 185          (reviews, badges, guarantees), and anything else from best practices.
 186        </p>
 187        <p>
 188          This will be provided to an LLM to calculate the score, along with the HTML of the DOM after
 189          pageload and one or more screenshots. Please advise whether this scoring system will require
 190          a full-page screenshot, or will just an above-the-fold and maybe the first below-the-fold
 191          screenshots be sufficient to produce a reasonable score? This system will be scoring many
 192          hundreds of thousands of websites, so minimising LLM token usage is more important than
 193          score accuracy.
 194        </p>
 195      </blockquote>
 196      <hr />
 197      <h2>2. Foundations: Why These Factors</h2>
 198      <p>
 199        The scoring factors were selected by synthesizing established CRO (Conversion Rate
 200        Optimization) best practices, prioritization frameworks (RICE, PIE), and behavioral psychology
 201        research. The factors map to three broad categories of conversion influence:
 202      </p>
 203      <h3>Messaging Clarity &amp; Value Communication</h3>
 204      <ul>
 205        <li>
 206          <strong>Headline Quality</strong> — The primary hook; must communicate what, who, and why
 207          within 3-5 seconds
 208        </li>
 209        <li>
 210          <strong>Value Proposition</strong> — Extends the headline; shifts from features to benefits
 211          (&quot;what&#39;s in it for me?&quot;)
 212        </li>
 213        <li>
 214          <strong>Unique Selling Proposition</strong> — Why choose <em>this</em> option over
 215          alternatives
 216        </li>
 217        <li>
 218          <strong>Clear Offer</strong> — What exactly is the visitor being asked to do, and what do
 219          they get
 220        </li>
 221      </ul>
 222      <h3>User Confidence &amp; Trust</h3>
 223      <ul>
 224        <li>
 225          <strong>Trust &amp; Credibility Signals</strong> — Testimonials, certifications, badges,
 226          partner logos, media mentions
 227        </li>
 228        <li>
 229          <strong>Authentic Imagery</strong> — Real product photos vs. generic stock; professional
 230          visual design
 231        </li>
 232      </ul>
 233      <h3>Action &amp; Engagement</h3>
 234      <ul>
 235        <li><strong>Call-to-Action</strong> — Copy clarity, visual prominence, and placement</li>
 236        <li>
 237          <strong>Urgency/Scarcity</strong> — Legitimate time/supply pressure for immediate action
 238        </li>
 239        <li>
 240          <strong>Hook &amp; Engagement</strong> — Hero element that captures attention in the first
 241          seconds
 242        </li>
 243      </ul>
 244      <h3>Additional Context</h3>
 245      <ul>
 246        <li>
 247          <strong>Industry Appropriateness</strong> (3% weight) — Whether design serves its specific
 248          business model context (B2B SaaS vs. e-commerce vs. local services have different norms)
 249        </li>
 250      </ul>
 251      <h3>Key Research Findings</h3>
 252      <ul>
 253        <li>Users spend 57-80% of viewing time on above-the-fold content (Nielsen Norman Group)</li>
 254        <li>
 255          Google found ads above-fold achieve 73% viewability vs. 44% below-fold — a 66% &quot;fold
 256          cliff&quot;
 257        </li>
 258        <li>
 259          90% of users begin scrolling within 14 seconds, but
 260          <em>only if above-fold content signals value</em>
 261        </li>
 262        <li>CTA copy changes alone can generate conversion improvements exceeding 200%</li>
 263        <li>
 264          GPT-4 Vision studies found cropped images actually <em>outperform</em> full-page captures
 265          for identification tasks — background noise reduces accuracy
 266        </li>
 267      </ul>
 268      <hr />
 269      <h2>3. The Nine-Factor Rubric</h2>
 270      <p>
 271        Each factor is scored 0-10 with specific rubric definitions. Below is the condensed rubric;
 272        the full detailed version with examples exists in the LLM prompts.
 273      </p>
 274      <h3>Factor 1: Headline Quality &amp; Clarity (15%)</h3>
 275      <table>
 276        <thead>
 277          <tr>
 278            <th>Score</th>
 279            <th>Description</th>
 280          </tr>
 281        </thead>
 282        <tbody>
 283          <tr>
 284            <td>9-10</td>
 285            <td>
 286              Immediately communicates value; benefit-oriented; specific; creates curiosity or
 287              emotional connection
 288            </td>
 289          </tr>
 290          <tr>
 291            <td>7-8</td>
 292            <td>Clearly communicates basic benefit; mostly specific; adequate direction</td>
 293          </tr>
 294          <tr>
 295            <td>5-6</td>
 296            <td>Communicates a benefit but somewhat generic; requires modest interpretation</td>
 297          </tr>
 298          <tr>
 299            <td>3-4</td>
 300            <td>Vague, generic, or fails to communicate core benefit</td>
 301          </tr>
 302          <tr>
 303            <td>1-2</td>
 304            <td>Confusing, contradictory, or essentially absent above-fold</td>
 305          </tr>
 306          <tr>
 307            <td>0</td>
 308            <td>No discernable headline or actively confusing</td>
 309          </tr>
 310        </tbody>
 311      </table>
 312      <h3>Factor 2: Value Proposition Clarity (14%)</h3>
 313      <table>
 314        <thead>
 315          <tr>
 316            <th>Score</th>
 317            <th>Description</th>
 318          </tr>
 319        </thead>
 320        <tbody>
 321          <tr>
 322            <td>9-10</td>
 323            <td>Specific, benefit-oriented, compelling; clearly differentiates</td>
 324          </tr>
 325          <tr>
 326            <td>7-8</td>
 327            <td>Clear and benefit-focused; adequately articulates core benefits</td>
 328          </tr>
 329          <tr>
 330            <td>5-6</td>
 331            <td>Present but generic or feature-heavy; requires interpretation</td>
 332          </tr>
 333          <tr>
 334            <td>3-4</td>
 335            <td>Vague or feature-focused; unclear differentiation</td>
 336          </tr>
 337          <tr>
 338            <td>1-2</td>
 339            <td>Barely present or confused with feature lists</td>
 340          </tr>
 341          <tr>
 342            <td>0</td>
 343            <td>No value proposition or contradictory messaging</td>
 344          </tr>
 345        </tbody>
 346      </table>
 347      <h3>Factor 3: Unique Selling Proposition (13%)</h3>
 348      <table>
 349        <thead>
 350          <tr>
 351            <th>Score</th>
 352            <th>Description</th>
 353          </tr>
 354        </thead>
 355        <tbody>
 356          <tr>
 357            <td>9-10</td>
 358            <td>Clear, compelling differentiation; specific competitive advantage</td>
 359          </tr>
 360          <tr>
 361            <td>7-8</td>
 362            <td>Reasonably clear; specific advantage identified</td>
 363          </tr>
 364          <tr>
 365            <td>5-6</td>
 366            <td>Some differentiation implied but not explicit</td>
 367          </tr>
 368          <tr>
 369            <td>3-4</td>
 370            <td>Vague; relies on generic claims (&quot;best in class&quot;)</td>
 371          </tr>
 372          <tr>
 373            <td>1-2</td>
 374            <td>Barely present; no clear reasons to choose</td>
 375          </tr>
 376          <tr>
 377            <td>0</td>
 378            <td>No differentiation; appears identical to generic competitors</td>
 379          </tr>
 380        </tbody>
 381      </table>
 382      <h3>Factor 4: Call-to-Action Design &amp; Placement (13%)</h3>
 383      <table>
 384        <thead>
 385          <tr>
 386            <th>Score</th>
 387            <th>Description</th>
 388          </tr>
 389        </thead>
 390        <tbody>
 391          <tr>
 392            <td>9-10</td>
 393            <td>
 394              Visible above fold; specific action-oriented language; visually prominent; secondary
 395              CTAs at natural breaks
 396            </td>
 397          </tr>
 398          <tr>
 399            <td>7-8</td>
 400            <td>Visible above fold; action-oriented; reasonably prominent</td>
 401          </tr>
 402          <tr>
 403            <td>5-6</td>
 404            <td>
 405              Present; clear but generic (&quot;Submit&quot;, &quot;Learn More&quot;); adequate
 406              placement
 407            </td>
 408          </tr>
 409          <tr>
 410            <td>3-4</td>
 411            <td>Present but not prominent; vague language; requires scrolling</td>
 412          </tr>
 413          <tr>
 414            <td>1-2</td>
 415            <td>Hard to find, confusing, or inadequately prominent</td>
 416          </tr>
 417          <tr>
 418            <td>0</td>
 419            <td>No CTA or buried below multiple scrolls</td>
 420          </tr>
 421        </tbody>
 422      </table>
 423      <h3>Factor 5: Urgency &amp; Scarcity (10%)</h3>
 424      <table>
 425        <thead>
 426          <tr>
 427            <th>Score</th>
 428            <th>Description</th>
 429          </tr>
 430        </thead>
 431        <tbody>
 432          <tr>
 433            <td>9-10</td>
 434            <td>Legitimate urgency with specifics (deadline, count); genuine pressure</td>
 435          </tr>
 436          <tr>
 437            <td>7-8</td>
 438            <td>Clear mechanism; specific rather than vague</td>
 439          </tr>
 440          <tr>
 441            <td>5-6</td>
 442            <td>Some urgency suggested but lacks specifics</td>
 443          </tr>
 444          <tr>
 445            <td>3-4</td>
 446            <td>Vague (&quot;act soon&quot;, &quot;don&#39;t miss out&quot;) without details</td>
 447          </tr>
 448          <tr>
 449            <td>1-2</td>
 450            <td>Minimal or ineffective urgency</td>
 451          </tr>
 452          <tr>
 453            <td>0</td>
 454            <td>No urgency or false urgency undermining credibility</td>
 455          </tr>
 456        </tbody>
 457      </table>
 458      <h3>Factor 6: Hook &amp; Initial Engagement (9%)</h3>
 459      <table>
 460        <thead>
 461          <tr>
 462            <th>Score</th>
 463            <th>Description</th>
 464          </tr>
 465        </thead>
 466        <tbody>
 467          <tr>
 468            <td>9-10</td>
 469            <td>Visually compelling hero element; contextually relevant; strong engagement</td>
 470          </tr>
 471          <tr>
 472            <td>7-8</td>
 473            <td>Professional, relevant hero; adequate engagement</td>
 474          </tr>
 475          <tr>
 476            <td>5-6</td>
 477            <td>Present but generic; mild engagement</td>
 478          </tr>
 479          <tr>
 480            <td>3-4</td>
 481            <td>Dated, poorly executed, or tangentially relevant</td>
 482          </tr>
 483          <tr>
 484            <td>1-2</td>
 485            <td>Missing, poor quality, or detracting</td>
 486          </tr>
 487          <tr>
 488            <td>0</td>
 489            <td>No hook; purely text-based above-fold with no visual appeal</td>
 490          </tr>
 491        </tbody>
 492      </table>
 493      <h3>Factor 7: Trust &amp; Credibility Signals (11%)</h3>
 494      <table>
 495        <thead>
 496          <tr>
 497            <th>Score</th>
 498            <th>Description</th>
 499          </tr>
 500        </thead>
 501        <tbody>
 502          <tr>
 503            <td>9-10</td>
 504            <td>
 505              Multiple relevant elements (named testimonials, certifications, badges, logos);
 506              prominently placed
 507            </td>
 508          </tr>
 509          <tr>
 510            <td>7-8</td>
 511            <td>Several elements; specific testimonials or credible certifications</td>
 512          </tr>
 513          <tr>
 514            <td>5-6</td>
 515            <td>Some elements (generic testimonials or basic badges); adequate</td>
 516          </tr>
 517          <tr>
 518            <td>3-4</td>
 519            <td>Minimal; generic or lacking credibility</td>
 520          </tr>
 521          <tr>
 522            <td>1-2</td>
 523            <td>Nearly absent</td>
 524          </tr>
 525          <tr>
 526            <td>0</td>
 527            <td>No trust signals at all</td>
 528          </tr>
 529        </tbody>
 530      </table>
 531      <h3>Factor 8: Authentic Imagery &amp; Visual Design (8%)</h3>
 532      <table>
 533        <thead>
 534          <tr>
 535            <th>Score</th>
 536            <th>Description</th>
 537          </tr>
 538        </thead>
 539        <tbody>
 540          <tr>
 541            <td>9-10</td>
 542            <td>Authentic imagery (product photos, real customers); professional design</td>
 543          </tr>
 544          <tr>
 545            <td>7-8</td>
 546            <td>Mix of authentic and professional; solid design</td>
 547          </tr>
 548          <tr>
 549            <td>5-6</td>
 550            <td>Mostly professional with some stock; adequate; minor issues</td>
 551          </tr>
 552          <tr>
 553            <td>3-4</td>
 554            <td>Significant stock photos; dated design; unprofessional impression</td>
 555          </tr>
 556          <tr>
 557            <td>1-2</td>
 558            <td>Predominantly generic/low-quality; poor design</td>
 559          </tr>
 560          <tr>
 561            <td>0</td>
 562            <td>Broken images, extremely low-quality, or repelling</td>
 563          </tr>
 564        </tbody>
 565      </table>
 566      <h3>Factor 9: Clear Offer &amp; Specificity (4%)</h3>
 567      <table>
 568        <thead>
 569          <tr>
 570            <th>Score</th>
 571            <th>Description</th>
 572          </tr>
 573        </thead>
 574        <tbody>
 575          <tr>
 576            <td>9-10</td>
 577            <td>Specific, unambiguous; visitor knows exactly what they get</td>
 578          </tr>
 579          <tr>
 580            <td>7-8</td>
 581            <td>Clear and specific; minor ambiguity</td>
 582          </tr>
 583          <tr>
 584            <td>5-6</td>
 585            <td>Generally clear but could be more specific</td>
 586          </tr>
 587          <tr>
 588            <td>3-4</td>
 589            <td>Somewhat vague; visitor must infer details</td>
 590          </tr>
 591          <tr>
 592            <td>1-2</td>
 593            <td>Unclear or hard to determine</td>
 594          </tr>
 595          <tr>
 596            <td>0</td>
 597            <td>No discernable offer</td>
 598          </tr>
 599        </tbody>
 600      </table>
 601      <h3>Factor 10: Contextual Appropriateness (3%)</h3>
 602      <p>
 603        Evaluates whether design serves its industry/business model context. B2B SaaS, e-commerce, and
 604        local services have different CRO norms.
 605      </p>
 606      <hr />
 607      <h2>4. Factor Weights &amp; Rationale</h2>
 608      <p>
 609        The weights were derived from empirical research on correlation with actual conversion
 610        outcomes:
 611      </p>
 612      <table>
 613        <thead>
 614          <tr>
 615            <th>Factor</th>
 616            <th>Weight</th>
 617            <th>Rationale</th>
 618          </tr>
 619        </thead>
 620        <tbody>
 621          <tr>
 622            <td>Headline Quality</td>
 623            <td>15%</td>
 624            <td>
 625              Primary determinant of whether users engage or bounce; captures 80% of initial attention
 626            </td>
 627          </tr>
 628          <tr>
 629            <td>Value Proposition</td>
 630            <td>14%</td>
 631            <td>
 632              Extends headline; answers &quot;what&#39;s in it for me?&quot;; directly drives
 633              consideration
 634            </td>
 635          </tr>
 636          <tr>
 637            <td>USP/Differentiation</td>
 638            <td>13%</td>
 639            <td>Critical for competitive markets; answers &quot;why you over alternatives?&quot;</td>
 640          </tr>
 641          <tr>
 642            <td>CTA Design</td>
 643            <td>13%</td>
 644            <td>The conversion mechanism itself; changes to CTA alone can drive 200%+ improvement</td>
 645          </tr>
 646          <tr>
 647            <td>Trust Signals</td>
 648            <td>11%</td>
 649            <td>
 650              Addresses fundamental &quot;is this trustworthy?&quot; concern; increasingly important
 651              post-privacy era
 652            </td>
 653          </tr>
 654          <tr>
 655            <td>Urgency/Scarcity</td>
 656            <td>10%</td>
 657            <td>Drives immediate action vs. postponement; effective when legitimate</td>
 658          </tr>
 659          <tr>
 660            <td>Hook/Engagement</td>
 661            <td>9%</td>
 662            <td>First-impression visual; supports but doesn&#39;t replace messaging</td>
 663          </tr>
 664          <tr>
 665            <td>Imagery/Design</td>
 666            <td>8%</td>
 667            <td>
 668              Credibility signal; generic stock undermines trust but doesn&#39;t make or break
 669              conversion
 670            </td>
 671          </tr>
 672          <tr>
 673            <td>Offer Clarity</td>
 674            <td>4%</td>
 675            <td>Important but usually redundant with headline + CTA when those are strong</td>
 676          </tr>
 677          <tr>
 678            <td>Context</td>
 679            <td>3%</td>
 680            <td>Catch-all for industry-specific norms</td>
 681          </tr>
 682          <tr>
 683            <td><strong>Total</strong></td>
 684            <td><strong>100%</strong></td>
 685            <td></td>
 686          </tr>
 687        </tbody>
 688      </table>
 689      <p>
 690        The top 4 factors (headline, value prop, USP, CTA) account for 55% of the score. This reflects
 691        the research consensus that messaging clarity and the conversion mechanism are the dominant
 692        drivers.
 693      </p>
 694      <hr />
 695      <h2>5. Score Calculation &amp; Grading Scale</h2>
 696      <h3>Formula</h3>
 697      <pre><code>Overall Score = (Headline × 0.15) + (Value Prop × 0.14) + (USP × 0.13) + (CTA × 0.13)
 698                + (Urgency × 0.10) + (Hook × 0.09) + (Trust × 0.11) + (Imagery × 0.08)
 699                + (Offer × 0.04) + (Context × 0.03)
 700  </code></pre>
 701      <p>Each factor is 0-10, producing a weighted sum of 0-10, multiplied by 10 to get 0-100.</p>
 702      <h3>Grading Scale</h3>
 703      <p>
 704        The original research used a standard academic scale. The production system now uses a
 705        business-oriented scale:
 706      </p>
 707      <table>
 708        <thead>
 709          <tr>
 710            <th>Grade</th>
 711            <th>Score Range</th>
 712            <th>Interpretation</th>
 713          </tr>
 714        </thead>
 715        <tbody>
 716          <tr>
 717            <td>A+</td>
 718            <td>95-100</td>
 719            <td>Exceptional conversion design</td>
 720          </tr>
 721          <tr>
 722            <td>A</td>
 723            <td>90-94</td>
 724            <td>Excellent; well-executed fundamentals</td>
 725          </tr>
 726          <tr>
 727            <td>A-</td>
 728            <td>85-89</td>
 729            <td>Very good; minor weaknesses</td>
 730          </tr>
 731          <tr>
 732            <td>B+</td>
 733            <td>83-84</td>
 734            <td>Good; some friction or messaging issues</td>
 735          </tr>
 736          <tr>
 737            <td>B</td>
 738            <td>82</td>
 739            <td>Satisfactory; multiple improvement opportunities</td>
 740          </tr>
 741          <tr>
 742            <td>B-</td>
 743            <td>70-81</td>
 744            <td>Below average but acceptable</td>
 745          </tr>
 746          <tr>
 747            <td>C</td>
 748            <td>50-69</td>
 749            <td>Marginal; substantial improvements needed</td>
 750          </tr>
 751          <tr>
 752            <td>D</td>
 753            <td>30-49</td>
 754            <td>Poor; critical issues present</td>
 755          </tr>
 756          <tr>
 757            <td>E</td>
 758            <td>0-29</td>
 759            <td>Fundamentally broken</td>
 760          </tr>
 761          <tr>
 762            <td>F</td>
 763            <td>Negative</td>
 764            <td>Should not occur in practice</td>
 765          </tr>
 766        </tbody>
 767      </table>
 768      <blockquote>
 769        <p>
 770          <strong>Note:</strong> The production grading scale (above) differs from the original
 771          academic scale proposed in the research (which had C+/C-/D+/D- subdivisions). The business
 772          scale was adopted because the scoring system is used to identify prospects who need help,
 773          not to give academic grades. See <code>src/score.js:computeGrade()</code> for the production
 774          implementation.
 775        </p>
 776      </blockquote>
 777      <hr />
 778      <h2>6. Screenshot Strategy</h2>
 779      <h3>Key Decision: Above-the-Fold Is Sufficient</h3>
 780      <p>
 781        The research concluded that
 782        <strong
 783          >above-the-fold and first below-the-fold screenshots are substantially sufficient for
 784          reliable scoring</strong
 785        >, reducing token consumption by 65-75% compared to full-page screenshots while maintaining
 786        evaluation accuracy above 90%.
 787      </p>
 788      <h3>Recommended Approach</h3>
 789      <ol>
 790        <li><strong>Primary:</strong> Desktop above-the-fold (1920x1080)</li>
 791        <li>
 792          <strong>Secondary:</strong> Mobile above-the-fold (375x667) —
 793          <em>later dropped in production for cost reasons</em>
 794        </li>
 795        <li>
 796          <strong>Conditional:</strong> Below-the-fold screenshot if initial score is low (rescoring
 797          pass)
 798        </li>
 799      </ol>
 800      <h3>Evidence</h3>
 801      <ul>
 802        <li>Above-fold content captures 57-80% of user viewing time</li>
 803        <li>The 9 scoring factors cluster heavily above the fold on well-designed pages</li>
 804        <li>
 805          GPT-4 Vision cropping research showed focused images <em>improve</em> accuracy by removing
 806          noise
 807        </li>
 808        <li>Token savings: ~1,000 tokens per above-fold image vs. ~2,000 for full-page</li>
 809        <li>At 500,000 websites: ~1 billion fewer tokens consumed</li>
 810      </ul>
 811      <h3>Production Implementation</h3>
 812      <p>In the production system (<code>src/capture.js</code>):</p>
 813      <ul>
 814        <li>Desktop screenshot captured at page load (cropped + uncropped variants)</li>
 815        <li>DOM-aware intelligent cropping preserves CTAs, trust signals, hero imagery</li>
 816        <li>Cropped version saves 20-35% additional LLM tokens</li>
 817        <li>Below-fold screenshot captured separately for rescoring pass</li>
 818        <li>Mobile screenshot was dropped for cost efficiency</li>
 819      </ul>
 820      <hr />
 821      <h2>7. Two-Pass Architecture</h2>
 822      <h3>Design Decision: Conditional Resubmission</h3>
 823      <p>The research compared three approaches for handling below-the-fold content:</p>
 824      <table>
 825        <thead>
 826          <tr>
 827            <th>Approach</th>
 828            <th>Token Cost (100K sites, 30% low-scoring)</th>
 829          </tr>
 830        </thead>
 831        <tbody>
 832          <tr>
 833            <td><strong>Conditional resubmission</strong></td>
 834            <td>174M tokens</td>
 835          </tr>
 836          <tr>
 837            <td>Always include below-fold</td>
 838            <td>180M tokens</td>
 839          </tr>
 840          <tr>
 841            <td>Include with &quot;ignore if unnecessary&quot;</td>
 842            <td>185M tokens</td>
 843          </tr>
 844        </tbody>
 845      </table>
 846      <p><strong>Conditional resubmission wins</strong> because:</p>
 847      <ul>
 848        <li>
 849          Vision models charge for image tokens at input time regardless of whether the model
 850          &quot;uses&quot; the image
 851        </li>
 852        <li>
 853          Including an image and saying &quot;only look at it if needed&quot; does NOT save tokens
 854        </li>
 855        <li>The breakeven point is ~50% of sites scoring low; in practice only ~30% do</li>
 856      </ul>
 857      <h3>Pass 1: Scoring (Above-the-Fold)</h3>
 858      <ul>
 859        <li>Input: Desktop screenshot (cropped) + HTML DOM</li>
 860        <li>
 861          Output: Factor scores (0-10 each), weighted total, grade, strengths, weaknesses, improvement
 862          opportunities
 863        </li>
 864        <li>Sites scoring below threshold proceed to Pass 2</li>
 865      </ul>
 866      <h3>Pass 2: Rescoring (Below-the-Fold)</h3>
 867      <ul>
 868        <li>Input: Below-fold screenshot + HTML DOM + original score JSON</li>
 869        <li>
 870          Output: Adjusted factor scores (only where new content warrants change), recalculated
 871          total/grade, contact details
 872        </li>
 873        <li>Does NOT resend above-fold screenshots (LLM already has the context from Pass 1 JSON)</li>
 874        <li>
 875          Focused prompt references original scores and asks for adjustments, not full re-evaluation
 876        </li>
 877      </ul>
 878      <h3>Threshold</h3>
 879      <p>
 880        The original research suggested C+ (77) as the resubmission threshold. The production system
 881        uses a configurable <code>LOW_SCORE_CUTOFF</code> (currently 82, i.e., B- and below).
 882      </p>
 883      <blockquote>
 884        <p>
 885          <strong>Business logic:</strong> We&#39;re selling web design services. High scorers
 886          don&#39;t need help; low scorers are prospects. Rescoring gives low-scoring sites a second
 887          chance with more data before proposal generation.
 888        </p>
 889      </blockquote>
 890      <hr />
 891      <h2>8. Contact Extraction on Rescore</h2>
 892      <p>
 893        Contact extraction was added to the rescoring pass (not the initial scoring) to save tokens —
 894        you only extract contacts for sites you actually plan to contact (low scorers).
 895      </p>
 896      <h3>What Gets Extracted</h3>
 897      <p>From the HTML DOM (not guessed):</p>
 898      <ul>
 899        <li>
 900          <strong>Contact form details:</strong> action URL, method, field presence (first_name,
 901          last_name, full_name, email, phone, company_name, subject_line, message) with field types,
 902          name attributes, and labels
 903        </li>
 904        <li>
 905          <strong>Email addresses:</strong> All explicit <code>mailto:</code> links or plain-text
 906          emails
 907        </li>
 908        <li>
 909          <strong>Phone numbers:</strong> All explicit <code>tel:</code> links or recognizable
 910          patterns
 911        </li>
 912        <li>
 913          <strong>Social profiles:</strong> Links to major platforms with platform identification
 914        </li>
 915        <li>
 916          <strong>Contact page URLs:</strong> Explicit &quot;/contact&quot; or &quot;/support&quot;
 917          links
 918        </li>
 919      </ul>
 920      <h3>Design Rationale</h3>
 921      <ul>
 922        <li>
 923          Extracting contacts in Pass 1 would waste tokens on high-scoring sites we won&#39;t contact
 924        </li>
 925        <li>The HTML DOM is already being sent in Pass 2 anyway (for score adjustment)</li>
 926        <li>Adding contact extraction to the same API call adds minimal token overhead</li>
 927        <li>
 928          All fields are optional — the LLM reports what it finds, uses <code>null</code>/empty for
 929          missing data
 930        </li>
 931      </ul>
 932      <hr />
 933      <h2>9. Popover Handling</h2>
 934      <p><strong>Decision: Close popovers before taking screenshots.</strong></p>
 935      <p>Reasoning:</p>
 936      <ol>
 937        <li>Popovers obscure the headline, hero image, CTA, and trust signals being evaluated</li>
 938        <li>
 939          They represent a secondary conversion path (newsletter signup, discount) — not the primary
 940          page conversion
 941        </li>
 942        <li>
 943          They create inconsistent evaluation conditions (some sites show immediately, others on
 944          delay/exit)
 945        </li>
 946        <li>
 947          The entire scoring methodology depends on evaluating above-fold content, which is completely
 948          blocked by modal overlays
 949        </li>
 950      </ol>
 951      <h3>Implementation</h3>
 952      <p>
 953        The production system (<code>src/capture.js</code> /
 954        <code>src/utils/stealth-browser.js</code>):
 955      </p>
 956      <ul>
 957        <li>Waits 2-3 seconds after page load for delay-triggered popovers</li>
 958        <li>
 959          Attempts to close via common selectors (<code>[class*=&#39;close&#39;]</code>,
 960          <code>[aria-label=&#39;Close&#39;]</code>, etc.)
 961        </li>
 962        <li>Sends Escape key as fallback</li>
 963        <li>Takes screenshot immediately after closing to avoid new popovers</li>
 964      </ul>
 965      <hr />
 966      <h2>10. LLM Prompt Design</h2>
 967      <h3>Prompt Structure (Both Passes)</h3>
 968      <ol>
 969        <li><strong>System context:</strong> Expert CRO specialist role</li>
 970        <li>
 971          <strong>Input specification:</strong> What data is provided (screenshots, HTML, prior scores
 972          for rescoring)
 973        </li>
 974        <li><strong>Evaluation framework:</strong> Factor definitions with rubric anchors</li>
 975        <li><strong>Scoring methodology:</strong> Weighted calculation formula</li>
 976        <li><strong>Output format:</strong> Strict JSON schema</li>
 977        <li>
 978          <strong>Best practices:</strong> Analyze HTML first, cross-reference with screenshots,
 979          assess mobile/desktop separately, provide specific evidence
 980        </li>
 981      </ol>
 982      <h3>Key Design Principles</h3>
 983      <ul>
 984        <li>
 985          <strong>Rubric in system prompt:</strong> Full rubric definitions appear once in the system
 986          prompt, not repeated per-website
 987        </li>
 988        <li>
 989          <strong>Evidence-based scoring:</strong> Each factor score requires 1-2 sentence reasoning
 990          with specific page evidence
 991        </li>
 992        <li>
 993          <strong>Independent factor scoring:</strong> Score each factor independently, then calculate
 994          weighted total
 995        </li>
 996        <li>
 997          <strong>Dual analysis:</strong> HTML content analysis + visual assessment cross-referenced
 998        </li>
 999        <li>
1000          <strong>Confidence assessment:</strong> Overall confidence (High/Medium/Low) with limitation
1001          notes
1002        </li>
1003      </ul>
1004      <h3>Production Evolution</h3>
1005      <p>
1006        The production prompts (<code>prompts/CONVERSION-SCORING-VISION.md</code> and
1007        <code>prompts/CONVERSION-SCORING-NOVIS.md</code>) have evolved from this original design:
1008      </p>
1009      <ul>
1010        <li>Simplified output JSON (removed verbose nested structures for token efficiency)</li>
1011        <li>
1012          Added <code>recommendation_sms</code> and <code>recommendation_email</code> fields for
1013          proposal generation
1014        </li>
1015        <li>Split into vision-enabled and HTML-only variants</li>
1016        <li>
1017          Grade calculation moved from LLM to code (<code>computeGrade()</code> in
1018          <code>src/score.js</code>)
1019        </li>
1020        <li>
1021          LLM now returns only <code>factor_scores</code>; total and grade computed programmatically
1022          for consistency
1023        </li>
1024      </ul>
1025      <hr />
1026      <h2>11. Validation &amp; Calibration</h2>
1027      <p>The research recommended the following validation approach (partially implemented):</p>
1028      <h3>Inter-Rater Reliability</h3>
1029      <ul>
1030        <li>Evaluate 50-100 websites with experienced CRO professionals</li>
1031        <li>Run same websites through LLM scoring</li>
1032        <li>Target Spearman correlation &gt; 0.85 between LLM and expert scores</li>
1033        <li>Analyze divergence cases; iterate on rubric wording</li>
1034      </ul>
1035      <h3>Expected Accuracy</h3>
1036      <ul>
1037        <li>
1038          <strong>Letter grade agreement:</strong> 75-85% exact match with experts; remaining are
1039          adjacent grades
1040        </li>
1041        <li>
1042          <strong>Factor-level accuracy:</strong> 80-90% match (individual factors more objective than
1043          aggregated grades)
1044        </li>
1045        <li>
1046          <strong>High-confidence cases:</strong> &gt;90% accuracy for clearly strong (A-) or weak
1047          (D/F) sites
1048        </li>
1049        <li><strong>Mid-range (B-C+):</strong> Lower agreement due to inherent subjectivity</li>
1050      </ul>
1051      <h3>Continuous Monitoring</h3>
1052      <ul>
1053        <li>Distribution should approximate normal centered around C+/B-</li>
1054        <li>
1055          Factor correlations should match expectations (headline ↔ value prop should correlate &gt;
1056          0.6)
1057        </li>
1058        <li>1% human spot-checks; recalibrate if divergences exceed 5-10%</li>
1059      </ul>
1060      <h3>Production Status</h3>
1061      <p>In practice, the system has been validated through:</p>
1062      <ul>
1063        <li>Manual review of thousands of scored sites during outreach QA</li>
1064        <li>Grade/score mismatch detection and correction</li>
1065        <li>Programmatic grade computation (eliminating LLM grading inconsistencies)</li>
1066        <li>Iterative prompt refinement based on observed scoring patterns</li>
1067      </ul>
1068      <hr />
1069      <h2>12. Token Optimization</h2>
1070      <h3>Image Optimization (Implemented)</h3>
1071      <ul>
1072        <li>JPEG conversion (quality 85): 40-50% file size reduction</li>
1073        <li>DOM-aware intelligent cropping: 20-35% token reduction</li>
1074        <li>Resolution targeting: minimum for text legibility</li>
1075        <li>Above-fold only: 65-75% reduction vs. full-page</li>
1076      </ul>
1077      <h3>Prompt Optimization (Implemented)</h3>
1078      <ul>
1079        <li>Rubric in system prompt (not repeated per website)</li>
1080        <li>Abbreviated factor references in rescoring prompt</li>
1081        <li>Strict JSON output (no prose explanation outside the JSON)</li>
1082        <li>LLM returns factor scores only; computation done in code</li>
1083      </ul>
1084      <h3>HTML-Only Mode (Added Later)</h3>
1085      <p>When <code>ENABLE_VISION=false</code>, the system skips screenshots entirely:</p>
1086      <ul>
1087        <li>No Playwright screenshot capture</li>
1088        <li>Text-only analysis of HTML DOM</li>
1089        <li>Auto-promotes scored sites through rescoring (no below-fold vision needed)</li>
1090        <li>Cost: ~$0.0025/site vs. ~$0.030/site with vision (83% savings)</li>
1091      </ul>
1092      <hr />
1093      <h2>13. Implementation Notes</h2>
1094      <h3>What Changed from Research to Production</h3>
1095      <table>
1096        <thead>
1097          <tr>
1098            <th>Research Proposal</th>
1099            <th>Production Implementation</th>
1100            <th>Reason</th>
1101          </tr>
1102        </thead>
1103        <tbody>
1104          <tr>
1105            <td>Academic grading scale (A+ to F with +/-)</td>
1106            <td>Business scale (A+ to F, fewer subdivisions)</td>
1107            <td>Simpler for prospect identification</td>
1108          </tr>
1109          <tr>
1110            <td>Mobile + desktop screenshots</td>
1111            <td>Desktop only</td>
1112            <td>Cost reduction; mobile added minimal value</td>
1113          </tr>
1114          <tr>
1115            <td>LLM computes grade</td>
1116            <td>Code computes grade from factor scores</td>
1117            <td>Eliminates grading inconsistencies</td>
1118          </tr>
1119          <tr>
1120            <td>Batch processing of 50-100 sites</td>
1121            <td>Individual API calls with concurrency control</td>
1122            <td>Rate limits; error isolation</td>
1123          </tr>
1124          <tr>
1125            <td>Few-shot examples in prompt</td>
1126            <td>Prompt-only (no examples)</td>
1127            <td>Token savings; rubric detail sufficient</td>
1128          </tr>
1129          <tr>
1130            <td>C+ (77) rescore threshold</td>
1131            <td>B- (82) configurable threshold</td>
1132            <td>Business need: more prospects</td>
1133          </tr>
1134          <tr>
1135            <td>9 factors + context</td>
1136            <td>10 factors (context kept as factor 10)</td>
1137            <td>Consistent weighting</td>
1138          </tr>
1139          <tr>
1140            <td>Full JSON with nested evidence/reasoning</td>
1141            <td>Simplified JSON with factor scores</td>
1142            <td>Token efficiency</td>
1143          </tr>
1144        </tbody>
1145      </table>
1146      <h3>Key Files</h3>
1147      <ul>
1148        <li>
1149          <code>src/score.js</code> — Scoring logic, <code>computeScoreFromFactors()</code>,
1150          <code>computeGrade()</code>
1151        </li>
1152        <li><code>src/stages/rescoring.js</code> — Below-fold rescoring pass</li>
1153        <li><code>src/capture.js</code> — Screenshot capture</li>
1154        <li><code>src/contacts/prioritize.js</code> — Contact extraction and prioritization</li>
1155        <li><code>prompts/CONVERSION-SCORING-VISION.md</code> — Vision-enabled scoring prompt</li>
1156        <li><code>prompts/CONVERSION-SCORING-NOVIS.md</code> — HTML-only scoring prompt</li>
1157      </ul>
1158      <hr />
1159      <h2>14. Appendix: Original JSON Schemas</h2>
1160      <h3>Pass 1: Scoring Output</h3>
1161      <pre><code class="language-json">{
1162    &quot;website_url&quot;: &quot;https://example.com&quot;,
1163    &quot;evaluation_date&quot;: &quot;2026-01-14T12:00:00Z&quot;,
1164    &quot;device_analysis&quot;: {
1165      &quot;desktop_visible&quot;: true,
1166      &quot;mobile_visible&quot;: true,
1167      &quot;design_differences&quot;: &quot;Mobile layout stacks hero and CTA; desktop shows them side by side.&quot;
1168    },
1169    &quot;factor_scores&quot;: {
1170      &quot;headline_quality&quot;: {
1171        &quot;score&quot;: 8,
1172        &quot;reasoning&quot;: &quot;Headline clearly states what the product does and who it&#39;s for.&quot;,
1173        &quot;evidence&quot;: &quot;Headline text: &#39;Automate Your Invoices in Minutes for Small Businesses&#39;.&quot;
1174      },
1175      &quot;value_proposition&quot;: {
1176        &quot;score&quot;: 7,
1177        &quot;reasoning&quot;: &quot;Benefits are clear but lack concrete quantified outcomes.&quot;,
1178        &quot;evidence&quot;: &quot;Copy mentions &#39;save time and reduce errors&#39; but no specific numbers.&quot;
1179      },
1180      &quot;unique_selling_proposition&quot;: {
1181        &quot;score&quot;: 6,
1182        &quot;reasoning&quot;: &quot;USP is implied but not explicitly contrasted with competitors.&quot;,
1183        &quot;evidence&quot;: &quot;Mentions &#39;built specifically for freelancers&#39; but no comparison.&quot;
1184      },
1185      &quot;call_to_action&quot;: {
1186        &quot;score&quot;: 9,
1187        &quot;reasoning&quot;: &quot;Primary CTA is above the fold, high contrast, and action-oriented.&quot;,
1188        &quot;evidence&quot;: &quot;CTA button &#39;Start Free 14-Day Trial&#39; in hero, visually prominent.&quot;
1189      },
1190      &quot;urgency_messaging&quot;: {
1191        &quot;score&quot;: 3,
1192        &quot;reasoning&quot;: &quot;Weak urgency with vague wording and no specific deadline.&quot;,
1193        &quot;evidence&quot;: &quot;Text: &#39;Join now and don&#39;t miss out&#39; without concrete time limit.&quot;
1194      },
1195      &quot;hook_engagement&quot;: {
1196        &quot;score&quot;: 7,
1197        &quot;reasoning&quot;: &quot;Hero image is relevant and supports the message.&quot;,
1198        &quot;evidence&quot;: &quot;Image shows a freelancer working with invoices on a laptop.&quot;
1199      },
1200      &quot;trust_signals&quot;: {
1201        &quot;score&quot;: 5,
1202        &quot;reasoning&quot;: &quot;Includes generic testimonials but no logos or certifications.&quot;,
1203        &quot;evidence&quot;: &quot;Three short text testimonials with first names only.&quot;
1204      },
1205      &quot;imagery_design&quot;: {
1206        &quot;score&quot;: 8,
1207        &quot;reasoning&quot;: &quot;Clean, modern design with custom-looking imagery.&quot;,
1208        &quot;evidence&quot;: &quot;Custom illustrations and product screenshots; no stock photos.&quot;
1209      },
1210      &quot;offer_clarity&quot;: {
1211        &quot;score&quot;: 8,
1212        &quot;reasoning&quot;: &quot;Offer is explicit: 14-day free trial, no credit card.&quot;,
1213        &quot;evidence&quot;: &quot;Text near CTA: &#39;Try it free for 14 days. No credit card needed.&#39;&quot;
1214      },
1215      &quot;contextual_appropriateness&quot;: {
1216        &quot;score&quot;: 7,
1217        &quot;reasoning&quot;: &quot;Design is appropriate for a B2B SaaS invoicing tool.&quot;,
1218        &quot;industry_context&quot;: &quot;B2B SaaS / invoicing software&quot;
1219      }
1220    },
1221    &quot;overall_calculation&quot;: {
1222      &quot;weighted_total&quot;: 76.5,
1223      &quot;letter_grade&quot;: &quot;C&quot;,
1224      &quot;grade_interpretation&quot;: &quot;Acceptable fundamentals but room for improvement in USP, trust, and urgency.&quot;
1225    },
1226    &quot;key_strengths&quot;: [
1227      &quot;Strong, visible primary CTA with clear action and low-friction offer.&quot;,
1228      &quot;Headline and offer are easy to understand for the target audience.&quot;
1229    ],
1230    &quot;critical_weaknesses&quot;: [
1231      &quot;Weak urgency messaging provides little reason to act now.&quot;,
1232      &quot;Trust elements are generic and do not provide strong social proof.&quot;
1233    ],
1234    &quot;quick_improvement_opportunities&quot;: [
1235      &quot;Add specific trust signals (logos, detailed testimonials) above the fold.&quot;,
1236      &quot;Introduce concrete urgency (time-bound offer or limited onboarding slots).&quot;
1237    ],
1238    &quot;confidence_assessment&quot;: {
1239      &quot;overall_confidence&quot;: &quot;High&quot;,
1240      &quot;reasoning&quot;: &quot;Above-fold content contains all major conversion elements.&quot;,
1241      &quot;limitation_notes&quot;: &quot;Does not consider deeper content, checkout flow, or post-click funnel.&quot;
1242    }
1243  }
1244  </code></pre>
1245      <h3>Pass 2: Rescoring + Contact Extraction Output</h3>
1246      <p>The rescoring pass adds a <code>contact_details</code> section to the evaluation JSON:</p>
1247      <pre><code class="language-json">{
1248    &quot;contact_details&quot;: {
1249      &quot;primary_contact_form&quot;: {
1250        &quot;form_action_url&quot;: &quot;https://example.com/contact-submit&quot;,
1251        &quot;form_method&quot;: &quot;post&quot;,
1252        &quot;fields&quot;: {
1253          &quot;first_name&quot;: {
1254            &quot;present&quot;: true,
1255            &quot;field_type&quot;: &quot;text&quot;,
1256            &quot;name_attribute&quot;: &quot;first_name&quot;,
1257            &quot;label_or_placeholder&quot;: &quot;First name&quot;
1258          },
1259          &quot;last_name&quot;: {
1260            &quot;present&quot;: true,
1261            &quot;field_type&quot;: &quot;text&quot;,
1262            &quot;name_attribute&quot;: &quot;last_name&quot;,
1263            &quot;label_or_placeholder&quot;: &quot;Last name&quot;
1264          },
1265          &quot;full_name&quot;: {
1266            &quot;present&quot;: false,
1267            &quot;field_type&quot;: null,
1268            &quot;name_attribute&quot;: null,
1269            &quot;label_or_placeholder&quot;: null
1270          },
1271          &quot;email&quot;: {
1272            &quot;present&quot;: true,
1273            &quot;field_type&quot;: &quot;email&quot;,
1274            &quot;name_attribute&quot;: &quot;email&quot;,
1275            &quot;label_or_placeholder&quot;: &quot;Your email&quot;
1276          },
1277          &quot;phone&quot;: {
1278            &quot;present&quot;: true,
1279            &quot;field_type&quot;: &quot;tel&quot;,
1280            &quot;name_attribute&quot;: &quot;phone&quot;,
1281            &quot;label_or_placeholder&quot;: &quot;Phone number&quot;
1282          },
1283          &quot;company_name&quot;: {
1284            &quot;present&quot;: false,
1285            &quot;field_type&quot;: null,
1286            &quot;name_attribute&quot;: null,
1287            &quot;label_or_placeholder&quot;: null
1288          },
1289          &quot;subject_line&quot;: {
1290            &quot;present&quot;: false,
1291            &quot;field_type&quot;: null,
1292            &quot;name_attribute&quot;: null,
1293            &quot;label_or_placeholder&quot;: null
1294          },
1295          &quot;message&quot;: {
1296            &quot;present&quot;: true,
1297            &quot;field_type&quot;: &quot;textarea&quot;,
1298            &quot;name_attribute&quot;: &quot;message&quot;,
1299            &quot;label_or_placeholder&quot;: &quot;Your message&quot;
1300          }
1301        }
1302      },
1303      &quot;email_addresses&quot;: [
1304        &quot;support@example.com&quot;,
1305        &quot;sales@example.com&quot;
1306      ],
1307      &quot;phone_numbers&quot;: [
1308        &quot;+1-555-123-4567&quot;
1309      ],
1310      &quot;social_profiles&quot;: [
1311        {
1312          &quot;platform&quot;: &quot;facebook&quot;,
1313          &quot;url&quot;: &quot;https://www.facebook.com/example&quot;
1314        },
1315        {
1316          &quot;platform&quot;: &quot;linkedin&quot;,
1317          &quot;url&quot;: &quot;https://www.linkedin.com/company/example&quot;
1318        }
1319      ],
1320      &quot;contact_pages&quot;: [
1321        &quot;https://example.com/contact&quot;,
1322        &quot;https://example.com/support&quot;
1323      ]
1324    }
1325  }
1326  </code></pre>
1327      <hr />
1328      <h2>References</h2>
1329      <p>The original research cited ~60 sources. Key references that informed the design:</p>
1330      <ul>
1331        <li>
1332          Nielsen Norman Group — User attention and fold research (57-80% above-fold viewing time)
1333        </li>
1334        <li>Google ad viewability study — 73% above-fold vs. 44% below-fold viewability</li>
1335        <li>CRO best practices literature — Factor selection and weighting</li>
1336        <li>GPT-4 Vision cropping research — Focused images outperform full-page for accuracy</li>
1337        <li>
1338          PURE method — Inter-rater reliability calibration for expert-based evaluation (&gt;0.8
1339          reliability)
1340        </li>
1341        <li>LLMLingua (Microsoft) — Prompt optimization: 35% token reduction maintaining quality</li>
1342        <li>DeepSeek vision-text compression — 7-20x token reduction at &gt;90% accuracy</li>
1343        <li>OpenAI vision model documentation — Token calculation and pricing for image inputs</li>
1344      </ul>
1345  
1346      <footer
1347        style="
1348          margin-top: 3rem;
1349          padding-top: 1rem;
1350          border-top: 1px solid var(--border);
1351          font-size: 0.85em;
1352          color: #6b7280;
1353        "
1354      >
1355        <p>
1356          Generated from original Open-WebUI research chat (January 2026). Consolidated February 2026.
1357        </p>
1358        <p>333 Method &mdash; SERP-to-Outreach Automation</p>
1359      </footer>
1360    </body>
1361  </html>