/ UBIQUITOUS_LANGUAGE.md
UBIQUITOUS_LANGUAGE.md
1 # Ubiquitous Language 2 3 Canonical terminology for the WriterAIScore project (browser extension surfacing 4 author-trust signals on book marketplaces). Use these terms consistently in 5 code, copy, docs, and commit messages. 6 7 ## Actors 8 9 | Term | Definition | Aliases to avoid | 10 |----------------------|-----------------------------------------------------------------------------------------------------|----------------------------------------| 11 | **Buyer** | A person at the point of purchase decision on a Marketplace. The primary user of the extension. | consumer, shopper, user | 12 | **Reader** | A person who has already consumed a purchased work. Distinct from Buyer. | consumer, end-user | 13 | **Author** | The person or identity named on a book's byline. May be an Original Author or a Synthetic Author. | writer, creator | 14 | **Original Author** | A real human whose published work has been appropriated into Laundered Content. | victim, source author | 15 | **Synthetic Author** | A pseudonymous identity with no verifiable human behind it, used to publish AI-generated books. | fake author, bot, AI persona, impostor | 16 | **Fraudster** | The unknown actor or organization operating one or more Synthetic Authors. | bad actor, scammer, unethical actor | 17 | **Publisher** | An organization that vouches for and distributes authored content. | press, imprint | 18 | **Marketplace** | A retail platform that lists books for sale (Amazon, Goodreads, Barnes & Noble, Apple Books, Kobo). | platform, store, site | 19 20 ## The fraud 21 22 | Term | Definition | Aliases to avoid | 23 |--------------------------|----------------------------------------------------------------------------------------------------------------------|------------------------------------------------------| 24 | **Laundered Content** | Published material algorithmically reshuffled from one or more Original Authors' work to evade plagiarism detection. | recycled content, rewritten content, scraped content | 25 | **Pseudonymous Listing** | A Marketplace listing attributed to a Synthetic Author. | fake listing, bot listing | 26 | **Attribution Loss** | The condition where an Original Author receives no credit or royalty for content derived from their work. | plagiarism (too narrow), theft (too broad) | 27 28 ## Trust machinery 29 30 | Term | Definition | Aliases to avoid | 31 |--------------------------|----------------------------------------------------------------------------------------------------------------------------------|------------------------------------------| 32 | **Trust Signal** | A single, observable, quantifiable input that informs a Buyer's judgement about authorship authenticity. | indicator, hint, flag, metric | 33 | **Trust Score** | A weighted composite of multiple Trust Signals. Distinct from a single Signal. | rating, grade, confidence | 34 | **Publication Velocity** | Number of books attributed to an Author over a fixed time window (e.g. last 12 months). The primary Trust Signal. | publishing rate, output rate, throughput | 35 | **Baseline** | The typical Publication Velocity for a human Author in a given Genre, used as a comparator. | norm, benchmark, average | 36 | **Genre** | A book-subject category that scopes the Baseline (e.g. romance, cozy mystery, technical non-fiction). Sourced from Marketplace metadata. | category, subject, vertical | 37 | **Burst** | A cluster of Publication Velocity concentrated in an abnormally short window (e.g. ≥5 books in 30 days). A derived Trust Signal. | spike, flood | 38 | **Topic-Spread Entropy** | A Trust Signal measuring how unrelated an Author's book subjects are. | topic diversity, subject spread | 39 | **Provenance** | Cryptographic attestation of a work's origin and edit history, carried by a C2PA Manifest. | source, origin | 40 | **C2PA Manifest** | The concrete Provenance record attached to a work, per the C2PA specification (Coalition for Content Provenance and Authenticity). | manifest, attestation, C2PA record | 41 42 ## Verification strategies 43 44 | Term | Definition | Aliases to avoid | 45 |------------------------|----------------------------------------------------------------------------------------------------------------------------|---------------------| 46 | **Verify the Human** | Strategy to confirm an Author is a real person with verifiable identity (ORCID, government ID, institutional affiliation). | identity check, KYC | 47 | **Verify the Content** | Strategy to confirm a work's origin and authenticity (C2PA, similarity scans, disclosure labels). | content auth | 48 | **Verify the Signal** | Strategy to confirm that surface trust signals (reviews, badges, listings) are themselves trustworthy. | signal auth | 49 50 ## Extension anatomy 51 52 | Term | Definition | Aliases to avoid | 53 |--------------------|----------------------------------------------------------------------------------------------------------------|--------------------------------| 54 | **Info Card** | The UI element the extension injects into a Marketplace page to display Trust Signals to the Buyer. Placement is defined per-Marketplace by the Adapter; the target is adjacent to the Author byline, not overlaying the purchase controls. | widget, tooltip, popup, banner | 55 | **Content Script** | The browser-extension component that runs on Marketplace pages, detects the Author, and renders the Info Card. | injection, page script | 56 | **Lookup** | A query to a book-metadata API (Google Books, OpenLibrary) resolving an Author name to their book list. | fetch, query, API call | 57 | **Author Cache** | Local per-browser storage of prior Lookup results, keyed by Author name, with a TTL. | local store, cache | 58 | **Adapter** | A per-Marketplace module that knows how to extract the Author from that site's DOM. | scraper, parser, selector | 59 60 ## Relationships 61 62 - A **Marketplace** hosts **Pseudonymous Listings** attributed to **Synthetic 63 Authors**. 64 - A **Fraudster** operates one or more **Synthetic Authors**. 65 - A **Synthetic Author** produces **Laundered Content** derived from one or more 66 **Original Authors**, causing **Attribution Loss**. 67 - A **Trust Score** aggregates multiple **Trust Signals**; **Publication 68 Velocity** is the primary **Trust Signal**. 69 - A **Burst** is a derived Trust Signal computed from Publication Velocity over 70 short windows. 71 - The **Content Script** uses an **Adapter** to extract an **Author**, issues a 72 **Lookup**, caches the result in the **Author Cache**, and renders the **Info 73 Card**. 74 - The extension's three release strategies map to **Verify the Human**, **Verify 75 the Content**, and **Verify the Signal**. 76 77 ## Example dialogue 78 79 > **Dev:** "When the Content Script lands on an Amazon `/dp/` page, do we 80 > always fetch the `/author/` page, or do we hit the Author Cache first?" 81 82 > **Domain expert:** "Cache first, seven-day TTL, keyed by Amazon author 83 > ID. The catalog for a given Author doesn't move minute-to-minute, and 84 > fetching aggressively would look like automated access." 85 86 > **Dev:** "Got it. And if the parsed catalog shows forty books in the last 87 > year — we label the Author a Synthetic Author in the Info Card?" 88 89 > **Domain expert:** "No. We show the count, attribute it *per Amazon's 90 > catalog*, and let the Buyer decide. Labelling an Author a Synthetic 91 > Author without proof is a legal risk and defeats the neutral-Signal 92 > framing." 93 94 > **Dev:** "Do we show a Baseline alongside for comparison?" 95 96 > **Domain expert:** "Not in v0.1. 2026-04-17 decision — raw counts only, 97 > rolling 12mo and calendar year, side by side. A Baseline is contestable 98 > (what counts as 'typical'?), and the bet is that two raw counts with a 99 > visible *see full catalog* link are enough for the Buyer to judge. 100 > Revisit after observing usage." 101 102 > **Dev:** "So the Info Card is Verify the Signal, not Verify the Human?" 103 104 > **Domain expert:** "For v0.1, yes. Verify the Human arrives in v0.4 105 > when we pull from Wikipedia, Goodreads, and ORCID. Keep the MVP 106 > strictly to raw Publication Velocity from Amazon's own catalog." 107 108 ## Versioned scope 109 110 The glossary is the full domain vocabulary, but only a subset is active in any 111 given release. Terms outside the active version remain canonical but 112 unimplemented — do not write code, UI copy, or docs that assume them. 113 114 | Version | Surface(s) | Active Trust Signals (raw, no composite) | Active Strategy | Notable inactive terms | 115 |---------|---------------------------------------------|-------------------------------------------------------|---------------------------|----------------------------------------------------------------------------------------------------| 116 | v0.1 | Amazon `/dp/` product page | Publication Velocity (rolling 12mo + calendar year) | Verify the Signal | Baseline, Burst, Topic-Spread Entropy, Genre, Trust Score, Provenance, Verify the Human | 117 | v0.2 | + Amazon `/author/` page with timeline | (same Signals; timeline is visualization, not a new Signal) | Verify the Signal | (as v0.1) | 118 | v0.3 | + Amazon search results | (same Signals) | Verify the Signal | (as v0.1) | 119 | v0.4 | (same surfaces) | + Wikipedia / Goodreads / ORCID presence | Verify the Signal + Human | Trust Score (still not composed), C2PA Manifest, Baseline (possibly never shipped) | 120 | Later | Multi-Marketplace | Composed Trust Score, C2PA Manifest checks | All three strategies | — | 121 122 *On Baseline:* the 2026-04-17 grilling decided to ship v0.1 without a 123 Baseline comparator. The Info Card shows raw counts; the Buyer brings 124 the context. Whether a Baseline (editorial or computed) is ever added 125 is an open question to revisit after observing v0.1 usage. Until then, 126 Baseline remains canonically defined here but inactive in every shipped 127 version. 128 129 *On the roadmap shape:* v0.2–v0.4 are organized around *surface and 130 data-source growth*, not new Trust Signals. Identity verification 131 (ORCID, Wikipedia) arrives in v0.4 when the backend lands that can 132 cross-reference those sources, not v0.3 as earlier plans suggested. 133 134 Three standing rules that follow from this: 135 136 - Do not introduce UI copy referencing a **Trust Score** until the composite 137 exists. v0.1 ships Signals, not a Score — despite the repository name. 138 - Do not label an Author a **Synthetic Author** in the Info Card. The Buyer 139 draws that conclusion from the Signal; the extension does not. 140 - Do not introduce a **Baseline** comparator in v0.1 — no "typical author" 141 number, no percentile, no colour coding, no threshold highlight. The 142 Info Card shows raw counts, attributed *per Amazon*. Revisit after v0.1 143 usage. 144 145 ## Flagged ambiguities 146 147 - **"buyer" vs "reader" vs "consumer"** — the extension targets the **Buyer** at 148 point of purchase. "Reader" is post-purchase. "Consumer" is too vague and 149 should be avoided entirely in product copy. 150 - **"author" overloaded** — in the conversation, "author" meant both real humans 151 and AI-manufactured identities. Always qualify as **Original Author** or 152 **Synthetic Author** when the distinction matters; plain **Author** is 153 acceptable only when the status is unknown or irrelevant. 154 - **"fake" vs "synthetic" vs "AI-generated"** — use **Synthetic Author** in docs 155 and UI copy. "Fake" is legally risky without proof (defamation exposure). 156 "AI-generated" presumes proof of AI involvement that the extension cannot 157 establish. 158 - **"score" vs "signal"** — a **Trust Signal** is a single input; a **Trust 159 Score** is a composite. The conversation used them loosely; tighten this in 160 all code and copy. MVP ships Signals only, no Score. 161 - **"platform" vs "marketplace"** — canonical is **Marketplace**. "Platform" is 162 ambiguous (Amazon, KDP, AWS all qualify). 163 - **"bot"** — informal and imprecise. Use **Synthetic Author** when referring to 164 the fake author identity, **Fraudster** when referring to the human operator. 165 - **"score" as product name, resolved** — the repository is `WriterAIScore` 166 (commit-history continuity). The public-facing product name is 167 *WhoWroteThis*. Code identifiers and internal docs may use *WriterAIScore*; 168 user-visible copy, the manifest `name` field, and the Chrome Web Store 169 listing use *WhoWroteThis*. Decision date: 2026-04-17. See `grill.org`.