/ docs / philosophy.md
philosophy.md
 1  # Philosophy
 2  
 3  ## Brief
 4  
 5  The philosophy of Proselint is preserved here as a direct excerpt from posts
 6  made on the website on 2014-06-10 05:31:19Z-07. It should be referred to in
 7  matters of deciding what the project should aim to achieve, and what it should
 8  avoid.
 9  
10  Note that for lack of an open corpus, we are currently unable to keep track of
11  the lintscore. We aim to resolve this in the future.
12  
13  ## Approach
14  
15  Is `proselint` yet another awful grammar checker?
16  
17  *No*. Here's why not:
18  
19  1. `proselint` does not focus on grammar, which is at once too easy and too hard
20     — too eassy because, for most native speakers, it comes naturally; too hard
21     because, in its most general form, detecting grammatical errors is
22     AI-complete, requiring human-level intelligence to get things right. Instead,
23     we consider usage: redundancy, illogic, clichés, sexism, misspelling,
24     inconsistency, misuse of symbols, malapropisms, oxymorons, security gaffes,
25     hedging, apologizing, pretension, and more.
26  2. `proselint` is precise. Existing tools for improving prose raise so many
27     false alarms that their advice cannot be trusted. Instead, the writer must
28     carefully consider whether to accept or reject each change. We aim for a tool
29     so precise that it becomes possible to unquestioningly adopt its
30     recommendations and still come out ahead — with stronger, tighter prose.
31     Better to be quiet and authoritative than loud and unreliable. We measure the
32     performance of `proselint` by tracking its [lintscore](#lintscore).
33  3. `proselint` defers to the world's greatest writers and editors. We didn't
34     make up this advice on our own. Instead, we aggregated their expertise,
35     giving you direct access to humanity's collective understanding about the
36     craft of writing.
37  
38  ## Lintscore
39  
40  Proselint's "lintscore" metric, which we use to evaluate its performance,
41  reflects the desire to have a linter that catches many errors, but which takes
42  false alarms seriously. Better to say nothing than to say the wrong thing. And
43  the harm from saying the wrong thing is greater than the benefit of having said
44  the right thing. Thus our score metric is defined as
45  
46  $$T \left(\frac{T}{F + T}\right)^k$$
47  
48  where $T$ is the number of true positives (hits), $F$ is the number of false
49  positives (false alarms), and $k > 0$ is a temperature parameter that determines
50  the penalty for imprecision. In general, we choose as large a value of $k$ as we
51  can stomach, one that strongly discourages the creation of rules that can't be
52  trusted. Suppose that $k = 2$. Then if the linter detects $100$ errors, of which
53  $10$ are false positives, the score is $72.9$.