BridgeIssues.md
1 # Adding Bridges and Pluggable Transports to Arti 2 3 This document will go over the general issues that we face when building 4 client-side support for bridges and pluggable transports in Arti. 5 6 7 ## Tor's anticensorship features: a lower-level perspective 8 9 Here's what you need to know about bridges. 10 11 Fundamentally, a "**Bridge**" is a relay that we use as the first hop 12 for our circuits _because it is configured by the user_, not because it is 13 listed in the main network directory.[^1] 14 15 A "Bridge" can either be reached by the regular Tor (cells over TLS) 16 protocol, or by some different censorship-resistant "**transport**" 17 protocol.[^2] 18 19 Users configure a single bridge by listing some of the following: 20 * A set of supported transports that can be used. If this set is 21 empty, the client just uses the default transport. 22 * A set of IP:Port addresses that can be used to reach the bridge. 23 (With some transports, the transport itself figures out how to 24 contact the bridge, and this set is empty or ignored.) 25 * A set of identities to expect for the bridge. (Note that C Tor 26 allows this set to be empty; Arti will not, since it tends to 27 create severe implementation headaches.) 28 * For each transport, a set of transport-specific parameters. (These 29 might, for example, be additional protocol-specific authentication 30 keys.) 31 32 Users can turn bridge usage on and off. This is a single boolean that 33 does not require deleting their entire list of bridges. 34 35 Users can configure a large number of bridges; if they do, then we want 36 to pick randomly from among them and favor just a few, in the same way 37 that we do when choosing guard relays. We want to reuse our `GuardMgr` 38 code for this. (Doing so, however, may require a bit of refactoring, 39 since the current `GuardMgr` selects `Relay`s from a `NetDir`, and we'll 40 have to select `Bridge`s from some kind of underlying `BridgeSet`.) 41 42 Since bridges are not listed in the main network directory, we can't use 43 the directory to look up their **onion keys** (the ones we use to build 44 multihop circuits). Instead, we have to connect to the bridge and ask 45 the bridge for a **router descriptor**—a self-signed document describing 46 the bridge and its supported keys. Descriptors are only valid for a 47 while. 48 49 Some transports are implemented as external processes, using a 50 "**managed pluggable transport**" mechanism. In this design, the Tor 51 client program is responsible for launching and monitoring external 52 binaries that provide transports over SOCKS4 or SOCKS5. The [protocol] 53 for communicating with these binaries uses stdin, stdout, and the 54 environment. To use these binaries as transports, the client treats 55 them as SOCKS4 or SOCKS5 proxies, and encodes per-connection arguments 56 in the authentication fields of the SOCKS handshakes. 57 58 A single managed PT binary can implement multiple transports: if it 59 does, each one gets its own local proxy address. 60 61 [protocol]: https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/pt-spec.txt 62 63 ## Architectural implications 64 65 With those issues in mind, let's go through the parts of the 66 implementation that are simple. 67 68 We'll need to extend the definition of `ChanTarget` to include the 69 additional information that bridges need: which protocol to use, and 70 protocol-specific information. We might want a separate trait for 71 `ChanTarget`s that can have this information, since relays will never 72 want to look at it, and in fact will require that it is absent. 73 74 We'll want to extend the `tor-chanmgr` crate to know about more ways to 75 launch channels. It will probably have a registry of known transport 76 mechanisms (including the default transport) and know how to connect to 77 each one. 78 79 We'll need to implement `tor-ptmgr` crate that launches and monitors 80 managed pluggable transport binaries. It should have the ability to 81 launch and shut down PTs on demand, not just because they are 82 configured. (In other words, if no bridge wants a given transport, we 83 shouldn't run that transport.) 84 85 We'll need to teach `tor-guardmgr` to be able to take its input from a 86 configured set of bridges rather than from a `NetDir`. This needs to be 87 a separate "guard selection", since we want to be able to switch back 88 and forth between using bridges and not using bridges. 89 90 In `ChanMgr` and `GuardMgr`, we'll need a way to identify bridges. This 91 will be interesting, since bridges can be configured only with a single 92 identity that is _not_ their Ed25519 identity. (In `GuardMgr`, we might 93 have as little as an `RsaIdentity`. In `ChanMgr`, we will have more 94 identity information, but only _after_ the channel handshake is 95 successful.) If the same identity is listed twice with different 96 addresses and transports, we may need to treat them as different 97 bridges.[^3] We may need to assign configured bridges a local unique ID, 98 and use that identify which bridge is which in ChanMgr. We may need a 99 flexible matching approach in our `GuardMgr` code to see which 100 remembered guard is equivalent to which configured bridge. 101 102 We'll need to download and cache bridge's router descriptors as needed. 103 This is different from downloading regular directory information in 104 several ways: 105 * We can only download a bridge's descriptor _from that bridge_. 106 * We need to be able to download a bridge's descriptor _even when we 107 have no directory_. 108 * When using bridges, we _only use bridges_ as our directory caches: 109 never fallback directories. 110 111 Let's try to, to the extent possible, to put all of the client-side 112 bridge and pluggable 113 transport code behind Cargo features (`bridge-client` and `pt-client`, 114 maybe), so that we can disable them for Relays and for 115 resource-constrained clients that don't want them. 116 117 ## Challenges with implementing anticensorship in Arti 118 119 Now that we've been through all of that, here are some of the challenges 120 and open questions that we need to solve as we implement these 121 anticensorship features in Arti. 122 123 ### Problem 1: The directory infrastructure and logic 124 125 Our existing directory code doesn't know about bridges. We'll need to 126 think carefully about the logic that drives guard selection and 127 directory downloads. 128 129 We'll need an additional directory state where we try to make sure we 130 fetch bridge descriptors. This has to happen after bridges are 131 selected. There needs to be feedback between the `GuardMgr` and the 132 `DirMgr` here: the `GuardMgr` can't hand out bridges for multi-hop 133 circuits until it knows descriptors for them; the `DirMgr` can't fetch 134 any bridge descriptors until it knows what the `GuardMgr` wants. 135 136 (The `DirMgr` also needs to keep bridge descriptors separate from regular 137 relays, to avoid leaking whether we've used a given bridge when using it 138 as a relay, and vice versa.) 139 140 141 ### Problem 2: Circuits through bridges 142 143 Our `CircMgr` can build one-hop directory circuits through any kind of 144 `ChanTarget`. But right now it can only build multihop circuits by 145 first looking up the `Relay` object for the first hop in the `NetDir`. 146 147 Here we have two options: We can make bridges with known descriptors 148 into `Relay`s, or we can adjust `CircMgr` so that any `CircTarget` can 149 start a multihop circuit. 150 151 We'll also want a meaningful way to know if a bridge is in the same 152 family as a `Relay`, which presents its own challenges. 153 154 ### Problem 3: Discarding unused channels and circuits 155 156 When a user turns bridges on and off, or changes the set of configured 157 bridges, we can easily have the `ChanMgr` and the `CircMgr` drop all of 158 their existing channels and circuits. That will cause these channels 159 and circuits to close once there are no longer any streams using them, 160 which is all well and good. 161 162 But the user may want channels and circuits to close sooner! People 163 sometimes get worried when an they flip a "anticensorship" switch and 164 their non-resistant channels and circuits don't close immediately. 165 166 That's a challenge in our current `ChanMgr`/`CircMgr` API, since we 167 don't actually keep track of the channels and circuits that we no longer 168 track in those managers. We might instead need to keep weak references 169 to deprecated channels and circuits. But doing _that_ might require new 170 `WeakChannel` and `WeakCircuit` types in `tor-proto`. 171 172 ### Problem 4: Channel equivalency, bridge identity 173 174 If a bridge's configured addresses or transports are changed, then 175 existing channels to that bridge may no longer be used. 176 177 If a bridge has multiple transports, we might need to remember which 178 ones work and which ones don't. 179 180 What's more, we might not always know an Ed25519 identity for a bridge: 181 this will mess with our guard and channel code, both of which assume 182 that all known relays have an Ed25519 identity. 183 184 ### Problem 5: Tuning, tuning, tuning 185 186 Our existing code has some constants and consensus values that are tuned 187 for the main network. We'll need to revisit them for bridges. Notably, 188 we'll need to reconsider our required guard parallelism, our recommended 189 guard parallelism, our willingness to retry a guard that seems not to be 190 working, our timeouts, our happy-eyeballs parameters, and more. 191 192 193 ### Problem 6: Existing bridge-line format 194 195 We would like to have backward compatibility with Tor's current bridge 196 configuration mechanism, which uses a line format something like this: 197 198 ``` 199 [TransportId] 1.2.3.4:9100 RsaIdentity [Param1=Val1] [Param2=Val2] ... 200 ``` 201 202 We need to support this indefinitely, though it has a number of design 203 problems, since its usage is established basically everywhere. 204 Nonetheless, we may want to look into alternatives, so that we could: 205 206 * Have more identity types 207 * Make addresses optional 208 * Use a type better suited for encoding binary data. 209 210 211 ## APIs to design 212 213 These are some APIs to sketch out as next steps. 214 215 * Extended ChanTarget/CircTarget API 216 217 * Protocol or TransportId API 218 219 * Revised GuardMgr interfaces 220 221 * TransportRegistry (part of ChanMgr, knows how to connect via different 222 protocols. Takes an `ExtendedChanTarget`; returns a `Result<Channel>`) 223 224 * PtMgr (handles managed pluggable transports) 225 226 * Whatever the heck is going on inside DirMgr and between 227 DirMgr/GuardMgr now. 228 229 230 231 ---- 232 233 [^1]: In fact, bridges are typically _not_ listed in the main network 234 directory: if they were, a censor could easily block their IP addresses. 235 236 [^2]: In practice, all of our transports are implemented as extra layers 237 over which we tunnel our regular cells-over-TLS protocol. This is a 238 deliberate choice: Even when the transport provides authenticity and 239 and confidentiality on its own 240 241 [^3]: This is an uncommon case in C Tor, and we might not want to 242 support it. 243