/ doc / dev / notes / BridgeIssues.md
BridgeIssues.md
  1  # Adding Bridges and Pluggable Transports to Arti
  2  
  3  This document will go over the general issues that we face when building
  4  client-side support for bridges and pluggable transports in Arti.
  5  
  6  
  7  ## Tor's anticensorship features: a lower-level perspective
  8  
  9  Here's what you need to know about bridges.
 10  
 11  Fundamentally, a "**Bridge**" is a relay that we use as the first hop
 12  for our circuits _because it is configured by the user_, not because it is
 13  listed in the main network directory.[^1]
 14  
 15  A "Bridge" can either be reached by the regular Tor (cells over TLS)
 16  protocol, or by some different censorship-resistant "**transport**"
 17  protocol.[^2]
 18  
 19  Users configure a single bridge by listing some of the following:
 20    * A set of supported transports that can be used.  If this set is
 21      empty, the client just uses the default transport.
 22    * A set of IP:Port addresses that can be used to reach the bridge.
 23      (With some transports, the transport itself figures out how to
 24      contact the bridge, and this set is empty or ignored.)
 25    * A set of identities to expect for the bridge.  (Note that C Tor
 26      allows this set to be empty; Arti will not, since it tends to
 27      create severe implementation headaches.)
 28    * For each transport, a set of transport-specific parameters.  (These
 29      might, for example, be additional protocol-specific authentication
 30      keys.)
 31  
 32  Users can turn bridge usage on and off.  This is a single boolean that
 33  does not require deleting their entire list of bridges.
 34  
 35  Users can configure a large number of bridges; if they do, then we want
 36  to pick randomly from among them and favor just a few, in the same way
 37  that we do when choosing guard relays.  We want to reuse our `GuardMgr`
 38  code for this.  (Doing so, however, may require a bit of refactoring,
 39  since the current `GuardMgr` selects `Relay`s from a `NetDir`, and we'll
 40  have to select `Bridge`s from some kind of underlying `BridgeSet`.)
 41  
 42  Since bridges are not listed in the main network directory, we can't use
 43  the directory to look up their **onion keys** (the ones we use to build
 44  multihop circuits).  Instead, we have to connect to the bridge and ask
 45  the bridge for a **router descriptor**—a self-signed document describing
 46  the bridge and its supported keys.  Descriptors are only valid for a
 47  while.
 48  
 49  Some transports are implemented as external processes, using a
 50  "**managed pluggable transport**" mechanism.  In this design, the Tor
 51  client program is responsible for launching and monitoring external
 52  binaries that provide transports over SOCKS4 or SOCKS5.  The [protocol]
 53  for communicating with these binaries uses stdin, stdout, and the
 54  environment.  To use these binaries as transports, the client treats
 55  them as SOCKS4 or SOCKS5 proxies, and encodes per-connection arguments
 56  in the authentication fields of the SOCKS handshakes.
 57  
 58  A single managed PT binary can implement multiple transports: if it
 59  does, each one gets its own local proxy address.
 60  
 61  [protocol]: https://gitlab.torproject.org/tpo/core/torspec/-/blob/main/pt-spec.txt
 62  
 63  ## Architectural implications
 64  
 65  With those issues in mind, let's go through the parts of the
 66  implementation that are simple.
 67  
 68  We'll need to extend the definition of `ChanTarget` to include the
 69  additional information that bridges need: which protocol to use, and
 70  protocol-specific information.  We might want a separate trait for
 71  `ChanTarget`s that can have this information, since relays will never
 72  want to look at it, and in fact will require that it is absent.
 73  
 74  We'll want to extend the `tor-chanmgr` crate to know about more ways to
 75  launch channels.  It will probably have a registry of known transport
 76  mechanisms (including the default transport) and know how to connect to
 77  each one.
 78  
 79  We'll need to implement `tor-ptmgr` crate that launches and monitors
 80  managed pluggable transport binaries.  It should have the ability to
 81  launch and shut down PTs on demand, not just because they are
 82  configured.  (In other words, if no bridge wants a given transport, we
 83  shouldn't run that transport.)
 84  
 85  We'll need to teach `tor-guardmgr` to be able to take its input from a
 86  configured set of bridges rather than from a `NetDir`.  This needs to be
 87  a separate "guard selection", since we want to be able to switch back
 88  and forth between using bridges and not using bridges.
 89  
 90  In `ChanMgr` and `GuardMgr`, we'll need a way to identify bridges. This
 91  will be interesting, since bridges can be configured only with a single
 92  identity that is _not_ their Ed25519 identity.  (In `GuardMgr`, we might
 93  have as little as an `RsaIdentity`.  In `ChanMgr`, we will have more
 94  identity information, but only _after_ the channel handshake is
 95  successful.)  If the same identity is listed twice with different
 96  addresses and transports, we may need to treat them as different
 97  bridges.[^3] We may need to assign configured bridges a local unique ID,
 98  and use that identify which bridge is which in ChanMgr.  We may need a
 99  flexible matching approach in our `GuardMgr` code to see which
100  remembered guard is equivalent to which configured bridge.
101  
102  We'll need to download and cache bridge's router descriptors as needed.
103  This is different from downloading regular directory information in
104  several ways:
105     * We can only download a bridge's descriptor _from that bridge_.
106     * We need to be able to download a bridge's descriptor _even when we
107       have no directory_.
108     * When using bridges, we _only use bridges_ as our directory caches:
109       never fallback directories.
110  
111  Let's try to, to the extent possible, to  put all of the client-side
112  bridge and pluggable
113  transport code behind Cargo features (`bridge-client` and `pt-client`,
114  maybe), so that we can disable them for Relays and for
115  resource-constrained clients that don't want them.
116  
117  ## Challenges with implementing anticensorship in Arti
118  
119  Now that we've been through all of that, here are some of the challenges
120  and open questions that we need to solve as we implement these
121  anticensorship features in Arti.
122  
123  ### Problem 1: The directory infrastructure and logic
124  
125  Our existing directory code doesn't know about bridges.  We'll need to
126  think carefully about the logic that drives guard selection and
127  directory downloads.
128  
129  We'll need an additional directory state where we try to make sure we
130  fetch bridge descriptors.  This has to happen after bridges are
131  selected.  There needs to be feedback between the `GuardMgr` and the
132  `DirMgr` here: the `GuardMgr` can't hand out bridges for multi-hop
133  circuits until it knows descriptors for them; the `DirMgr` can't fetch
134  any bridge descriptors until it knows what the `GuardMgr` wants.
135  
136  (The `DirMgr` also needs to keep bridge descriptors separate from regular
137  relays, to avoid leaking whether we've used a given bridge when using it
138  as a relay, and vice versa.)
139  
140  
141  ### Problem 2: Circuits through bridges
142  
143  Our `CircMgr` can build one-hop directory circuits through any kind of
144  `ChanTarget`.  But right now it can only build multihop circuits by
145  first looking up the `Relay` object for the first hop in the `NetDir`.
146  
147  Here we have two options: We can make bridges with known descriptors
148  into `Relay`s, or we can adjust `CircMgr` so that any `CircTarget` can
149  start a multihop circuit.
150  
151  We'll also want a meaningful way to know if a bridge is in the same
152  family as a `Relay`, which presents its own challenges.
153  
154  ### Problem 3: Discarding unused channels and circuits
155  
156  When a user turns bridges on and off, or changes the set of configured
157  bridges, we can easily have the `ChanMgr` and the `CircMgr` drop all of
158  their existing channels and circuits.  That will cause these channels
159  and circuits to close once there are no longer any streams using them,
160  which is all well and good.
161  
162  But the user may want channels and circuits to close sooner!  People
163  sometimes get worried when an they flip a "anticensorship" switch and
164  their non-resistant channels and circuits don't close immediately.
165  
166  That's a challenge in our current `ChanMgr`/`CircMgr` API, since we
167  don't actually keep track of the channels and circuits that we no longer
168  track in those managers.  We might instead need to keep weak references
169  to deprecated channels and circuits.  But doing _that_ might require new
170  `WeakChannel` and `WeakCircuit` types in `tor-proto`.
171  
172  ### Problem 4: Channel equivalency, bridge identity
173  
174  If a bridge's configured addresses or transports are changed, then
175  existing channels to that bridge may no longer be used.
176  
177  If a bridge has multiple transports, we might need to remember which
178  ones work and which ones don't.
179  
180  What's more, we might not always know an Ed25519 identity for a bridge:
181  this will mess with our guard and channel code, both of which assume
182  that all known relays have an Ed25519 identity.
183  
184  ### Problem 5: Tuning, tuning, tuning
185  
186  Our existing code has some constants and consensus values that are tuned
187  for the main network.  We'll need to revisit them for bridges.  Notably,
188  we'll need to reconsider our required guard parallelism, our recommended
189  guard parallelism, our willingness to retry a guard that seems not to be
190  working, our timeouts, our happy-eyeballs parameters, and more.
191  
192  
193  ### Problem 6: Existing bridge-line format
194  
195  We would like to have backward compatibility with Tor's current bridge
196  configuration mechanism, which uses a line format something like this:
197  
198  ```
199  [TransportId] 1.2.3.4:9100 RsaIdentity [Param1=Val1] [Param2=Val2] ...
200  ```
201  
202  We need to support this indefinitely, though it has a number of design
203  problems, since its usage is established basically everywhere.
204  Nonetheless, we may want to look into alternatives, so that we could:
205  
206    * Have more identity types
207    * Make addresses optional
208    * Use a type better suited for encoding binary data.
209  
210  
211  ## APIs to design
212  
213  These are some APIs to sketch out as next steps.
214  
215  * Extended ChanTarget/CircTarget API
216  
217  * Protocol or TransportId API
218  
219  * Revised GuardMgr interfaces
220  
221  * TransportRegistry (part of ChanMgr, knows how to connect via different
222    protocols.  Takes an `ExtendedChanTarget`; returns a `Result<Channel>`)
223  
224  * PtMgr (handles managed pluggable transports)
225  
226  * Whatever the heck is going on inside DirMgr and between
227    DirMgr/GuardMgr now.
228  
229  
230  
231  ----
232  
233  [^1]: In fact, bridges are typically _not_ listed in the main network
234     directory: if they were, a censor could easily block their IP addresses.
235  
236  [^2]: In practice, all of our transports are implemented as extra layers
237     over which we tunnel our regular cells-over-TLS protocol.  This is a
238     deliberate choice: Even when the transport provides authenticity and
239     and confidentiality on its own
240  
241  [^3]: This is an uncommon case in C Tor, and we might not want to
242     support it.
243