/ doc / dev / ExportedApiSketch.md
ExportedApiSketch.md
  1  
  2  # Some early thoughts on Arti FFI & RPC
  3  
  4  There are certainly quite a few environments that we want Arti to be
  5  deployable into!  They include:
  6  
  7   * Arti as a Rust crate, used from a Rust program written to use `async`
  8     networking.  (This is what we have today.)
  9   * Arti as a rust crate, used from a Rust program that doesn't know
 10     about `async` networking.  (I'll be skipping over this option.)
 11   * Arti running as a separate process, accessed via RPC.
 12   * Arti as a library, loadable from an arbitrary programming environment.
 13   * Arti as a drop-in replacement for your program's existing networking
 14     solution.
 15  
 16  I'm going to discuss options for each one below and implementation
 17  strategies.  Some general principles to think about are:
 18  
 19  1. It would be nice if adding a new feature to Arti didn't require adding a
 20     huge amount of extra boilerplate to 5-10 different embedding systems.
 21  
 22  2. It would be nice if existing programs that use C Tor could migrate to
 23     using Arti without too much effort.
 24  
 25  Before we start looking at the strategies, let's talk about some
 26  difficulties that any Arti API will face.
 27  
 28  ## Difficulties with designing APIs for a Tor implementation.
 29  
 30  ### Problem 1: Sockets are not so well supported
 31  
 32  The major abstraction used in Tor is an **anonymous socket**, which
 33  presents a problem with RPC:
 34  
 35  It is not easy to transfer real sockets across all process boundaries.
 36  You can do it with unix sockets and `sendmsg`, and there is a similar
 37  windows thing, but they are both slight forms of dark magic.
 38  
 39  Many RPC systems simply don't support transferring sockets.  We can
 40  instead add a proxy alongside the RPC mechanism (like C tor does), but
 41  that does require additional coordination between the two mechanisms so
 42  that the RPC could refer to the proxy's sockets unambiguously.
 43  
 44  ### Problem 2: Sockets are not so well abstracted
 45  
 46  Applications want to use sockets in their native format, which presents
 47  a problem with FFI:
 48  
 49  If all the world were `async` Rust, we could simply expose a type that
 50  implemented `AsyncRead + AsyncWrite`.  If all the world were Java, we
 51  could expose a type that exposed an `InputStream` and an `OutputStream`.
 52  If every C program were written using NSPR, we could expose a
 53  `PRFileDesc`...
 54  
 55  But in reality, the space of existing higher-level socket APIs is too
 56  huge for us to emulate all of them.
 57  
 58  So we need to support the most basic low-level socket API we can.  On
 59  Unix, that's a file descriptor.  On Windows, that's a `SOCKET`.
 60  
 61  Absent a set of uniform kernel plugins to let us define new file
 62  descriptor types, our viable only option is to use a `socketpair()` API
 63  to provide the application with a real honest-to-goodness socket, and to
 64  proxy the other end of that socket over the Tor network.
 65  
 66  This kind of approach consumes kernel resources, but there's no way
 67  around that, and in most cases the overhead won't matter in comparison
 68  to the rest of the Tor network API.
 69  
 70  
 71  ### Problem 3: The API surface is large
 72  
 73  Arti (like C tor before it) has a **complex API surface**.  There are
 74  many knobs that can be turned, and many of them have their own
 75  functions.  We do not want (for example) to make a separate function
 76  call for our entire configuration builder infrastructure; instead, we
 77  should look for solutions that let us hide complexity.
 78  
 79  For example, we could expose access to our configuration as a
 80  string-based tree structure, rather than as a separate function per
 81  option.  We can also use string-based or object-based properties to
 82  configure streams, rather than exposing every option as a new function.
 83  
 84  (Our use of the `serde` crate might make this easier to solve, since we
 85  already have access to our configuration as a structured tree of strings.)
 86  
 87  ### Problem 4: We want to expose asynchronous events
 88  
 89  Many of the things that we want to tell applications about happen
 90  asynchronously, such as circuit construction, log events, and bootstrap
 91  events.
 92  
 93  Not every RPC system makes this kind of API simple to expose. Some want
 94  to have a only request at a time per "session", and make it nontrivial
 95  or inefficient to support requests whose responses never end, or whose
 96  responses might come a long time later.  We need to make sure we avoid
 97  these designs.
 98  
 99  In-process FFI also makes this kind of thing tricky.  The simplest way
100  to learn about events in process might be to register a callback, but
101  badly programmed callbacks have a tendency to run out of hand.  Some
102  environments prefer to poll and drain a queue of events, but many
103  polling systems rely on fd-based notification, or behave badly if the
104  queue isn't drained fast enough.
105  
106  Again, it might be best to offer the application a way to get a socket
107  which arti writes the information to in some kind of structured way
108  using serde.  (serde makes it easy to support a variety of formats
109  including (say) JSON and messagepack.)
110  
111  
112  ## Thoughts on particular options
113  
114  ### Arti over RPC
115  
116  There is a pretty large body of existing programs that use C tor by
117  launching it, connecting to a control port to manage it, and talking to
118  that control port over a somewhat clunky protocol.
119  
120  In practice, some of these programs roll their own implementation of
121  launching and controlling C tor; others use an existing library like
122  `stem`, `txtorcon`, or `jtorctl`.
123  
124  The existing control protocol is pretty complex, and it exposes an API
125  with a large surface that is somewhat attached to implementation details
126  of the C tor implementation.
127  
128  There is also a fairly large body of RPC protocols out there _other_
129  than the Tor controller protocols!  Using one of them would make Arti
130  easier to contact in environments that have support for (say) JSON-RPC,
131  but which don't want to do a from-scratch clone of our control porotcol.
132  
133  Here are several options that we might provide in Arti.
134  
135  #### RPC via a control port clone.
136  
137  We could attempt a control-protocol reimplementation.  A complete
138  bug-compatible clone is probably impossible, since the control protocol
139  is immense, and tied to details of C tor.  But we might be able to do a
140  somewhat-compatible, very partial reimplementation.  It's not clear how
141  much of the protocol we'd need to clone in order to actually support
142  existing applications, though!
143  
144  Also note that the control port exposes more than the control port API:
145  In addition to translating e.g. CIRC events to Arti, we'd also need to
146  translate Arti's configuration options so that they looked similar to
147  old C tor options.  (Otherwise, for example, `GETCONF SocksPort`
148  wouldn't work, since Arti doesn't have an option called `SocksPort`, and
149  its socks port configuration option doesn't accept arguments in the same
150  format.)
151  
152  #### RPC via some standard system
153  
154  
155  We could create a new incompatible RPC interface, using some standard
156  RPC framework.  (See problems 1, 3, and 4 above for some constraints on
157  the RPC systems we could choose.)  This is the cleanest approach, but of
158  course it doesn't help existing code that uses C tor.
159  
160  
161  
162  (If we took this approach, we might be able to port one or more of the
163  APIs above (`txtorcon`, `stem`, `jtorctl`, etc) to use the new RPC
164  interface.  That might be cleaner than a control port clone.  But as
165  above, we'd need to translate more than the API:
166  `get_config("SocksPort")` would need a compatibility layer too.)
167  
168  With an appropriate implementation strategy, it might be possible to
169  implement a subset of the C Tor control port protocol *in terms of*
170  a new protocol based on a sensible RPC framework.
171  
172  
173  #### A note about HTTP and RPC
174  
175  Many popular RPC protocols are based upon HTTP.  This creates a
176  challenge if we use them: specifically, that your local web browser
177  makes a decent attack vector against any local HTTP service.  We'll
178  need to make sure that any HTTP-based RPC system we build can resist the
179  usual attacks, of course. But also we'll need to make sure that that
180  it's hard to trick any plausible client implementation holding the
181  credentials for the RPC system into accidentally leaking them our using
182  them for something else.
183  
184  
185  ### Arti via FFI
186  
187  We probably don't want to just expose all our Rust APIs unthinkingly,
188  because of problems 2 (other languages can't easily consume
189  `AsyncRead+AsyncWrite` sockets) and problem 3 (huge API surface) above.
190  
191  Instead, we probably want to define a simplified API based on a handle
192  to a managed TorClient instance, `socketpair()`-based proxying, and
193  string-based handling of configuration and other similar data.
194  
195  This API would have to work by launching our async runtime in a separate
196  thread, and communicating with it either via function calls or via
197  messages over some kind of queue.
198  
199  Every `async` API that we want to re-export from TorClient would need to either
200  get a blocking equivalent, a polling equivalent, or a callback-based
201  equivalent.
202  
203  We'd have to expose C API here.  We might also want to provide wrappers
204  for that API Java and Python.
205  
206  Fortunately, we don't have to worry about backward compatibility with
207  existing applications here, since there is not a C tor API of this type.
208  
209  
210  
211  ### Arti as plugin
212  
213  Some applications already have support for multiple networking
214  backends.  With this in mind, we could expose Arti as one of those.
215  
216  For example, there's some interest in having Arti expose a
217  `libp2p` interface.
218  
219  
220  ## Where to start?
221  
222  ### Selecting APIs
223  
224  I think our first steps here would be to approach the question of APIs from
225  two ends.
226  
227  1. What APIs do current applications use in C tor?
228  
229  2. What APIs does Arti currently have and want to expose?
230  
231  If we can find the simplest intersection of those two that is useful,
232  I suggest we begin by trying to expose that small intersection of APIs
233  via whatever candidate RPC and FFI mechanisms we think of.
234  
235  The very simplest useful API is probably something like:
236  
237  ```
238    startup() -> * TorClient or error;
239    status(client: *TorClient) -> SomeStatusObject;
240    connect(client: *TorClient, target: *Address) -> Socket or error;
241    shutdown(* TorClient);
242  ```
243  
244  We could begin by implementing that, and then add other functionality as
245  needed.
246  
247  ### Picking our tooling
248  
249  We'll need to do a survey of RPC options (including rust tooling) and see
250  whether they provide a feasible way to support async events and/or proxying.
251  
252  We should see whether cbindgen can help us with our FFI needs.
253  
254