ExportedApiSketch.md
1 2 # Some early thoughts on Arti FFI & RPC 3 4 There are certainly quite a few environments that we want Arti to be 5 deployable into! They include: 6 7 * Arti as a Rust crate, used from a Rust program written to use `async` 8 networking. (This is what we have today.) 9 * Arti as a rust crate, used from a Rust program that doesn't know 10 about `async` networking. (I'll be skipping over this option.) 11 * Arti running as a separate process, accessed via RPC. 12 * Arti as a library, loadable from an arbitrary programming environment. 13 * Arti as a drop-in replacement for your program's existing networking 14 solution. 15 16 I'm going to discuss options for each one below and implementation 17 strategies. Some general principles to think about are: 18 19 1. It would be nice if adding a new feature to Arti didn't require adding a 20 huge amount of extra boilerplate to 5-10 different embedding systems. 21 22 2. It would be nice if existing programs that use C Tor could migrate to 23 using Arti without too much effort. 24 25 Before we start looking at the strategies, let's talk about some 26 difficulties that any Arti API will face. 27 28 ## Difficulties with designing APIs for a Tor implementation. 29 30 ### Problem 1: Sockets are not so well supported 31 32 The major abstraction used in Tor is an **anonymous socket**, which 33 presents a problem with RPC: 34 35 It is not easy to transfer real sockets across all process boundaries. 36 You can do it with unix sockets and `sendmsg`, and there is a similar 37 windows thing, but they are both slight forms of dark magic. 38 39 Many RPC systems simply don't support transferring sockets. We can 40 instead add a proxy alongside the RPC mechanism (like C tor does), but 41 that does require additional coordination between the two mechanisms so 42 that the RPC could refer to the proxy's sockets unambiguously. 43 44 ### Problem 2: Sockets are not so well abstracted 45 46 Applications want to use sockets in their native format, which presents 47 a problem with FFI: 48 49 If all the world were `async` Rust, we could simply expose a type that 50 implemented `AsyncRead + AsyncWrite`. If all the world were Java, we 51 could expose a type that exposed an `InputStream` and an `OutputStream`. 52 If every C program were written using NSPR, we could expose a 53 `PRFileDesc`... 54 55 But in reality, the space of existing higher-level socket APIs is too 56 huge for us to emulate all of them. 57 58 So we need to support the most basic low-level socket API we can. On 59 Unix, that's a file descriptor. On Windows, that's a `SOCKET`. 60 61 Absent a set of uniform kernel plugins to let us define new file 62 descriptor types, our viable only option is to use a `socketpair()` API 63 to provide the application with a real honest-to-goodness socket, and to 64 proxy the other end of that socket over the Tor network. 65 66 This kind of approach consumes kernel resources, but there's no way 67 around that, and in most cases the overhead won't matter in comparison 68 to the rest of the Tor network API. 69 70 71 ### Problem 3: The API surface is large 72 73 Arti (like C tor before it) has a **complex API surface**. There are 74 many knobs that can be turned, and many of them have their own 75 functions. We do not want (for example) to make a separate function 76 call for our entire configuration builder infrastructure; instead, we 77 should look for solutions that let us hide complexity. 78 79 For example, we could expose access to our configuration as a 80 string-based tree structure, rather than as a separate function per 81 option. We can also use string-based or object-based properties to 82 configure streams, rather than exposing every option as a new function. 83 84 (Our use of the `serde` crate might make this easier to solve, since we 85 already have access to our configuration as a structured tree of strings.) 86 87 ### Problem 4: We want to expose asynchronous events 88 89 Many of the things that we want to tell applications about happen 90 asynchronously, such as circuit construction, log events, and bootstrap 91 events. 92 93 Not every RPC system makes this kind of API simple to expose. Some want 94 to have a only request at a time per "session", and make it nontrivial 95 or inefficient to support requests whose responses never end, or whose 96 responses might come a long time later. We need to make sure we avoid 97 these designs. 98 99 In-process FFI also makes this kind of thing tricky. The simplest way 100 to learn about events in process might be to register a callback, but 101 badly programmed callbacks have a tendency to run out of hand. Some 102 environments prefer to poll and drain a queue of events, but many 103 polling systems rely on fd-based notification, or behave badly if the 104 queue isn't drained fast enough. 105 106 Again, it might be best to offer the application a way to get a socket 107 which arti writes the information to in some kind of structured way 108 using serde. (serde makes it easy to support a variety of formats 109 including (say) JSON and messagepack.) 110 111 112 ## Thoughts on particular options 113 114 ### Arti over RPC 115 116 There is a pretty large body of existing programs that use C tor by 117 launching it, connecting to a control port to manage it, and talking to 118 that control port over a somewhat clunky protocol. 119 120 In practice, some of these programs roll their own implementation of 121 launching and controlling C tor; others use an existing library like 122 `stem`, `txtorcon`, or `jtorctl`. 123 124 The existing control protocol is pretty complex, and it exposes an API 125 with a large surface that is somewhat attached to implementation details 126 of the C tor implementation. 127 128 There is also a fairly large body of RPC protocols out there _other_ 129 than the Tor controller protocols! Using one of them would make Arti 130 easier to contact in environments that have support for (say) JSON-RPC, 131 but which don't want to do a from-scratch clone of our control porotcol. 132 133 Here are several options that we might provide in Arti. 134 135 #### RPC via a control port clone. 136 137 We could attempt a control-protocol reimplementation. A complete 138 bug-compatible clone is probably impossible, since the control protocol 139 is immense, and tied to details of C tor. But we might be able to do a 140 somewhat-compatible, very partial reimplementation. It's not clear how 141 much of the protocol we'd need to clone in order to actually support 142 existing applications, though! 143 144 Also note that the control port exposes more than the control port API: 145 In addition to translating e.g. CIRC events to Arti, we'd also need to 146 translate Arti's configuration options so that they looked similar to 147 old C tor options. (Otherwise, for example, `GETCONF SocksPort` 148 wouldn't work, since Arti doesn't have an option called `SocksPort`, and 149 its socks port configuration option doesn't accept arguments in the same 150 format.) 151 152 #### RPC via some standard system 153 154 155 We could create a new incompatible RPC interface, using some standard 156 RPC framework. (See problems 1, 3, and 4 above for some constraints on 157 the RPC systems we could choose.) This is the cleanest approach, but of 158 course it doesn't help existing code that uses C tor. 159 160 161 162 (If we took this approach, we might be able to port one or more of the 163 APIs above (`txtorcon`, `stem`, `jtorctl`, etc) to use the new RPC 164 interface. That might be cleaner than a control port clone. But as 165 above, we'd need to translate more than the API: 166 `get_config("SocksPort")` would need a compatibility layer too.) 167 168 With an appropriate implementation strategy, it might be possible to 169 implement a subset of the C Tor control port protocol *in terms of* 170 a new protocol based on a sensible RPC framework. 171 172 173 #### A note about HTTP and RPC 174 175 Many popular RPC protocols are based upon HTTP. This creates a 176 challenge if we use them: specifically, that your local web browser 177 makes a decent attack vector against any local HTTP service. We'll 178 need to make sure that any HTTP-based RPC system we build can resist the 179 usual attacks, of course. But also we'll need to make sure that that 180 it's hard to trick any plausible client implementation holding the 181 credentials for the RPC system into accidentally leaking them our using 182 them for something else. 183 184 185 ### Arti via FFI 186 187 We probably don't want to just expose all our Rust APIs unthinkingly, 188 because of problems 2 (other languages can't easily consume 189 `AsyncRead+AsyncWrite` sockets) and problem 3 (huge API surface) above. 190 191 Instead, we probably want to define a simplified API based on a handle 192 to a managed TorClient instance, `socketpair()`-based proxying, and 193 string-based handling of configuration and other similar data. 194 195 This API would have to work by launching our async runtime in a separate 196 thread, and communicating with it either via function calls or via 197 messages over some kind of queue. 198 199 Every `async` API that we want to re-export from TorClient would need to either 200 get a blocking equivalent, a polling equivalent, or a callback-based 201 equivalent. 202 203 We'd have to expose C API here. We might also want to provide wrappers 204 for that API Java and Python. 205 206 Fortunately, we don't have to worry about backward compatibility with 207 existing applications here, since there is not a C tor API of this type. 208 209 210 211 ### Arti as plugin 212 213 Some applications already have support for multiple networking 214 backends. With this in mind, we could expose Arti as one of those. 215 216 For example, there's some interest in having Arti expose a 217 `libp2p` interface. 218 219 220 ## Where to start? 221 222 ### Selecting APIs 223 224 I think our first steps here would be to approach the question of APIs from 225 two ends. 226 227 1. What APIs do current applications use in C tor? 228 229 2. What APIs does Arti currently have and want to expose? 230 231 If we can find the simplest intersection of those two that is useful, 232 I suggest we begin by trying to expose that small intersection of APIs 233 via whatever candidate RPC and FFI mechanisms we think of. 234 235 The very simplest useful API is probably something like: 236 237 ``` 238 startup() -> * TorClient or error; 239 status(client: *TorClient) -> SomeStatusObject; 240 connect(client: *TorClient, target: *Address) -> Socket or error; 241 shutdown(* TorClient); 242 ``` 243 244 We could begin by implementing that, and then add other functionality as 245 needed. 246 247 ### Picking our tooling 248 249 We'll need to do a survey of RPC options (including rust tooling) and see 250 whether they provide a feasible way to support async events and/or proxying. 251 252 We should see whether cbindgen can help us with our FFI needs. 253 254