Cradicle Explorer

/ data_sync / p2p-data-sync-comparison.md
p2p-data-sync-comparison.md
  1  # Different approaches to p2p data sync
  2  *Written March 29, 2019 by Oskar. Updated April 1, April 2.*
  3  
  4  **WARNING: This is an early draft, and likely contains errors.**
  5  
  6  This document compares various forms of data sync protocols and applications, along various dimensions, with the goal of making it easier for you to choose which one is for you. It is part of a larger series on secure messaging.
  7  
  8  >Request for comments: Please direct comments on this document to Oskar. These comments can be on the structure, content, things that are wrong, things that should be looked into more, things that are unclear, as well as anything else that comes to mind.
  9  
 10  ## Table of contents
 11  
 12  1. [Introduction](#1-introduction)
 13  2. [Background and definitions](#2-background-and-definitions)
 14  3. [Methodology](#3-methodology)
 15  4. [Comparison](#4-comparison)
 16  5. [Summary](#5-summary)
 17  6. [Acknowledgements](#6-acknowledgements)
 18  7. [References](#7-references)
 19  
 20  ## 1. Introduction
 21  In a p2p network you often want to reliably transmit and replicate data across participants. This can be either large files, or messages that users want to exchange between each other in a private chat. This is especially challenging on mobile devices. Additionally, you might want security properties beyond robustness, such as privacy-preservation, censorship resistance and coercion resistance.
 22  
 23  In this paper we go through various approaches to data sync in a p2p context. We then compare them across relevant dimensions to make it easier for you to make an informed decision about what solution you want to use.
 24  
 25  ## 2. Background and definitions
 26  
 27  For broad definitions please refer to the [glossary](../glossary.md), data sync specific definitions will be listed below.
 28  
 29  | Term | Definition |
 30  | ---- | ---------- |
 31  | Node | Some process that is able to store data, do processing and communicate with other nodes. |
 32  | Peer | The other nodes that a peer is connected to. |
 33  | Peer-to-peer (P2P) | Protocols where resources are divided among multiple peers, without the need of central coordination. |
 34  | Device | A node capable of storing some data locally. |
 35  | Identity | A user's cryptographic identity, usually a single or several public keypairs. |
 36  | User | A (human) end-user that may have multiple *devices*, and some form of identity. |
 37  | Data replication |Storing the same piece of data in multiple locations, in order to improve availability and performance. |
 38  | Data sync | Achieving *consistency* among a set of nodes storing data |
 39  | Mobile-friendly | Multiple factors that together make a solution suitable for mobile. These are things such as dealing with *mostly-offline* scenarios, *network churn*, *limited resources* (such as bandwidth, battery, storage or compute). |
 40  | Replication object | Also known as the minimal unit of replication, the thing that we are replicating. |
 41  | Friend-to-friend network (F2F) | A private P2P network where there are only connections between mutually trusted peers. |
 42  | Content addressable storage (CAS) | Storing information such that it can be retrieved by its content, not its location. Commonly performed by the use of cryptographic hash functions. |
 43  | Public P2P | Open network where peers can connect to each other. |
 44  | Structured P2P network | A public p2p network where data placement is related to the network topology. E.g. a DHT. |
 45  | Unstructed P2P network | A public p2p network where peers connect in an ad hoc manner, and a peer only knows its neighbors, not what they have. |
 46  | Super-peer P2P network | A non-pure p2p network, hybrid client/server architecture, where certain peers do more work. | 
 47  | Private P2P | Private network where peers must mutually trust each other before connecting, can either be F2F or group-based. |
 48  | Group-based P2P network | A private p2p network where you need some form of access to join, but within group you can connect to new peers, e.g. friend of a friend or a private BT tracker. |
 49  
 50  **These have yet to be filled out**
 51  *Consistency model*: ...
 52  
 53  *Mostly-offline:* ...
 54  
 55  *Network churn:* ...
 56  
 57  *Light node:* ...
 58  
 59  *Cryptographic hash function*: ...
 60  
 61  *Multi-device:*...
 62  
 63  *Robustness*: ...
 64  
 65  *Privacy-preservation*: ...
 66  
 67  *Censorship-resistance*: ...
 68  
 69  *Coercion-resistance*: ...
 70  
 71  ## 3. Methodology
 72    We look at generally established dimensions in the literature [xx1], and evaluate protocols and applications based on these. Additionally we add some dimensions that aren't necessarily captured in the literature, such as mobile-friendliness and practical implementation. Specifically the focus is on p2p applications that perform some form of data synchronization, with a bias towards secure messaging applications.
 73  
 74  All notes are tentative and are based on the provided documentation and specification. Code has generally not been looked into, nor have any empirical simulations been performed. These results are tentative as they have yet to be reviewed.
 75  
 76  ### Compared dimensions
 77  
 78  **Request for comments: What's a  better way to decompose these properties? Ideally something like 2-5 dimensions that matter most. Right now it reads a bit like a laundry list, or like an ad hoc classification. Update April 1: Factored other considerations out into guarantees and practical application/protocol considerations. Also tweaked How section to also deal with transport requirements and peer discovery.**
 79  
 80  These dimensions are largely taken from the survey paper by Martins as well, with some small tweaks. To make it easier to survey, we divide up the dimensions into rough sections.
 81  
 82  #### 1. Why and what are we syncing?
 83  - *Problem domain*. Why are we syncing stuff in the first place?
 84  - *Minimal unit of replication*. The minimal entity that we can replicate. The actual unit of replication is the data structure that we are interested in, usually a collection of entities. Related: version history abstraction (linear vs DAG).
 85  - *Read-only or read and write*. Is the (actual unit of replication) data static or not?
 86  
 87  #### 2. Who is participating?
 88  - *Active vs passive replication*. Are participants replicating data that they are not interested in themselves?
 89  
 90  ####  3. When and where are we syncing?
 91  Replication control mechanisms.
 92  
 93  - *Single-master vs multi-master*. Both entity and collection.
 94  - *Synchronous (eager) vs asynchronous (lazy)*.
 95  - *If asynchronous, optimistic or not*.
 96  - *Replica placement*. Full or partial replication.
 97  
 98  #### 4. How are they syncing?
 99  - *P2P Topology*. Public or private P2P? Unstructured/structured/super-peer or friend-to-friend/group-based?
100  - *Peer Discovery*. How do peers discover and connect with each other?
101  - *Transport requirements*. Any specific requirements on the underlying transport layer?
102  
103  #### 5. What guarantees does it provide?
104  - *Consistency-model*. What are the guarantees provided?
105  - *Privacy-preservation*. Assuming underlying layers provide privacy, are these guarantees upheld?
106  
107  #### 6. Engineering and application, how does it work in practice?
108  - *Actively used*. Is an instance of the protocol used in the real-world?
109  - *Mobile friendliness*. Any specific considerations for mobile?
110  - *Well-defined spec*. Is there a well-defined and documented spec to refer to?
111  - *Protocol upgradability*. Are there any mechanisms for upgrading the protocol?
112  - *Framing considerations*. Does it support messages that don't need replication?
113  
114  ## Notes on single-master vs multi-master
115  For single-master, there's only a single node that writes to a specific piece of data. The other peers purely replicate it and don't change it.
116  
117  For many of the studied systems, *content addressable storage* is used. This means the replication object is usually immutable, and it is only written to once. As a side effect, many systems are naturally single-master from this point of view.
118  
119  This is in comparison with the Martins survey paper, where they are more interested in update-in-place programming through replicating SQL DBs and other rich data structures.
120  
121  However, if we look at what is semantically interesting for the user, this is usually not an individual message or chunk of a file. Instead it is usually a conversation or a file. Seen from that point of view, we often employ some form of linear or DAG-based version history. In this case, many participants might update the relevant sync scope. Thus, the system is better seen as a multi-master one. To capture this notion we have divided the minimal unit of replication into entity and collection.
122  
123  ## Compared technologies
124  
125  **RFC: Which way is the most useful to look at it? Worried about confusion between the two, especially since many applications have tightly coupled set of protocols.**
126  
127  This includes both applications and specification. For example p2p messengers such as Briar (Bramble) [xx2], Matrix [xx3] and Secure Scuttlebutt (SSB) [xx4]. As well as general p2p storage solutions such as Swarm [xx5].
128  
129  In order to compare apples to apples, we compare Briar and Matrix as in how they de facto use sync. An additional consideration would be to compare Bramble and Matrix Server-to-Server API/spec directly. This also means Swarm doesn't necessarily make sense to compare directly to these applications.
130  
131  This application focus is justified by the following observations:
132  
133  a) most protocols in this space are young, underspecified and somewhat coupled with the application using them, especially compared to protocols such as IP/TCP/UDP/TLS/Bittorrent, etc
134  
135  b) we are ultimately interested in considerations that only manifest themselves as the protocol is used on e.g. mobile devices
136  
137  ## 4. Comparison
138  
139  ### What and why are we syncing?
140  
141  #### Briar
142  
143  *Problem domain*. Bramble is the protocol Briar uses for synchronizing application layer data. Briar is a messaging app, and application data can be events such as sending messages, joining a room, etc. 
144   
145  *Minimal unit of replication*. Immutable messages. The actual unit of replication is a DAG, which is encoded inside the encrypted message payload. This produces a message graph.
146  
147  *Read-only or read and write*. Each message is immutable, but since we are dealing with instant messaging a group context, such as a conversation, is changing. Thus it is read and write.
148  
149  #### Matrix
150  
151  *Problem domain*. Matrix is a set of open HTTP APIs that allow for real-time synchronization and persistance of arbitrary JSON over a federation of servers. Often this is done in a room context, with participants talking to each other. More generally, for messaging between humans.
152  
153  *Minimal unit of replication*. Messages that also form a DAG.
154  
155  *Read-only or read and write*. Message domain so read and write. Individual messages are generally immutable.
156  
157  #### Secure Scuttlebutt
158  
159  *Problem domain*.
160  
161  *Minimal unit of replication*
162  
163  *Actual unit of replication*. 
164  
165  *Read-only or read and write*.
166  
167  #### Swarm
168  
169  *Problem domain*.
170  
171  *Minimal unit of replication*
172  
173  *Actual unit of replication*. 
174  
175  *Read-only or read and write*.
176  
177  #### Comparison
178  
179  | Dimension | Briar | Matrix | SSB | Swarm |
180  |------------|------|------| ----| ---|
181  | *Problem domain*. | Messaging  | Messaging | Social network | Storage |
182  | *Minimal unit of replication* | Messages | Messages | Messages | Chunks |
183  | *Actual unit of replication*.  | DAG | DAG  | Log | | 
184  | *Read-only or read and write*. | Read and write | Read and write | Read and write | Read and write | 
185  
186  ### Who is participating?
187  
188  #### Bramble
189  *Active vs passive replication*. In Bramble, only the participants in a group chat are syncing messages. It is thus a form of passive replication. There are some proposals to have additional nodes to help with offline inboxing.
190  
191  #### Matrix
192  *Active vs passive replication*. In Matrix, homeservers are used for replication. Homeservers themselves don't care about the messages, so it is a form of active replication.
193  
194  #### Secure Scuttlebutt
195  *Active vs passive replication*.
196  
197  #### Swarm
198  *Active vs passive replication*. In Swarm, nodes replicate data that they are "assigned" to by virtue of being close to it.
199  
200  #### Comparison
201  
202  | Dimension | Briar | Matrix | SSB | Swarm |
203  |------------|------|------| ----| ---|
204  | Active replication? | Passive | Active | | Passive |
205  
206  ###  When and where are we syncing?
207  
208  #### Briar
209  *Single-master vs multi-master*. Anyone who has access to the DAG can write to it. Each individual message is immutable and thus write-once. But it's multi-master since multiple people can update the principal data structure.
210  
211  *Synchronous (eager) vs asynchronous (lazy)*. Asynchronous, you write to your own node first.
212  
213  *If asynchronous, optimistic or not*. Optimistic, since each message update is a hash, conflicts are likely to be rare.
214  
215  *Replica placement*. Full or partial replication.
216  
217  #### Matrix
218  *Single-master vs multi-master*. Essentially same as Briar. Multi-master.
219  
220  *Synchronous (eager) vs asynchronous (lazy)*. Partial, because there's a HTTP API to Matrix server that needs to acknowledge before a message is transmitted.
221  
222  *If asynchronous, optimistic or not*. Optimistic. Though there are sone considerations for the conflict resolutions when it comes to auth events, which are synced separately from normal message events. This is part of room state, which is a shared dictionary and they use room state resolution to agree on current state. This is in order to deal with soft failure / ban evasion prevention.
223  
224  More on ban evasion: In order to prevent user from evading bans by attaching to an older part of DAG. These events may be valid, but a federation homeserver checks if such an event passes the current state auth checks. If it does, homeserver doesn’t propagate it. A similar construct does not appear in Briar, since there’s no such federation contruct and there’s no global notion of the current state.
225  
226  *Replica placement*. Partial. DAG and homeserver can choose to get previous state when the join the network.
227  
228  #### Secure Scuttlebutt
229  *Single-master vs multi-master*. The principal data structure in SSB is a log, and only you can write to your own log. Single-master.
230  
231  *Synchronous (eager) vs asynchronous (lazy)*. Asynchronous, you add to your own log first.
232  
233  *If asynchronous, optimistic or not*. Optimistic.
234  
235  *Replica placement*. Full, because you start syncing your log from scratch.
236  
237  #### Swarm
238  *Single-master vs multi-master*.
239  
240  *Synchronous (eager) vs asynchronous (lazy)*.
241  
242  *If asynchronous, optimistic or not*.
243  
244  *Replica placement*. Full or partial replication.
245  
246  #### Comparison
247  
248  | Dimension | Briar | Matrix | SSB | Swarm |
249  |------------|------|------| ----| ---|
250  | *Multi-master?*  | Mulit-master | Multi-master | Single-master | |
251  | *Asynchronous?*  | Asynchronous | Partial | Asynchronous| |
252  |  *Optimistic?*  | Optimistic | Optimistic | Optimistic | | 
253  | *Replica placement*  | Partial | Partial | Full | 
254  
255  ### How are they syncing?
256  
257  #### Bramble
258  *P2P Topology*. Multiple options, but basically friend to friend. Can work either directly over Bluetooth or Tor (Internet). For BT it works by directly connecting to known contact, i.e. friend to friend. For Tor it uses onion addresses , it also leverages a DHT (structured).
259  
260  *Peer Discovery*. Usually add contact locally. beacon signal locally, then keep track of and update transport properties for each contact. Leverage Tor onion addresses.
261  
262  *Transport requirements*. Requires a transport layer security protocol which provides a secure channel between two peers. For Briar, that means Bramble Transport Protocol.
263  
264  #### Matrix
265  *P2P Topology*. Super-peer based on homeservers. Pure P2P discused discussed but not live yet.
266  
267  *Peer Discovery*.
268  
269  *Transport requirements*. Written as a set of HTTP JSON API:s, so runs on HTTP. Though there are some alternative transports as well.
270  
271  #### Secure Scuttlebutt
272  *P2P Topology*. Private p2p, group-based.
273  
274  *Peer Discovery*. Gossip-based, friends and then different levels of awareness 1-hop, 2-hop and 3-hop.
275  
276  *Transport requirements*.
277  
278  #### Swarm
279  *P2P Topology*. Public structured DHT.
280  
281  *Peer Discovery*. Kademlia, and caching peers in various proximity bins.
282  
283  *Transport requirements*.
284  
285  #### Comparison
286  
287  | Dimension | Briar | Matrix | SSB | Swarm |
288  |------------|------|------| ----| ---|
289  |*P2P Topology* | F2F | Super-peer | Group-based | Structured |
290  | *Peer Discovery* | | | | |
291  | *Transport requirements*  | | | | | 
292  
293  ### Guarantees provided
294  
295  #### Bramble
296  *Consistency model*. Casual consistency, since each message encodes its (optional) message dependencies. This DAG is hidden for non-participants.
297  
298  *Privacy-preservation*. It is a friend to friend network, and preserves whatever privacy underlying layer provides, so it has Tor's properties, and for short-distance the threat model is justifably weaker due to the increased cost of surveillence. However, a peer knows when another message is received by the receiving ACK (though it is possible to delay sending this, if on wishes).
299  
300  #### Matrix
301  *Consistency model*. DAG so casual consistency, generally speaking.
302  
303  *Privacy-preservation*.  Relies on homeservers, so sender and receiver anonymity not provided. Generally, if adversary controls homeserver they have a lot of information.
304  
305  #### Secure Scuttlebutt
306  *Consistency model*. Sequential consistency due to append-only log for each participant.
307  
308  *Privacy-preservation*.
309  
310  #### Swarm
311  *Consistency model*.
312  
313  *Privacy-preservation*.
314  
315  #### Comparison
316  *Consistency model*.
317  
318  *Privacy-preservation*.
319  
320  | Dimension | Briar | Matrix | SSB | Swarm |
321  |------------|------|------| ----| ---|
322  |*Consistency model* | Casual | Casual | Sequential | |
323  | *Privacy-preservation* | Yes | | | |
324  | *Transport requirements*  | | | | | 
325  
326  ### Engineering and application
327  
328  #### Bramble
329  *Actively used*. Released in the Play Store and on F-Droid. Seems like it, but not huge. ~300 reviews on Google Play Store, if that means anything.
330  
331  *Mobile friendliness*. Partial. Yes in that it is built for Android and works well, even without Internet and with high network churn. Note that battery consumption is still an issue, and that it runs in the background. The latter makes an iOS app difficulty.
332  
333  *Well-defined spec*. Yes/Partial, see Bramble specifications. Not machine readable though.
334  
335  *Protocol upgradability*. Haven't seen any specific work in this area, though they do have multiple versioned specs, which generally seem based on accretion.
336  
337  *Framing considerations*. As far as I can tell this is not a consideration.
338  
339  #### Matrix
340  *Actively used*. Matrix has users in the millions and there are multiple clients. So definitely.
341  
342  *Mobile friendliness*. Due to its super-peer architecture, mobile is straightforward and it can leverage basic HTTP PUT/long-lived GETs like normal messenger apps. DNS for homeservers can also be leveraged.
343  
344  *Well-defined spec*. Partial/Yes. Well-described enough to have multiple clients, but not machine-readable as far as I can tell.
345  
346  *Protocol upgradability*. Spec is versioned and rooms can have different versions that can be upgraded. Exact process around that is currently a bit unclear, though it does appear to have been a major consideration.
347  
348  *Framing considerations*. Matrix supports multiple forms of message types. Aside from synced Persisted Data Units (PDU) they also have Ephemeral (EDU) and queries. PDUs record history of message and state of room. EDUs don’t need to be replied to, necessarily. Queries are simple req/resp to get snapshot of state.
349  
350  #### Secure Scuttlebutt
351  *Actively used*. In production and many clients. Unclear exactly how active.
352  
353  *Mobile friendliness*.
354  
355  *Well-defined spec*.
356  
357  *Protocol upgradability*.
358  
359  *Framing considerations*.
360  
361  #### Swarm
362  *Actively used*. Currently in POC testing phase with limited real-world usage.
363  
364  *Mobile friendliness*.
365  
366  *Well-defined spec*.
367  
368  *Protocol upgradability*.
369  
370  *Framing considerations*.
371  
372  #### Comparison
373  
374  
375  | Dimension | Briar | Matrix | SSB | Swarm |
376  |------------|------|------| ----| ---|
377  |*Actively used* | Yes | Very | Yes | POC |
378  | *Mobile friendliness* | Partial | Yes | | |
379  | *Well-defined spec*  | Partial | Partial | | | 
380  | *Protocol upgradability* | | | | 
381  | *Framing considerations* | No | Yes | | 
382  
383  ### Table Summary
384  
385  **RFC: Does a full table summary makes sense? Should it list all dimensions? What would be the most useful?**
386  
387  | Dimension | Bramble | Matrix | SSB | Swarm |
388  | ---------- | -------- | -------- | --- |--- | 
389  | Replication object | Message (DAG) | Message* (DAG/*) | Messages (?) (Log) | Chunks (File/Manifest) |
390  | Single-master? | Single-master | Depends* | Single-master |  |
391  | Asynchronous? |Yes  | Partial | Yes | Yes | Yes |
392  | Asynchronous optimistic? | Yes | Yes | Yes | Yes |
393  | Full vs partial replication? | Partial | Partial | Partial? | Partial |
394  | Consistency model | Casual consistency | Casual | Eventual / Casual |
395  | Active replication? | No | Yes | Yes | Yes |
396  | P2P Topology | F2F Network | Super-peer | Group-based | Structured |
397  
398  #### Brief discussion
399  
400  What these results mean. Things that come together or not, independent considerations.
401  
402  ## 5. Summary
403  
404  TODO.
405  
406  Should answer what this means for you, given your constraint. For us specifically: mobile-friendly considerations, etc. Possibly a section on future work.
407  
408  ## 6. Acknowledgements
409  
410  TODO.
411  
412  Should reflect people who have helped shape this document.
413  
414  ## 7. References
415  
416  xx1
417  [Martins, Pacitti, and Valduriez, “Survey of Data Replication in P2P Systems.”](https://hal.inria.fr/inria-00122282/PDF/Survey_of_data_replication_in_P2P_systems.PDF)
418  
419  xx2
420  [Bramble Synchronisation Protocol, version 1](https://code.briarproject.org/briar/briar-spec/blob/master/protocols/BSP.md)
421  
422  xx3
423  [Matrix Specification](https://matrix.org/docs/spec/)
424  
425  xx4
426  [Scuttlebutt Protocol Guide](https://ssbc.github.io/scuttlebutt-protocol-guide/)
427  
428  xx5
429  [Swarm documentation, release 0.3](https://buildmedia.readthedocs.org/media/pdf/swarm-guide/latest/swarm-guide.pdf)
430  
431  
432  
433  ## TODO
434  
435  - Add graphs of linear vs DAG version history
436  - Clarify user, identity, multi-device definitions
437  - Mention alternative to data sync, i.e. for messaging RTC and framing, but motivate why useful especially for p2p
438  - Better document structure, e.g. see SoK Secure Messaging for example of survey paper
439  - Apples to apples: break apart e.g. Matrix and Swarm since they aren't directly comparable
440  - Capture in how are they syncing: API and push/pull.
441  - Consider collapasing actual replication unit.
442  - Matrix should capture auth state semantics.
443  - Mentions: How deal with other things that are kind of replicated but differently? E.g. metadata, etc. Auth state, ephemeral, manifests, feeds, etc.
444  - Remove redundancy. In the case dimensions are generally the same, consider treating it under same header.
445  - Fill in rest of definitions.
446   
447  ### Further work and research questions
448  
449  **1. Multiple-device management and data sync.** (minor)
450  How do various solutions deal with multiple-devices and identities? I.e. Matrix, Briar etc. What does this look like for SSB if you have a single log but write from mobile and desktop? Related: ghost user problem and mitigations.
451  
452  **2. CRDTs and version history.** (minor)
453  Elaborate on what problem CRDTs solve, their consistency model and problem domain, as well as how this related to logs and DAGs for version history.
454  
455  **3. Breaking apart data sync into better components.**
456  The dimensions right now reads like a laundry list, and I'm not confident these are orthogonal considerations. Especially the "other considerations" section. Ideally there'd be just a few different dimensions that matter the most, with possible sub problems. A la principal component analysis, likely only a few really fundamental differences. Related to breaking things apart into similarties and differences.
457  
458  **4. Extend comparison with more applications and protocols.**
459  E.g. Bittorrent, Git, Tribler, IPFS, Whisper, Status. Revisit Tox/TokTok.
460  
461  **5. Clarify requirements of sync protocol in question.**
462  E.g. if it has transport requirements or certain protocol dependencies. This also relates to how tightly coupled a sync protocol is to other protocols. It is also relevant for engineering reality.
463  
464  **6. Ensure knowns in data sync research log are captured**
465  I.e. for the comparative work that has been done in https://discuss.status.im/t/data-sync-research-log/1100, this should be summarized under the various dimensions.
466  
467  **7. Ensure previously unknowns brought up considerations are captured.**
468  I.e. all of these https://discuss.status.im/t/data-sync-next-steps-and-considerations/1056 should be captured in some way or another. As well as all the loose ends/questions posted in https://discuss.status.im/t/data-sync-research-log/1100. This might fork into new future work items.
469  
470  **8. Peer discovery.**
471  Capture how who wants to sync finds each other. Consider how encryption plays a role here in terms of visibility.
472  
473  **9. User-centric, capture client implementers and requirements.**
474  The user of a data sync layer is the sync client writer, not end user. What do they want to think about and not think about? Capture requirements better.
475  
476  **10. Separate out Swarm vs IPFS storage and distribution comparison.**
477  Looking at similarities and differences between these probably more informative, even though there's overlap. Also keeps the comparison scope more tight.