/ EIPS / eip-706.md
eip-706.md
  1  ---
  2  eip: 706
  3  title: DEVp2p snappy compression
  4  author: Péter Szilágyi <peter@ethereum.org>
  5  type: Standards Track
  6  category: Networking
  7  status: Final
  8  created: 2017-09-07
  9  ---
 10  
 11  ## Abstract
 12  The base networking protocol (DEVp2p) used by Ethereum currently does not employ any form of compression. This results in a massive amount of bandwidth wasted in the entire network, making both initial sync as well as normal operation slower and laggier.
 13  
 14  This EIP proposes a tiny extension to the DEVp2p protocol to enable [Snappy compression](https://en.wikipedia.org/wiki/Snappy_(compression)) on all message payloads after the initial handshake. After extensive benchmarks, results show that data traffic is decreased by 60-80% for initial sync. You can find exact numbers below.
 15  
 16  ## Motivation
 17  Synchronizing the Ethereum main network (block 4,248,000) in Geth using fast sync currently consumes 1.01GB upload and 33.59GB download bandwidth. On the Rinkeby test network (block 852,000) it's 55.89MB upload and 2.51GB download.
 18  
 19  However, most of this data (blocks, transactions) are heavily compressible. By enabling compression at the message payload level, we can reduce the previous numbers to 1.01GB upload / 13.46GB download on the main network, and 46.21MB upload / 463.65MB download on the test network.
 20  
 21  The motivation behind doing this at the DEVp2p level (opposed to eth for example) is that it would enable compression for all sub-protocols (eth, les, bzz) seamlessly, reducing any complexity those protocols might incur in trying to individually optimize for data traffic.
 22  
 23  ## Specification
 24  Bump the advertised DEVp2p version number from `4` to `5`. If during handshake, the remote side advertises support only for version `4`, run the exact same protocol as until now.
 25  
 26  If the remote side advertises a DEVp2p version `>= 5`, inject a Snappy compression step right before encrypting the DEVp2p message during sending:
 27  
 28   * A message consists of `{Code, Size, Payload}`
 29    * Compress the original payload with Snappy and store it in the same field.
 30    * Update the message size to the length of the compressed payload.
 31    * Encrypt and send the message as before, oblivious to compression.
 32  
 33  Similarly to message sending, when receiving a DEVp2p v5 message from a remote node, insert a Snappy decompression step right after the decrypting the DEVp2p message:
 34  
 35  * A message consists of `{Code, Size, Payload}`
 36   * Decrypt the message payload as before, oblivious to compression.
 37   * Decompress the payload with Snappy and store it in the same field.
 38   * Update the message size to the length of the decompressed payload.
 39  
 40  Important caveats:
 41  
 42   * The handshake message is **never** compressed, since it is needed to negotiate the common version.
 43   * Snappy framing is **not** used, since the DEVp2p protocol already message oriented.
 44  
 45  *Note: Snappy supports uncompressed binary literals (up to 4GB) too, leaving room for fine-tuned future optimisations for already compressed or encrypted data that would have no gain of compression (Snappy usually detects this case automatically).*
 46  
 47  ### Avoiding DOS attacks
 48  
 49  Currently a DEVp2p message length is limited to 24 bits, amounting to a maximum size of 16MB. With the introduction of Snappy compression, care must be taken not to blindly decompress messages, since they may get significantly larger than 16MB.
 50  
 51  However, Snappy is capable of calculating the decompressed size of an input message without inflating it in memory (*[the stream starts with the uncompressed length up to a maximum of `2^32 - 1` stored as a little-endian varint](https://github.com/google/snappy/blob/master/format_description.txt#L20)*). This can be used to discard any messages which decompress above some threshold. **The proposal is to use the same limit (16MB) as the threshold for decompressed messages.** This retains the same guarantees that the current DEVp2p protocol does, so there won't be surprises in application level protocols.
 52  
 53  ## Alternatives (discarded)
 54  
 55  **Alternative solutions to data compression that have been brought up and discarded are:**
 56  
 57  Extend protocol `xyz` to support compressed messages versus doing it at DEVp2p level:
 58  
 59   * **Pro**: Can be better optimized when to compress and when not to.
 60   * **Con**: Mixes in transport layer encoding into application layer logic.
 61   * **Con**: Makes the individual message specs more convoluted with compression details.
 62   * **Con**: Requires cross client coordination on every single protocol, making the effor much harder and repeated (eth, les, shh, bzz).
 63  
 64  Introduce seamless variations of protocol such as `xyz` expanded with `xyz-compressed`:
 65  
 66   * **Pro**: Can be done (hacked in) without cross client coordination.
 67   * **Con**: Litters the network with client specific protocol announces.
 68   * **Con**: Needs to be specced in an EIP for cross interoperability anyway.
 69  
 70  **Other ideas that have been discussed and discarded:**
 71  
 72  Don't explicitly limit the decompressed message size, only the compressed one:
 73  
 74   * **Pro**: Allows larger messages to traverse through DEVp2p.
 75   * **Con**: Upper layer protocols need to check and discard large messages.
 76   * **Con**: Needs lazy decompression to allow size limitations without DOS.
 77  
 78  ## Backwards Compatibility
 79  This proposal is fully backward compatible. Clients upgrading to the proposed DEVp2p protocol version `5` should still support skipping the compression step for connections that only advertise version `4` of the DEVp2p protocol.
 80  
 81  ## Implementation
 82  You can find a reference implementation of this EIP in https://github.com/ethereum/go-ethereum/pull/15106.
 83  
 84  ## Test vectors
 85  
 86  There is more than one valid encoding of any given input, and there is more than one good internal compression algorithm within Snappy when trading off throughput for output size. As such, different implementations might produce slight variations in the compressed form, but all should be cross compatible between each other.
 87  
 88  As an example, take hex encoded RLP of block #272621 from the Rinkeby test network: [block.rlp (~3MB)](https://gist.githubusercontent.com/karalabe/72a1a6c4c1dbe6d4996879e415697f06/raw/195bf0c0050ee9805fcd5db4b5b650c58879a55f/block.rlp).
 89  
 90   * Encoding the raw RLP via [Go's Snappy library](https://github.com/golang/snappy) yields: [block.go.snappy (~70KB)](https://gist.githubusercontent.com/karalabe/72a1a6c4c1dbe6d4996879e415697f06/raw/195bf0c0050ee9805fcd5db4b5b650c58879a55f/block.go.snappy).
 91   * Encoding the raw RLP via [Python's Snappy library](https://github.com/andrix/python-snappy) yields: [block.py.snappy (~70KB)](https://gist.githubusercontent.com/karalabe/72a1a6c4c1dbe6d4996879e415697f06/raw/195bf0c0050ee9805fcd5db4b5b650c58879a55f/block.py.snappy).
 92  
 93  You can verify that an encoded binary can be decoded into the proper plaintext using the following snippets:
 94  
 95  ### Go
 96  
 97  ```sh
 98  $ go get https://github.com/golang/snappy
 99  ```
100  
101  ```go
102  package main
103  
104  import (
105  	"bytes"
106  	"encoding/hex"
107  	"fmt"
108  	"io/ioutil"
109  	"log"
110  	"os"
111  
112  	"github.com/golang/snappy"
113  )
114  
115  func main() {
116  	// Read and decode the decompressed file
117  	plainhex, err := ioutil.ReadFile(os.Args[1])
118  	if err != nil {
119  		log.Fatalf("Failed to read decompressed file %s: %v", os.Args[1], err)
120  	}
121  	plain, err := hex.DecodeString(string(plainhex))
122  	if err != nil {
123  		log.Fatalf("Failed to decode decompressed file: %v", err)
124  	}
125  	// Read and decode the compressed file
126  	comphex, err := ioutil.ReadFile(os.Args[2])
127  	if err != nil {
128  		log.Fatalf("Failed to read compressed file %s: %v", os.Args[2], err)
129  	}
130  	comp, err := hex.DecodeString(string(comphex))
131  	if err != nil {
132  		log.Fatalf("Failed to decode compressed file: %v", err)
133  	}
134  	// Make sure they match
135  	decomp, err := snappy.Decode(nil, comp)
136  	if err != nil {
137  		log.Fatalf("Failed to decompress compressed file: %v", err)
138  	}
139  	if !bytes.Equal(plain, decomp) {
140  		fmt.Println("Booo, decompressed file does not match provided plain text!")
141  		return
142  	}
143  	fmt.Println("Yay, decompressed data matched provided plain text!")
144  }
145  ```
146  
147  ```sh
148  $ go run main.go block.rlp block.go.snappy
149  Yay, decompressed data matched provided plain text!
150  
151  $ go run main.go block.rlp block.py.snappy
152  Yay, decompressed data matched provided plain text!
153  ```
154  
155  ### Python
156  
157  ```bash
158  $ pip install python-snappy
159  ```
160  
161  ```py
162  import snappy
163  import sys
164  
165  # Read and decode the decompressed file
166  with open(sys.argv[1], 'rb') as file:
167      plainhex = file.read()
168  
169  plain = plainhex.decode("hex")
170  
171  # Read and decode the compressed file
172  with open(sys.argv[2], 'rb') as file:
173      comphex = file.read()
174  
175  comp = comphex.decode("hex")
176  
177  # Make sure they match
178  decomp = snappy.uncompress(comp)
179  if plain != decomp:
180      print "Booo, decompressed file does not match provided plain text!"
181  else:
182      print "Yay, decompressed data matched provided plain text!"
183  ```
184  
185  ```sh
186  $ python main.py block.rlp block.go.snappy
187  Yay, decompressed data matched provided plain text!
188  
189  $ python main.py block.rlp block.py.snappy
190  Yay, decompressed data matched provided plain text!
191  ```
192  
193  ## References
194  
195   * Snappy website: https://google.github.io/snappy/
196   * Snappy specification: https://github.com/google/snappy/blob/master/format_description.txt
197  
198  ## Copyright
199  Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).