txrequest.h
1 // Copyright (c) 2020-present The Bitcoin Core developers 2 // Distributed under the MIT software license, see the accompanying 3 // file COPYING or http://www.opensource.org/licenses/mit-license.php. 4 5 #ifndef BITCOIN_TXREQUEST_H 6 #define BITCOIN_TXREQUEST_H 7 8 #include <primitives/transaction.h> 9 #include <net.h> 10 #include <uint256.h> 11 12 #include <chrono> 13 #include <cstdint> 14 #include <memory> 15 #include <vector> 16 17 /** Data structure to keep track of, and schedule, transaction downloads from peers. 18 * 19 * === Specification === 20 * 21 * We keep track of which peers have announced which transactions, and use that to determine which requests 22 * should go to which peer, when, and in what order. 23 * 24 * The following information is tracked per peer/tx combination ("announcement"): 25 * - Which peer announced it (through their NodeId) 26 * - The txid or wtxid of the transaction (collectively called "txhash" in what follows) 27 * - Whether it was a tx or wtx announcement (see BIP339). 28 * - What the earliest permitted time is that the transaction can be requested from that peer (called "reqtime"). 29 * - Whether it's from a "preferred" peer or not. Which announcements get this flag is determined by the caller, but 30 * this is designed for outbound peers, or other peers that we have a higher level of trust in. Even when the 31 * peers' preferredness changes, the preferred flag of existing announcements from that peer won't change. 32 * - Whether or not the transaction was requested already, and if so, when it times out (called "expiry"). 33 * - Whether or not the transaction request failed already (timed out, or invalid transaction or NOTFOUND was 34 * received). 35 * 36 * Transaction requests are then assigned to peers, following these rules: 37 * 38 * - No transaction is requested as long as another request for the same txhash is outstanding (it needs to fail 39 * first by passing expiry, or a NOTFOUND or invalid transaction has to be received for it). 40 * 41 * Rationale: to avoid wasting bandwidth on multiple copies of the same transaction. Note that this only works 42 * per txhash, so if the same transaction is announced both through txid and wtxid, we have no means 43 * to prevent fetching both (the caller can however mitigate this by delaying one, see further). 44 * 45 * - The same transaction is never requested twice from the same peer, unless the announcement was forgotten in 46 * between, and re-announced. Announcements are forgotten only: 47 * - If a peer goes offline, all its announcements are forgotten. 48 * - If a transaction has been successfully received, or is otherwise no longer needed, the caller can call 49 * ForgetTxHash, which removes all announcements across all peers with the specified txhash. 50 * - If for a given txhash only already-failed announcements remain, they are all forgotten. 51 * 52 * Rationale: giving a peer multiple chances to announce a transaction would allow them to bias requests in their 53 * favor, worsening transaction censoring attacks. The flip side is that as long as an attacker manages 54 * to prevent us from receiving a transaction, failed announcements (including those from honest peers) 55 * will linger longer, increasing memory usage somewhat. The impact of this is limited by imposing a 56 * cap on the number of tracked announcements per peer. As failed requests in response to announcements 57 * from honest peers should be rare, this almost solely hinders attackers. 58 * Transaction censoring attacks can be done by announcing transactions quickly while not answering 59 * requests for them. See https://allquantor.at/blockchainbib/pdf/miller2015topology.pdf for more 60 * information. 61 * 62 * - Transactions are not requested from a peer until its reqtime has passed. 63 * 64 * Rationale: enable the calling code to define a delay for less-than-ideal peers, so that (presumed) better 65 * peers have a chance to give their announcement first. 66 * 67 * - If multiple viable candidate peers exist according to the above rules, pick a peer as follows: 68 * 69 * - If any preferred peers are available, non-preferred peers are not considered for what follows. 70 * 71 * Rationale: preferred peers are more trusted by us, so are less likely to be under attacker control. 72 * 73 * - Pick a uniformly random peer among the candidates. 74 * 75 * Rationale: random assignments are hard to influence for attackers. 76 * 77 * Together these rules strike a balance between being fast in non-adverserial conditions and minimizing 78 * susceptibility to censorship attacks. An attacker that races the network: 79 * - Will be unsuccessful if all preferred connections are honest (and there is at least one preferred connection). 80 * - If there are P preferred connections of which Ph>=1 are honest, the attacker can delay us from learning 81 * about a transaction by k expiration periods, where k ~ 1 + NHG(N=P-1,K=P-Ph-1,r=1), which has mean 82 * P/(Ph+1) (where NHG stands for Negative Hypergeometric distribution). The "1 +" is due to the fact that the 83 * attacker can be the first to announce through a preferred connection in this scenario, which very likely means 84 * they get the first request. 85 * - If all P preferred connections are to the attacker, and there are NP non-preferred connections of which NPh>=1 86 * are honest, where we assume that the attacker can disconnect and reconnect those connections, the distribution 87 * becomes k ~ P + NB(p=1-NPh/NP,r=1) (where NB stands for Negative Binomial distribution), which has mean 88 * P-1+NP/NPh. 89 * 90 * Complexity: 91 * - Memory usage is proportional to the total number of tracked announcements (Size()) plus the number of 92 * peers with a nonzero number of tracked announcements. 93 * - CPU usage is generally logarithmic in the total number of tracked announcements, plus the number of 94 * announcements affected by an operation (amortized O(1) per announcement). 95 * 96 * Context: 97 * - In an earlier version of the transaction request logic it was possible for a peer to prevent us from seeing a 98 * specific transaction. See https://bitcoincore.org/en/2024/07/03/disclose_already_asked_for. 99 */ 100 class TxRequestTracker { 101 // Avoid littering this header file with implementation details. 102 class Impl; 103 const std::unique_ptr<Impl> m_impl; 104 105 public: 106 //! Construct a TxRequestTracker. 107 explicit TxRequestTracker(bool deterministic = false); 108 ~TxRequestTracker(); 109 110 // Conceptually, the data structure consists of a collection of "announcements", one for each peer/txhash 111 // combination: 112 // 113 // - CANDIDATE announcements represent transactions that were announced by a peer, and that become available for 114 // download after their reqtime has passed. 115 // 116 // - REQUESTED announcements represent transactions that have been requested, and which we're awaiting a 117 // response for from that peer. Their expiry value determines when the request times out. 118 // 119 // - COMPLETED announcements represent transactions that have been requested from a peer, and a NOTFOUND or a 120 // transaction was received in response (valid or not), or they timed out. They're only kept around to 121 // prevent requesting them again. If only COMPLETED announcements for a given txhash remain (so no CANDIDATE 122 // or REQUESTED ones), all of them are deleted (this is an invariant, and maintained by all operations below). 123 // 124 // The operations below manipulate the data structure. 125 126 /** Adds a new CANDIDATE announcement. 127 * 128 * Does nothing if one already exists for that (txhash, peer) combination (whether it's CANDIDATE, REQUESTED, or 129 * COMPLETED). Note that the txid/wtxid property is ignored for determining uniqueness, so if an announcement 130 * is added for a wtxid H, while one for txid H from the same peer already exists, it will be ignored. This is 131 * harmless as the txhashes being equal implies it is a non-segwit transaction, so it doesn't matter how it is 132 * fetched. The new announcement is given the specified preferred and reqtime values, and takes its is_wtxid 133 * from the specified gtxid. 134 */ 135 void ReceivedInv(NodeId peer, const GenTxid& gtxid, bool preferred, 136 std::chrono::microseconds reqtime); 137 138 /** Deletes all announcements for a given peer. 139 * 140 * It should be called when a peer goes offline. 141 */ 142 void DisconnectedPeer(NodeId peer); 143 144 /** Deletes all announcements for a given txhash (both txid and wtxid ones). 145 * 146 * This should be called when a transaction is no longer needed. The caller should ensure that new announcements 147 * for the same txhash will not trigger new ReceivedInv calls, at least in the short term after this call. 148 */ 149 void ForgetTxHash(const uint256& txhash); 150 151 /** Find the txids to request now from peer. 152 * 153 * It does the following: 154 * - Convert all REQUESTED announcements (for all txhashes/peers) with (expiry <= now) to COMPLETED ones. 155 * These are returned in expired, if non-nullptr. 156 * - Requestable announcements are selected: CANDIDATE announcements from the specified peer with 157 * (reqtime <= now) for which no existing REQUESTED announcement with the same txhash from a different peer 158 * exists, and for which the specified peer is the best choice among all (reqtime <= now) CANDIDATE 159 * announcements with the same txhash (subject to preferredness rules, and tiebreaking using a deterministic 160 * salted hash of peer and txhash). 161 * - The selected announcements are returned in announcement order (even if multiple were added at the same 162 * time, or when the clock went backwards while they were being added). This is done to minimize disruption 163 * from dependent transactions being requested out of order: if multiple dependent transactions are announced 164 * simultaneously by one peer, and end up being requested from them, the requests will happen in announcement order. 165 */ 166 std::vector<GenTxid> GetRequestable(NodeId peer, std::chrono::microseconds now, 167 std::vector<std::pair<NodeId, GenTxid>>* expired = nullptr); 168 169 /** Marks a transaction as requested, with a specified expiry. 170 * 171 * If no CANDIDATE announcement for the provided peer and txhash exists, this call has no effect. Otherwise: 172 * - That announcement is converted to REQUESTED. 173 * - If any other REQUESTED announcement for the same txhash already existed, it means an unexpected request 174 * was made (GetRequestable will never advise doing so). In this case it is converted to COMPLETED, as we're 175 * no longer waiting for a response to it. 176 */ 177 void RequestedTx(NodeId peer, const uint256& txhash, std::chrono::microseconds expiry); 178 179 /** Converts a CANDIDATE or REQUESTED announcement to a COMPLETED one. If no such announcement exists for the 180 * provided peer and txhash, nothing happens. 181 * 182 * It should be called whenever a transaction or NOTFOUND was received from a peer. When the transaction is 183 * not needed entirely anymore, ForgetTxhash should be called instead of, or in addition to, this call. 184 */ 185 void ReceivedResponse(NodeId peer, const uint256& txhash); 186 187 // The operations below inspect the data structure. 188 189 /** Count how many REQUESTED announcements a peer has. */ 190 size_t CountInFlight(NodeId peer) const; 191 192 /** Count how many CANDIDATE announcements a peer has. */ 193 size_t CountCandidates(NodeId peer) const; 194 195 /** Count how many announcements a peer has (REQUESTED, CANDIDATE, and COMPLETED combined). */ 196 size_t Count(NodeId peer) const; 197 198 /** Count how many announcements are being tracked in total across all peers and transaction hashes. */ 199 size_t Size() const; 200 201 /** For some txhash (txid or wtxid), finds all peers with non-COMPLETED announcements and appends them to 202 * result_peers. Does not try to ensure that result_peers contains no duplicates. */ 203 void GetCandidatePeers(const uint256& txhash, std::vector<NodeId>& result_peers) const; 204 205 /** Access to the internal priority computation (testing only) */ 206 uint64_t ComputePriority(const uint256& txhash, NodeId peer, bool preferred) const; 207 208 /** Run internal consistency check (testing only). */ 209 void SanityCheck() const; 210 211 /** Run a time-dependent internal consistency check (testing only). 212 * 213 * This can only be called immediately after GetRequestable, with the same 'now' parameter. 214 */ 215 void PostGetRequestableSanityCheck(std::chrono::microseconds now) const; 216 }; 217 218 #endif // BITCOIN_TXREQUEST_H