mongo.mdx
1 --- 2 title: MongoDB Integration 3 titleShort: MongoDB 4 description: "Sync onchain data to your MongoDB database using Apibara." 5 priority: 699 6 updatedAt: 2023-08-29 15:00 7 --- 8 9 # MongoDB integration 10 11 The MongoDB integration provides a way to mirror onchain data to a MongoDB 12 collection of your choice. Data is automatically inserted as it's produced by 13 the chain, and it's invalidated in case of chain reorganizations. 14 15 - The integration can be used to **populate a collection with data from one or 16 more networks or smart contracts**. 17 - Create powerful analytics with MongoDB pipelines. 18 - Change how collections are queried without re-indexing. 19 20 21 ### Installation 22 23 ``` 24 apibara plugins install sink-mongo 25 ``` 26 27 28 ### Configuration 29 30 - `connectionString: string`: the Mongo connection URL of your database. 31 - `database: string`: the target database name. 32 - `collectionName: string`: the target collection name. 33 - `collectionNames: [string]`: a list of target collection names. See the "Multiple Collections" section for more information. 34 - `entityMode: boolean`: enable entity mode. See the "Entity 35 storage" section for more information. 36 37 38 ### Collection schema 39 40 The transformation step is required to return an array of objects. Data is 41 converted to BSON and then written to the collection. The MongoDB integration 42 adds a `_cursor` column to each record so that data can be invalidated in case 43 of chain reorganizations. 44 45 46 ### Querying data 47 48 When querying data, you should always add the following property to your MongoDB filter 49 to ensure you get the latest value: 50 51 ```ts 52 { 53 "_cursor.to": null, 54 } 55 ``` 56 57 The "Storage & Data Invalidation" section at the end of this document contains 58 information on why you need to add this condition to your filter. 59 60 61 ### Entity storage 62 63 The MongoDB integration works with two types of data: 64 65 - Immutable logs (default): the values returned by the indexer represent 66 something that doesn't change over time, in other words they're a list of 67 items. For example, they represent a list of token transfers. 68 - Mutable entities: the values returned by the indexer represent the state of 69 an entity at a given block. For example, token balances change block by 70 block as users transfer the token. 71 72 You can index entities by setting the `entityMode` option to true. When you set 73 this option, the indexer expects the transform function to return a list of 74 update operations. An update operation is a JavaScript object with an `entity` 75 property used to filter which entities should be updated, and an `update` property 76 with either an [update 77 document](https://www.mongodb.com/docs/manual/reference/method/db.collection.updateMany/#std-label-updateMany-behavior-update-expressions) 78 or an [aggregation 79 pipeline](https://www.mongodb.com/docs/manual/reference/method/db.collection.updateMany/#std-label-updateMany-behavior-aggregation-pipeline). 80 81 82 **Example**: our indexer tracks token ownership for an ERC-721 smart contract 83 together with the number of transactions for each token. 84 We enable entity storage by setting the `entityMode` option to `true`. 85 The transform function returns the entities that need update, together with the 86 operation to update their state. 87 88 ```ts 89 export default function transform(block: Block) { 90 // Example to show the shape of data returned by transform. 91 return [ 92 { 93 entity: { contract, tokenId: "1" }, 94 update: { "$set": { owner: "0xA" }, "$inc": { "txCount": 1 } } 95 }, 96 { 97 entity: { contract, tokenId: "2" }, 98 update: { "$set": { owner: "0xB" }, "$inc": { "txCount": 3 } } 99 }, 100 ]; 101 } 102 ``` 103 104 The integration will iterate through the new entities and update the existing 105 values (if any) using the following MongoDB pseudo-query: 106 107 ```ts 108 for (const doc of returnValue) { 109 db.collection.updateMany({ 110 filter: doc.entity, 111 update: doc.update, 112 options: { 113 upsert: true, 114 } 115 }); 116 } 117 ``` 118 119 Notice that in reality the query is more complex, please refer to the next 120 section to learn more about how the MongoDB integration stores data. 121 122 ### Multiple Collections 123 124 You can write to multiple collections at the same time using the 125 `collectionNames` option, doing so the transform function should specify what 126 collection to the write the data to: 127 - For standard mode, put the returned data in a `data` key and add a 128 `collection` key, the returned value will look like `{ data: any, collection: 129 string }` 130 - For entity mode, just add a `collection` key to the returned object, it 131 should look like `{ entity: any, collection: string, update: any }` 132 133 ### Storage & Data Invalidation 134 135 Storing blockchain data poses an additional challenge since we must be able to 136 rollback the database state in case of chain reorganizations. 137 This integration adds an additional `_cursor` field to all documents to track 138 for which block range a piece of data is valid for. 139 140 ```ts 141 type Cursor = { 142 /** Block (inclusive) when this piece of data was created. */ 143 from: number, 144 /** Block (exclusive) at which this piece of data became invalid. */ 145 to: number | null, 146 }; 147 ``` 148 149 It follows that a field is valid at the most recent block if its `_cursor.to` 150 field is `null`. 151 152 **Example**: we're indexing an ERC-721 token with the following transfers: 153 154 - block: 1000, transfer from 0x0 to 0xA 155 - block: 1010, transfer from 0xA to 0xB 156 - block: 1020, transfer from 0xB to 0xC 157 158 If we put the token ownership on a timeline, it looks like the following diagram. 159 160 ```txt 161 1000 1010 1020 162 --+-----------------------+---------------------+---- - - - - - - - 163 [ { owner: "0xA } ) 164 [ { owner: "0xB" } ) 165 [ { owner: "0xC" } 166 ``` 167 168 Which translates to the following documents in the MongoDB collection. 169 170 After the first transfer: 171 172 ```json 173 [ 174 { "owner": "0xA", "_cursor": { "from": 1000, "to": null } } 175 ] 176 ``` 177 178 After the second transfer: 179 180 ```json 181 [ 182 { "owner": "0xA", "_cursor": { "from": 1000, "to": 1010 } }, 183 { "owner": "0xB", "_cursor": { "from": 1010, "to": null } } 184 ] 185 ``` 186 187 And after the third transfer: 188 189 ```json 190 [ 191 { "owner": "0xA", "_cursor": { "from": 1000, "to": 1010 } }, 192 { "owner": "0xB", "_cursor": { "from": 1010, "to": 1020 } }, 193 { "owner": "0xC", "_cursor": { "from": 1020, "to": null } } 194 ] 195 ``` 196