/ CORRELATION_TOOL_README.md
CORRELATION_TOOL_README.md
1 # Transaction Correlation Monitor Tool 2 3 ## 🎯 Purpose 4 5 This tool monitors **Espresso testnet** and **Caff Node** simultaneously to find correlations between transactions in different formats: 6 - **Espresso**: TX~ format (base64) 7 - **Caff Node**: 0x format (hex) 8 9 ## 🚀 Quick Start 10 11 ```bash 12 # Monitor for 10 minutes (default) 13 node correlate-tx-monitor.js 14 15 # Monitor for 20 minutes 16 node correlate-tx-monitor.js 20 17 18 # Monitor for 30 minutes (to catch more activity) 19 node correlate-tx-monitor.js 30 20 ``` 21 22 ## 📊 What It Does 23 24 ### Data Collection 25 26 1. **Fetches current block heights** from both networks 27 2. **Calculates block range** for the time window (e.g., last 10 minutes) 28 3. **Monitors Espresso** RARI namespace (1380012617) for TX~ transactions 29 4. **Monitors Caff Node** Rari chain for 0x transactions 30 5. **Correlates transactions** within 5s-3min delay tolerance 31 32 ### Correlation Algorithm 33 34 **Matches transactions based on:** 35 - ✅ Timestamp proximity (5s - 3min window) 36 - ✅ Transaction index in block 37 - ✅ Transaction size similarity 38 - ✅ Namespace correctness 39 - ✅ Block transaction count 40 41 **Confidence scoring:** 42 - **High (≥80%)**: Very likely match - timestamp <1min, index matches 43 - **Medium (60-79%)**: Probable match - timestamp <3min, close index 44 - **Low (40-59%)**: Possible match - within time window 45 46 ## 📁 Output Files 47 48 Every run generates **4-5 files**: 49 50 ### 1. JSON Report (Complete Data) 51 **File**: `correlation-full-{timestamp}.json` 52 53 Contains: 54 - Metadata (timestamp, duration, config) 55 - Summary statistics 56 - **ALL Espresso transactions** (with timestamps, hashes, blocks) 57 - **ALL Caff Node transactions** (with timestamps, hashes, from/to, value) 58 - **ALL correlations found** (with confidence scores) 59 60 ### 2. Espresso Transactions CSV 61 **File**: `espresso-transactions-{timestamp}.csv` 62 63 Columns: 64 ``` 65 Hash, Block, Index, Namespace, Timestamp, Timestamp_ISO, Size 66 ``` 67 68 ### 3. Caff Node Transactions CSV 69 **File**: `caff-transactions-{timestamp}.csv` 70 71 Columns: 72 ``` 73 Hash, Block, Index, Timestamp, Timestamp_ISO, From, To, Value, Gas, Size 74 ``` 75 76 **Example row:** 77 ```csv 78 "0x025cdd51...",1416393,0,1761360628,"2025-10-25 02:50:28","0x000...a4b05","0x000...a4b05","0x0",0,133 79 ``` 80 81 ### 4. Correlations CSV (if correlations found) 82 **File**: `correlations-{timestamp}.csv` 83 84 Columns: 85 ``` 86 Espresso_TX, Espresso_Block, Espresso_Index, Espresso_Time, 87 Caff_TX, Caff_Block, Caff_Index, Caff_Time, 88 Time_Diff_Seconds, Confidence, Confidence_Percent 89 ``` 90 91 ### 5. Human-Readable Text Report 92 **File**: `correlation-report-{timestamp}.txt` 93 94 Contains: 95 - Configuration summary 96 - Data collection summary 97 - Confidence distribution 98 - Time delay statistics 99 - Top 10 correlations 100 101 ## 📈 Current Results (Oct 27, 2025) 102 103 ### Run 1: 10-minute window 104 ``` 105 Duration: 10 minutes 106 Espresso blocks: #5721926 - #5721976 (50 blocks) 107 Caff Node blocks: #1416393 - #1416443 (50 blocks) 108 109 Results: 110 ✅ Caff Node: 102 transactions collected 111 ❌ Espresso: 0 RARI transactions found 112 ⚠️ No correlations (no Espresso activity in this window) 113 ``` 114 115 **Observation**: Caff Node is very active (2 txs per block), but RARI namespace on Espresso had no transactions during this time. 116 117 ## 🔍 Analysis of Collected Data 118 119 ### Caff Node Transaction Patterns 120 121 From the collected 102 transactions: 122 123 **Transaction Types:** 124 1. **System transactions** (majority) 125 - From/To: `0x00000000000000000000000000000000000a4b05` 126 - Value: 0 127 - Gas: 0 128 - Size: 133-165 bytes 129 - Pattern: Every block has these 130 131 2. **User transactions** (occasional) 132 - From: Real addresses 133 - To: Contracts or null (deployments) 134 - Gas: 53,458 - 3,496,663 135 - Example: Block 1416400, 1416401 136 137 **Timestamp Distribution:** 138 ``` 139 Oct 25, 02:50 - Block 1416393 140 Oct 25, 03:41 - Block 1416394 141 Oct 25, 13:42 - Block 1416395 142 Oct 25, 23:43 - Block 1416396 143 Oct 26, 09:44 - Block 1416397 144 Oct 26, 19:45 - Block 1416398 145 Oct 27, 05:46 - Block 1416399 146 Oct 27, 15:04 - Block 1416400-1416443 147 ``` 148 149 **Block time**: ~10 hours between blocks (very slow for last few days) 150 151 ## 💡 Recommendations 152 153 ### To Find Correlations 154 155 1. **Run for longer duration** (30-60 minutes) 156 ```bash 157 node correlate-tx-monitor.js 60 158 ``` 159 160 2. **Check when RARI is active** 161 - Monitor Espresso explorer for RARI namespace activity 162 - Run tool when you see activity 163 164 3. **Analyze historical data** 165 - Check blocks when both networks had activity 166 - Look at known transaction pairs (see documentation) 167 168 ### To Improve Correlation 169 170 1. **Adjust delay window** if needed 171 - Current: 5s - 3min 172 - Can be modified in script 173 174 2. **Add more matching factors** 175 - Transaction sender address 176 - Transaction value 177 - Contract called 178 179 3. **Implement caching** 180 - Store known correlations 181 - Build correlation database 182 183 ## 📊 How to Analyze the CSV Files 184 185 ### In Excel/Google Sheets 186 187 1. **Open Caff Node CSV** 188 - Sort by Timestamp to see chronological order 189 - Filter by From/To to see user transactions 190 - Look for patterns in transaction timing 191 192 2. **Open Espresso CSV** (when data available) 193 - Sort by Block height 194 - Compare timestamps with Caff Node 195 - Match by proximity 196 197 3. **Manual correlation** 198 - Look for transactions within 5s-3min 199 - Match by transaction index 200 - Verify with block timing 201 202 ### In Python/Pandas 203 204 ```python 205 import pandas as pd 206 207 # Load data 208 caff = pd.read_csv('caff-transactions-*.csv') 209 espresso = pd.read_csv('espresso-transactions-*.csv') 210 211 # Convert timestamps 212 caff['timestamp'] = pd.to_datetime(caff['Timestamp'], unit='s') 213 espresso['timestamp'] = pd.to_datetime(espresso['Timestamp'], unit='s') 214 215 # Find matches within 3 minutes 216 matches = [] 217 for _, caff_tx in caff.iterrows(): 218 time_diff = abs(espresso['timestamp'] - caff_tx['timestamp']) 219 within_window = espresso[time_diff < pd.Timedelta(minutes=3)] 220 if not within_window.empty: 221 matches.append({ 222 'caff_tx': caff_tx['Hash'], 223 'espresso_tx': within_window.iloc[0]['Hash'], 224 'time_diff': time_diff.min().total_seconds() 225 }) 226 227 print(f"Found {len(matches)} potential matches") 228 ``` 229 230 ## 🔧 Configuration 231 232 Edit the script to adjust: 233 234 ```javascript 235 const DELAY_MIN = 5; // Minimum delay (seconds) 236 const DELAY_MAX = 180; // Maximum delay (seconds) 237 const RARI_NAMESPACE = 1380012617; // Namespace to monitor 238 const POLL_INTERVAL = 5000; // Poll interval (ms) 239 ``` 240 241 ## 🐛 Troubleshooting 242 243 ### No Espresso transactions found 244 245 **Possible causes:** 246 1. RARI namespace not active in this time window 247 2. Check Espresso explorer for recent RARI activity 248 3. Try longer monitoring duration 249 250 **Solution**: Run when RARI is active or check historical data 251 252 ### API rate limiting 253 254 **Symptom**: Slow data collection or timeouts 255 256 **Solution**: Increase delays in script 257 ```javascript 258 await new Promise(resolve => setTimeout(resolve, 500)); // Increase from 100ms 259 ``` 260 261 ### Large file sizes 262 263 **Symptom**: JSON/CSV files are huge 264 265 **Solution**: Monitor shorter periods or filter transactions in post-processing 266 267 ## 📝 Example Use Cases 268 269 ### 1. Real-time Monitoring 270 ```bash 271 # Watch for new correlations 272 node correlate-tx-monitor.js 10 273 # Check files every 10 minutes 274 ``` 275 276 ### 2. Historical Analysis 277 ```bash 278 # Collect data for 1 hour 279 node correlate-tx-monitor.js 60 280 # Analyze CSV files in spreadsheet 281 ``` 282 283 ### 3. Pattern Discovery 284 ```bash 285 # Multiple runs at different times 286 node correlate-tx-monitor.js 20 # Morning 287 node correlate-tx-monitor.js 20 # Afternoon 288 node correlate-tx-monitor.js 20 # Evening 289 # Compare patterns 290 ``` 291 292 ## 🎯 Next Steps 293 294 1. **Capture active period**: Run when RARI namespace is active 295 2. **Analyze patterns**: Study transaction timing and delays 296 3. **Build correlation database**: Store proven matches 297 4. **Improve algorithm**: Add more matching factors 298 5. **Automate**: Run continuously and alert on high-confidence matches 299 300 ## 📚 Related Documentation 301 302 - `CAFF_NODE_INTEGRATION_PLAN.md` - Full integration strategy 303 - `TX_CORRELATION_STRATEGY.md` - Correlation algorithms in detail 304 - `TEST_REPORT.md` - API testing results 305 306 ## 🚀 Success Metrics 307 308 **When tool finds correlations:** 309 - ✅ High confidence (≥80%): Ready to use 310 - ✅ Medium confidence (60-79%): Review manually 311 - ⚠️ Low confidence (<60%): Needs improvement 312 313 **Target**: Find 10+ high-confidence correlations to validate approach 314 315 --- 316 317 **Tool Version**: 1.0 318 **Last Updated**: Oct 27, 2025 319 **Status**: ✅ Working, waiting for RARI activity