/ docs / advanced / electron.md
electron.md
  1  ---
  2  description: How to CLI-ify and automate any Electron Desktop Application via CDP
  3  ---
  4  
  5  # CLI-ifying Electron Applications (Skill Guide)
  6  
  7  Based on the successful automation of **Cursor**, **Codex**, **Antigravity**, **ChatWise**, **Notion**, and **Discord** desktop apps, this guide serves as the standard operating procedure (SOP) for adapting ANY Electron-based application into an OpenCLI adapter.
  8  
  9  ## Core Concept
 10  
 11  Electron apps are essentially local Chromium browser instances. By exposing a debugging port (CDP — Chrome DevTools Protocol) at launch time, we can use the Browser Bridge to pierce through the UI layer, accessing and controlling all underlying state including React/Vue components and Shadow DOM.
 12  
 13  > **Note:** Not all desktop apps are Electron. WeChat (native Cocoa) and Feishu/Lark (custom Lark Framework) embed Chromium but do NOT expose CDP. For those apps, use the AppleScript + clipboard approach instead (see [Non-Electron Pattern](#non-electron-pattern-applescript)).
 14  
 15  ### Launching the Target App
 16  ```bash
 17  /Applications/AppName.app/Contents/MacOS/AppName --remote-debugging-port=9222
 18  ```
 19  
 20  ### Verifying Electron
 21  ```bash
 22  # Check for Electron Framework in the app bundle
 23  ls /Applications/AppName.app/Contents/Frameworks/Electron\ Framework.framework
 24  # If this directory exists → Electron → CDP works
 25  # If not → check for libEGL.dylib (embedded Chromium/CEF, CDP may not work)
 26  ```
 27  
 28  ## The 5-Command Pattern (CDP / Electron)
 29  
 30  Every new Electron adapter should implement these 5 commands in `clis/<app_name>/`:
 31  
 32  ### 1. `status.ts` — Connection Test
 33  ```typescript
 34  export const statusCommand = cli({
 35    site: 'myapp',
 36    name: 'status',
 37    domain: 'localhost',
 38    strategy: Strategy.UI,
 39    browser: true,       // Requires CDP connection
 40    args: [],
 41    columns: ['Status', 'Url', 'Title'],
 42    func: async (page: IPage) => {
 43      const url = await page.evaluate('window.location.href');
 44      const title = await page.evaluate('document.title');
 45      return [{ Status: 'Connected', Url: url, Title: title }];
 46    },
 47  });
 48  ```
 49  
 50  ### 2. `dump.ts` — Reverse Engineering Core
 51  Modern app DOMs are huge and obfuscated. **Never guess selectors.** Dump first, then extract precise class names with AI or `grep`:
 52  ```typescript
 53  const dom = await page.evaluate('document.body.innerHTML');
 54  fs.writeFileSync('/tmp/app-dom.html', dom);
 55  const snap = await page.snapshot({ interactive: false });
 56  fs.writeFileSync('/tmp/app-snapshot.json', JSON.stringify(snap, null, 2));
 57  ```
 58  
 59  ### 3. `send.ts` — Advanced Text Injection
 60  Electron apps often use complex rich-text editors (Monaco, Lexical, ProseMirror). Setting `.value` directly is ignored by React state.
 61  
 62  **Best practice:** Use `document.execCommand('insertText')` to perfectly simulate real user input, fully piercing React state:
 63  ```javascript
 64  const composer = document.querySelector('[contenteditable="true"]');
 65  composer.focus();
 66  document.execCommand('insertText', false, 'Hello');
 67  ```
 68  Then submit with `await page.pressKey('Enter')`.
 69  
 70  ### 4. `read.ts` — Context Extraction
 71  Don't extract the entire page text. Use `dump.ts` output to find the real "conversation container":
 72  - Look for semantic selectors: `[role="log"]`, `[data-testid="conversation"]`, `[data-content-search-turn-key]`
 73  - Format output as Markdown — readable by both humans and LLMs
 74  
 75  ### 5. `new.ts` — Keyboard Shortcuts
 76  Many GUI actions respond to native shortcuts rather than button clicks:
 77  ```typescript
 78  const isMac = process.platform === 'darwin';
 79  await page.pressKey(isMac ? 'Meta+N' : 'Control+N');
 80  await page.wait(1); // Wait for re-render
 81  ```
 82  
 83  ## Environment Variable
 84  ```bash
 85  export OPENCLI_CDP_ENDPOINT="http://127.0.0.1:9222"
 86  ```
 87  
 88  ## Non-Electron Pattern (AppleScript)
 89  
 90  For native macOS apps (WeChat, Feishu) that don't expose CDP:
 91  ```typescript
 92  export const statusCommand = cli({
 93    site: 'myapp',
 94    strategy: Strategy.PUBLIC,
 95    browser: false,       // No browser needed
 96    func: async (page: IPage | null) => {
 97      const output = execSync("osascript -e 'application \"MyApp\" is running'", { encoding: 'utf-8' }).trim();
 98      return [{ Status: output === 'true' ? 'Running' : 'Stopped' }];
 99    },
100  });
101  ```
102  
103  Core techniques:
104  - **status**: `osascript -e 'application "AppName" is running'`
105  - **send**: `pbcopy` → activate window → `Cmd+V` → `Enter`
106  - **read**: `Cmd+A` → `Cmd+C` → `pbpaste`
107  - **search**: Activate → `Cmd+F`/`Cmd+K` → `keystroke "query"`
108  
109  ## Pitfalls & Gotchas
110  
111  1. **Port conflicts (EADDRINUSE)**: Only one app per port. Use unique ports: Codex=9222, ChatGPT=9224, Cursor=9226, ChatWise=9228, Notion=9230, Discord=9232
112  2. **IPage abstraction**: OpenCLI wraps the browser page as `IPage` (`src/types.ts`). Use `page.pressKey()` and `page.evaluate()`, NOT direct DOM APIs
113  3. **Timing**: Always add `await page.wait(0.5)` to `1.0` after DOM mutations. Returning too early disconnects prematurely
114  4. **AppleScript requires Accessibility**: Terminal app must be granted permission in System Settings → Privacy & Security → Accessibility
115  
116  ## Port Assignment Table
117  
118  | App | Port | Mode |
119  |-----|------|------|
120  | Codex | 9222 | CDP |
121  | ChatGPT | 9224 | CDP / AppleScript |
122  | Cursor | 9226 | CDP |
123  | ChatWise | 9228 | CDP |
124  | Notion | 9230 | CDP |
125  | Discord App | 9232 | CDP |