githubrepoviewer.mdx
1 --- 2 title: "GitHubRepoViewer" 3 id: githubrepoviewer 4 slug: "/githubrepoviewer" 5 description: "This component navigates and fetches content from GitHub repositories through the GitHub API." 6 --- 7 8 # GitHubRepoViewer 9 10 This component navigates and fetches content from GitHub repositories through the GitHub API. 11 12 <div className="key-value-table"> 13 14 | | | 15 | --- | --- | 16 | **Most common position in a pipeline** | Right at the beginning of a pipeline and before a [ChatPromptBuilder](../builders/chatpromptbuilder.mdx) that expects the content of GitHub files as input | 17 | **Mandatory run variables** | `path`: Repository path to view <br /> <br />`repo`: Repository in owner/repo format | 18 | **Output variables** | `documents`: A list of documents containing repository contents | 19 | **API reference** | [GitHub](/reference/integrations-github) | 20 | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/github | 21 22 </div> 23 24 ## Overview 25 26 `GitHubRepoViewer` provides different behavior based on the path type: 27 28 - **For directories**: Returns a list of documents, one for each item (files and subdirectories), 29 - **For files**: Returns a single document containing the file content. 30 31 Each document includes rich metadata such as the path, type, size, and URL. 32 33 ### Authorization 34 35 The component can work without authentication for public repositories, but for private repositories or to avoid rate limiting, you can provide a GitHub personal access token. 36 37 You can set the token using the `GITHUB_TOKEN` environment variable, or pass it directly during initialization via the `github_token` parameter. 38 39 To create a personal access token, visit [GitHub's token settings page](https://github.com/settings/tokens). 40 41 ### Installation 42 43 Install the GitHub integration with pip: 44 45 ```shell 46 pip install github-haystack 47 ``` 48 49 ## Usage 50 51 :::info[Repository Placeholder] 52 53 To run the following code snippets, you need to replace the `owner/repo` with your own GitHub repository name. 54 ::: 55 56 ### On its own 57 58 Viewing a directory listing: 59 60 ```python 61 from haystack_integrations.components.connectors.github import GitHubRepoViewer 62 63 viewer = GitHubRepoViewer() 64 result = viewer.run( 65 repo="deepset-ai/haystack", 66 path="haystack/components", 67 branch="main", 68 ) 69 70 print(result) 71 ``` 72 73 ```bash 74 {'documents': [Document(id=..., content: 'agents', meta: {'path': 'haystack/components/agents', 'type': 'dir', 'size': 0, 'url': 'https://github.com/deepset-ai/haystack/tree/main/haystack/components/agents'}), ...]} 75 ``` 76 77 Viewing a specific file: 78 79 ```python 80 from haystack_integrations.components.connectors.github import GitHubRepoViewer 81 82 viewer = GitHubRepoViewer(repo="deepset-ai/haystack", branch="main") 83 result = viewer.run(path="README.md") 84 85 print(result) 86 ``` 87 88 ```bash 89 {'documents': [Document(id=..., content: '<div align="center"> 90 <a href="https://haystack.deepset.ai/"><img src="https://raw.githubuserconten...', meta: {'path': 'README.md', 'type': 'file_content', 'size': 11979, 'url': 'https://github.com/deepset-ai/haystack/blob/main/README.md'})]} 91 ```