githubrepoviewertool.mdx
1 --- 2 title: "GitHubRepoViewerTool" 3 id: githubrepoviewertool 4 slug: "/githubrepoviewertool" 5 description: "A Tool that allows Agents and ToolInvokers to navigate and fetch content from GitHub repositories." 6 --- 7 8 # GitHubRepoViewerTool 9 10 A Tool that allows Agents and ToolInvokers to navigate and fetch content from GitHub repositories. 11 12 <div className="key-value-table"> 13 14 | | | 15 | --- | --- | 16 | **API reference** | [Tools](/reference/tools-api) | 17 | **GitHub link** | https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/github | 18 19 </div> 20 21 ## Overview 22 23 `GitHubRepoViewerTool` wraps the [`GitHubRepoViewer`](../../pipeline-components/connectors/githubrepoviewer.mdx) component, providing a tool interface for use in agent workflows and tool-based pipelines. 24 25 The tool provides different behavior based on the path type: 26 27 - **For directories**: Returns a list of documents, one for each item (files and subdirectories), 28 - **For files**: Returns a single document containing the file content. 29 30 Each document includes rich metadata such as the path, type, size, and URL. 31 32 ### Parameters 33 34 - `name` is _optional_ and defaults to "repo_viewer". Specifies the name of the tool. 35 - `description` is _optional_ and provides context to the LLM about what the tool does. 36 - `github_token` is _optional_ but recommended for private repositories or to avoid rate limiting. 37 - `repo` is _optional_ and sets a default repository in owner/repo format. 38 - `branch` is _optional_ and defaults to "main". Sets the default branch to work with. 39 - `raise_on_failure` is _optional_ and defaults to `True`. If False, errors are returned as documents instead of raising exceptions. 40 - `max_file_size` is _optional_ and defaults to `1,000,000` bytes (1MB). Maximum file size to fetch. 41 42 ## Usage 43 44 Install the GitHub integration to use the `GitHubRepoViewerTool`: 45 46 ```shell 47 pip install github-haystack 48 ``` 49 50 :::info[Repository Placeholder] 51 52 To run the following code snippets, you need to replace the `owner/repo` with your own GitHub repository name. 53 ::: 54 55 ### On its own 56 57 Basic usage to view repository contents: 58 59 ```python 60 from haystack_integrations.tools.github import GitHubRepoViewerTool 61 62 tool = GitHubRepoViewerTool() 63 result = tool.invoke( 64 repo="deepset-ai/haystack", 65 path="haystack/components", 66 branch="main", 67 ) 68 69 print(result) 70 ``` 71 72 ```bash 73 {'documents': [Document(id=..., content: 'agents', meta: {'path': 'haystack/components/agents', 'type': 'dir', 'size': 0, 'url': 'https://github.com/deepset-ai/haystack/tree/main/haystack/components/agents'}), Document(id=..., content: 'audio', meta: {'path': 'haystack/components/audio', 'type': 'dir', 'size': 0, 'url': 'https://github.com/deepset-ai/haystack/tree/main/haystack/components/audio'}),...]} 74 ``` 75 76 ### With an Agent 77 78 You can use `GitHubRepoViewerTool` with the [Agent](../../pipeline-components/agents-1/agent.mdx) component. The Agent will automatically invoke the tool when needed to explore repository structure and read files. 79 80 Note that we set the Agent's `state_schema` parameter in this code example so that the GitHubRepoViewerTool can write documents to the state. 81 82 ```python 83 from typing import List 84 85 from haystack.components.generators.chat import OpenAIChatGenerator 86 from haystack.dataclasses import ChatMessage, Document 87 from haystack.components.agents import Agent 88 from haystack_integrations.tools.github import GitHubRepoViewerTool 89 90 repo_tool = GitHubRepoViewerTool(name="github_repo_viewer") 91 92 agent = Agent( 93 chat_generator=OpenAIChatGenerator(), 94 tools=[repo_tool], 95 exit_conditions=["text"], 96 state_schema={"documents": {"type": List[Document]}}, 97 ) 98 99 response = agent.run( 100 messages=[ 101 ChatMessage.from_user( 102 "Can you analyze the structure of the deepset-ai/haystack repository and tell me about the main components?", 103 ), 104 ], 105 ) 106 107 print(response["last_message"].text) 108 ``` 109 110 ```bash 111 The `deepset-ai/haystack` repository has a structured layout that includes several important components. Here's an overview of its main parts: 112 113 1. **Directories**: 114 - **`.github`**: Contains GitHub-specific configuration files and workflows. 115 - **`docker`**: Likely includes Docker-related files for containerization of the Haystack application. 116 - **`docs`**: Contains documentation for the Haystack project. This could include guides, API documentation, and other related resources. 117 - **`e2e`**: This likely stands for "end-to-end", possibly containing tests or examples related to end-to-end functionality of the Haystack framework. 118 - **`examples`**: Includes example scripts or notebooks demonstrating how to use Haystack. 119 - **`haystack`**: This is likely the core source code of the Haystack framework itself, containing the main functionality and classes. 120 - **`proposals`**: A directory that may contain proposals for new features or changes to the Haystack project. 121 - **`releasenotes`**: Contains notes about various releases, including changes and improvements. 122 - **`test`**: This directory likely contains unit tests and other testing utilities to ensure code quality and functionality. 123 124 2. **Files**: 125 - **`.gitignore`**: Specifies files and directories that should be ignored by Git. 126 - **`.pre-commit-config.yaml`**: Configuration file for pre-commit hooks to automate code quality checks. 127 - **`CITATION.cff`**: Might include information on how to cite the repository in academic work. 128 - **`code_of_conduct.txt`**: Contains the code of conduct for contributors and users of the repository. 129 - **`CONTRIBUTING.md`**: Guidelines for contributing to the repository. 130 - **`LICENSE`**: The license under which the project is distributed. 131 - **`VERSION.txt`**: Contains versioning information for the project. 132 - **`README.md`**: A markdown file that usually provides an overview of the project, installation instructions, and usage examples. 133 - **`SECURITY.md`**: Contains information about the security policy of the repository. 134 135 This structure indicates a well-organized repository that follows common conventions in open-source projects, with a focus on documentation, contribution guidelines, and testing. The core functionalities are likely housed in the `haystack` directory, with additional resources provided in the other directories. 136 ```