websearch_api.md
1 --- 2 title: "Websearch" 3 id: websearch-api 4 description: "Web search engine for Haystack." 5 slug: "/websearch-api" 6 --- 7 8 9 ## searchapi 10 11 ### SearchApiWebSearch 12 13 Uses [SearchApi](https://www.searchapi.io/) to search the web for relevant documents. 14 15 Usage example: 16 17 ```python 18 from haystack.components.websearch import SearchApiWebSearch 19 from haystack.utils import Secret 20 21 websearch = SearchApiWebSearch(top_k=10, api_key=Secret.from_token("test-api-key")) 22 results = websearch.run(query="Who is the boyfriend of Olivia Wilde?") 23 24 assert results["documents"] 25 assert results["links"] 26 ``` 27 28 #### __init__ 29 30 ```python 31 __init__( 32 api_key: Secret = Secret.from_env_var("SEARCHAPI_API_KEY"), 33 top_k: int | None = 10, 34 allowed_domains: list[str] | None = None, 35 search_params: dict[str, Any] | None = None, 36 ) -> None 37 ``` 38 39 Initialize the SearchApiWebSearch component. 40 41 **Parameters:** 42 43 - **api_key** (<code>Secret</code>) – API key for the SearchApi API 44 - **top_k** (<code>int | None</code>) – Number of documents to return. 45 - **allowed_domains** (<code>list\[str\] | None</code>) – List of domains to limit the search to. 46 - **search_params** (<code>dict\[str, Any\] | None</code>) – Additional parameters passed to the SearchApi API. 47 For example, you can set 'num' to 100 to increase the number of search results. 48 See the [SearchApi website](https://www.searchapi.io/) for more details. 49 50 The default search engine is Google, however, users can change it by setting the `engine` 51 parameter in the `search_params`. 52 53 #### to_dict 54 55 ```python 56 to_dict() -> dict[str, Any] 57 ``` 58 59 Serializes the component to a dictionary. 60 61 **Returns:** 62 63 - <code>dict\[str, Any\]</code> – Dictionary with serialized data. 64 65 #### from_dict 66 67 ```python 68 from_dict(data: dict[str, Any]) -> SearchApiWebSearch 69 ``` 70 71 Deserializes the component from a dictionary. 72 73 **Parameters:** 74 75 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to deserialize from. 76 77 **Returns:** 78 79 - <code>SearchApiWebSearch</code> – The deserialized component. 80 81 #### run 82 83 ```python 84 run(query: str) -> dict[str, list[Document] | list[str]] 85 ``` 86 87 Uses [SearchApi](https://www.searchapi.io/) to search the web. 88 89 **Parameters:** 90 91 - **query** (<code>str</code>) – Search query. 92 93 **Returns:** 94 95 - <code>dict\[str, list\[Document\] | list\[str\]\]</code> – A dictionary with the following keys: 96 - "documents": List of documents returned by the search engine. 97 - "links": List of links returned by the search engine. 98 99 **Raises:** 100 101 - <code>TimeoutError</code> – If the request to the SearchApi API times out. 102 - <code>SearchApiError</code> – If an error occurs while querying the SearchApi API. 103 104 #### run_async 105 106 ```python 107 run_async(query: str) -> dict[str, list[Document] | list[str]] 108 ``` 109 110 Asynchronously uses [SearchApi](https://www.searchapi.io/) to search the web. 111 112 This is the asynchronous version of the `run` method with the same parameters and return values. 113 114 **Parameters:** 115 116 - **query** (<code>str</code>) – Search query. 117 118 **Returns:** 119 120 - <code>dict\[str, list\[Document\] | list\[str\]\]</code> – A dictionary with the following keys: 121 - "documents": List of documents returned by the search engine. 122 - "links": List of links returned by the search engine. 123 124 **Raises:** 125 126 - <code>TimeoutError</code> – If the request to the SearchApi API times out. 127 - <code>SearchApiError</code> – If an error occurs while querying the SearchApi API. 128 129 ## serper_dev 130 131 ### SerperDevWebSearch 132 133 Uses [Serper](https://serper.dev/) to search the web for relevant documents. 134 135 See the [Serper Dev website](https://serper.dev/) for more details. 136 137 Usage example: 138 139 ```python 140 from haystack.components.websearch import SerperDevWebSearch 141 from haystack.utils import Secret 142 143 websearch = SerperDevWebSearch(top_k=10, api_key=Secret.from_token("test-api-key")) 144 results = websearch.run(query="Who is the boyfriend of Olivia Wilde?") 145 146 assert results["documents"] 147 assert results["links"] 148 149 # Example with domain filtering - exclude subdomains 150 websearch_filtered = SerperDevWebSearch( 151 top_k=10, 152 allowed_domains=["example.com"], 153 exclude_subdomains=True, # Only results from example.com, not blog.example.com 154 api_key=Secret.from_token("test-api-key") 155 ) 156 results_filtered = websearch_filtered.run(query="search query") 157 ``` 158 159 #### __init__ 160 161 ```python 162 __init__( 163 api_key: Secret = Secret.from_env_var("SERPERDEV_API_KEY"), 164 top_k: int | None = 10, 165 allowed_domains: list[str] | None = None, 166 search_params: dict[str, Any] | None = None, 167 *, 168 exclude_subdomains: bool = False 169 ) -> None 170 ``` 171 172 Initialize the SerperDevWebSearch component. 173 174 **Parameters:** 175 176 - **api_key** (<code>Secret</code>) – API key for the Serper API. 177 - **top_k** (<code>int | None</code>) – Number of documents to return. 178 - **allowed_domains** (<code>list\[str\] | None</code>) – List of domains to limit the search to. 179 - **exclude_subdomains** (<code>bool</code>) – Whether to exclude subdomains when filtering by allowed_domains. 180 If True, only results from the exact domains in allowed_domains will be returned. 181 If False, results from subdomains will also be included. Defaults to False. 182 - **search_params** (<code>dict\[str, Any\] | None</code>) – Additional parameters passed to the Serper API. 183 For example, you can set 'num' to 20 to increase the number of search results. 184 See the [Serper website](https://serper.dev/) for more details. 185 186 #### to_dict 187 188 ```python 189 to_dict() -> dict[str, Any] 190 ``` 191 192 Serializes the component to a dictionary. 193 194 **Returns:** 195 196 - <code>dict\[str, Any\]</code> – Dictionary with serialized data. 197 198 #### from_dict 199 200 ```python 201 from_dict(data: dict[str, Any]) -> SerperDevWebSearch 202 ``` 203 204 Deserializes the component from a dictionary. 205 206 **Parameters:** 207 208 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to deserialize from. 209 210 **Returns:** 211 212 - <code>SerperDevWebSearch</code> – The deserialized component. 213 214 #### run 215 216 ```python 217 run(query: str) -> dict[str, list[Document] | list[str]] 218 ``` 219 220 Use [Serper](https://serper.dev/) to search the web. 221 222 **Parameters:** 223 224 - **query** (<code>str</code>) – Search query. 225 226 **Returns:** 227 228 - <code>dict\[str, list\[Document\] | list\[str\]\]</code> – A dictionary with the following keys: 229 - "documents": List of documents returned by the search engine. 230 - "links": List of links returned by the search engine. 231 232 **Raises:** 233 234 - <code>SerperDevError</code> – If an error occurs while querying the SerperDev API. 235 - <code>TimeoutError</code> – If the request to the SerperDev API times out. 236 237 #### run_async 238 239 ```python 240 run_async(query: str) -> dict[str, list[Document] | list[str]] 241 ``` 242 243 Asynchronously uses [Serper](https://serper.dev/) to search the web. 244 245 This is the asynchronous version of the `run` method with the same parameters and return values. 246 247 **Parameters:** 248 249 - **query** (<code>str</code>) – Search query. 250 251 **Returns:** 252 253 - <code>dict\[str, list\[Document\] | list\[str\]\]</code> – A dictionary with the following keys: 254 - "documents": List of documents returned by the search engine. 255 - "links": List of links returned by the search engine. 256 257 **Raises:** 258 259 - <code>SerperDevError</code> – If an error occurs while querying the SerperDev API. 260 - <code>TimeoutError</code> – If the request to the SerperDev API times out.