websearch_api.md
1 --- 2 title: "Websearch" 3 id: websearch-api 4 description: "Web search engine for Haystack." 5 slug: "/websearch-api" 6 --- 7 8 9 ## searchapi 10 11 ### SearchApiWebSearch 12 13 Uses [SearchApi](https://www.searchapi.io/) to search the web for relevant documents. 14 15 Usage example: 16 17 <!-- test-ignore --> 18 19 ```python 20 from haystack.components.websearch import SearchApiWebSearch 21 from haystack.utils import Secret 22 23 websearch = SearchApiWebSearch(top_k=10, api_key=Secret.from_env_var("SERPERDEV_API_KEY")) 24 results = websearch.run(query="Who is the boyfriend of Olivia Wilde?") 25 26 assert results["documents"] 27 assert results["links"] 28 ``` 29 30 #### __init__ 31 32 ```python 33 __init__( 34 api_key: Secret = Secret.from_env_var("SEARCHAPI_API_KEY"), 35 top_k: int | None = 10, 36 allowed_domains: list[str] | None = None, 37 search_params: dict[str, Any] | None = None, 38 ) -> None 39 ``` 40 41 Initialize the SearchApiWebSearch component. 42 43 **Parameters:** 44 45 - **api_key** (<code>Secret</code>) – API key for the SearchApi API 46 - **top_k** (<code>int | None</code>) – Number of documents to return. 47 - **allowed_domains** (<code>list\[str\] | None</code>) – List of domains to limit the search to. 48 - **search_params** (<code>dict\[str, Any\] | None</code>) – Additional parameters passed to the SearchApi API. 49 For example, you can set 'num' to 100 to increase the number of search results. 50 See the [SearchApi website](https://www.searchapi.io/) for more details. 51 52 The default search engine is Google, however, users can change it by setting the `engine` 53 parameter in the `search_params`. 54 55 #### to_dict 56 57 ```python 58 to_dict() -> dict[str, Any] 59 ``` 60 61 Serializes the component to a dictionary. 62 63 **Returns:** 64 65 - <code>dict\[str, Any\]</code> – Dictionary with serialized data. 66 67 #### from_dict 68 69 ```python 70 from_dict(data: dict[str, Any]) -> SearchApiWebSearch 71 ``` 72 73 Deserializes the component from a dictionary. 74 75 **Parameters:** 76 77 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to deserialize from. 78 79 **Returns:** 80 81 - <code>SearchApiWebSearch</code> – The deserialized component. 82 83 #### run 84 85 ```python 86 run(query: str) -> dict[str, list[Document] | list[str]] 87 ``` 88 89 Uses [SearchApi](https://www.searchapi.io/) to search the web. 90 91 **Parameters:** 92 93 - **query** (<code>str</code>) – Search query. 94 95 **Returns:** 96 97 - <code>dict\[str, list\[Document\] | list\[str\]\]</code> – A dictionary with the following keys: 98 - "documents": List of documents returned by the search engine. 99 - "links": List of links returned by the search engine. 100 101 **Raises:** 102 103 - <code>TimeoutError</code> – If the request to the SearchApi API times out. 104 - <code>SearchApiError</code> – If an error occurs while querying the SearchApi API. 105 106 #### run_async 107 108 ```python 109 run_async(query: str) -> dict[str, list[Document] | list[str]] 110 ``` 111 112 Asynchronously uses [SearchApi](https://www.searchapi.io/) to search the web. 113 114 This is the asynchronous version of the `run` method with the same parameters and return values. 115 116 **Parameters:** 117 118 - **query** (<code>str</code>) – Search query. 119 120 **Returns:** 121 122 - <code>dict\[str, list\[Document\] | list\[str\]\]</code> – A dictionary with the following keys: 123 - "documents": List of documents returned by the search engine. 124 - "links": List of links returned by the search engine. 125 126 **Raises:** 127 128 - <code>TimeoutError</code> – If the request to the SearchApi API times out. 129 - <code>SearchApiError</code> – If an error occurs while querying the SearchApi API. 130 131 ## serper_dev 132 133 ### SerperDevWebSearch 134 135 Uses [Serper](https://serper.dev/) to search the web for relevant documents. 136 137 See the [Serper Dev website](https://serper.dev/) for more details. 138 139 Usage example: 140 141 <!-- test-ignore --> 142 143 ```python 144 from haystack.components.websearch import SerperDevWebSearch 145 from haystack.utils import Secret 146 147 serper_dev_api = Secret.from_env_var("SERPERDEV_API_KEY") 148 149 websearch = SerperDevWebSearch(top_k=10, api_key=serper_dev_api) 150 results = websearch.run(query="Who is the boyfriend of Olivia Wilde?") 151 152 assert results["documents"] 153 assert results["links"] 154 155 # Example with domain filtering - exclude subdomains 156 websearch_filtered = SerperDevWebSearch( 157 top_k=10, 158 allowed_domains=["example.com"], 159 exclude_subdomains=True, # Only results from example.com, not blog.example.com 160 api_key=serper_dev_api 161 ) 162 results_filtered = websearch_filtered.run(query="search query") 163 ``` 164 165 #### __init__ 166 167 ```python 168 __init__( 169 api_key: Secret = Secret.from_env_var("SERPERDEV_API_KEY"), 170 top_k: int | None = 10, 171 allowed_domains: list[str] | None = None, 172 search_params: dict[str, Any] | None = None, 173 *, 174 exclude_subdomains: bool = False 175 ) -> None 176 ``` 177 178 Initialize the SerperDevWebSearch component. 179 180 **Parameters:** 181 182 - **api_key** (<code>Secret</code>) – API key for the Serper API. 183 - **top_k** (<code>int | None</code>) – Number of documents to return. 184 - **allowed_domains** (<code>list\[str\] | None</code>) – List of domains to limit the search to. 185 - **exclude_subdomains** (<code>bool</code>) – Whether to exclude subdomains when filtering by allowed_domains. 186 If True, only results from the exact domains in allowed_domains will be returned. 187 If False, results from subdomains will also be included. Defaults to False. 188 - **search_params** (<code>dict\[str, Any\] | None</code>) – Additional parameters passed to the Serper API. 189 For example, you can set 'num' to 20 to increase the number of search results. 190 See the [Serper website](https://serper.dev/) for more details. 191 192 #### to_dict 193 194 ```python 195 to_dict() -> dict[str, Any] 196 ``` 197 198 Serializes the component to a dictionary. 199 200 **Returns:** 201 202 - <code>dict\[str, Any\]</code> – Dictionary with serialized data. 203 204 #### from_dict 205 206 ```python 207 from_dict(data: dict[str, Any]) -> SerperDevWebSearch 208 ``` 209 210 Deserializes the component from a dictionary. 211 212 **Parameters:** 213 214 - **data** (<code>dict\[str, Any\]</code>) – The dictionary to deserialize from. 215 216 **Returns:** 217 218 - <code>SerperDevWebSearch</code> – The deserialized component. 219 220 #### run 221 222 ```python 223 run(query: str) -> dict[str, list[Document] | list[str]] 224 ``` 225 226 Use [Serper](https://serper.dev/) to search the web. 227 228 **Parameters:** 229 230 - **query** (<code>str</code>) – Search query. 231 232 **Returns:** 233 234 - <code>dict\[str, list\[Document\] | list\[str\]\]</code> – A dictionary with the following keys: 235 - "documents": List of documents returned by the search engine. 236 - "links": List of links returned by the search engine. 237 238 **Raises:** 239 240 - <code>SerperDevError</code> – If an error occurs while querying the SerperDev API. 241 - <code>TimeoutError</code> – If the request to the SerperDev API times out. 242 243 #### run_async 244 245 ```python 246 run_async(query: str) -> dict[str, list[Document] | list[str]] 247 ``` 248 249 Asynchronously uses [Serper](https://serper.dev/) to search the web. 250 251 This is the asynchronous version of the `run` method with the same parameters and return values. 252 253 **Parameters:** 254 255 - **query** (<code>str</code>) – Search query. 256 257 **Returns:** 258 259 - <code>dict\[str, list\[Document\] | list\[str\]\]</code> – A dictionary with the following keys: 260 - "documents": List of documents returned by the search engine. 261 - "links": List of links returned by the search engine. 262 263 **Raises:** 264 265 - <code>SerperDevError</code> – If an error occurs while querying the SerperDev API. 266 - <code>TimeoutError</code> – If the request to the SerperDev API times out.