Exa Search
Exa is a search engine fully designed for use by LLMs. Search for documents on the internet using natural language queries, then retrieve cleaned HTML content from desired documents.
Unlike keyword-based search (Google), Exa's neural search capabilities allow it to semantically understand queries and return relevant documents. For example, we could search "fascinating article about cats" and compare the search results from Google and Exa. Google gives us SEO-optimized listicles based on the keyword "fascinating". Exa just works.
This notebook goes over how to use Exa Search with LangChain.
First, get an Exa API key and add it as an environment variable. Get $10 free credit (plus more by completing certain actions like making your first search) by signing up here.
import os
api_key = os.getenv("EXA_API_KEY") # Set your API key as an environment variable
And install the integration package
%pip install --upgrade --quiet langchain-exa
# and some deps for this notebook
%pip install --upgrade --quiet langchain langchain-openai langchain-community
Using ExaSearchRetrieverā
ExaSearchRetriever is a retriever that uses Exa Search to retrieve relevant documents.
The max_characters parameter for TextContentsOptions used to be called max_length which is now deprecated. Make sure to use max_characters instead.
Using the Exa SDK as LangChain Agent Toolsā
The Exa SDK creates a client that can interact with three main Exa API endpoints:
search: Given a natural language search query, retrieve a list of search results.find_similar: Given a URL, retrieve a list of search results corresponding to webpages which are similar to the document at the provided URL.get_contents: Given a list of document ids fetched fromsearchorfind_similar, get cleaned HTML content for each document.
The exa_py SDK combines these endpoints into two powerful calls. Using these provide the most flexible and efficient use cases of Exa search:
search_and_contents: Combines thesearchandget_contentsendpoints to retrieve search results along with their content in a single operation.find_similar_and_contents: Combines thefind_similarandget_contentsendpoints to find similar pages and retrieve their content in one call.
We can use the @tool decorator and docstrings to create LangChain Tool wrappers that tell an LLM agent how to use these combined Exa functionalities effectively. This approach simplifies usage and reduces the number of API calls needed to get comprehensive results.
Before writing code, ensure you have langchain-exa installed
%pip install --upgrade --quiet langchain-exa
import os
from exa_py import Exa
from langchain_core.tools import tool
exa = Exa(api_key=os.environ["EXA_API_KEY"])
@tool
def search_and_contents(query: str):
"""Search for webpages based on the query and retrieve their contents."""
# This combines two API endpoints: search and contents retrieval
return exa.search_and_contents(
query, use_autoprompt=True, num_results=5, text=True, highlights=True
)
@tool
def find_similar_and_contents(url: str):
"""Search for webpages similar to a given URL and retrieve their contents.
The url passed in should be a URL returned from `search_and_contents`.
"""
# This combines two API endpoints: find similar and contents retrieval
return exa.find_similar_and_contents(url, num_results=5, text=True, highlights=True)
tools = [search_and_contents, find_similar_and_contents]