Skip to main content

LangChain Tools

 Tools:


Hi! In this blog, we are going to talk about something called Langchain Tools.

These tools help us in various ways. One example is the Wikipedia wrapper, which allows us to directly access and retrieve content from Wikipedia without manually visiting the site. Instead of searching for information on Wikipedia yourself, this tool can automatically fetch relevant content, which you can then pass to a language model (LM) to generate concise answers.

In a similar way, there’s a tool for interacting with Google Search. This tool allows you to automatically query Google, retrieve results, and extract useful content directly from Google Search results.

There are many other tools in Langchain that cater to different needs. Depending on what you’re working on, you can select the tool that best suits your use case.

Let me show you how to use a few of these tools. First, we'll explore how to utilize them, and once you're comfortable with the tools, we’ll dive into the concept of Agents. Agents are an exciting and advanced feature that take automation and intelligence a step further by using these tools dynamically.

Wikipedia Tool:

Python
from langchain_community.tools.wikipedia.tool import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
ap=WikipediaAPIWrapper(top_k_results=2,doc_content_chars_max=300)
t1=WikipediaQueryRun(api_wrapper=ap)
print(t1.name)
print(t1.run({"query":"Generative AI"}))

In the provided code, we can clearly see how we are retrieving content from Wikipedia using LangChain tools. Let me explain the code in more detail.

We are using two key modules in LangChain:

  1. WikipediaAPIWrapper:
    This wrapper is used to specify the parameters for interacting with the Wikipedia API. For example, we can define how many results we want, such as the top 2 results, by using top_k_results=2. Additionally, we can control how much content we want to retrieve from each result by setting doc_content_chars_max=300, which limits the characters for each result. The WikipediaAPIWrapper acts as the configuration interface that determines what content to retrieve from Wikipedia. It doesn’t execute the query but helps set up how the API interacts with Wikipedia to fetch content.Below are the common parameters that you can set:

    Parameters:

    • top_k_results:

      • Description: Specifies how many top search results you want to retrieve from Wikipedia.
      • Example: top_k_results=2 fetches the top 2 results.
      • Default: Usually defaults to fetching the top result if not specified.
    • doc_content_chars_max:

      • Description: Limits the number of characters to retrieve from the content of each result. This is helpful if you want to control the length of the content passed to an LLM.
      • Example: doc_content_chars_max=300 fetches up to 300 characters per result.
      • Default: There may be no character limit if this parameter is not set.
    • lang:

      • Description: Specifies the language of the Wikipedia page to query. The default language is English, but you can change it to any other language supported by Wikipedia.
      • Example: lang="fr" would query French Wikipedia pages.
      • Default: "en" (English)
    • categories_filter:

      • Description: You can filter the search results based on certain categories. Only results that fall into these categories will be returned.
      • Example: categories_filter=["Machine Learning", "Artificial Intelligence"] will return only results related to these categories.
      • Default: No filtering if not provided.
    • summary_only:

      • Description: When set to True, retrieves only the summary section of a Wikipedia article.
      • Example: summary_only=True to get only the summary without the rest of the content.
      • Default: False (fetches the full content)
    • load_full_article:

      • Description: Determines whether to load the full article content or just a snippet.
      • Example: load_full_article=True retrieves the entire article, while False retrieves just an excerpt.
      • Default: False
    • related_articles:

      • Description: Fetches related articles based on the queried article's topic. This is useful if you're looking for more contextual information.
      • Example: related_articles=True fetches additional related articles.
      • Default: False
  2. WikipediaQueryRun:
    This tool is responsible for executing the actual query. It takes the configuration from the WikipediaAPIWrapper (i.e., the number of results and character limit) and runs a specific query, such as "Generative AI". The WikipediaQueryRun works on top of the API wrapper and sends the query to Wikipedia, retrieves the results, and processes them based on the configuration we set in the API wrapper.

Google Search Engine Tool:

Now, let me explain how to create a Google Search API app that you can use as a tool to interact with the Google search engine as an API tool.

To do this, you’ll need two essential keys:

  1. Custom Search Engine (CSE) Key.
  2. Google API Key.

Both of these keys are necessary for enabling your app to query Google Search and retrieve results.

In this section, I'll explain how to create a Google Custom Search Engine (CSE) and obtain an API key so that you can programmatically use Google Search in your projects.

Step-by-Step Guide to Creating Google CSE and API Key:

1. Create a Google Custom Search Engine (CSE)

Let’s start with creating a custom search engine that can search specific websites or the entire web.

Step 1: Visit Google CSE
Step 2: Create a New Search Engine
  • Click the "Create a custom search engine" button.
Step 3: Add Sites to Search
  • In the "Sites to search" field, enter the websites you want your search engine to cover. If you want to search the entire web, just type *.com or leave it blank.
    • For example: If you want to search a specific website, you can add www.example.com, or use *.com to search across all websites.
Step 4: Set a Name and Language
  • Name your search engine: Give your search engine a name that’s easy to remember.
  • Language: Select the language you want the search engine to use.
Step 5: Create Your Search Engine
  • After setting everything, click "Create".
Step 6: Edit Search Engine Settings (Optional)
  • If you want to tweak more settings, go to the Control Panel of your newly created search engine. You can adjust things like search layouts, site refinements, etc.

2. Get the Search Engine ID

You’ll need the Search Engine ID for using the custom search in your code.

Step 1: Go to the CSE Control Panel
  • After creating your custom search engine, you’ll be taken to the Control Panel.
Step 2: Find the Search Engine ID
  • In the Details section, you’ll find a field labeled "Search engine ID". Copy this ID as you will need it when making API requests.

3. Get the API Key from Google Cloud Console

To use the Custom Search Engine API, you need to get an API key from Google Cloud.

Step 1: Visit Google Cloud Console
Step 2: Create a New Project
  • At the top of the page, click on the project dropdown and select "New Project".
    • Name your project: Choose a name for your project.
    • Location: You can leave this as is or set it if needed.
    • Click Create.
Step 3: Enable the Custom Search API
  • Once the project is created, go to APIs & Services from the left-hand menu.
  • Click on "Enable APIs and Services".
  • Search for Custom Search API in the API library.
  • Select Custom Search API and click Enable.
Step 4: Generate API Key
  • After enabling the API, go to Credentials from the left-hand menu.
  • Click "Create Credentials" and select API Key.
  • Copy the API key that is generated. You’ll use this key for making requests to the Custom Search API.
Step 5: Restrict API Key (Optional)
  • For security purposes, it's best to restrict the API key to specific APIs.
    • Click "Restrict key".
    • Under API restrictions, select Custom Search API.
    • Save your changes.

In the original method, using the regular tool caused an error due to improper input handling, so we switched to using a StructuredTool with args_schema for better control. StructuredTool is designed to expect structured inputs defined by a schema, ensuring proper validation and formatting before executing a function. In this case, we use GoogleSearchAPIWrapper to query Google, with pydantic's BaseModel to define an input schema that specifies the query format. The args_schema (GoogleSearchArgs) ensures the input is a valid string query, and StructuredTool combines this schema with the search function to handle the query execution. This structured approach provides better validation, clarity, and error handling, preventing the issues caused by unstructured inputs.

Regular Code that didn't worked:

Python
from langchain_community.utilities import GoogleSearchAPIWrapper
from langchain_core.tools import Tool

search = GoogleSearchAPIWrapper()

tool = Tool(
    name="google_search",
    description="Search Google for recent results.",
    func=search.run,
)
Code that will work
Python
import os
from dotenv import load_dotenv
load_dotenv()
os.environ["GOOGLE_CSE_ID"] = "f032bbce349484bf4"
os.environ["GOOGLE_API_KEY"] = "AIzaSyCZV4tA1hzpCRGvD85CeGlJYFpahmru4cA"
from langchain_community.utilities import GoogleSearchAPIWrapper
from langchain_core.tools import StructuredTool
from pydantic import BaseModel

search = GoogleSearchAPIWrapper(k=2)

class GoogleSearchArgs(BaseModel):
    query: str
    
t2 = StructuredTool(
    name="google_search",
    description="Search Google and return the first 2 results.",
    func=search.run,
    args_schema=GoogleSearchArgs
)
t2.run("Obama's first name?")

Before executing the code, there are a few essential steps that you need to follow to ensure everything is set up correctly. Here's a list of the steps that you must complete to avoid any issues:

Steps You Must Complete Before Running the Code:

  1. Install Google Cloud SDK (if not installed yet):

    • Follow the installation guide for your system from the Google Cloud SDK website.
    • After installation, open the Google Cloud SDK Shell or your terminal.
  2. Authenticate Google Cloud SDK:

    • Run this command to authenticate your Google account:
      gcloud auth login
      This will open a browser window where you can log in and allow the SDK access.
  3. Set Your Google Cloud Project:

    • Set the project where you will use the Custom Search API:
      gcloud config set project YOUR_PROJECT_ID
      Replace YOUR_PROJECT_ID with the actual project ID from your Google Cloud Console.
  4. Enable Custom Search API:

    • Enable the Custom Search API by running this command:
      gcloud services enable customsearch.googleapis.com


Yahoo Finance Tool:

You can also use the Yahoo Finance tool, which you can directly import from the available tools in LangChain. Once we import this tool, we can query about any company to get the latest news regarding that company. This tool is helpful for grabbing the latest stock news or updates, but keep in mind it won't provide explicit stock values or financial data like stock market values. Instead, it will give you relevant news articles for the company you’re querying.

For example, I’ll demonstrate querying about Google to get the latest news. I’ll also show you the output to give you a clear idea of how it works.

In the same way, you can use these kinds of tools for other companies as well. Additionally, there’s another way to create and use custom tools. You can build your own custom functions and convert them into a custom tool. I will show you how this works later on.

There are several tools available on the LangChain official page. Some tools are free, while others may require an API key or even a subscription. It’s up to you to explore and decide which tools work best for your project.

Finally, you may want to consider using a retrieval tool. You might ask, "Why do we need a retrieval tool?" Well, when working with agents, you can't just use directories or direct queries; instead, agents typically accept only tools. So, you can create a retrieval tool that interacts with your custom data and integrate it with the agent. This way, the agent can access custom data sources for more specific or advanced tasks.

For now, I’ll demonstrate how to code this, and I’ll share the output as well.


Python
import os

from langchain_community.tools.google_lens import GoogleLensQueryRun
from langchain_community.utilities.google_lens import GoogleLensAPIWrapper

os.environ["SERPAPI_API_KEY"] = "7f196a9ed663aca1ee52979580d6dd486c7e9bbab635ca1ad6aa4538a2e7b2abfk"
tool = GoogleLensQueryRun(api_wrapper=GoogleLensAPIWrapper())
tool.run("https://assets.adidas.com/images/w_600,f_auto,q_auto/3bbecbdf584e40398446a8bf0117cf62_9366/Samba_OG_Shoes_White_B75806_01_standard.jpg")

Google Lens Tool:

You can also use something called Google Lens Tool, where you can give visual images—whether they are from your local folders or elsewhere—and Google Lens will analyze the image and return relevant information. If Google Lens is unable to identify anything from the image, it might return an error.

To handle this, you can use a try-catch block to manage errors gracefully, though I won’t demonstrate that here since it’s a common practice. You simply catch the error to ensure it doesn’t disrupt the flow if the image fails to yield any results.

You can pass the URL of an image or upload images directly. Google Lens will then generate probable links and relevant information about what it finds. For example, I used an image of Adidas Samba shoes, and Google Lens returned corresponding information about the shoes. I’ll show you a snapshot of the response as well as the code.

Make sure you’ve created an API key for SerpAPI—which is integrated with Google’s tools—by signing up for a free account at SerpAPI. After signing up, you’ll get an API key. Since I can’t share my API key, you’ll need to generate your own, and then you can simply plug it into the code.

Here’s the code and the corresponding response:

Python
from langchain_community.tools.wikipedia.tool import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
ap=WikipediaAPIWrapper(top_k_results=2,doc_content_chars_max=300)
t1=WikipediaQueryRun(api_wrapper=ap)
print(t1.name)
print(t1.run({"query":"Generative AI"}))

Custom Tools:

As I mentioned earlier, we can create custom tools in LangChain. When I say custom tools, I mean that you can take any of your Python functions or code functionality and simply turn them into a tool. Whenever you call this tool, it will automatically run the underlying functionality.

Now, you might wonder, why bother creating a tool if I can just run the function directly? The answer lies in agents.

Agents in LangChain work by interacting with tools. When you give an agent a question or task, it goes through the available tools and picks the best one to provide an answer or solution. The agent essentially plays a thought process game, figuring out which tool is best suited to handle the query you’ve given it.

So, to make agents work effectively, we need to have tools for them to use. This is why I’m dedicating this section to show you how to create your own tools and how to use the built-in tools in LangChain.

Now, let’s take a look at how to create a simple custom tool.

Python
from dotenv import load_dotenv
from langchain import hub
from langchain.agents import (
    AgentExecutor,
    create_react_agent,
)
from langchain_core.tools import Tool
from langchain_openai import ChatOpenAI
load_dotenv()

def strp(*args, **kwargs):
    """Stars printer"""
    print("*"*5)
    

t1=Tool(
        name="Star printer",  
        func=strp,  
        description="Useful for printing stars",
    )
t1.run("")

Let’s break down the code to better understand what’s happening. In the above code, we are creating a custom function that simply prints stars. The key difference here is that we are passing the *args and **kwargs parameters because, by default, tools expect a certain set of arguments to be passed—whether or not you use them in your function. If these arguments aren't accounted for, you may run into syntax or runtime errors. So, we pass *args inside the function to handle any potential arguments, even if we don’t use them.

After creating the function that prints stars, we use the Tool class from LangChain to convert the function into a tool. It's important to provide a meaningful name and description for the tool because agents, when powered by an LLM (like GPT), need these details to understand what the tool does.

For example, when you design an agent, it might be given several tools to choose from. Since these are custom tools, the LLM doesn’t inherently know what each tool does. The name and description make it easier for the LLM to figure out which tool to use without having to dig into the function's code.

Once the tool is created, we use tool.run() to execute it. As I mentioned earlier, it’s mandatory to pass an argument to the tool. Even though our function doesn’t require any input, tools in LangChain still expect some kind of input, like a query. In this case, since the function doesn’t use any input, we just pass an empty string (""). It doesn't matter what we pass, because we aren't using the input in the function.

In this way, you can create any custom tool based on your own code. You can define arguments and create more complex tools as needed. I’ll also show you how to handle input arguments in other examples so you have a more thorough understanding of the process.

That’s all for this example—let's move on to the next custom tool!

Structured Tool:

Now, let’s explore how a StructuredTool works when passing multiple inputs. For instance, if you’re working with an agent where the LLM (language model) decides which tool to use, you might want to create a function that sums two numbers given as input. In cases like this, when you're dealing with multiple inputs, an LLM might struggle to identify which input corresponds to which parameter in your function.

For example, if your function expects two parameters, one being a string and the other an integer, the LLM might have trouble parsing a sentence and assigning the correct values to the right parameters. Suppose the first parameter is supposed to be a string and the second one an integer—if the LLM mistakenly sends an integer in place of the string, or vice versa, the tool could fail to work as expected.

This is where StructuredTool comes in handy. A StructuredTool allows us to define the expected inputs more clearly using a pydantic base model. By creating a class that explicitly describes the type of each parameter, we ensure that the LLM or agent can correctly identify where each input should go. This structure makes it easier for the model to correctly assign values, even when handling multiple or complex inputs.

When using StructuredTool, we define each parameter in a pydantic class, specifying types like integers or strings. By doing this, we give the agent or LLM a blueprint, so it knows exactly where to put each value when passing multiple inputs to a function.

Let’s take an example where we create a function to sum two numbers. This will illustrate how StructuredTool helps in handling inputs more efficiently compared to traditional methods.

Python
from langchain.pydantic_v1 import BaseModel, Field

def summanation(a, b):
    """Sum of 2 numbers."""
    return int(a) + int(b)

class Summationargs(BaseModel):
    a: str = Field(description="First Integer")
    b: str = Field(description="Second Integer")
t2=StructuredTool.from_function(
        func=summanation,  
        name="Summation", 
        description="Summate two numbers",  
        args_schema=Summationargs,  
    )
t2.run({"a": 3, "b": 4})

What Happens When Both Inputs Are Strings?

Now, let’s consider a case where both inputs to the function are strings. You might wonder how the LLM identifies which string goes where, especially when the user’s query is also a string. The LLM may have its own capabilities to extract the right inputs based on context, but there’s still the potential for confusion, particularly when both inputs are of the same type.

For example, let’s say the function expects two strings: name and city. If the user query is something like "John lives in New York," the LLM will use its natural language processing to infer that "John" should map to name and "New York" should map to city. However, if the query is more ambiguous, like "I visited Paris, and my name is Sarah," the LLM might have trouble figuring out which input goes where.

How to Handle This:

  1. Use a StructuredTool with Named Parameters:
    By using pydantic models, you can clearly define the input parameters (e.g., name and city). This way, the LLM can map the correct string to each parameter based on its name.

    Example: Define the input schema as a class with fields like name and city. Then, create a function that takes these as input and prints the result. Use StructuredTool to define the tool with this schema and function. When you run the tool, provide the inputs in a dictionary format, such as {"name": "John", "city": "New York"}.

  2. Use Clear Descriptions:
    Providing detailed descriptions for each parameter will help the LLM better understand the expected input. For instance, if your tool's description says it takes a user's name and city, the LLM will be able to more easily map the values correctly.

  3. Use Agent Prompt Engineering:
    If you're working with agents, you can craft prompts in a way that makes it easier for the LLM to map inputs correctly. For example, instead of a vague sentence, you could explicitly instruct: "Please assign 'John' to the 'name' variable and 'New York' to the 'city' variable."

  4. Agent Self-Reflection (Advanced):
    In more advanced setups, agents can ask clarifying questions if they’re unsure of how to map inputs, ensuring the right values go to the right parameters.

Using a retriever as a Tool:

As I mentioned earlier, we can also use retrievers as tools for agents. You might wonder why we need to convert a retriever into a tool when the retriever and the tool essentially serve the same purpose: retrieving relevant information. For instance, when you provide a query to a retriever, it retrieves the relevant documents, and this process doesn’t change whether it’s a tool or a retriever.

So, what’s the difference, and why should we bother converting a retriever into a tool?

The key reason is that agents only work with tools. Agents cannot directly interact with retrievers. They are designed to operate by using tools—making decisions, running processes, and executing tasks based on the tools at their disposal. Agents essentially think and decide what to do, but they need tools to do that. This is why it's crucial to convert a retriever (like one that interacts with a vector database) into a tool. Once the retriever becomes a tool, the agent can access the retrieval functionality and fetch the relevant information.

For example, you can create an SQL retriever tool in the same way. In the future, I might showcase a use case for converting traditional SQL databases into a tool that retrieves rows based on user queries, performing activities like SQL-based retrieval for specific user requests. The process is the same: you convert the functionality into a tool that the agent can use.

Additionally, we can convert Python code into custom tools, which we’ve discussed before. Alongside that, we’ve seen several ready-made tools like Google, Wikipedia, and even research paper tools like arXiv, which retrieves research papers. You can also use the YouTube tool, which retrieves video links based on a query. The flexibility is there—you can use existing tools, build your own, or create retriever tools to suit your needs.

To sum it up, we can use retrievers, SQL databases, or even custom code as tools, and that’s what makes agents more effective. It’s really simple: with just a single line of code, you can convert your retrieval function into a tool, allowing agents to utilize it.

Python
import os
from dotenv import load_dotenv
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain.embeddings import HuggingFaceEmbeddings
load_dotenv()

persistent_directory = os.path.join('vh')
huggingface_embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-mpnet-base-v2"
)

db = Chroma(persist_directory=persistent_directory,
                embedding_function=huggingface_embeddings)

retriever = db.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 3},
)
from langchain.tools.retriever import create_retriever_tool
retriever_tool=create_retriever_tool(retriever,description="It is a retriever_tool that explains all the policies of the lease",name="Lease information Retriever")
retriever_tool.run("What is the policy 33")

Comments