GPT4 with Retrieval Augmentation over LangChain docs

Open In Colab Open nbviewer

GPT4 with Retrieval Augmentation over LangChain Docs

Open nbviewer

In this notebook we'll work through an example of using GPT-4 with retrieval augmentation to answer questions about the LangChain Python library.

!pip install -qU \
  tiktoken==0.4.0 \
  openai==0.27.7 \
  langchain==0.0.179 \
  "pinecone-client[grpc]"==2.2.1

🚨 Note: the above pip install is formatted for Jupyter notebooks. If running elsewhere you may need to drop the !.


In this example, we will download the LangChain docs from langchain.readthedocs.io/. We get all .html files located on the site like so:

!wget -r -A.html -P rtdocs https://python.langchain.com/en/latest/
Streaming output truncated to the last 5000 lines.
2023-05-25 02:44:06 (267 MB/s) - ‘rtdocs/python.langchain.com/en/latest/use_cases/question_answering.html’ saved [101116]

...
--2023-05-25 02:45:44--  https://python.langchain.com/en/latest/tracing/agent_with_tracing.html
Reusing existing connection to python.langchain.com:443.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘rtdocs/python.langchain.com/en/latest/tracing/agent_with_tracing.html’

python.langchain.co     [ <=>                ] 111.82K  --.-KB/s    in 0s      

2023-05-25 02:45:45 (242 MB/s) - ‘rtdocs/python.langchain.com/en/latest/tracing/agent_with_tracing.html’ saved [114507]

FINISHED --2023-05-25 02:45:45--
Total wall clock time: 2m 58s
Downloaded: 892 files, 95M in 1.6s (60.6 MB/s)

This downloads all HTML into the rtdocs directory. Now we can use LangChain itself to process these docs. We do this using the ReadTheDocsLoader like so:

from langchain.document_loaders import ReadTheDocsLoader

loader = ReadTheDocsLoader('rtdocs')
docs = loader.load()
len(docs)
891

This leaves us with 891 processed doc pages. Let's take a look at the format each one contains:

docs[20]
Document(page_content='.md\n.pdf\nRunhouse\n Contents \nInstallation and Setup\nSelf-hosted LLMs\nSelf-hosted Embeddings\nRunhouse#\nThis page covers how to use the Runhouse ecosystem within LangChain.\nIt is broken into three parts: installation and setup, LLMs, and Embeddings.\nInstallation and Setup#\nInstall the Python SDK with pip install runhouse\nIf you’d like to use on-demand cluster, check your cloud credentials with sky check\nSelf-hosted LLMs#\nFor a basic self-hosted LLM, you can use the SelfHostedHuggingFaceLLM class. For more\ncustom LLMs, you can use the SelfHostedPipeline parent class.\nfrom langchain.llms import SelfHostedPipeline, SelfHostedHuggingFaceLLM\nFor a more detailed walkthrough of the Self-hosted LLMs, see this notebook\nSelf-hosted Embeddings#\nThere are several ways to use self-hosted embeddings with LangChain via Runhouse.\nFor a basic self-hosted embedding from a Hugging Face Transformers model, you can use\nthe SelfHostedEmbedding class.\nfrom langchain.llms import SelfHostedPipeline, SelfHostedHuggingFaceLLM\nFor a more detailed walkthrough of the Self-hosted Embeddings, see this notebook\nprevious\nReplicate\nnext\nRWKV-4\n Contents\n  \nInstallation and Setup\nSelf-hosted LLMs\nSelf-hosted Embeddings\nBy Harrison Chase\n    \n      © Copyright 2023, Harrison Chase.\n      \n  Last updated on May 24, 2023.\n  ', metadata={'source': 'rtdocs/python.langchain.com/en/latest/integrations/runhouse.html'})

We access the plaintext page content like so:

print(docs[20].page_content)
.md
.pdf
Runhouse
 Contents 
Installation and Setup
Self-hosted LLMs
Self-hosted Embeddings
Runhouse#
This page covers how to use the Runhouse ecosystem within LangChain.
It is broken into three parts: installation and setup, LLMs, and Embeddings.
Installation and Setup#
Install the Python SDK with pip install runhouse
If you’d like to use on-demand cluster, check your cloud credentials with sky check
Self-hosted LLMs#
For a basic self-hosted LLM, you can use the SelfHostedHuggingFaceLLM class. For more
custom LLMs, you can use the SelfHostedPipeline parent class.
from langchain.llms import SelfHostedPipeline, SelfHostedHuggingFaceLLM
For a more detailed walkthrough of the Self-hosted LLMs, see this notebook
Self-hosted Embeddings#
There are several ways to use self-hosted embeddings with LangChain via Runhouse.
For a basic self-hosted embedding from a Hugging Face Transformers model, you can use
the SelfHostedEmbedding class.
from langchain.llms import SelfHostedPipeline, SelfHostedHuggingFaceLLM
For a more detailed walkthrough of the Self-hosted Embeddings, see this notebook
previous
Replicate
next
RWKV-4
 Contents
  
Installation and Setup
Self-hosted LLMs
Self-hosted Embeddings
By Harrison Chase
    
      © Copyright 2023, Harrison Chase.
      
  Last updated on May 24, 2023.
  
print(docs[35].page_content)
.ipynb
.pdf
Aim
Aim#
Aim makes it super easy to visualize and debug LangChain executions. Aim tracks inputs and outputs of LLMs and tools, as well as actions of agents.
With Aim, you can easily debug and examine an individual execution:
Additionally, you have the option to compare multiple executions side by side:
Aim is fully open source, learn more about Aim on GitHub.
Let’s move forward and see how to enable and configure Aim callback.
Tracking LangChain Executions with AimIn this notebook we will explore three usage scenarios. To start off, we will install the necessary packages and import certain modules. Subsequently, we will configure two environment variables that can be established either within the Python script or through the terminal.
!pip install aim
!pip install langchain
!pip install openai
!pip install google-search-results
import os
from datetime import datetime
from langchain.llms import OpenAI
from langchain.callbacks import AimCallbackHandler, StdOutCallbackHandler
Our examples use a GPT model as the LLM, and OpenAI offers an API for this purpose. You can obtain the key from the following link: https://platform.openai.com/account/api-keys .
We will use the SerpApi to retrieve search results from Google. To acquire the SerpApi key, please go to https://serpapi.com/manage-api-key .
os.environ["OPENAI_API_KEY"] = "..."
os.environ["SERPAPI_API_KEY"] = "..."
The event methods of AimCallbackHandler accept the LangChain module or agent as input and log at least the prompts and generated results, as well as the serialized version of the LangChain module, to the designated Aim run.
session_group = datetime.now().strftime("%m.%d.%Y_%H.%M.%S")
aim_callback = AimCallbackHandler(
    repo=".",
    experiment_name="scenario 1: OpenAI LLM",
)
callbacks = [StdOutCallbackHandler(), aim_callback]
llm = OpenAI(temperature=0, callbacks=callbacks)
The flush_tracker function is used to record LangChain assets on Aim. By default, the session is reset rather than being terminated outright.
Scenario 1 In the first scenario, we will use OpenAI LLM.
# scenario 1 - LLM
llm_result = llm.generate(["Tell me a joke", "Tell me a poem"] * 3)
aim_callback.flush_tracker(
    langchain_asset=llm,
    experiment_name="scenario 2: Chain with multiple SubChains on multiple generations",
)
Scenario 2 Scenario two involves chaining with multiple SubChains across multiple generations.
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
# scenario 2 - Chain
template = """You are a playwright. Given the title of play, it is your job to write a synopsis for that title.
Title: {title}
Playwright: This is a synopsis for the above play:"""
prompt_template = PromptTemplate(input_variables=["title"], template=template)
synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callbacks=callbacks)
test_prompts = [
    {"title": "documentary about good video games that push the boundary of game design"},
    {"title": "the phenomenon behind the remarkable speed of cheetahs"},
    {"title": "the best in class mlops tooling"},
]
synopsis_chain.apply(test_prompts)
aim_callback.flush_tracker(
    langchain_asset=synopsis_chain, experiment_name="scenario 3: Agent with Tools"
)
Scenario 3 The third scenario involves an agent with tools.
from langchain.agents import initialize_agent, load_tools
from langchain.agents import AgentType
# scenario 3 - Agent with Tools
tools = load_tools(["serpapi", "llm-math"], llm=llm, callbacks=callbacks)
agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    callbacks=callbacks,
)
agent.run(
    "Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?"
)
aim_callback.flush_tracker(langchain_asset=agent, reset=False, finish=True)
> Entering new AgentExecutor chain...
 I need to find out who Leo DiCaprio's girlfriend is and then calculate her age raised to the 0.43 power.
Action: Search
Action Input: "Leo DiCaprio girlfriend"
Observation: Leonardo DiCaprio seemed to prove a long-held theory about his love life right after splitting from girlfriend Camila Morrone just months ...
Thought: I need to find out Camila Morrone's age
Action: Search
Action Input: "Camila Morrone age"
Observation: 25 years
Thought: I need to calculate 25 raised to the 0.43 power
Action: Calculator
Action Input: 25^0.43
Observation: Answer: 3.991298452658078
Thought: I now know the final answer
Final Answer: Camila Morrone is Leo DiCaprio's girlfriend and her current age raised to the 0.43 power is 3.991298452658078.
> Finished chain.
previous
AI21 Labs
next
AnalyticDB
By Harrison Chase
    
      © Copyright 2023, Harrison Chase.
      
  Last updated on May 24, 2023.
  

We can also find the source of each document:

docs[35].metadata['source'].replace('rtdocs/', 'https://')
'https://python.langchain.com/en/latest/integrations/aim_tracking.html'

Now let's see how we can process all of these. We will chunk everything into ~500 token chunks, we can do this easily with langchain and tiktoken:

import tiktoken

tokenizer_name = tiktoken.encoding_for_model('gpt-4')
tokenizer_name.name
'cl100k_base'
tokenizer = tiktoken.get_encoding(tokenizer_name.name)

# create the length function
def tiktoken_len(text):
    tokens = tokenizer.encode(
        text,
        disallowed_special=()
    )
    return len(tokens)
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=20,
    length_function=tiktoken_len,
    separators=["\n\n", "\n", " ", ""]
)

Process the docs into more chunks using this approach.

from typing_extensions import Concatenate
from uuid import uuid4
from tqdm.auto import tqdm

chunks = []

for idx, page in enumerate(tqdm(docs)):
    content = page.page_content
    if len(content) > 100:
        url = page.metadata['source'].replace('rtdocs/', 'https://')
        texts = text_splitter.split_text(content)
        chunks.extend([{
            'id': str(uuid4()),
            'text': texts[i],
            'chunk': i,
            'url': url
        } for i in range(len(texts))])
  0%|          | 0/891 [00:00<?, ?it/s]

Our chunks are ready so now we move onto embedding and indexing everything.

Initialize Embedding Model

We use text-embedding-ada-002 as the embedding model. We can embed text like so:

import os
import openai

# get API key from top-right dropdown on OpenAI website
openai.api_key = os.getenv("OPENAI_API_KEY") or "OPENAI_API_KEY"

openai.Engine.list()  # check we have authenticated
<OpenAIObject list at 0x7f430a862d40> JSON: {
  "data": [
    {
      "created": null,
      "id": "whisper-1",
      "object": "engine",
      "owner": "openai-internal",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "babbage",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "gpt-3.5-turbo",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "davinci",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-davinci-edit-001",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-davinci-003",
      "object": "engine",
      "owner": "openai-internal",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "babbage-code-search-code",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-similarity-babbage-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "gpt-3.5-turbo-0301",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "code-davinci-edit-001",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-davinci-001",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "ada",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "babbage-code-search-text",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "babbage-similarity",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "code-search-babbage-text-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-curie-001",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "code-search-babbage-code-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-ada-001",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-embedding-ada-002",
      "object": "engine",
      "owner": "openai-internal",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-similarity-ada-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "curie-instruct-beta",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "ada-code-search-code",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "ada-similarity",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "gpt-4-0314",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "code-search-ada-text-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-search-ada-query-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "davinci-search-document",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "ada-code-search-text",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-search-ada-doc-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "davinci-instruct-beta",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "gpt-4",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-similarity-curie-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "code-search-ada-code-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "ada-search-query",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-search-davinci-query-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "curie-search-query",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "davinci-search-query",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "babbage-search-document",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "ada-search-document",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-search-curie-query-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-search-babbage-doc-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "curie-search-document",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-search-curie-doc-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "babbage-search-query",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-babbage-001",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-search-davinci-doc-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-search-babbage-query-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "curie-similarity",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "curie",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-similarity-davinci-001",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "text-davinci-002",
      "object": "engine",
      "owner": "openai",
      "permissions": null,
      "ready": true
    },
    {
      "created": null,
      "id": "davinci-similarity",
      "object": "engine",
      "owner": "openai-dev",
      "permissions": null,
      "ready": true
    }
  ],
  "object": "list"
}
embed_model = "text-embedding-ada-002"

res = openai.Embedding.create(
    input=[
        "Sample document text goes here",
        "there will be several phrases in each batch"
    ], engine=embed_model
)

In the response res we will find a JSON-like object containing our new embeddings within the 'data' field.

res.keys()
dict_keys(['object', 'data', 'model', 'usage'])

Inside 'data' we will find two records, one for each of the two sentences we just embedded. Each vector embedding contains 1536 dimensions (the output dimensionality of the text-embedding-ada-002 model.

len(res['data'])
2
len(res['data'][0]['embedding']), len(res['data'][1]['embedding'])
(1536, 1536)

We will apply this same embedding logic to the langchain docs dataset we've just scraped. But before doing so we must create a place to store the embeddings.

Initializing the Index

Now we need a place to store these embeddings and enable a efficient vector search through them all. To do that we use Pinecone, we can get a free API key and enter it below where we will initialize our connection to Pinecone and create a new index.

import pinecone

# initialize connection to pinecone (get API key at app.pinecone.io)
api_key = os.getenv("PINECONE_API_KEY") or "PINECONE_API_KEY"
# find your environment next to the api key in pinecone console
env = os.getenv("PINECONE_ENVIRONMENT") or "PINECONE_ENVIRONMENT"

pinecone.init(api_key=api_key, environment=env)
pinecone.whoami()
WhoAmIResponse(username='c78f2bd', user_label='default', projectname='9a4fbb6')
index_name = 'gpt-4-langchain-docs'
import time

# check if index already exists (it shouldn't if this is first time)
if index_name not in pinecone.list_indexes():
    # if does not exist, create index
    pinecone.create_index(
        index_name,
        dimension=len(res['data'][0]['embedding']),
        metric='cosine'
    )
    # wait for index to be initialized
    time.sleep(1)

# connect to index
index = pinecone.GRPCIndex(index_name)
# view index stats
index.describe_index_stats()
{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}

We can see the index is currently empty with a total_vector_count of 0. We can begin populating it with OpenAI text-embedding-ada-002 built embeddings like so:

from tqdm.auto import tqdm
from time import sleep

batch_size = 100  # how many embeddings we create and insert at once

for i in tqdm(range(0, len(chunks), batch_size)):
    # find end of batch
    i_end = min(len(chunks), i+batch_size)
    meta_batch = chunks[i:i_end]
    # get ids
    ids_batch = [x['id'] for x in meta_batch]
    # get texts to encode
    texts = [x['text'] for x in meta_batch]
    # create embeddings (try-except added to avoid RateLimitError)
    try:
        res = openai.Embedding.create(input=texts, engine=embed_model)
    except:
        done = False
        while not done:
            sleep(5)
            try:
                res = openai.Embedding.create(input=texts, engine=embed_model)
                done = True
            except:
                pass
    embeds = [record['embedding'] for record in res['data']]
    # cleanup metadata
    meta_batch = [{
        'text': x['text'],
        'chunk': x['chunk'],
        'url': x['url']
    } for x in meta_batch]
    to_upsert = list(zip(ids_batch, embeds, meta_batch))
    # upsert to Pinecone
    index.upsert(vectors=to_upsert)
  0%|          | 0/34 [00:00<?, ?it/s]

Now we've added all of our langchain docs to the index. With that we can move on to retrieval and then answer generation using GPT-4.

Retrieval

To search through our documents we first need to create a query vector xq. Using xq we will retrieve the most relevant chunks from the LangChain docs, like so:

query = "how do I use the LLMChain in LangChain?"

res = openai.Embedding.create(
    input=[query],
    engine=embed_model
)

# retrieve from Pinecone
xq = res['data'][0]['embedding']

# get relevant contexts (including the questions)
res = index.query(xq, top_k=5, include_metadata=True)
res
{'matches': [{'id': 'a1c7cab4-bf69-425f-877d-3d7c832c4894',
              'metadata': {'chunk': 2.0,
                           'text': 'for full documentation on:\\n\\nGetting '
                                   'started (installation, setting up the '
                                   'environment, simple examples)\\n\\nHow-To '
                                   'examples (demos, integrations, helper '
                                   'functions)\\n\\nReference (full API '
                                   'docs)\\n\\nResources (high-level '
                                   'explanation of core '
                                   'concepts)\\n\\nð\\x9f\\x9a\\x80 What can '
                                   'this help with?\\n\\nThere are six main '
                                   'areas that LangChain is designed to help '
                                   'with.\\nThese are, in increasing order of '
                                   'complexity:\\n\\nð\\x9f“\\x83 LLMs and '
                                   'Prompts:\\n\\nThis includes prompt '
                                   'management, prompt optimization, a generic '
                                   'interface for all LLMs, and common '
                                   'utilities for working with '
                                   'LLMs.\\n\\nð\\x9f”\\x97 '
                                   'Chains:\\n\\nChains go beyond a single LLM '
                                   'call and involve sequences of calls '
                                   '(whether to an LLM or a different '
                                   'utility). LangChain provides a standard '
                                   'interface for chains, lots of integrations '
                                   'with other tools, and end-to-end chains '
                                   'for common applications.\\n\\nð\\x9f“\\x9a '
                                   'Data Augmented Generation:\\n\\nData '
                                   'Augmented Generation involves specific '
                                   'types of chains that first interact with '
                                   'an external data source to fetch data for '
                                   'use in the generation step. Examples '
                                   'include summarization of long pieces of '
                                   'text and question/answering over specific '
                                   'data sources.\\n\\nð\\x9f¤\\x96 '
                                   'Agents:\\n\\nAgents involve an LLM making '
                                   'decisions about which Actions to take, '
                                   'taking that Action, seeing an Observation, '
                                   'and repeating that until done. LangChain',
                           'url': 'https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/markdown.html'},
              'score': 0.86983985,
              'sparse_values': {'indices': [], 'values': []},
              'values': []},
             {'id': '6d71f35b-d113-411c-9095-4c4e7cb02b85',
              'metadata': {'chunk': 17.0,
                           'text': 'an Observation, and repeating that until '
                                   'done. LangChain provides a standard '
                                   'interface for agents, a selection of '
                                   'agents to choose from, and examples of end '
                                   'to end agents.\\n\\n\\n\\n\\n\\nUse '
                                   'Cases#\\nThe above modules can be used in '
                                   'a variety of ways. LangChain also provides '
                                   'guidance and assistance in this. Below are '
                                   'some of the common use cases LangChain '
                                   'supports.\\n\\nPersonal Assistants: The '
                                   'main LangChain use case. Personal '
                                   'assistants need to take actions, remember '
                                   'interactions, and have knowledge about '
                                   'your data.\\nQuestion Answering: The '
                                   'second big LangChain use case. Answering '
                                   'questions over specific documents, only '
                                   'utilizing the information in those '
                                   'documents to construct an '
                                   'answer.\\nChatbots: Since language models '
                                   'are good at producing text, that makes '
                                   'them ideal for creating '
                                   'chatbots.\\nQuerying Tabular Data: If you '
                                   'want to understand how to use LLMs to '
                                   'query data that is stored in a tabular '
                                   'format (csvs, SQL, dataframes, etc) you '
                                   'should read this page.\\nInteracting with '
                                   'APIs: Enabling LLMs to interact with APIs '
                                   'is extremely powerful in order to give '
                                   'them more up-to-date information and allow '
                                   'them to take actions.\\nExtraction: '
                                   'Extract structured information from '
                                   'text.\\nSummarization: Summarizing longer '
                                   'documents into shorter, more condensed '
                                   'chunks of information. A type of Data '
                                   'Augmented Generation.\\nEvaluation: '
                                   'Generative models are notoriously',
                           'url': 'https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/sitemap.html'},
              'score': 0.8587692,
              'sparse_values': {'indices': [], 'values': []},
              'values': []},
             {'id': 'bf7a87a3-acc9-46ee-855c-cde84fcc73a6',
              'metadata': {'chunk': 7.0,
                           'text': 'working with raw text, they work with '
                                   'messages. LangChain provides a standard '
                                   'interface for working with them and doing '
                                   'all the same things as '
                                   'above.\\n\\n\\n\\n\\n\\nUse Cases#\\nThe '
                                   'above modules can be used in a variety of '
                                   'ways. LangChain also provides guidance and '
                                   'assistance in this. Below are some of the '
                                   'common use cases LangChain '
                                   'supports.\\n\\nAgents: Agents are systems '
                                   'that use a language model to interact with '
                                   'other tools. These can be used to do more '
                                   'grounded question/answering, interact with '
                                   'APIs, or even take actions.\\nChatbots: '
                                   'Since language models are good at '
                                   'producing text, that makes them ideal for '
                                   'creating chatbots.\\nData Augmented '
                                   'Generation: Data Augmented Generation '
                                   'involves specific types of chains that '
                                   'first interact with an external datasource '
                                   'to fetch data to use in the generation '
                                   'step. Examples of this include '
                                   'summarization of long pieces of text and '
                                   'question/answering over specific data '
                                   'sources.\\nQuestion Answering: Answering '
                                   'questions over specific documents, only '
                                   'utilizing the information in those '
                                   'documents to construct an answer. A type '
                                   'of Data Augmented '
                                   'Generation.\\nSummarization: Summarizing '
                                   'longer documents into shorter, more '
                                   'condensed chunks of information. A type of '
                                   'Data Augmented Generation.\\nQuerying '
                                   'Tabular Data: If you want to understand '
                                   'how to use LLMs to query data that is '
                                   'stored in a tabular format (csvs,',
                           'url': 'https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/sitemap.html'},
              'score': 0.85652447,
              'sparse_values': {'indices': [], 'values': []},
              'values': []},
             {'id': '0a207cad-b0c9-4118-bf90-221f944b2d50',
              'metadata': {'chunk': 1.0,
                           'text': 'Initiate the LLMChain\n'
                                   'Run the LLMChain\n'
                                   'By Harrison Chase\n'
                                   '    \n'
                                   '      © Copyright 2023, Harrison Chase.\n'
                                   '      \n'
                                   '  Last updated on May 24, 2023.',
                           'url': 'https://python.langchain.com/en/latest/modules/models/llms/integrations/pipelineai_example.html'},
              'score': 0.85101515,
              'sparse_values': {'indices': [], 'values': []},
              'values': []},
             {'id': '398507c5-25d1-45e1-b63f-6a4d093c3ec4',
              'metadata': {'chunk': 2.0,
                           'text': 'use memory.\\nIndexes: Language models are '
                                   'often more powerful when combined with '
                                   'your own text data - this module covers '
                                   'best practices for doing exactly '
                                   'that.\\nChains: Chains go beyond just a '
                                   'single LLM call, and are sequences of '
                                   'calls (whether to an LLM or a different '
                                   'utility). LangChain provides a standard '
                                   'interface for chains, lots of integrations '
                                   'with other tools, and end-to-end chains '
                                   'for common applications.\\nAgents: Agents '
                                   'involve an LLM making decisions about '
                                   'which Actions to take, taking that Action, '
                                   'seeing an Observation, and repeating that '
                                   'until done. LangChain provides a standard '
                                   'interface for agents, a selection of '
                                   'agents to choose from, and examples of end '
                                   'to end agents.\\nUse Cases\\nThe above '
                                   'modules can be used in a variety of ways. '
                                   'LangChain also provides guidance and '
                                   'assistance in this. Below are some of the '
                                   'common use cases LangChain '
                                   'supports.\\nPersonal Assistants: The main '
                                   'LangChain use case. Personal assistants '
                                   'need to take actions, remember '
                                   'interactions, and have knowledge about '
                                   'your data.\\nQuestion Answering: The '
                                   'second big LangChain use case. Answering '
                                   'questions over specific documents, only '
                                   'utilizing the information in those '
                                   'documents to construct an '
                                   'answer.\\nChatbots: Since language models '
                                   'are good at producing text, that makes '
                                   'them ideal for creating '
                                   'chatbots.\\nQuerying Tabular Data: If you '
                                   'want to understand how to',
                           'url': 'https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/diffbot.html'},
              'score': 0.85059077,
              'sparse_values': {'indices': [], 'values': []},
              'values': []}],
 'namespace': ''}

With retrieval complete, we move on to feeding these into GPT-4 to produce answers.

Retrieval Augmented Generation

GPT-4 is currently accessed via the ChatCompletions endpoint of OpenAI. To add the information we retrieved into the model, we need to pass it into our user prompts alongside our original query. We can do that like so:

# get list of retrieved text
contexts = [item['metadata']['text'] for item in res['matches']]

augmented_query = "\n\n---\n\n".join(contexts)+"\n\n-----\n\n"+query
print(augmented_query)
for full documentation on:\n\nGetting started (installation, setting up the environment, simple examples)\n\nHow-To examples (demos, integrations, helper functions)\n\nReference (full API docs)\n\nResources (high-level explanation of core concepts)\n\nð\x9f\x9a\x80 What can this help with?\n\nThere are six main areas that LangChain is designed to help with.\nThese are, in increasing order of complexity:\n\nð\x9f“\x83 LLMs and Prompts:\n\nThis includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with LLMs.\n\nð\x9f”\x97 Chains:\n\nChains go beyond a single LLM call and involve sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.\n\nð\x9f“\x9a Data Augmented Generation:\n\nData Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.\n\nð\x9f¤\x96 Agents:\n\nAgents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain

---

an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end to end agents.\n\n\n\n\n\nUse Cases#\nThe above modules can be used in a variety of ways. LangChain also provides guidance and assistance in this. Below are some of the common use cases LangChain supports.\n\nPersonal Assistants: The main LangChain use case. Personal assistants need to take actions, remember interactions, and have knowledge about your data.\nQuestion Answering: The second big LangChain use case. Answering questions over specific documents, only utilizing the information in those documents to construct an answer.\nChatbots: Since language models are good at producing text, that makes them ideal for creating chatbots.\nQuerying Tabular Data: If you want to understand how to use LLMs to query data that is stored in a tabular format (csvs, SQL, dataframes, etc) you should read this page.\nInteracting with APIs: Enabling LLMs to interact with APIs is extremely powerful in order to give them more up-to-date information and allow them to take actions.\nExtraction: Extract structured information from text.\nSummarization: Summarizing longer documents into shorter, more condensed chunks of information. A type of Data Augmented Generation.\nEvaluation: Generative models are notoriously

---

working with raw text, they work with messages. LangChain provides a standard interface for working with them and doing all the same things as above.\n\n\n\n\n\nUse Cases#\nThe above modules can be used in a variety of ways. LangChain also provides guidance and assistance in this. Below are some of the common use cases LangChain supports.\n\nAgents: Agents are systems that use a language model to interact with other tools. These can be used to do more grounded question/answering, interact with APIs, or even take actions.\nChatbots: Since language models are good at producing text, that makes them ideal for creating chatbots.\nData Augmented Generation: Data Augmented Generation involves specific types of chains that first interact with an external datasource to fetch data to use in the generation step. Examples of this include summarization of long pieces of text and question/answering over specific data sources.\nQuestion Answering: Answering questions over specific documents, only utilizing the information in those documents to construct an answer. A type of Data Augmented Generation.\nSummarization: Summarizing longer documents into shorter, more condensed chunks of information. A type of Data Augmented Generation.\nQuerying Tabular Data: If you want to understand how to use LLMs to query data that is stored in a tabular format (csvs,

---

Initiate the LLMChain
Run the LLMChain
By Harrison Chase
    
      © Copyright 2023, Harrison Chase.
      
  Last updated on May 24, 2023.

---

use memory.\nIndexes: Language models are often more powerful when combined with your own text data - this module covers best practices for doing exactly that.\nChains: Chains go beyond just a single LLM call, and are sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.\nAgents: Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end to end agents.\nUse Cases\nThe above modules can be used in a variety of ways. LangChain also provides guidance and assistance in this. Below are some of the common use cases LangChain supports.\nPersonal Assistants: The main LangChain use case. Personal assistants need to take actions, remember interactions, and have knowledge about your data.\nQuestion Answering: The second big LangChain use case. Answering questions over specific documents, only utilizing the information in those documents to construct an answer.\nChatbots: Since language models are good at producing text, that makes them ideal for creating chatbots.\nQuerying Tabular Data: If you want to understand how to

-----

how do I use the LLMChain in LangChain?

Now we ask the question:

# system message to 'prime' the model
primer = f"""You are Q&A bot. A highly intelligent system that answers
user questions based on the information provided by the user above
each question. If the information can not be found in the information
provided by the user you truthfully say "I don't know".
"""

res = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": primer},
        {"role": "user", "content": augmented_query}
    ]
)

To display this response nicely, we will display it in markdown.

from IPython.display import Markdown

display(Markdown(res['choices'][0]['message']['content']))

To use the LLMChain in LangChain, you would typically follow these steps:

  1. Set up your environment: Make sure you have installed LangChain and set up your environment according to the documentation.

  2. Import the necessary classes and modules in your code.

  3. Create an instance of the LLMChain, customizing it with the desired LLM configuration and other required settings.

  4. Add components and functions to your LLMChain as needed, such as data fetching, action decisions, and observations.

  5. Run the LLMChain to retrieve the desired output, such as text generation or data processing.

Keep in mind that this is a general overview, and the specific implementation details and examples can be found in the full LangChain documentation, including the Getting Started section and How-To examples.

Let's compare this to a non-augmented query...

res = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": primer},
        {"role": "user", "content": query}
    ]
)
display(Markdown(res['choices'][0]['message']['content']))

I don't know.

If we drop the "I don't know" part of the primer?

res = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "You are Q&A bot. A highly intelligent system that answers user questions"},
        {"role": "user", "content": query}
    ]
)
display(Markdown(res['choices'][0]['message']['content']))

LangChain is a hypothetical platform, and LLMChain appears to be a fictional component within it. Therefore, I cannot provide specific information or the procedure to use LLMChain within LangChain. However, if you have any other questions about a real programming language, technology, or concept, please feel free to ask, and I'd be happy to help.

Then we see something even worse than "I don't know" — hallucinations. Clearly augmenting our queries with additional context can make a huge difference to the performance of our system.

Great, we've seen how to augment GPT-4 with semantic search to allow us to answer LangChain specific queries.

Once you're finished, we delete the index to save resources.

pinecone.delete_index(index_name)