LangChain vs LlamaIndex

Listen to this content

Contents

Share this article

Large Language Models, or LLMs for short, can create human-like text and handle human-language complexity. Trained on vast amounts of data, LLMs can understand and produce text that’s contextually relevant across many different topics.

If you’re trying to create data aware LLM applications and talk to your data through your product or software then you’ll most likely run into LlangChain and/or LlamaIndex.

LangChain and LlamaIndex are both frameworks that allow you to ingest and query data using an LLM as an interface. Depending on your needs, one or both of these frameworks will make sense to leverage.

We’ll explore both in depth so that you can make an informed decision on the technologies you wish to implement in order to build your LLM application. It’s worth noting that both of these frameworks are constantly evolving and receive updates regularly.

What is LlamaIndex?

LlamaIndex (formerly GPT Index) is a framework for LLM applications to ingest, structure, and access private or domain-specific data.

As the name suggests, LlamaIndex is really focused on ingesting and structuring data based on index types, such as list or tree index. With LlamaIndex you can compose indices ontop of eachother to form complex data structures. As for querying, LlamaIndex is able to handle simple and more complex queries.

images_for_website_Prancheta_1_cópia_16
images_for_website_Prancheta_1_cópia_17

Below are the key features of LlamaIndex:

    • Prompting: LlamaIndex uses prompting extensively to maximize its functionality.
  • Document chunking: LlamaIndex divides documents into smaller chunks, leveraging LangChain’s textSplitter classes, which break down text to fit LLM’s token limits. Customized chunking is also available.
  • Graph index: The index can be a list, tree, or keyword table. There’s also the option to create an index from various other indexes, facilitating hierarchical document organization.
  • Querying: When querying an index graph, two steps occur. First, relevant nodes related to the query are identified. Next, using these nodes, the response_synthesis module generates an answer. The determination of a node’s relevance varies by index type.
  • List index: Uses all nodes sequentially. In “embedding” mode, only the top similar nodes are used.
  • Vector index: Uses embeddings for each document and retrieves only highly relevant nodes.
  • Response synthesis: Multiple modes including ‘Create and refine’, ‘Tree summarize’, and ‘Compact’ determine how the response is created.
  • Create and refine: In the default mode for a list index, nodes are processed sequentially. At each step, the LLM is prompted to refine the response based on the current node’s information.
  • Tree summarize: A tree is formed from selected nodes. The summarization prompt, seeded with the query, helps form parent nodes. The tree builds until a root node emerges, summarizing all node information.
  • Compact: To economize, the synthesizer fills the prompt with as many nodes as possible without exceeding the LLM’s token limit. If there are excess nodes, the synthesizer processes them in batches, refining the answer sequentially.
  • Composability: LlamaIndex allows creating an index from other indexes. This is beneficial for searching across diverse data sources.
  • Data connectors: LlamaIndex supports various data sources, like confluence pages or cloud-stored documents, with connectors available on LlamaHub.
  • Query transformations: Techniques like HyDE and query decomposition refine or break down queries for better results.
  • HyDe: HyDE (Hypothetical Document Embedding) prompts an LLM with a query to obtain a general answer without referencing specific documents. This answer, combined with the original query (if “include_original” is set to TRUE), helps retrieve relevant information from your documents. This method guides the query engine but can produce irrelevant answers if the initial query lacks context.
  • Query decomposition: Queries can be undergo single or multi-step decomposition queries depending on the complexity.
  • Node postprocessors: These refine the selected nodes. For instance, the
    KeywordNodePostprocessor class further filters retrieved nodes based on keywords.
  • Storage: Storage is crucial in this framework. It handles vectors, nodes, and the index. By default, data is stored in memory, except vectors which services like PineCone save in their databases. In-memory data can be saved to disk for later retrieval. Let’s explore the available storage options:
  • Document Stores: MongoDB and in-memory options available. MongoDocumentStore and SimpleDocumentStore handle storing document nodes either in a MongoDB server or in memory respectively.
  • Index Stores: MongoIndexStore and SimpleIndexStore supports both MongoDB and in-memory for storing index metadata.
  • Vector Stores: Supports in-memory and databases like PineCone. One thing to note is that hosted databases like PineCone are capable of very efficient complex calculations on vectors compared to in-memory databases such as Chroma.

To utilize LlamaIndex effectively, set up your storage preferences, and then generate a storage_context object for your indexes.

What is LangChain?

LangChain is an open-source development framework for LLM applications. It’s shipped in two packages, Python and JavaScript(TypeScript).

The framework’s focused on composition and modularity, and have a lot of individual components that can be used together or on their own such as models, chains and agents. LangChain also comes with use cases that show common ways to combine components.

Let’s dive into the various components of LangChain.

Models

There are 3 kinds of models.

  • LLMs: A large language model that takes in input via an interaction and returns a response as output.
  • Chat models: Are similar to LLMs but specialized to work with message objects instead of pure text. Examples would be human message, system message and AI messages. These labels don’t do anything on their own, but help the system understand how it should respond in a conversation.
  • Embedding models: These models are used to deliver text as a vector representation. A good use case for this is semantic search. LangChain uses two methods for embedding called embed_query and embed_document. This is because different LLMs use either one of these embedding methods.

Prompts

Prompts are how users interact with LLMs. LangChain uses carefully designed prompts and prompt templates to take user query and data schemas in order to get the desired response.

There are four types of prompt templates that LangChain uses:

  • LLM Prompt Templates: To enable dynamic prompt configuration and eliminate hardcoded values, Langchain offers an object rooted in Python’s formatted strings. At present, Langchain supports Jinja, with plans to integrate more templating languages soon.
  • Chat Prompt Templates: As previously described, we shift from using string objects to message objects here. Message objects offer a structured approach to conversational scenarios. Messages fall into three types: 1. HumanMessages, 2. AIMessages, and 3. SystemMessages.
    The first two are straightforward in their naming. SystemMessages, however, don’t originate from either AI or humans. They generally establish the chat’s context. For instance, “You are a helpful AI that assists with resume screening” is a SystemMessage.

In essence, ChatPromptTemplates aren’t merely based on strings as LLM prompts. They’re grounded on MessageTemplates, which encompass HumanMessages, AIMessages, and SystemMessages.

  1. Example Selectors: LangChain offers the adaptability to decide the way in which you select input samples for a language model from a set of examples. There’s a selector that functions based on input length, modulating the number of examples picked from your prompt according to the prompt’s remaining length.
  1. Output Parsers: Inputs and prompts represent only one facet of LLMs. Occasionally, the manner in which the output is presented becomes importabnt, especially for subsequent operations. LangChain lets you use pre-designed output parsers, but you also have the freedom to make one tailored to your specific needs. For instance, there could be an output parser that translates the LLM response into a sequence of values separated by commas, suitable for saving in CSV format.

Indexes

This tool retrieves information from documents based on a query. To build this system, we need a mechanism to load documents, create embedding vectors for them, and manage these vectors and documents. While LlamaIndex offers one approach, LangChain provides more granularity through its classes.

Document Loaders: This tool loads documents from various sources like HTML, PDF, Email, Git, and Notion using Unstructured, a pre-processing tool for unstructured data.

Text Splitters: Long documents can’t be fully embedded into a vector model due to token size limits and the need for coherent chunks. Thus, splitting documents is essential. While LangChain offers splitters, creating a custom one for specific needs may be beneficial.

VectorStores: Databases to store embedding vectors, which enable semantic similarity searches. LangChain supports platforms like PineCone and Chroma.

Retrievers: Linked to VectorStore indexes, retrievers are designed for document retrieval, offering methods to determine similarity and the number of related documents. They are integral in chains that require retrieval.

Memory: While most interactions with LLMs aren’t stored, memory is crucial for applications like chatbots. LangChain provides memory objects for tracking interactions. Examples include:

  • ChatMessageHistory: Tracks previous interactions to provide context.
  • ConversationBufferMemory: A simpler way to manage chat history.
  • Saving History: Convert your message history into dictionary objects and save as formats like JSON or pickle.

Chains

Chains combine components, like language models and prompts, for specific outcomes. They can simplify inputs and provide detailed responses. They can be linked using classes like SequentialChain.

Agents and Tools

LLMs are bound by their training data. For real-time data, like weather updates, they need external tools. Chains might not be enough since they use every tool, irrespective of the query.

Agents decide which tool is relevant per query. Using agents involves loading tools, initializing the agent with them, and querying. For instance, an agent might use the OpenWeatherMap API and an LLM-math tool together.

Tools: LangChain offers standard tools, but users can create custom ones. Tool descriptions help agents decide which tool to use for a query. The tool’s description is crucial for its effectiveness.

Similarities: LangChain vs LlamaIndex

LlamaIndex and LangChain have some overlap, specifically in the indexing and querying stages of working with LLMs.

Within this overlap, both frameworks do handle indexing and querying slightly differently.

images_for_website_Prancheta_1_cópia_18

Difference: LangChain vs LlamaIndex

While both frameworks provide indexing and querying capabilities, LangChain is broader and provides modules for tools, agents and chains.

Chains are a powerful concept in LLM development. With chains, the output from an interaction with your LLM can be your input for your next interaction.

For example, in a chatbot application a chain would represent a conversation between a user and the LLM. These steps are typically predefined or hardcoded into the applications.

Agents are similar to chains, but with decision-making powers. They can come up with steps on their own and decide what tools to use.

LlamaIndex’s primary use case is in going very deep into index and retrieval (querying).

Which one should you choose?

LangChain and LlamaIndex together provide standardization and interoperability when building LLM applications.

If you’re looking to get the most out of indexing and querying, and build really powerful search tools then LlamaIndex is most likely the stronger tool for the job, as it’s primarily focused on doing those two things well. LlamaIndex is also a bit easier to get started with, and has helpful getting started documentation.

To leverage multiple instances of ChatGPT, tools, interaction chains and agents that can autonomously work with tools, and provide them with memory then LlangChain is the way to go.

LangChain’s documentation is a bit more complex, though well-written, and might need a little more time to be productive with.

The community in LangChain has grown a lot faster than LlamaIndex, and so there will most likely be plenty of support around this framework along with content for common challenges in the future.

Finally, you can use both LangChain and LlamaIndex together and benefit the best of both worlds. This boils down to your needs when it comes to indexing and querying.

Hire Exceptional Developers Quickly

Build dev teams you can trust
Companies are growing their business faster with Trio.

Share this article

Alex Kugell

With over 10 years of experience in software outsourcing, Alex has assisted in building high-performance teams before co-founding Trio with his partner Daniel. Today he enjoys helping people hire the best software developers from Latin America and writing great content on how to do that!
A collage featuring a man using binoculars, a map pin with a man's portrait in the center, and the Brazilian flag fluttering in the wind against a blue background with coding script overlaid.

Brazil's Best in US Tech: Elevate Projects with Elite Developers

Harness the Vibrant Talent of Brazilian Developers: Elevate Your Projects with Trio’s Elite Tech Teams, Pioneering Innovation and Trusted for Global Success