What is a Vector Database? How to use it for creating RAG LLM applications?

Vector databases have recently gained popularity since the launch of ChatGPT. Previously, they were used in applications for similarity search, recommendation systems and image recognition. However, they now have become a critical tool for building LLM and RAG applications. There are several different types of vector databases and search libraries such as Pinecone, Faiss, Chroma and many others. Which one is the best for your use cases and applications?

If you want to try creating or experimenting with a LLM chatbot with a vector database, you can try using Bind and test out how your results differ with and without a vector index. Here’s a detailed example for creating a financial app with a vector index.

What are Vector Databases?

Vector databases work by storing embeddings of words or phrases as vectors in a high-dimensional space. These embeddings capture the semantic meaning of the words or phrases, allowing the LLM to understand relationships between them. When a query is made to the vector database, the LLM can retrieve similar vectors based on their proximity in the vector space, enabling it to generate relevant responses or predictions. Embeddings are a way to represent words or phrases as numerical vectors in a high-dimensional space. This is a very different approach as compared to traditional databases which typically store structured data. When a query is made in a traditional database, it searches for exact matches, regex or patterns within the data based on the specified fields or values.

An AI-generated representation of Vector Embeddings in a high dimensional space

Why do LLMs need Vector DBs? How to use it?

LLMs are stateless, and don’t have access to your proprietary information. Meaning, if you ask a question to a LLM API, it will only respond based on the input you provide and the knowledge it has. As an example, imagine you are building an AI Assistant for Medical Diagnosis. You’ll want to make sure that the LLM does not hallucinate and provide inaccurate information. That’s where a vector database comes into picture, you can store your proprietary data as vector embeddings, and instruct the LLMs to only use the provided relevant info to frame the response. The approach or framework is known as RAG or Retrieval Augmented Generation.

This is the typical process for a LLM application using RAG approach with vector database

Take your data, and create embeddings using an embeddings model. There are a few of them, e.g. OpenAI embeddings
Store these embeddings as chunks in a Vector Database. Chunk means your document is broken down into several pieces, based on the token limit you set.
When a user enters a query in your LLM chat application, the vector database will automatically retrieve the most relevant chunks, which you can then pass on to the LLM along with your prompt and the user question.
With the overall content that has been provided, the LLM produces a response.

For a broader overview of LLM apps, please read our previous blog post which explains the foundations for LLMs and the core concepts such as RAG.

Now, let’s look at all the vector databases available, and which ones can be suitable for your use cases.

Types of Vector databases and vector search libraries

The use of vectors in data management and retrieval can be excitingly facilitated by vector databases, libraries, and indexing. To have a clearer understanding, let’s dissect them.

Dedicated Vector DBs

These are the new breed of databases, such as Pinecone, which are built ground up for storing and retrieving vector embeddings.

Vector Search Libraries such as FAISS

These databases store vectors in files on disk. This is a simple and efficient approach, but it can be slow for large datasets and doesn’t offer much in terms of features. Faiss is a popular example of a file-based vector database.

Traditional Databases Supporting Vector Indexing

Traditional datastores such as Elasticsearch, Redis, and PostgreSQL also now offer extensions for supporting vector indexing even though they were not originally intended for vector operations. Elasticsearch, which is well-known for its full-text search functionality, has added vector search tools to improve its handling of intricate queries. Redis, a well-known in-memory data structure store, enhances its functionality by adding vector indexing, which maximizes efficiency and speed. With PGVector, PostgreSQL—a database management powerhouse—becomes even more adaptable, allowing it to handle high-dimensional vector data. The incorporation of vector indexing into well-established databases is a noteworthy development as it merges the dependability and resilience of conventional databases with the contemporary requirements of vector data processing.

Example Use cases of LLM Applications which use Vector DBs

Vector database integration has become a key tactic to improve accuracy and efficiency in the dynamic field of language model-based applications. Suppose you are engaging with a large language model (LLM) such as GPT-4-powered customer care chatbot to find out more about different banking products. In the past, these chatbots have relied on a predetermined context to deliver information, but they frequently run into issues because of context size restrictions and the difficulty of incorporating vast amounts of knowledge, such as thorough product information or customer profiles.

This is where vector databases’ magic is useful. Text embeddings are numerical representations that capture the meaning of text and are stored by them. These can significantly improve the chatbot’s comprehension and response to your questions.

For instance, text embeddings of credit card details are kept in a vector database for use in banking chatbot scenarios. In order to deliver you a more accurate response when you inquire about credit card possibilities, the chatbot analyzes your question, matches it with the most pertinent embeddings, and dynamically populates its context with this data.
Let’s go to discussing e-commerce search engines now. In this case, vector search can change the way that products are found. Semantically comparable items can be found via vector search, as opposed to only depending on precise phrase matches. This method works particularly well in e-commerce, because customers frequently use natural language or even misspell products, and product descriptions fluctuate.
E-commerce systems can enhance the relevancy of search results by matching the semantic meaning of the query by translating product data and user inquiries into vectors.
The incorporation of vector databases into SQL code generation tools provides a strong remedy for some of the intrinsic shortcomings of LLMs, such as restricted context memory or restricted access to private or current data. These tools can be made much more efficient by storing and retrieving embeddings from a vector database. This method works especially well in SQL Server systems because of the strength of the SQL language and enterprise-grade capabilities like scalability and security, which enable excellent vector data management.
Even with enormous datasets, applications that use SQL Server or Azure SQL can effectively store vectors in columnstore indexes and carry out sophisticated searches.
As a concrete illustration, think of a console program that processes text from classic literature and builds a vector database in SQL for use in question and answer sessions. This application shows how feasible it is to use SQL Server as a vector store, able to accurately and efficiently handle complicated queries and massive amounts of data.
Whether a chatbot, an e-commerce platform, or a SQL code generating tool is used, the incorporation of vector databases in various LLM applications is a step toward smarter, more responsive, and contextually aware digital interactions—not merely a technological improvement.
This method shows how artificial intelligence (AI) and machine learning (ML) can totally transform the way we use information and interact with online services.

Summary

In this post we reviewed the fundamentals of vector databases, and some of the use cases.

Read our next blog post to learn about the top 10 vector databases.