Memory Stores

Interact with your content, data and information using Memory Stores

📘

Memory Stores allow your workflows and agents to retrieve relevant information when they are completing requests.

Memory Stores leverage semantic search - a search that is based on "meaning" which differs from normal direct-match keyword searches (like those found in SQL).

How does a Memory Store work?

AirOps Studio's Managed Memory Stores are powered by Pinecone vector databases and OpenAI's embedding model which allow you to semantically search your content in an AirOps App. Typically, when you hear "Chatbots or AI trained on your data," in 99.9% of instances, this implies using vector databases with embeddings for data retrieval.

Here is a step-by-step walkthrough:

  1. Add your documents (pdf, txt, Google Sheets, SQL DB or csv) to your Memory Store
  2. We segment your content into smaller "chunks" and generate embeddings (numerical representations of your data):
    1. Each "chunk" is no more than 1000 tokens so it can fit in LLM prompts
    2. We create vector embeddings (numeric representations) of each chunk using an OpenAI embedding model
    3. The vector embeddings will allow you to compare the relatedness and semantic similarity between your content and your search query
      For example, searching for "how do i reset my password" would return chunks related to password resets, being locked out of your account, how to contact support for account issues.
  3. We create your Memory Store by storing the embeddings in a Pinecone vector database
  4. You add a Memory Store Step in an AirOps App to query your memory store (the vector database)
    1. Add a Memory Store step with a search query
    2. The query will retrieve the top n chunks (embeddings) with the closest semantic similarity (based on numeric distance)
    3. In other words, the Memory Store Step will return the top chunks that are most similar to your search query

More on Embeddings

To visualize embeddings, you can check out TensorFlow's projector tool. Notice that words that are semantically similar will appear closer in distance, whereas words that are less semantically similar will appear further apart:

To learn more about embeddings, check out OpenAI's documentation or this Pinecone article.

Uploading Google Sheets / CSVs to Memory Stores

🚧

Google Sheets and CSVs require a specific configuration

When uploading a Google Sheets file or a CSV file to Memory Store, each row will be loaded as a separate chunk, and you can embed as many column you choose.

Requirements for Google Sheets / CSV

Your Google Sheets or CSV document may include an id column per row that will be used as an identifier.

  • id : a unique identifier per row
    • 💡Tip: don't have a unique identifier? Create a unique index by starting with 1 and incrementing (+1) per row

Additionally, add any additional columns (metadata) you want to retrieve for the associated indexed record.

Google Sheets / CSV Example

The following is an example of the columns I could include to embed blog content into a memory store:

  • id: the unique identifier
  • content: the blog content, which means, only the blog content will be embedded and searchable
  • title: the blog title (metadata I can filter on or access later)
  • description: the blog description (metadata I can filter on or access later)
idcontenttitledescription
1In this blog, we'll give you a rundown of everything you need to know about LLMs...Everything You Need to Know About LLMsA detailed guide on everything you need to know about LLMs.
2You've heard about chatGPT, but have you tried to write your own prompt? In this blog...Write Prompts in 5 Easy StepsWrite an LLM prompt in 5 simple steps, low
3In this blog, we're going to show you how to take your prompting to the next level...Prompt Engineering 101Learn how to elevate your prompting to the next level.
4Are you an amateur poker player wondering how to make it as a pro? In this blog...How to Become a Pro-Poker PlayerThe skills you need to know to become a pro-poker player.

If you want to embed more than one column, you can concatenate and combine columns accordingly. For example, I could concatenate the blog title and the blog content to generate a better search result and provide more context to the LLM step:

idcontentdescription
1BLOG TITLE: Everything You Need to Know About LLMs BLOG CONTENT: In this blog, we'll give you a rundown of everything you need to know about LLMs...A detailed guide on everything you need to know about LLMs.
2BLOG TITLE: Write Prompts in 5 Easy Steps BLOG CONTENT: You've heard about chatGPT, but have you tried to write your own prompt? In this blog...Write an LLM prompt in 5 simple steps, low
3BLOG TITLE: Prompt Engineering 101 BLOG CONTENT: In this blog, we're going to show you how to take your prompting to the next level...Learn how to elevate your prompting to the next level.
4BLOG TITLE: How to Become a Pro-Poker Player BLOG CONTENT: Are you an amateur poker player wondering how to make it as a pro? In this blog...The skills you need to know to become a pro-poker player.

Limitations of Google Sheets / CSV Upload

Please note there's a 40KB size limit for the metadata of a row.

Memory Store Step Configuration

  1. Add a "Memory Store" step in order to query the memory store
  2. "Select a Memory Store": select the Memory Store that you created
  3. "Max Results": the maximum number of "chunks" or results you want to retrieve from your memory store
  4. "Filters" (Optional): Add a filter based on the value of your metadata - for example, if you have
  5. "Query": add the variable which will pass the text that you want to query (This could be in an input or an output of a step. In the example below, I pass {{my_input}} into the query).

Memory Store Step Output

[
  {
    "id": "vsdi:87:rid:1:cid:0",
    "score": 0.78703,
    "content": "Blog Title: How to Become a Prompt Engineer\\nBlog Content: In this blog, we're going to show you how to take your prompting...",
    "document_name": "AirOps Knowledge Base",
    "metadata": {
      "__chunk_id": 0,
      "__record_id": "1",
      "__vector_store_document_id": 87,
      "title": "How to Become a Prompt Engineer",
      "description": "Learn how to elevate your prompting to the next level.",
    }
  },
  {
    ...
  }
 ]

Memory Stores Walkthrough