Memory Search

Semantically search a Memory Store

What is a Memory Search Step?

The Memory Search step allows you to semantically search your data in order to power customized content or recommendations to users.

For more information on setting up a Memory Store, see our documentation page here.

How to Configure a Memory Search Step

Select a Memory Store

Select the name of your Memory Store from the dropdown.

Max Results

Choose the number of Memory Store results to return.

Your content is chunked into 1000 tokens, so each result will be ~1000 tokens.

If you're finding that your LLM isn't returning the exact results you're looking for, then we recommend increasing the number of results and asking it to synthesize them.

Filter the Memory Store (Optional)

Memory Stores have two different filter modes, the first is a Visual Editor where you can filter data based on a specific metadata field.

For example, if you wanted to filter a client column on an input client_name, you would pass the following:

For more complex use cases, you can use the Code Editor which uses MongoDB's query and projection operators. Here are some common operators:

  • $eq - Equal to (for numbers, strings, booleans)

  • $ne - Not equal to (for numbers, strings, booleans)

  • $gt - Greater than (for numbers)

  • $gte - Greater than or equal to (for numbers)

  • $lt - Less than (for numbers)

  • $lte - Less than or equal to (for numbers)

  • $in - In array (for strings or numbers)

  • $nin - Not in array (for strings or numbers)

Let's walk through a quick example of how you can filter on Metadata in your Workflow using our "Q&A" Template. In this specific example, we have already populated our Memory Store to include multiple countries' Constitutions, as well as Metadata focused around the "country" field to filter on.

In order for filtering to be effective, we recommend reading through our "Memory Stores Metadata" document to ensure you're providing optimal metadata fields to filter on.

Query

The phrase used to search your memory store for semantically similar text.

As best practice, we recommend that you utilize a liquid variable for your query. Because your applications are meant to adapt to the inputs you provide, it typically isn't best to hard-code your query to a specific variable or question.

Because the Memory Store will return the most semantically similar results, there will be times where providing a long string of text will be more likely to return your desired results than a targeted, accurate phrase. We encourage you to test your query prompt to ensure it best fits your needs.

Memory Search Step Output

The Memory Search step will output a list of embeddings from the memory store.

  • ID: a unique ID based on the metadata Chunk ID, Record ID, and Vector Store Document ID

  • Content: the text (or columns) of your document that are searchable

  • Document Name: name of the document you provided

  • Metadata: any columns you provided as metadata

[
  {
    "id": "vsdi:87:rid:1:cid:0",
    "score": 0.78703,
    "content": "Blog Title: How to Become a Prompt Engineer\\nBlog Content: In this blog, we're going to show you how to take your prompting...",
    "document_name": "AirOps Knowledge Base",
    "metadata": {
      "__chunk_id": 0,
      "__record_id": "1",
      "__vector_store_document_id": 87,
      "title": "How to Become a Prompt Engineer",
      "description": "Learn how to elevate your prompting to the next level.",
    }
  },
  {
    ...
  }
 ]

Formatting Your Output for LLM Steps

The output of the Memory Search Step is a JSON blob as shown above. However, because you will often pass this output into an LLM, it is much more effective if you format the results in natural language.

Formatting with Code Step

One of the best ways to handle this is by using our Code Step in conjunction with Liquid syntax. Our "Q&A" Template shows a great example of how you can achieve this:

Our Code Step utilizes Javascript to format the JSON Blob into a more natural format for our LLM Step to interpret:

return step_1.output.map(item => `Document Name: ${item.document_name}\nContent: ${item.content}\nConfidence Score: ${item.score}`).join('\n\n');

Formatting with Liquid

To pass context and formatted data to the LLM step, you can also use Liquid to format your Memory Search Step results.

We modify our earlier example output above by iterating over each chunk and returning the metadata.title and metadata.description:

// Replace step_x.output

{% for chunk in step_x.output %} 
    Title: {{chunk.metadata.title}}
    Description: {{chunk.metadata.description}}
{% endfor %}

Last updated