.search() enables you to uncover the most pertinent context by performing a semantic search across your data sources based on a given query. Refer to the function signature below:

Parameters

query
str

Question

num_documents
int

Number of relevant documents to fetch. Defaults to 3

where
dict

Key value pair for metadata filtering.

raw_filter
dict

Pass raw filter query based on your vector database. Currently, raw_filter param is only supported for Pinecone vector database.

Returns

answer
dict

Return list of dictionaries that contain the relevant chunk and their source information.

Usage

Basic

Refer to the following example on how to use the search api:

Code example
from embedchain import App

app = App()
app.add("https://www.forbes.com/profile/elon-musk")

context = app.search("What is the net worth of Elon?", num_documents=2)
print(context)

Advanced

Metadata filtering using where params

Here is an advanced example of search() API with metadata filtering on pinecone database:

import os

from embedchain import App

os.environ["PINECONE_API_KEY"] = "xxx"

config = {
    "vectordb": {
        "provider": "pinecone",
        "config": {
            "metric": "dotproduct",
            "vector_dimension": 1536,
            "index_name": "ec-test",
            "serverless_config": {"cloud": "aws", "region": "us-west-2"},
        },
    }
}

app = App.from_config(config=config)

app.add("https://www.forbes.com/profile/bill-gates", metadata={"type": "forbes", "person": "gates"})
app.add("https://en.wikipedia.org/wiki/Bill_Gates", metadata={"type": "wiki", "person": "gates"})

results = app.search("What is the net worth of Bill Gates?", where={"person": "gates"})
print("Num of search results: ", len(results))

Metadata filtering using raw_filter params

Following is an example of metadata filtering by passing the raw filter query that pinecone vector database follows:

import os

from embedchain import App

os.environ["PINECONE_API_KEY"] = "xxx"

config = {
    "vectordb": {
        "provider": "pinecone",
        "config": {
            "metric": "dotproduct",
            "vector_dimension": 1536,
            "index_name": "ec-test",
            "serverless_config": {"cloud": "aws", "region": "us-west-2"},
        },
    }
}

app = App.from_config(config=config)

app.add("https://www.forbes.com/profile/bill-gates", metadata={"year": 2022, "person": "gates"})
app.add("https://en.wikipedia.org/wiki/Bill_Gates", metadata={"year": 2024, "person": "gates"})

print("Filter with person: gates and year > 2023")
raw_filter = {"$and": [{"person": "gates"}, {"year": {"$gt": 2023}}]}
results = app.search("What is the net worth of Bill Gates?", raw_filter=raw_filter)
print("Num of search results: ", len(results))