evaluate() method is used to evaluate the performance of a RAG app. You can find the signature below:

Parameters

question
Union[str, list[str]]

A question or a list of questions to evaluate your app on.

metrics
Optional[list[Union[BaseMetric, str]]]

The metrics to evaluate your app on. Defaults to all metrics: ["context_relevancy", "answer_relevancy", "groundedness"]

num_workers
int

Specify the number of threads to use for parallel processing.

Returns

metrics
dict

Returns the metrics you have chosen to evaluate your app on as a dictionary.

Usage

from embedchain import App

app = App()

# add data source
app.add("https://www.forbes.com/profile/elon-musk")

# run evaluation
app.evaluate("what is the net worth of Elon Musk?")
# {'answer_relevancy': 0.958019958036268, 'context_relevancy': 0.12903225806451613}

# or
# app.evaluate(["what is the net worth of Elon Musk?", "which companies does Elon Musk own?"])