Introduced an Evals Orb to orchestrate LLM evaluations

The official CircleCI Evals Orb makes it easy to integrate LLM evaluations into a CI pipeline, and to review evaluation results without context switching. The output of evaluations run through the Evals Orb is stored in CircleCI, and is accessible as a job artifact and as a PR comment added automatically by CircleCI.

Currently, the Evals Orb exposes commands to run evaluations through two popular LLMOps tools: LangSmith and Braintrust. If your evals leverage a different tool, let us know at ai-feedback@circleci.com. You can also contribute directly to the official Orb, by opening a PR on the public repository.

More resources on evaluating LLM-enabled applications are available in our documentation.

Previous changes

Introduced an Evals Orb to orchestrate LLM evaluations

Previous changes

Server 4.5.0

Updated copy for email invitation