Braintrust Weekly Update

Ankur Goyal · Founder

09 October 2023

It’s been a busy week for us at Braintrust. Here’s some of the new features we shipped this week:

All experiment loading HTTP requests are 100-200ms faster
We released a new tutorial: finetune GPT3.5 to write SQL queries (opens in a new tab)

You can easily finetune GPT3.5 to generate SQL queries using OpenAI and then evaluate how the fine tuned model compares to the base model using Braintrust. Check out the Jupyter Notebook example here (opens in a new tab) to get started.

We evaluated the Alpaca evals leaderboard in Braintrust

The Alpaca evals use Claude and GPT4 to rank how different LLMs perform on a variety of tasks. You can see the aggregated rankings and also dig into individual models and better understand their strengths and weaknesses. Check out the Alpaca Evals braintrust project (opens in a new tab) on Braintrust to dig in further—no login required.

We improved Datasets. See when they were last edited and the version number from the UI.

Easily see when a dataset was last changed from the UI by hovering over the ID. We also provide example code so you can quickly use the current dataset version in your project. Learn more on our datasets guide (opens in a new tab).

Release notes

All experiment loading HTTP requests are 100-200ms faster
The prompt playground now supports autocomplete
Dataset versions are now displayed on the datasets page
Projects in the summary page are now sorted alphabetically
Long text fields in logged data can be expanded into scrollable blocks (opens in a new tab)

Braintrust is the enterprise-grade stack for building AI products. From evaluations, to prompt playground, to data management, we take uncertainty and tedium out of incorporating AI into your business.

Weekly update 10/16/23 It's time to build reliable AI