Docs
SDK reference
classes
Dataset

Class: Dataset

A dataset is a collection of records, such as model inputs and outputs, which represent data you can use to evaluate and fine-tune models. You can log production data to datasets, curate them with interesting examples, edit/delete records, and run evaluations against them.

You should not create Dataset objects directly. Instead, use the braintrust.initDataset() method.

Constructors

constructor

new Dataset(project, id, name, pinnedVersion?)

Parameters

NameType
projectRegisteredProject
idstring
namestring
pinnedVersion?string

Methods

[asyncIterator]

[asyncIterator](): AsyncGenerator<DatasetRecord, any, unknown>

Fetch all records in the dataset.

Returns

AsyncGenerator<DatasetRecord, any, unknown>

Example

// Use an async iterator to fetch all records in the dataset.
for await (const record of dataset) {
 console.log(record);
}

clearCache

clearCache(): void

Returns

void


close

close(): Promise<string>

Terminate connection to the dataset and return its id. After calling close, you may not invoke any further methods on the dataset object.

Will be invoked automatically if the dataset is bound as a context manager.

Returns

Promise<string>

The dataset id.


delete

delete(id): string

Parameters

NameType
idstring

Returns

string


fetch

fetch(): AsyncGenerator<DatasetRecord, any, unknown>

Fetch all records in the dataset.

Returns

AsyncGenerator<DatasetRecord, any, unknown>

An iterator over the dataset's records.

Example

// Use an async iterator to fetch all records in the dataset.
for await (const record of dataset.fetch()) {
 console.log(record);
}

// You can also iterate over the dataset directly.
for await (const record of dataset) {
 console.log(record);
}

fetchedData

fetchedData(): Promise<any[]>

Returns

Promise<any[]>


insert

insert(event): string

Insert a single record to the dataset. The record will be batched and uploaded behind the scenes. If you pass in an id, and a record with that id already exists, it will be overwritten (upsert).

Parameters

NameTypeDescription
eventObjectThe event to log.
event.id?string(Optional) a unique identifier for the event. If you don't provide one, Braintrust will generate one for you.
event.input?unknownThe argument that uniquely define an input case (an arbitrary, JSON serializable object).
event.metadata?Record<string, unknown>(Optional) a dictionary with additional data about the test example, model outputs, or just about anything else that's relevant, that you can use to help find and analyze examples later. For example, you could log the prompt, example's id, or anything else that would be useful to slice/dice later. The values in metadata can be any JSON-serializable type, but its keys must be strings.
event.outputunknownThe output of your application, including post-processing (an arbitrary, JSON serializable object).

Returns

string

The id of the logged record.


summarize

summarize(options?): Promise<DatasetSummary>

Summarize the dataset, including high level metrics about its size and other metadata.

Parameters

NameType
optionsObject
options.summarizeData?boolean

Returns

Promise<DatasetSummary>

DatasetSummary

A summary of the dataset.


version

version(): Promise<any>

Returns

Promise<any>

Properties

id

Readonly id: string


name

Readonly name: string


project

Readonly project: RegisteredProject