> ## Documentation Index
> Fetch the complete documentation index at: https://laminar.sh/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Introduction to Laminar datasets

## Concept

Dataset is a collection of datapoints. It can be used for the following purposes:

1. Data storage for use in future fine-tuning or prompt-tuning.
2. Provide inputs and expected outputs for [Evaluations](/evaluations/introduction).

## Format

Every datapoint has two fixed JSON objects: `data` and `target`, each with arbitrary keys.
`target` is only used in evaluations.

* `data` – the actual datapoint data,
* `target` – data additionally sent to the evaluator function.
* `metadata` – arbitrary key-value metadata about the datapoint.

For every key inside `data` and `target`, the value can be any JSON value.

### Example

This is an example of a valid datapoint.

```json theme={null}
{
    "data": {
        "color": "red",
        "size": "large",
        "messages": [
            {
                "role": "user",
                "content": "Hello, can you help me choose a T-shirt?"
            },
            {
                "role": "assistant",
                "content": "I'm afraid, we don't sell T-shirts"
            }
        ]
    },
    "target": {
        "expected_output": "Of course! What size and color are you looking for?"
    }
}
```

## Editing

Datasets are editable. You can edit the datapoints by clicking on the datapoint and
editing the data in JSON. Changes are saved as a new datapoint version.

### Versioning

Each datapoint has a unique id and a `created_at` timestamp. Every time you
edit a datapoint, under the hood,
a new datapoint version is created with the same id,
but the `created_at` timestamp is updated.

The version stack is push-only. That is, when you revert to a previous version,
a copy of that version is created and added as a current version.

Example:

* Initial version (v1):

```json theme={null}
{
  "id": "019a3122-ca78-7d75-91a7-a860526895b2",
  "created_at": "2025-01-01T00:00:00.000Z",
  "data": { "key": "initial value" }
}
```

* Version 2 (v2):

```json theme={null}
{
  "id": "019a3122-ca78-7d75-91a7-a860526895b2",
  "created_at": "2025-01-05T00:00:05.000Z",
  "data": { "key": "value at v2" }
}
```

* Version 3 (v3):

```json theme={null}
{
  "id": "019a3122-ca78-7d75-91a7-a860526895b2",
  "created_at": "2025-01-10T00:00:10.000Z",
  "data": { "key": "value at v3" }
}
```

After this, you want to update to version 1 (initial version). This will create a new version (v4) with the same id, but the `created_at` timestamp is updated.

* Version 4 (v4):

```json theme={null}
{
  "id": "019a3122-ca78-7d75-91a7-a860526895b2",
  "created_at": "2025-01-15T00:00:15.000Z",
  "data": { "key": "initial value" }
}
```

### Datapoint id

When you push a new datapoint to a dataset, a UUIDv7 is generated for it.
This allows to sort datapoints by their creation order and preserve the order of insertion.
