Skip to main content

API Reference

This page documents the main Python SDK methods and objects for working with datasets.

Dataset methods

list_sync

basalt.datasets.list_sync()
Lists all datasets accessible to your API key. Returns: A list of Dataset objects (without rows). Each dataset object includes:
  • slug (str): Unique identifier for the dataset.
  • name (str): Human-readable name.
  • description (str): Description of the dataset.
  • num_rows (int): Number of rows in the dataset.
  • columns (list[DatasetColumn]): List of column definitions.

list (async)

await basalt.datasets.list()
Async variant of list_sync, suitable for use in async applications. Returns: Same as list_sync: a list of Dataset objects.

get_sync

basalt.datasets.get_sync(slug: str)
Synchronously retrieves a dataset by slug, including its rows and columns. Parameters:
  • slug (str, required): Dataset identifier.
Returns: A Dataset object (see below).

get (async)

await basalt.datasets.get(slug: str)
Async variant of get_sync, suitable for use in async applications. Parameters:
  • slug (str, required): Dataset identifier.
Returns: A Dataset object with all rows included.

add_row_sync

basalt.datasets.add_row_sync(
    slug: str,
    values: dict,
    name: str | None = None,
    ideal_output: str | None = None,
    metadata: dict | None = None,
)
Adds a single row to a dataset. Parameters:
  • slug (str, required): Dataset identifier.
  • values (dict, required): Column name → value mapping. Values can be strings, numbers, or FileAttachment objects for file uploads.
  • name (str, optional): Row identifier; auto-generated if omitted.
  • ideal_output (str, optional): Expected output for evaluation.
  • metadata (dict, optional): Arbitrary metadata for this row.
Returns: A DatasetRow object representing the newly created row.

add_row (async)

await basalt.datasets.add_row(
    slug: str,
    values: dict,
    name: str | None = None,
    ideal_output: str | None = None,
    metadata: dict | None = None,
)
Async variant of add_row_sync, for use in async workflows. Returns: Same as add_row_sync: a DatasetRow object.

Dataset objects

Dataset

Represents a complete dataset, including schema and rows.
  • slug (str): Unique identifier.
  • name (str): Human-readable name.
  • description (str): Description of the dataset.
  • num_rows (int): Number of rows.
  • columns (list[DatasetColumn]): List of column definitions.
  • rows (list[DatasetRow]): List of rows in the dataset.

DatasetColumn

Describes a single column in the dataset schema.
  • name (str): Column name.
  • type (str): Data type (for example "string", "number").
  • description (str): Column description.

DatasetRow

Represents a single test case / row.
  • name (str): Row identifier.
  • values (dict): Column name → value mapping.
  • ideal_output (str | None): Expected output, when available.
  • metadata (dict | None): Additional metadata for this row.


Error Handling

The following exceptions can be raised by dataset operations: API Errors (all inherit from BasaltAPIError):
  • NotFoundError: The dataset does not exist (HTTP 404).
  • UnauthorizedError: The API key is missing, invalid, or lacks permissions (HTTP 401).
  • BadRequestError: Invalid request parameters (HTTP 400).
  • ForbiddenError: API key lacks permissions for this resource (HTTP 403).
  • BasaltAPIError: Other API-level errors.
File Upload Errors:
  • FileValidationError: File validation failed before upload (unsupported type, too large, etc.).
  • FileUploadError: File upload to S3 failed. Has optional file_key attribute.
Connection Errors:
  • NetworkError: Network connectivity issues.
Base Exception:
  • BasaltError: Base class for all Basalt SDK errors.
Example:
from basalt.types.exceptions import NotFoundError, FileValidationError, FileUploadError

try:
    basalt.datasets.add_row_sync(
        slug="my-dataset",
        values={"content": FileAttachment(source="large-file.pdf")}
    )
except FileValidationError as e:
    print(f"File validation failed: {e.message}")
except FileUploadError as e:
    print(f"File upload failed: {e.message}")
except NotFoundError:
    print("Dataset not found")

Data Objects

FileAttachment

Used to upload files as part of a dataset row.
from basalt import FileAttachment
from pathlib import Path

# From file path
attachment = FileAttachment(
    source=Path("path/to/file.png"),
    content_type="image/png"  # Optional: auto-detected if omitted
)

# From bytes
attachment = FileAttachment(
    source=b"binary_data_here",
    content_type="image/png",
    filename="image.png"  # Required for bytes
)

# From file-like object
with open("file.png", "rb") as f:
    attachment = FileAttachment(source=f, content_type="image/png")
Parameters:
  • source (str | Path | bytes | BinaryIO): File source. Can be a file path, bytes, or file-like object.
  • content_type (str, optional): MIME type of the file (e.g., "image/png", "application/pdf", "text/html", "text/markdown"). If omitted, content type is auto-detected from the file extension or path.
  • filename (str, optional): Filename to use. Required when source is bytes. Auto-determined otherwise.
Supported content types:
  • image/png
  • image/jpeg
  • application/pdf
  • text/html
  • text/markdown
Maximum file size: 10 MB per file. Usage in dataset rows:
basalt.datasets.add_row_sync(
    slug="my-dataset",
    values={
        "question": "What is in this image?",
        "image": FileAttachment(source=Path("screenshot.png"))
    }
)