Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.getbasalt.ai/llms.txt

Use this file to discover all available pages before exploring further.

API Reference

This page documents the main Python SDK methods and objects for working with datasets.

Dataset methods

list_sync

basalt.datasets.list_sync()
Lists all datasets accessible to your API key. Returns: A list of Dataset objects (without rows). Each dataset object includes:
  • slug (str): Unique identifier for the dataset.
  • name (str): Human-readable name.
  • description (str): Description of the dataset.
  • num_rows (int): Number of rows in the dataset.
  • columns (list[DatasetColumn]): List of column definitions.

list (async)

await basalt.datasets.list()
Async variant of list_sync, suitable for use in async applications. Returns: Same as list_sync: a list of Dataset objects.

get_sync

basalt.datasets.get_sync(slug: str)
Synchronously retrieves a dataset by slug, including its rows and columns. Parameters:
  • slug (str, required): Dataset identifier.
Returns: A Dataset object (see below).

get (async)

await basalt.datasets.get(slug: str)
Async variant of get_sync, suitable for use in async applications. Parameters:
  • slug (str, required): Dataset identifier.
Returns: A Dataset object with all rows included.

add_row_sync

basalt.datasets.add_row_sync(
    slug: str,
    values: dict,
    name: str | None = None,
    ideal_output: str | None = None,
    metadata: dict | None = None,
)
Adds a single row to a dataset. Parameters:
  • slug (str, required): Dataset identifier.
  • values (dict, required): Column name → value mapping. Values can be strings, numbers, or FileAttachment objects for file uploads.
  • name (str, optional): Row identifier; auto-generated if omitted.
  • ideal_output (str, optional): Expected output for evaluation.
  • metadata (dict, optional): Arbitrary metadata for this row.
Returns: A DatasetRow object representing the newly created row.

add_row (async)

await basalt.datasets.add_row(
    slug: str,
    values: dict,
    name: str | None = None,
    ideal_output: str | None = None,
    metadata: dict | None = None,
)
Async variant of add_row_sync, for use in async workflows. Returns: Same as add_row_sync: a DatasetRow object.

Dataset objects

Dataset

Represents a complete dataset, including schema and rows.
  • slug (str): Unique identifier.
  • name (str): Human-readable name.
  • description (str): Description of the dataset.
  • num_rows (int): Number of rows.
  • columns (list[DatasetColumn]): List of column definitions.
  • rows (list[DatasetRow]): List of rows in the dataset.

DatasetColumn

Describes a single column in the dataset schema.
  • name (str): Column name.
  • type (str): Data type (for example "string", "number").
  • description (str): Column description.

DatasetRow

Represents a single test case / row.
  • name (str): Row identifier.
  • values (dict): Column name → value mapping.
  • ideal_output (str | None): Expected output, when available.
  • metadata (dict | None): Additional metadata for this row.


Error Handling

The following exceptions can be raised by dataset operations: API Errors (all inherit from BasaltAPIError):
  • NotFoundError: The dataset does not exist (HTTP 404).
  • UnauthorizedError: The API key is missing, invalid, or lacks permissions (HTTP 401).
  • BadRequestError: Invalid request parameters (HTTP 400).
  • ForbiddenError: API key lacks permissions for this resource (HTTP 403).
  • BasaltAPIError: Other API-level errors.
File Upload Errors:
  • FileValidationError: File validation failed before upload (unsupported type, too large, etc.).
  • FileUploadError: File upload to S3 failed. Has optional file_key attribute.
Connection Errors:
  • NetworkError: Network connectivity issues.
Base Exception:
  • BasaltError: Base class for all Basalt SDK errors.
Example:
from basalt.types.exceptions import NotFoundError, FileValidationError, FileUploadError

try:
    basalt.datasets.add_row_sync(
        slug="my-dataset",
        values={"content": FileAttachment(source="large-file.pdf")}
    )
except FileValidationError as e:
    print(f"File validation failed: {e.message}")
except FileUploadError as e:
    print(f"File upload failed: {e.message}")
except NotFoundError:
    print("Dataset not found")

Data Objects

FileAttachment

Used to upload files as part of a dataset row.
from basalt import FileAttachment
from pathlib import Path

# From file path
attachment = FileAttachment(
    source=Path("path/to/file.png"),
    content_type="image/png"  # Optional: auto-detected if omitted
)

# From bytes
attachment = FileAttachment(
    source=b"binary_data_here",
    content_type="image/png",
    filename="image.png"  # Required for bytes
)

# From file-like object
with open("file.png", "rb") as f:
    attachment = FileAttachment(source=f, content_type="image/png")
Parameters:
  • source (str | Path | bytes | BinaryIO): File source. Can be a file path, bytes, or file-like object.
  • content_type (str, optional): MIME type of the file (e.g., "image/png", "application/pdf", "text/html", "text/markdown"). If omitted, content type is auto-detected from the file extension or path.
  • filename (str, optional): Filename to use. Required when source is bytes. Auto-determined otherwise.
Supported content types:
  • image/png
  • image/jpeg
  • application/pdf
  • text/html
  • text/markdown
Maximum file size: 10 MB per file. Usage in dataset rows:
basalt.datasets.add_row_sync(
    slug="my-dataset",
    values={
        "question": "What is in this image?",
        "image": FileAttachment(source=Path("screenshot.png"))
    }
)