What are Experiments?
In Basalt, an experiment is a collection of traces that are grouped together for comparison and analysis. When you attach a trace to an experiment, the trace data goes to the experiment instead of the regular monitoring system. This separation allows you to:- Run controlled tests with multiple variations
- Collect consistent metrics across different runs
- Compare performance between different approaches
- Analyze the impact of changes on output quality
- Make evidence-based decisions about your AI implementations
Think of experiments as scientific trials for your AI workflows. Just as scientists run controlled experiments to test hypotheses, you can use Basalt experiments to test ideas and measure their impact.
Why Use Experiments?
Complex AI workflows often combine multiple AI components with business logic code. When iterating on these workflows, it can be difficult to know whether changes improve the overall result. Experiments provide a structured way to compare different versions of your workflow:A/B Testing Workflows
Experiments are ideal for A/B testing different versions of complex workflows. For example, you can test whether:- A different prompt formulation improves the quality of generated content
- A new retrieval method provides more relevant context
- Adding a classification step improves final output accuracy
- Changing model parameters affects the overall performance
How Experiments Work
When you create an experiment in Basalt:- You define a unique experiment with a name and feature slug
- You create traces and attach them to the experiment
- You run your AI workflows with these traces
- Basalt collects all the trace data in the experiment
- You analyze the results in the Basalt dashboard
Viewing Experiment Results
After running an experiment, you can view and analyze the results in the Basalt application:- Navigate to the Experiments section in the Basalt dashboard
- Select your experiment from the list
- View aggregated metrics for all runs in the experiment
- Compare different variants side by side using charts and tables
- Drill down into individual traces for detailed analysis
The experiment interface is designed to help you answer the question: “Which version of my workflow performs better on the metrics I care about?”