Charts
Define and manage saved charts. Charts are visualizations that aggregate metrics over time with bucketing, filters, and groupings. (5 commands)
Datapoints
Manage individual records inside datasets, including batch creation and mapping to source events. (6 commands)
Datasets
Curate collections of datapoints used as test sets for evaluations and experiments. (6 commands)
Events
Read and write trace events. Events are the spans that capture every step of an AI application’s execution. (4 commands)
Experiments
Run, retrieve, and compare evaluation runs to measure how prompt or configuration changes affect agent performance. (11 commands)
Metric Versions
Snapshot, list, and deploy versions of a metric’s definition so changes can be reviewed and rolled back without losing history. (3 commands)
Metrics
Define and run evaluators, i.e. automated quality checks that score traces against criteria like accuracy, safety, or correctness. (5 commands)
Sessions
Group related trace events into sessions, the top-level container for a multi-step or multi-service AI interaction. (2 commands)

