Check out the Copilot to explore these techniques in a conversational interface.

Scoring

Open in Colab

Bring some questions and a response to score, and see how Pi handles them, end-to-end.

Model Comparison

Open in Colab

With a good scoring system, you can now evaluate different models to see which one hits the price/performance/latency tradeoff that you’re looking for, without relying on vibes alone as a benchmark.

Dataset Filtering

Open in Colab

You can take a larger amount of data (from Hugging Face) or any other source, filter out irrelevant data, and arrive at a better set for training or evaluation.

Generate Sythetic Training Data

Open in Colab

This notebook walks you through generating a synthetic training set and filtering it against your Pi Scoring System so that you can train on the “good” examples.

Calibration

Open in Colab

Some questions are more important than others. Calibration lets you more heavily weight “important” questions when computing your final score. This weighting is learned from your own user feedback and ratings.