Get Started
API Reference
- Scoring System
- Data Generation
List
Lists the Scoring Spec Calibration Jobs owned by a user
import PiClient from 'withpi';
const client = new PiClient({
apiKey: process.env['WITHPI_API_KEY'], // This is the default and can be omitted
});
async function main() {
const scoringSpecCalibrationStatuses = await client.scoringSystem.calibrate.list();
console.log(scoringSpecCalibrationStatuses);
}
main();
[
{
"calibrated_scoring_spec": [
{
"custom_model_id": "your-model-id",
"is_lower_score_desirable": "False",
"label": "Relevance to Prompt",
"parameters": [
0.14285714285714285,
0.2857142857142857,
0.42857142857142855,
0.5714285714285714,
0.7142857142857143,
0.8571428571428571
],
"python_code": "\ndef score(response_text: str, input_text: str, kwargs: dict) -> dict:\n word_count = len(response_text.split())\n if word_count > 10:\n return {\"score\": 0.2, \"explanation\": \"Response has more than 10 words\"}\n elif word_count > 5:\n return{\"score\": 0.6, \"explanation\": \"Response has more than 5 words\"}\n else:\n return {\"score\": 1, \"explanation\": \"Response has 5 or fewer words\"}\n",
"question": "Is the response relevant to the prompt?",
"scoring_type": "PI_SCORER",
"tag": "Legal Formatting",
"weight": 1
}
],
"detailed_status": [
"Downloading model",
"Tuning prompt"
],
"job_id": "1234abcd",
"state": "RUNNING"
}
]
Authorizations
Query Parameters
Filter jobs by state
QUEUED
, RUNNING
, DONE
, ERROR
, CANCELLED
Response
Detailed status of the job
["Downloading model", "Tuning prompt"]
The job id
"1234abcd"
Current state of the job
QUEUED
, RUNNING
, DONE
, ERROR
, CANCELLED
The calibrated scoring spec
The yes/no question to ask Pi Scoring System.
"Is the response relevant to the prompt?"
The ID of the custom model associated with the CUSTOM_MODEL_SCORER scoring_type.
"your-model-id"
Indicates whether a lower score represents a better outcome (e.g., fewer errors, less toxicity)
"False"
The label of the question.
"Relevance to Prompt"
The learned parameters for the scoring question define a piecewise linear interpolation over the range [0, 1]. This transformation adjusts the score distribution to better match your preferences—for example, by pulling scores below 0.5 closer to 0, and scores above 0.5 closer to 1.
[
0.14285714285714285,
0.2857142857142857,
0.42857142857142855,
0.5714285714285714,
0.7142857142857143,
0.8571428571428571
]
The PYTHON code associated with the PYTHON_CODE scoring_type.
"\ndef score(response_text: str, input_text: str, kwargs: dict) -> dict:\n word_count = len(response_text.split())\n if word_count > 10:\n return {\"score\": 0.2, \"explanation\": \"Response has more than 10 words\"}\n elif word_count > 5:\n return{\"score\": 0.6, \"explanation\": \"Response has more than 5 words\"}\n else:\n return {\"score\": 1, \"explanation\": \"Response has 5 or fewer words\"}\n"
The type of scoring performed for this question. Default: PI_SCORER.
PI_SCORER
, PYTHON_CODE
, CUSTOM_MODEL_SCORER
"PI_SCORER"
The tag or the group to which this question belongs.
"Legal Formatting"
The weight of the question which reflects its relative importance. The sum of question weights will be normalized to one internally. A higher weight counts for more when aggregating this subdimension into the parent dimension.
1
import PiClient from 'withpi';
const client = new PiClient({
apiKey: process.env['WITHPI_API_KEY'], // This is the default and can be omitted
});
async function main() {
const scoringSpecCalibrationStatuses = await client.scoringSystem.calibrate.list();
console.log(scoringSpecCalibrationStatuses);
}
main();
[
{
"calibrated_scoring_spec": [
{
"custom_model_id": "your-model-id",
"is_lower_score_desirable": "False",
"label": "Relevance to Prompt",
"parameters": [
0.14285714285714285,
0.2857142857142857,
0.42857142857142855,
0.5714285714285714,
0.7142857142857143,
0.8571428571428571
],
"python_code": "\ndef score(response_text: str, input_text: str, kwargs: dict) -> dict:\n word_count = len(response_text.split())\n if word_count > 10:\n return {\"score\": 0.2, \"explanation\": \"Response has more than 10 words\"}\n elif word_count > 5:\n return{\"score\": 0.6, \"explanation\": \"Response has more than 5 words\"}\n else:\n return {\"score\": 1, \"explanation\": \"Response has 5 or fewer words\"}\n",
"question": "Is the response relevant to the prompt?",
"scoring_type": "PI_SCORER",
"tag": "Legal Formatting",
"weight": 1
}
],
"detailed_status": [
"Downloading model",
"Tuning prompt"
],
"job_id": "1234abcd",
"state": "RUNNING"
}
]