POST
/
scoring_system
/
generate
import PiClient from 'withpi';

const client = new PiClient({
  apiKey: process.env['WITHPI_API_KEY'], // This is the default and can be omitted
});

async function main() {
  const questions = await client.scoringSystem.generate({
    application_description: "Write a children's story communicating a simple life lesson.",
  });

  console.log(questions);
}

main();
[
  {
    "custom_model_id": "your-model-id",
    "is_lower_score_desirable": "False",
    "label": "Relevance to Prompt",
    "parameters": [
      0.14285714285714285,
      0.2857142857142857,
      0.42857142857142855,
      0.5714285714285714,
      0.7142857142857143,
      0.8571428571428571
    ],
    "python_code": "\ndef score(response_text: str, input_text: str, kwargs: dict) -> dict:\n    word_count = len(response_text.split())\n    if word_count > 10:\n        return {\"score\": 0.2, \"explanation\": \"Response has more than 10 words\"}\n    elif word_count > 5:\n        return{\"score\": 0.6, \"explanation\": \"Response has more than 5 words\"}\n    else:\n        return {\"score\": 1, \"explanation\": \"Response has 5 or fewer words\"}\n",
    "question": "Is the response relevant to the prompt?",
    "scoring_type": "PI_SCORER",
    "tag": "Legal Formatting",
    "weight": 1
  }
]

Authorizations

x-api-key
string
header
required

Body

application/json
application_description
string
required

The application description to generate a scoring spec for.

Example:

"Write a children's story communicating a simple life lesson."

num_questions
integer
default:10

The number of questions that the generated scoring system should contain. If <= 0, then the number is auto selected.

Example:

"10"

try_auto_generating_python_code
boolean
default:false

If true, try to generate python code for the generated questions.

Example:

false

Response

200
application/json
Successful Response
question
string
required

The yes/no question to ask Pi Scoring System.

Example:

"Is the response relevant to the prompt?"

custom_model_id
string | null

The ID of the custom model associated with the CUSTOM_MODEL_SCORER scoring_type.

Example:

"your-model-id"

is_lower_score_desirable
boolean
default:false

Indicates whether a lower score represents a better outcome (e.g., fewer errors, less toxicity)

Example:

"False"

label
string | null

The label of the question.

Example:

"Relevance to Prompt"

parameters
number[] | null

The learned parameters for the scoring question define a piecewise linear interpolation over the range [0, 1]. This transformation adjusts the score distribution to better match your preferences—for example, by pulling scores below 0.5 closer to 0, and scores above 0.5 closer to 1.

Example:
[
  0.14285714285714285,
  0.2857142857142857,
  0.42857142857142855,
  0.5714285714285714,
  0.7142857142857143,
  0.8571428571428571
]
python_code
string | null

The PYTHON code associated with the PYTHON_CODE scoring_type.

Example:

"\ndef score(response_text: str, input_text: str, kwargs: dict) -> dict:\n word_count = len(response_text.split())\n if word_count > 10:\n return {\"score\": 0.2, \"explanation\": \"Response has more than 10 words\"}\n elif word_count > 5:\n return{\"score\": 0.6, \"explanation\": \"Response has more than 5 words\"}\n else:\n return {\"score\": 1, \"explanation\": \"Response has 5 or fewer words\"}\n"

scoring_type
enum<string> | null

The type of scoring performed for this question. Default: PI_SCORER.

Available options:
PI_SCORER,
PYTHON_CODE,
CUSTOM_MODEL_SCORER
Example:

"PI_SCORER"

tag
string | null

The tag or the group to which this question belongs.

Example:

"Legal Formatting"

weight
number | null

The weight of the question which reflects its relative importance. The sum of question weights will be normalized to one internally. A higher weight counts for more when aggregating this subdimension into the parent dimension.

Example:

1