POST
/
data
/
generate_input_response_pairs
import PiClient from 'withpi';

const client = new PiClient({
  apiKey: process.env['WITHPI_API_KEY'], // This is the default and can be omitted
});

async function main() {
  const syntheticDataStatus = await client.data.generateInputResponsePairs.startJob({
    num_pairs_to_generate: 50,
    seeds: [
      {
        llm_input: 'Tell me something different',
        llm_output: 'The lazy dog was jumped over by the quick brown fox',
      },
    ],
  });

  console.log(syntheticDataStatus.job_id);
}

main();
{
  "data": [
    {
      "llm_input": "Tell me something different",
      "llm_output": "The lazy dog was jumped over by the quick brown fox"
    },
    {
      "llm_input": "Write a short poem",
      "llm_output": "Moonlight dancing on waves,\nStars whisper ancient tales,\nNight's gentle embrace"
    }
  ],
  "detailed_status": [
    "Downloading model",
    "Tuning prompt"
  ],
  "job_id": "1234abcd",
  "state": "RUNNING"
}

Authorizations

x-api-key
string
header
required

Body

application/json
num_pairs_to_generate
integer
required

The number of new LLM input-response pairs to generate

Example:

50

seeds
object[]
required

The list of LLM input response-pairs to be used as seeds

An example for training or evaluation

Example:
[
  {
    "llm_input": "Tell me something different",
    "llm_output": "The lazy dog was jumped over by the quick brown fox"
  }
]
application_description
string | null

The application description for which the synthetic data would be applicable.

Example:

"AI application for writing a children's story given topics."

batch_size
integer
default:5

Number of input-response pairs to generate in one LLM call. Must be <=10. Generally it could be same as num_shots.

Example:

5

exploration_mode
enum<string>

The exploration mode for input-response pairs generation. Defaults to BALANCED

Available options:
CONSERVATIVE,
BALANCED,
CREATIVE,
ADVENTUROUS
num_shots
integer
default:5

Number of input-response pairs to be included in the prompt for generation

Example:

5

system_prompt
string | null

The system prompt to generate the responses for the application's inputs

Example:

"Write a children's story given a topic from the user."

Response

200
application/json
Successful Response

SyntheticDataStatus is the result of a synthetic data generation job.

detailed_status
string[]
required

Detailed status of the job

Example:
["Downloading model", "Tuning prompt"]
job_id
string
required

The job id

Example:

"1234abcd"

state
enum<string>
required

Current state of the job

Available options:
QUEUED,
RUNNING,
DONE,
ERROR,
CANCELLED
data
object[] | null

The generated synthetic data. Can be present even if the state is not done/error as it is streamed.

An example for training or evaluation

Example:
[
  {
    "llm_input": "Tell me something different",
    "llm_output": "The lazy dog was jumped over by the quick brown fox"
  },
  {
    "llm_input": "Write a short poem",
    "llm_output": "Moonlight dancing on waves,\nStars whisper ancient tales,\nNight's gentle embrace"
  }
]