View Source OpenAi.Evals (OpenAI REST API Client v1.0.0)

Provides API endpoints related to evals

Summary

Functions

Cancel an ongoing evaluation run.

Create the structure of an evaluation that can be used to test a model's performance. An evaluation is a set of testing criteria and a datasource. After creating an evaluation, you can run it on different models and model parameters. We support several types of graders and datasources. For more information, see the Evals guide.

Create a new evaluation run. This is the endpoint that will kick off grading.

Delete an evaluation.

Get an evaluation by ID.

Get an evaluation run by ID.

Get a list of output items for an evaluation run.

Get a list of runs for an evaluation.

List evaluations for a project.

Update certain properties of an evaluation.

Types

Link to this type

delete_eval_200_json_resp()

View Source
@type delete_eval_200_json_resp() :: %{
  deleted: boolean(),
  eval_id: String.t(),
  object: String.t()
}
Link to this type

delete_eval_run_200_json_resp()

View Source
@type delete_eval_run_200_json_resp() :: %{
  deleted: boolean() | nil,
  object: String.t() | nil,
  run_id: String.t() | nil
}

Functions

Link to this function

cancel_eval_run(eval_id, run_id, opts \\ [])

View Source
@spec cancel_eval_run(eval_id :: String.t(), run_id :: String.t(), opts :: keyword()) ::
  {:ok, OpenAi.Eval.Run.t()} | {:error, OpenAi.Error.error()}

Cancel an ongoing evaluation run.

Link to this function

create_eval(body, opts \\ [])

View Source
@spec create_eval(body :: OpenAi.Eval.RequestCreate.t(), opts :: keyword()) ::
  {:ok, OpenAi.Eval.t()} | {:error, OpenAi.Error.error()}

Create the structure of an evaluation that can be used to test a model's performance. An evaluation is a set of testing criteria and a datasource. After creating an evaluation, you can run it on different models and model parameters. We support several types of graders and datasources. For more information, see the Evals guide.

Link to this function

create_eval_run(eval_id, body, opts \\ [])

View Source
@spec create_eval_run(
  eval_id :: String.t(),
  body :: OpenAi.Eval.Run.CreateRequest.t(),
  opts :: keyword()
) :: {:ok, OpenAi.Eval.Run.t()} | {:error, OpenAi.Error.error()}

Create a new evaluation run. This is the endpoint that will kick off grading.

Link to this function

delete_eval(eval_id, opts \\ [])

View Source
@spec delete_eval(eval_id :: String.t(), opts :: keyword()) ::
  {:ok, delete_eval_200_json_resp()} | {:error, OpenAi.Error.error()}

Delete an evaluation.

Link to this function

delete_eval_run(eval_id, run_id, opts \\ [])

View Source
@spec delete_eval_run(eval_id :: String.t(), run_id :: String.t(), opts :: keyword()) ::
  {:ok, delete_eval_run_200_json_resp()} | {:error, OpenAi.Error.error()}

Delete an eval run.

Link to this function

get_eval(eval_id, opts \\ [])

View Source
@spec get_eval(eval_id :: String.t(), opts :: keyword()) ::
  {:ok, OpenAi.Eval.t()} | {:error, OpenAi.Error.error()}

Get an evaluation by ID.

Link to this function

get_eval_run(eval_id, run_id, opts \\ [])

View Source
@spec get_eval_run(eval_id :: String.t(), run_id :: String.t(), opts :: keyword()) ::
  {:ok, OpenAi.Eval.Run.t()} | {:error, OpenAi.Error.error()}

Get an evaluation run by ID.

Link to this function

get_eval_run_output_item(eval_id, run_id, output_item_id, opts \\ [])

View Source
@spec get_eval_run_output_item(
  eval_id :: String.t(),
  run_id :: String.t(),
  output_item_id :: String.t(),
  opts :: keyword()
) :: {:ok, OpenAi.Eval.Run.OutputItem.t()} | {:error, OpenAi.Error.error()}

Get an evaluation run output item by ID.

Link to this function

get_eval_run_output_items(eval_id, run_id, opts \\ [])

View Source
@spec get_eval_run_output_items(
  eval_id :: String.t(),
  run_id :: String.t(),
  opts :: keyword()
) ::
  {:ok, OpenAi.Eval.Run.OutputItem.List.t()} | {:error, OpenAi.Error.error()}

Get a list of output items for an evaluation run.

Options

  • after: Identifier for the last output item from the previous pagination request.

  • limit: Number of output items to retrieve.

  • status: Filter output items by status. Use failed to filter by failed output items or pass to filter by passed output items.

  • order: Sort order for output items by timestamp. Use asc for ascending order or desc for descending order. Defaults to asc.

Link to this function

get_eval_runs(eval_id, opts \\ [])

View Source
@spec get_eval_runs(eval_id :: String.t(), opts :: keyword()) ::
  {:ok, OpenAi.Eval.Run.List.t()} | {:error, OpenAi.Error.error()}

Get a list of runs for an evaluation.

Options

  • after: Identifier for the last run from the previous pagination request.
  • limit: Number of runs to retrieve.
  • order: Sort order for runs by timestamp. Use asc for ascending order or desc for descending order. Defaults to asc.
  • status: Filter runs by status. One of queued | in_progress | failed | completed | canceled.

@spec list_evals(opts :: keyword()) ::
  {:ok, OpenAi.Eval.List.t()} | {:error, OpenAi.Error.error()}

List evaluations for a project.

Options

  • after: Identifier for the last eval from the previous pagination request.
  • limit: Number of evals to retrieve.
  • order: Sort order for evals by timestamp. Use asc for ascending order or desc for descending order.
  • order_by: Evals can be ordered by creation time or last updated time. Use created_at for creation time or updated_at for last updated time.
Link to this function

update_eval(eval_id, body, opts \\ [])

View Source
@spec update_eval(eval_id :: String.t(), body :: map(), opts :: keyword()) ::
  {:ok, OpenAi.Eval.t()} | {:error, OpenAi.Error.error()}

Update certain properties of an evaluation.