View Source Instructor (Instructor v0.0.5)

Structured prompting for LLMs. Instructor is a spiritual port of the great Instructor Python Library by @jxnlco, check out his talk on YouTube.

The Instructor library is useful for coaxing an LLM to return JSON that maps to an Ecto schema that you provide, rather than the default unstructured text output. If you define your own validation logic, Instructor can automatically retry prompts when validation fails (returning natural language error messages to the LLM, to guide it when making corrections).

Instructor is designed to be used with the OpenAI API by default, but it also works with llama.cpp and Bumblebee (Coming Soon!) by using an extendable adapter behavior.

At its simplest, usage is pretty straightforward:

  1. Create an ecto schema, with a @doc string that explains the schema definition to the LLM.
  2. Define a validate_changeset/1 function on the schema, and use the Instructor.Validator macro in order for Instructor to know about it.
  3. Make a call to Instructor.chat_completion/1 with an instruction for the LLM to execute.

You can use the max_retries parameter to automatically, iteratively go back and forth with the LLM to try fixing validation errorswhen they occur.

defmodule SpamPrediction do
  use Ecto.Schema
  use Instructor.Validator

  @doc """
  ## Field Descriptions:
  - class: Whether or not the email is spam.
  - reason: A short, less than 10 word rationalization for the classification.
  - score: A confidence score between 0.0 and 1.0 for the classification.
  """
  @primary_key false
  embedded_schema do
    field(:class, Ecto.Enum, values: [:spam, :not_spam])
    field(:reason, :string)
    field(:score, :float)
  end

  @impl true
  def validate_changeset(changeset) do
    changeset
    |> Ecto.Changeset.validate_number(:score,
      greater_than_or_equal_to: 0.0,
      less_than_or_equal_to: 1.0
    )
  end
end

is_spam? = fn text ->
  Instructor.chat_completion(
    model: "gpt-3.5-turbo",
    response_model: SpamPrediction,
    max_retries: 3,
    messages: [
      %{
        role: "user",
        content: """
        Your purpose is to classify customer support emails as either spam or not.
        This is for a clothing retail business.
        They sell all types of clothing.

        Classify the following email: 
        ```
        #{text}
        ```
        """
      }
    ]
  )
end

is_spam?.("Hello I am a Nigerian prince and I would like to send you money")

# => {:ok, %SpamPrediction{class: :spam, reason: "Nigerian prince email scam", score: 0.98}}

Check out our Quickstart Guide for more code snippets that you can run locally (in Livebook). Or, to get a better idea of the thinking behind Instructor, read more about our Philosophy & Motivations.

Optionally, you can also customize the your llama.cpp calls (with defaults shown):

llamacpp
config :instructor, adapter: Instructor.Adapters.Llamacpp
config :instructor, :llamacpp,
    chat_template: :mistral_instruct,
    api_url: "http://localhost:8080/completion"
````

Summary

Functions

Casts all the parameters in the params map to the types defined in the types map. This works both with Ecto Schemas and maps of Ecto types (see Schemaless Ecto).

Create a new chat completion for the provided messages and parameters.

Functions

Link to this function

cast_all(schema, params)

View Source

Casts all the parameters in the params map to the types defined in the types map. This works both with Ecto Schemas and maps of Ecto types (see Schemaless Ecto).

Examples

When using a full Ecto Schema

iex> Instructor.cast_all(%{
...>   data: %Instructor.Demos.SpamPrediction{},
...>   types: %{
...>     class: :string,
...>     score: :float
...>   }
...> }, %{
...>   class: "spam",
...>   score: 0.999
...> })
%Ecto.Changeset{
  action: nil,
  changes: %{
    class: "spam",
    score: 0.999
  },
  errors: [],
  data: %Instructor.Demos.SpamPrediction{
    class: :spam,
    score: 0.999
  },
  valid?: true
}

When using a map of Ecto types

iex> Instructor.cast_all(%Instructor.Demo.SpamPrediction{}, %{
...>   class: "spam",
...>   score: 0.999
...> })
%Ecto.Changeset{
  action: nil,
  changes: %{
    class: "spam",
    score: 0.999
  },
  errors: [],
  data: %{
    class: :spam,
    score: 0.999
  },
  valid?: true
}

and when using raw Ecto types,

iex> Instructor.cast_all({%{},%{name: :string}, %{
...>   name: "George Washington"
...> })
%Ecto.Changeset{
  action: nil,
  changes: %{
    name: "George Washington",
  },
  errors: [],
  data: %{
    name: "George Washington",
  },
  valid?: true
}
Link to this function

chat_completion(params, config \\ nil)

View Source
@spec chat_completion(Keyword.t(), any()) ::
  {:ok, Ecto.Schema.t()}
  | {:error, Ecto.Changeset.t()}
  | {:error, String.t()}
  | Stream.t()

Create a new chat completion for the provided messages and parameters.

The parameters are passed directly to the LLM adapter. By default they shadow the OpenAI API parameters. For more information on the parameters, see the OpenAI API docs.

Additionally, the following parameters are supported:

  • :response_model - The Ecto schema to validate the response against, or a valid map of Ecto types (see Schemaless Ecto).
  • :stream - Whether to stream the response or not. (defaults to false)
  • :validation_context - The validation context to use when validating the response. (defaults to %{})
  • :mode - The mode to use when parsing the response, :tools, :json, :md_json (defaults to :tools), generally speaking you don't need to change this unless you are not using OpenAI.
  • :max_retries - The maximum number of times to retry the LLM call if it fails, or does not pass validations.
                 (defaults to `0`)

Examples

iex> Instructor.chat_completion(%{
...>   model: "gpt-3.5-turbo",
...>   response_model: Instructor.Demos.SpamPrediction,
...>   messages: [
...>     %{
...>       role: "user",
...>       content: "Classify the following text: Hello, I am a Nigerian prince and I would like to give you $1,000,000."
...>     }
...> })
{:ok,
    %Instructor.Demos.SpamPrediction{
        class: :spam
        score: 0.999
    }}

When you're using Instructor in Streaming Mode, instead of returning back a tuple, it will return back a stream that emits tuples. There are two main streaming modes available. array streaming and partial streaming.

Partial streaming will emit the record multiple times until it's complete.

iex> Instructor.chat_completion(%{
...>   model: "gpt-3.5-turbo",
...>   response_model: {:partial, %{name: :string, birth_date: :date}}
...>   messages: [
...>     %{
...>       role: "user",
...>       content: "Who is the first president of the United States?"
...>     }
...> }) |> Enum.to_list()
[
  {:partial, %{name: "George Washington"}},
  {:partial, %{name: "George Washington", birth_date: ~D[1732-02-22]}},
  {:ok, %{name: "George Washington", birth_date: ~D[1732-02-22]}}
]

Whereas with array streaming, you can ask the LLM to return multiple instances of your Ecto schema, and instructor will emit them one at a time as they arrive in complete form and validated.

iex> Instructor.chat_completion(%{
...>   model: "gpt-3.5-turbo",
...>   response_model: {:array, %{name: :string, birth_date: :date}}
...>   messages: [
...>     %{
...>       role: "user",
...>       content: "Who are the first 5 presidents of the United States?"
...>     }
...> }) |> Enum.to_list()

[
  {:ok, %{name: "George Washington", birth_date: ~D[1732-02-22]}},
  {:ok, %{name: "John Adams", birth_date: ~D[1735-10-30]}},
  {:ok, %{name: "Thomas Jefferson", birth_date: ~D[1743-04-13]}},
  {:ok, %{name: "James Madison", birth_date: ~D[1751-03-16]}},
  {:ok, %{name: "James Monroe", birth_date: ~D[1758-04-28]}}
]

If there's a validation error, it will return an error tuple with the change set describing the errors.

iex> Instructor.chat_completion(%{
...>   model: "gpt-3.5-turbo",
...>   response_model: Instructor.Demos.SpamPrediction,
...>   messages: [
...>     %{
...>       role: "user",
...>       content: "Classify the following text: Hello, I am a Nigerian prince and I would like to give you $1,000,000."
...>     }
...> })
{:error,
    %Ecto.Changeset{
        changes: %{
            class: "foobar",
            score: -10.999
        },
        errors: [
            class: {"is invalid", [type: :string, validation: :cast]}
        ],
        valid?: false
    }}