LeXtract.Schema.Analyzer (lextract v0.1.2)

View Source

Analyzes example extractions to infer schema information.

Examines extraction examples to identify:

  • Extraction classes (e.g., "Medication", "Person")
  • Attribute names (e.g., medication_attributes, person_attributes)
  • Attribute types (string, list, map, etc.)
  • Nested structures

Example

iex> examples = [
...>   %LeXtract.ExampleData{
...>     input: "Patient takes aspirin 100mg daily",
...>     output: %{
...>       "extractions" => [
...>         %{
...>           "class" => "Medication",
...>           "medication_attributes" => %{
...>             "name" => "aspirin",
...>             "dosage" => "100mg"
...>           }
...>         }
...>       ]
...>     }
...>   }
...> ]
iex> schema_info = LeXtract.Schema.Analyzer.analyze(examples)
iex> schema_info.classes
["Medication"]
iex> Map.has_key?(schema_info.attributes, "Medication")
true

Summary

Functions

Analyzes examples to extract schema information.

Converts analyzed schema to NimbleOptions format.

Types

schema_info()

@type schema_info() :: %{
  classes: [String.t()],
  attributes: %{required(String.t()) => [String.t()]},
  types: %{required(String.t()) => atom()}
}

Functions

analyze(examples)

@spec analyze([LeXtract.ExampleData.t()]) :: schema_info()

Analyzes examples to extract schema information.

Returns a map with:

  • :classes - List of extraction class names
  • :attributes - Map of class -> attribute names
  • :types - Map of attribute path -> inferred type

Examples

iex> examples = [
...>   %LeXtract.ExampleData{
...>     input: "Dr. Smith treated John",
...>     output: %{
...>       "extractions" => [
...>         %{
...>           "class" => "Person",
...>           "person_attributes" => %{"name" => "Dr. Smith"}
...>         },
...>         %{
...>           "class" => "Person",
...>           "person_attributes" => %{"name" => "John"}
...>         }
...>       ]
...>     }
...>   }
...> ]
iex> result = LeXtract.Schema.Analyzer.analyze(examples)
iex> result.classes
["Person"]

to_nimble_options(map)

@spec to_nimble_options(schema_info()) :: keyword()

Converts analyzed schema to NimbleOptions format.

Returns a keyword list suitable for ReqLLM's generate_object/4. Note: Due to NimbleOptions limitations with nested validation in lists, the schema provides basic validation for the extractions list structure but does not deeply validate nested attribute maps.

Examples

iex> schema_info = %{
...>   classes: ["Medication"],
...>   attributes: %{"Medication" => ["name", "dosage"]},
...>   types: %{
...>     "Medication.name" => :string,
...>     "Medication.dosage" => :string
...>   }
...> }
iex> schema = LeXtract.Schema.Analyzer.to_nimble_options(schema_info)
iex> Keyword.has_key?(schema, :extractions)
true