LeXtract.AnnotatedDocument (lextract v0.1.2)

View Source

Represents a document with extracted entities and relationships.

This is the primary output type from LeXtract.extract/2.

Fields

  • :extractions - List of extracted entities
  • :text - Original document text
  • :document_id - Unique identifier
  • :metadata - Optional metadata about extraction process

Examples

iex> doc = %LeXtract.AnnotatedDocument{
...>   document_id: "abc-123",
...>   text: "Sample text",
...>   extractions: []
...> }
iex> Enum.count(doc.extractions)
0

Summary

Functions

Returns extractions filtered by class.

Returns count of extractions.

Returns all unique extraction classes.

Returns true if document has any extractions.

Creates a new annotated document.

Types

t()

@type t() :: %LeXtract.AnnotatedDocument{
  document_id: String.t(),
  extractions: [LeXtract.Extraction.t()],
  metadata: map() | nil,
  text: String.t() | nil
}

Functions

by_class(annotated_document, class)

@spec by_class(t(), String.t()) :: [LeXtract.Extraction.t()]

Returns extractions filtered by class.

Examples

iex> extractions = [
...>   %LeXtract.Extraction{extraction_class: "person", extraction_text: "John"},
...>   %LeXtract.Extraction{extraction_class: "medication", extraction_text: "aspirin"}
...> ]
iex> doc = %LeXtract.AnnotatedDocument{document_id: "doc-1", extractions: extractions}
iex> doc |> LeXtract.AnnotatedDocument.by_class("person") |> length()
1

count(annotated_document)

@spec count(t()) :: non_neg_integer()

Returns count of extractions.

Examples

iex> doc = %LeXtract.AnnotatedDocument{document_id: "doc-1", extractions: []}
iex> LeXtract.AnnotatedDocument.count(doc)
0

extraction_classes(annotated_document)

@spec extraction_classes(t()) :: [String.t()]

Returns all unique extraction classes.

Examples

iex> extractions = [
...>   %LeXtract.Extraction{extraction_class: "person", extraction_text: "John"},
...>   %LeXtract.Extraction{extraction_class: "person", extraction_text: "Jane"}
...> ]
iex> doc = %LeXtract.AnnotatedDocument{document_id: "doc-1", extractions: extractions}
iex> LeXtract.AnnotatedDocument.extraction_classes(doc)
["person"]

has_extractions?(annotated_document)

@spec has_extractions?(t()) :: boolean()

Returns true if document has any extractions.

Examples

iex> doc = %LeXtract.AnnotatedDocument{document_id: "doc-1", extractions: []}
iex> LeXtract.AnnotatedDocument.has_extractions?(doc)
false

new(opts \\ [])

@spec new(keyword()) :: t()

Creates a new annotated document.

Parameters

  • opts - Keyword list of options

Options

  • :text - Original document text
  • :document_id - Unique identifier (auto-generated if not provided)
  • :extractions - List of extractions (default: [])
  • :metadata - Optional metadata map

Examples

iex> doc = LeXtract.AnnotatedDocument.new(text: "Sample", document_id: "doc1")
iex> doc.text
"Sample"

iex> doc = LeXtract.AnnotatedDocument.new(text: "Test")
iex> String.length(doc.document_id)
36