# `IREE.Tokenizers.Encoding`
[🔗](https://github.com/goodhamgupta/iree_tokenizers/blob/v0.7.0/lib/iree/tokenizers/encoding.ex#L1)

Result returned by encoding operations.

This module intentionally mirrors the most useful `Tokenizers.Encoding`
helpers so callers can inspect token IDs, offsets, masks, and derived
metadata without dealing with the NIF directly.

# `t`

```elixir
@type t() :: %IREE.Tokenizers.Encoding{
  attention_mask: [non_neg_integer()],
  ids: [integer()],
  offsets: nil | [{non_neg_integer(), non_neg_integer()}],
  special_tokens_mask: [non_neg_integer()],
  tokens: [binary()],
  type_ids: [non_neg_integer()]
}
```

An encoded token sequence with optional offsets and derived masks.

# `get_attention_mask`

```elixir
@spec get_attention_mask(t()) :: [integer()]
```

Returns the attention mask.

# `get_ids`

```elixir
@spec get_ids(t()) :: [integer()]
```

Returns the token IDs.

# `get_length`

```elixir
@spec get_length(t()) :: non_neg_integer()
```

Returns the number of tokens in the encoding.

# `get_n_sequences`

```elixir
@spec get_n_sequences(t()) :: non_neg_integer()
```

Returns the number of sequences represented by the encoding.

The current IREE-backed implementation only emits single-sequence encodings.

# `get_offsets`

```elixir
@spec get_offsets(t()) :: [{integer(), integer()}]
```

Returns byte offsets for each token.

# `get_overflowing`

```elixir
@spec get_overflowing(t()) :: [t()]
```

Returns overflowing encodings, if any.

The current implementation does not emit overflowing pieces and always
returns an empty list.

# `get_sequence_ids`

```elixir
@spec get_sequence_ids(t()) :: [non_neg_integer() | nil]
```

Returns sequence IDs for each token, with special tokens represented as `nil`.

# `get_special_tokens_mask`

```elixir
@spec get_special_tokens_mask(t()) :: [integer()]
```

Returns the special-tokens mask.

# `get_tokens`

```elixir
@spec get_tokens(t()) :: [binary()]
```

Returns the token strings corresponding to the encoding.

# `get_type_ids`

```elixir
@spec get_type_ids(t()) :: [integer()]
```

Returns the type IDs.

# `get_u32_attention_mask`

```elixir
@spec get_u32_attention_mask(t()) :: binary()
```

Returns the attention mask packed into a little-endian `u32` binary.

# `get_u32_ids`

```elixir
@spec get_u32_ids(t()) :: binary()
```

Returns the token IDs packed into a little-endian `u32` binary.

# `get_u32_special_tokens_mask`

```elixir
@spec get_u32_special_tokens_mask(t()) :: binary()
```

Returns the special-tokens mask packed into a little-endian `u32` binary.

# `get_u32_type_ids`

```elixir
@spec get_u32_type_ids(t()) :: binary()
```

Returns the type IDs packed into a little-endian `u32` binary.

# `get_word_ids`

```elixir
@spec get_word_ids(t()) :: [nil]
```

Returns word IDs for each token.

The current implementation does not track word IDs and returns `nil` entries.

# `n_tokens`

```elixir
@spec n_tokens(t()) :: non_neg_integer()
```

Alias for `get_length/1`.

# `pad`

```elixir
@spec pad(t(), non_neg_integer(), keyword()) :: t()
```

Pads the encoding to `target_length`.

Supported options:

- `:direction` - `:left` or `:right`, defaults to `:right`
- `:pad_id` - token ID used for padding, defaults to `0`
- `:pad_type_id` - type ID used for padding, defaults to `0`
- `:pad_token` - token string used for padding, defaults to `"[PAD]"`

# `set_sequence_id`

```elixir
@spec set_sequence_id(t(), non_neg_integer()) :: t()
```

Replaces all sequence IDs in the encoding with the given value.

# `transform`

```elixir
@spec transform(t(), [IREE.Tokenizers.Encoding.Transformation.t()]) :: t()
```

Applies a list of transformations in order.

# `truncate`

```elixir
@spec truncate(t(), non_neg_integer(), keyword()) :: t()
```

Truncates the encoding to `max_length`.

Supported options:

- `:direction` - `:left` or `:right`, defaults to `:right`
- `:stride` - accepted for compatibility, currently not applied

---

*Consult [api-reference.md](api-reference.md) for complete listing*
