# `LlamaCppEx.Thinking`
[🔗](https://github.com/nyo16/llama_cpp_ex/blob/main/lib/llama_cpp_ex/thinking.ex#L1)

Parser for `<think>...</think>` blocks in thinking model output.

Thinking models (e.g. Qwen 3.5 with `enable_thinking: true`) wrap their
chain-of-thought reasoning in `<think>...</think>` tags. This module provides
both a one-shot parser for complete text and a streaming parser that handles
token boundary splits.

# `feed`

```elixir
@spec feed(map(), String.t()) :: {[{:thinking | :content, String.t()}], map()}
```

Feeds a text chunk to the streaming parser.

Returns `{events, new_parser}` where events are `{:thinking, text}` or
`{:content, text}` tuples.

The parser buffers partial `<think>` and `</think>` tags to correctly handle
token boundary splits.

## Examples

    parser = LlamaCppEx.Thinking.stream_parser()
    {events, parser} = LlamaCppEx.Thinking.feed(parser, "<think>")
    # events = []  (tag consumed)
    {events, parser} = LlamaCppEx.Thinking.feed(parser, "reasoning")
    # events = [{:thinking, "reasoning"}]
    {events, _parser} = LlamaCppEx.Thinking.feed(parser, "</think>answer")
    # events = [{:content, "answer"}]

# `parse`

```elixir
@spec parse(String.t()) :: {String.t(), String.t()}
```

Splits completed text into `{reasoning_content, content}`.

Handles both explicit `<think>...</think>` wrapping and the common case where
the chat template already opened the `<think>` block (so generated text starts
directly with reasoning followed by `</think>`).

## Examples

    iex> LlamaCppEx.Thinking.parse("<think>I need to think</think>The answer is 42")
    {"I need to think", "The answer is 42"}

    iex> LlamaCppEx.Thinking.parse("reasoning here\n</think>\nThe answer is 42")
    {"reasoning here", "The answer is 42"}

    iex> LlamaCppEx.Thinking.parse("Just a response")
    {"", "Just a response"}

    iex> LlamaCppEx.Thinking.parse("<think>reasoning only</think>")
    {"reasoning only", ""}

# `stream_parser`

```elixir
@spec stream_parser(keyword()) :: map()
```

Creates a new streaming parser state.

Use with `feed/2` to incrementally parse streamed tokens.

## Options

  * `:thinking` - When `true`, assumes the template already opened a
    `<think>` block, so generated text starts in thinking mode. Defaults
    to `false`.

---

*Consult [api-reference.md](api-reference.md) for complete listing*