Streaming allows you to receive LLM responses chunk-by-chunk as they are generated, improving perceived latency for users.
Basic Streaming
Use Broker.generate_stream/3 to get a stream of chunks:
alias Mojentic.LLM.{Broker, Message}
alias Mojentic.LLM.Gateways.Ollama
broker = Broker.new("qwen3:32b", Ollama)
messages = [Message.user("Tell me a story.")]
stream = Broker.generate_stream(broker, messages)
for {:ok, chunk} <- stream do
IO.write(chunk)
endStreaming with Tools
Mojentic supports streaming even when tools are involved. The broker will pause streaming to execute tools and then resume streaming the final response.
alias Mojentic.LLM.Tools.DateResolver
tools = [DateResolver]
stream = Broker.generate_stream(broker, messages, tools)
# The stream will contain text chunks.
# Tool execution happens transparently in the background.
for {:ok, chunk} <- stream do
IO.write(chunk)
endAsync Streams
For integration with Phoenix LiveView or other async processes, you can consume the stream asynchronously. The stream implements the Enumerable protocol, so it works with standard Elixir stream functions.