Nous.Providers.SGLang (nous v0.13.3)

SGLang provider implementation.

SGLang (Structured Generation Language) is a framework for efficient LLM serving with an OpenAI-compatible API. By default it runs on http://localhost:30000/v1.

Configuration

No API key is required for local usage. Configure the base URL if needed:

config :nous, :sglang,
  base_url: "http://localhost:30000/v1"

Or use environment variable:

export SGLANG_BASE_URL="http://localhost:30000/v1"

Usage

# Via Model.parse
model = Nous.Model.parse("sglang:meta-llama/Llama-3-8B-Instruct")

# Direct provider usage
{:ok, response} = Nous.Providers.SGLang.chat(%{
  "model" => "meta-llama/Llama-3-8B-Instruct",
  "messages" => [%{"role" => "user", "content" => "Hello"}]
})

Features

SGLang supports:

OpenAI-compatible chat completions
Streaming responses
RadixAttention for KV cache reuse
Constrained decoding (JSON, regex)
Speculative decoding
Multi-modal inputs

SGLang-Specific Parameters

Additional parameters supported (pass in params map):

regex - Constrain output to match a regex pattern
json_schema - Constrain output to match a JSON schema

Summary

Functions

api_key(opts \\ [])

Get the API key from options, environment, or application config.

base_url(opts \\ [])

Get the base URL from options, application config, or default.

count_tokens(messages)

Count tokens in messages (rough estimate).

request(model, messages, settings)

High-level request with message conversion, telemetry, and error wrapping.

request_stream(model, messages, settings)

High-level streaming request with message conversion and telemetry.