Langfuse

Community Elixir SDK for Langfuse - Open source LLM observability, tracing, and prompt management.

Note: This is an unofficial community-maintained SDK, not affiliated with or endorsed by Langfuse GmbH.

Features

Tracing - Create traces, spans, generations, and events for LLM observability
Scoring - Attach numeric, categorical, and boolean scores to traces and observations
Sessions - Group related traces into conversations
Prompts - Fetch, cache, and compile version-controlled prompts
Client API - Full REST API access for datasets, models, and management
OpenTelemetry - Optional integration for distributed tracing
Instrumentation - Macros for automatic function tracing
Data Masking - Redact sensitive data before sending to Langfuse
Async Batching - Non-blocking event ingestion with configurable batching

Installation

Add langfuse to your list of dependencies in mix.exs:

def deps do
  [
    {:langfuse, "~> 0.1.0"}
  ]
end

For OpenTelemetry integration, add the optional dependencies:

def deps do
  [
    {:langfuse, "~> 0.1.0"},
    {:opentelemetry_api, "~> 1.4"},
    {:opentelemetry, "~> 1.5"}
  ]
end

Configuration

Configure Langfuse in your config/config.exs:

config :langfuse,
  public_key: "pk-...",
  secret_key: "sk-...",
  host: "https://cloud.langfuse.com"

Or use environment variables:

export LANGFUSE_PUBLIC_KEY="pk-..."
export LANGFUSE_SECRET_KEY="sk-..."
export LANGFUSE_HOST="https://cloud.langfuse.com"

Configuration Options

Option	Type	Default	Description
`public_key`	string	-	Langfuse public key (or `LANGFUSE_PUBLIC_KEY`)
`secret_key`	string	-	Langfuse secret key (or `LANGFUSE_SECRET_KEY`)
`host`	string	`https://cloud.langfuse.com`	Langfuse API host
`environment`	string	`nil`	Environment tag (e.g., "production", "staging")
`enabled`	boolean	`true`	Enable/disable SDK
`flush_interval`	integer	`5000`	Batch flush interval in ms
`batch_size`	integer	`100`	Maximum events per batch
`max_retries`	integer	`3`	HTTP retry attempts
`debug`	boolean	`false`	Enable debug logging
`mask_fn`	function	`nil`	Custom function for masking sensitive data

Quick Start

Tracing

trace = Langfuse.trace(
  name: "chat-request",
  user_id: "user-123",
  metadata: %{source: "api"},
  version: "1.0.0",
  release: "2025-01-15"
)

span = Langfuse.span(trace,
  name: "document-retrieval",
  type: :retriever,
  input: %{query: "test"}
)
span = Langfuse.update(span, output: retrieved_docs)
span = Langfuse.end_observation(span)

generation = Langfuse.generation(trace,
  name: "chat-completion",
  model: "gpt-4",
  input: [%{role: "user", content: "Hello"}],
  model_parameters: %{temperature: 0.7}
)

generation = Langfuse.update(generation,
  output: %{role: "assistant", content: "Hi there!"},
  usage: %{input: 10, output: 5, total: 15}
)
generation = Langfuse.end_observation(generation)

Langfuse.score(trace, name: "quality", value: 0.9)

Span Types

Spans support semantic types for better organization in the Langfuse UI:

Langfuse.span(trace, name: "agent-loop", type: :agent)
Langfuse.span(trace, name: "tool-call", type: :tool)
Langfuse.span(trace, name: "rag-chain", type: :chain)
Langfuse.span(trace, name: "doc-search", type: :retriever)
Langfuse.span(trace, name: "embed-text", type: :embedding)
Langfuse.span(trace, name: "generic-step", type: :default)

Sessions

Group related traces into sessions:

session_id = Langfuse.Session.new_id()

trace1 = Langfuse.trace(name: "turn-1", session_id: session_id)
trace2 = Langfuse.trace(name: "turn-2", session_id: session_id)

Langfuse.Session.score(session_id, name: "satisfaction", value: 4.5)

Prompts

Fetch and use prompts from Langfuse:

{:ok, prompt} = Langfuse.Prompt.get("my-prompt")
{:ok, prompt} = Langfuse.Prompt.get("my-prompt", version: 2)
{:ok, prompt} = Langfuse.Prompt.get("my-prompt", label: "production")

compiled = Langfuse.Prompt.compile(prompt, %{name: "Alice", topic: "weather"})

generation = Langfuse.generation(trace,
  name: "chat",
  prompt_name: prompt.name,
  prompt_version: prompt.version,
  input: compiled
)

Prompts are cached by default. To invalidate:

Langfuse.Prompt.invalidate("my-prompt")
Langfuse.Prompt.invalidate("my-prompt", version: 2)
Langfuse.Prompt.invalidate_all()

Use fallback prompts when fetch fails:

fallback = %Langfuse.Prompt{
  name: "my-prompt",
  prompt: "Default template: {{name}}",
  type: :text
}

{:ok, prompt} = Langfuse.Prompt.get("my-prompt", fallback: fallback)

Scores

Score traces, observations, or sessions:

Langfuse.score(trace, name: "quality", value: 0.85)

Langfuse.score(trace,
  name: "sentiment",
  string_value: "positive",
  data_type: :categorical
)

Langfuse.score(trace,
  name: "hallucination",
  value: false,
  data_type: :boolean
)

Langfuse.score(trace,
  name: "feedback",
  value: 5,
  comment: "Excellent response",
  metadata: %{reviewer: "human"}
)

API Coverage

This SDK covers the core Langfuse API. See the Langfuse API Reference for full documentation.

Tracing (via SDK)

Feature	Function	Status
Create trace	`Langfuse.trace/1`	Supported
Create span	`Langfuse.span/2`	Supported
Create generation	`Langfuse.generation/2`	Supported
Create event	`Langfuse.event/2`	Supported
Create score	`Langfuse.score/2`	Supported
Update observation	`Langfuse.update/2`	Supported
End observation	`Langfuse.end_observation/1`	Supported
Batch ingestion	`Langfuse.Ingestion`	Supported

Prompts

Operation	Function	Status
Get prompt	`Client.get_prompt/2`	Supported
List prompts	`Client.list_prompts/1`	Supported
Create prompt	`Client.create_prompt/1`	Supported
Update labels	`Client.update_prompt_labels/3`	Supported

Datasets

Operation	Function	Status
Create dataset	`Client.create_dataset/1`	Supported
Get dataset	`Client.get_dataset/1`	Supported
List datasets	`Client.list_datasets/1`	Supported
Delete dataset	`Client.delete_dataset/1`	Supported

Dataset Items

Operation	Function	Status
Create item	`Client.create_dataset_item/1`	Supported
Get item	`Client.get_dataset_item/1`	Supported
Update item	`Client.update_dataset_item/2`	Supported
List items	`Client.list_dataset_items/1`	Supported
Delete item	`Client.delete_dataset_item/1`	Supported

Dataset Runs

Operation	Function	Status
Create run	`Client.create_dataset_run/1`	Supported
Get run	`Client.get_dataset_run/2`	Supported
List runs	`Client.list_dataset_runs/2`	Supported
Delete run	`Client.delete_dataset_run/2`	Supported
Create run item	`Client.create_dataset_run_item/1`	Supported
List run items	`Client.list_dataset_run_items/1`	Supported

Traces & Sessions

Operation	Function	Status
Get trace	`Client.get_trace/1`	Supported
List traces	`Client.list_traces/1`	Supported
Get session	`Client.get_session/1`	Supported
List sessions	`Client.list_sessions/1`	Supported

Observations

Operation	Function	Status
Get observation	`Client.get_observation/1`	Supported
List observations	`Client.list_observations/1`	Supported

Scores

Operation	Function	Status
Create score	`Langfuse.score/2`	Supported
Get score	`Client.get_score/1`	Supported
List scores	`Client.list_scores/1`	Supported
Delete score	`Client.delete_score/1`	Supported

Score Configs

Operation	Function	Status
Create config	`Client.create_score_config/1`	Supported
Get config	`Client.get_score_config/1`	Supported
List configs	`Client.list_score_configs/1`	Supported

Models

Operation	Function	Status
Create model	`Client.create_model/1`	Supported
Get model	`Client.get_model/1`	Supported
List models	`Client.list_models/1`	Supported
Delete model	`Client.delete_model/1`	Supported

Health & Auth

Operation	Function	Status
Auth check	`Langfuse.auth_check/0`	Supported
Health check	`Client.get("/api/public/health")`	Via raw API

Not Yet Implemented

The following Langfuse API features are not yet implemented but can be accessed via Client.get/2, Client.post/2, Client.patch/2, and Client.delete/1:

Annotation Queues
Comments
Media (file uploads)
Metrics
Projects management
Organizations management
SCIM provisioning

Client API Examples

{:ok, _} = Langfuse.auth_check()

{:ok, dataset} = Langfuse.Client.create_dataset(name: "eval-set")
{:ok, datasets} = Langfuse.Client.list_datasets()

{:ok, item} = Langfuse.Client.create_dataset_item(
  dataset_name: "eval-set",
  input: %{query: "test"},
  expected_output: %{answer: "response"}
)
{:ok, _} = Langfuse.Client.update_dataset_item(item["id"], status: "ARCHIVED")

{:ok, run} = Langfuse.Client.create_dataset_run(
  dataset_name: "eval-set",
  name: "experiment-1"
)

{:ok, model} = Langfuse.Client.create_model(
  model_name: "gpt-4-turbo",
  match_pattern: "(?i)^(gpt-4-turbo)$",
  input_price: 0.01,
  output_price: 0.03,
  unit: "TOKENS"
)
{:ok, models} = Langfuse.Client.list_models()

{:ok, observations} = Langfuse.Client.list_observations(trace_id: trace.id)
{:ok, observation} = Langfuse.Client.get_observation(observation_id)

{:ok, prompt} = Langfuse.Client.get_prompt("my-prompt", version: 1)

{:ok, config} = Langfuse.Client.create_score_config(
  name: "quality",
  data_type: "NUMERIC",
  min_value: 0,
  max_value: 1
)

Instrumentation

Use macros for automatic function tracing:

defmodule MyApp.Agent do
  use Langfuse.Instrumentation

  @trace name: "agent-run"
  def run(input) do
    process(input)
  end

  @span name: "process-step", type: :chain
  def process(input) do
    call_llm(input)
  end

  @generation name: "llm-call", model: "gpt-4"
  def call_llm(input) do
    # LLM call here
  end
end

OpenTelemetry Integration

For applications using OpenTelemetry, Langfuse can receive spans via a custom span processor:

config :opentelemetry,
  span_processor: {Langfuse.OpenTelemetry.SpanProcessor, []}

Or configure programmatically:

Langfuse.OpenTelemetry.Setup.configure()

Map OpenTelemetry attributes to Langfuse fields:

:otel_tracer.with_span "llm-call", %{attributes: %{
  "langfuse.type" => "generation",
  "langfuse.model" => "gpt-4",
  "langfuse.input" => Jason.encode!(messages),
  "langfuse.output" => Jason.encode!(response)
}} do
  # Your code here
end

See Langfuse.OpenTelemetry for full documentation.

Data Masking

Redact sensitive data before sending to Langfuse:

config :langfuse,
  mask_fn: &MyApp.Masking.mask/1

defmodule MyApp.Masking do
  def mask(data) do
    Langfuse.Masking.mask(data,
      patterns: [
        ~r/\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/,
        ~r/\b\d{3}-\d{2}-\d{4}\b/
      ],
      replacement: "[REDACTED]"
    )
  end
end

Or use the built-in masking:

config :langfuse,
  mask_fn: {Langfuse.Masking, :mask, [[
    patterns: [~r/secret_\w+/i],
    keys: ["password", "api_key", "token"]
  ]]}

Telemetry

The SDK emits telemetry events for observability:

Event	Measurements	Metadata
`[:langfuse, :ingestion, :flush, :start\|:stop\|:exception]`	`duration`	`batch_size`
`[:langfuse, :http, :request, :start\|:stop\|:exception]`	`duration`	`method`, `path`, `status`
`[:langfuse, :prompt, :fetch, :start\|:stop\|:exception]`	`duration`	`name`, `version`
`[:langfuse, :prompt, :cache, :hit\|:miss]`	-	`name`, `version`

:telemetry.attach(
  "langfuse-logger",
  [:langfuse, :http, :request, :stop],
  fn _event, measurements, metadata, _config ->
    duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
    Logger.info("Langfuse HTTP #{metadata.method} #{metadata.path}: #{duration_ms}ms")
  end,
  nil
)

Langfuse.Telemetry.attach_default_logger()

Testing

The SDK provides helpers for testing applications that use Langfuse:

config :langfuse, enabled: false

defmodule MyApp.TracingTest do
  use ExUnit.Case
  import Langfuse.Testing

  setup do
    start_supervised!({Langfuse.Testing.EventCapture, []})
    :ok
  end

  test "traces are created" do
    MyApp.Agent.run("test input")

    assert_traced("agent-run")
    assert_generation_created("llm-call", model: "gpt-4")
  end
end

For mocking HTTP calls:

Mox.defmock(Langfuse.HTTPMock, for: Langfuse.HTTPBehaviour)

config :langfuse, http_client: Langfuse.HTTPMock

Graceful Shutdown

The SDK automatically flushes pending events on application shutdown. For explicit control:

Langfuse.flush()

Langfuse.flush(timeout: 10_000)

Langfuse.shutdown()

Runtime Configuration

Reload configuration at runtime (useful for feature flags):

Application.put_env(:langfuse, :enabled, false)
Langfuse.Config.reload()

License

MIT License - see LICENSE for details.

Next Page → Changelog