Langfuse

View Source

Hex.pm Docs

Community Elixir SDK for Langfuse - Open source LLM observability, tracing, and prompt management.

Note: This is an unofficial community-maintained SDK, not affiliated with or endorsed by Langfuse GmbH.

Features

  • Tracing - Create traces, spans, generations, and events for LLM observability
  • Scoring - Attach numeric, categorical, and boolean scores to traces and observations
  • Sessions - Group related traces into conversations
  • Prompts - Fetch, cache, and compile version-controlled prompts
  • Client API - Full REST API access for datasets, models, and management
  • OpenTelemetry - Optional integration for distributed tracing
  • Instrumentation - Macros for automatic function tracing
  • Data Masking - Redact sensitive data before sending to Langfuse
  • Async Batching - Non-blocking event ingestion with configurable batching

Installation

Add langfuse to your list of dependencies in mix.exs:

def deps do
  [
    {:langfuse, "~> 0.1.0"}
  ]
end

For OpenTelemetry integration, add the optional dependencies:

def deps do
  [
    {:langfuse, "~> 0.1.0"},
    {:opentelemetry_api, "~> 1.4"},
    {:opentelemetry, "~> 1.5"}
  ]
end

Configuration

Configure Langfuse in your config/config.exs:

config :langfuse,
  public_key: "pk-...",
  secret_key: "sk-...",
  host: "https://cloud.langfuse.com"

Or use environment variables:

export LANGFUSE_PUBLIC_KEY="pk-..."
export LANGFUSE_SECRET_KEY="sk-..."
export LANGFUSE_HOST="https://cloud.langfuse.com"

Configuration Options

OptionTypeDefaultDescription
public_keystring-Langfuse public key (or LANGFUSE_PUBLIC_KEY)
secret_keystring-Langfuse secret key (or LANGFUSE_SECRET_KEY)
hoststringhttps://cloud.langfuse.comLangfuse API host
environmentstringnilEnvironment tag (e.g., "production", "staging")
enabledbooleantrueEnable/disable SDK
flush_intervalinteger5000Batch flush interval in ms
batch_sizeinteger100Maximum events per batch
max_retriesinteger3HTTP retry attempts
debugbooleanfalseEnable debug logging
mask_fnfunctionnilCustom function for masking sensitive data

Quick Start

Tracing

trace = Langfuse.trace(
  name: "chat-request",
  user_id: "user-123",
  metadata: %{source: "api"},
  version: "1.0.0",
  release: "2025-01-15"
)

span = Langfuse.span(trace,
  name: "document-retrieval",
  type: :retriever,
  input: %{query: "test"}
)
span = Langfuse.update(span, output: retrieved_docs)
span = Langfuse.end_observation(span)

generation = Langfuse.generation(trace,
  name: "chat-completion",
  model: "gpt-4",
  input: [%{role: "user", content: "Hello"}],
  model_parameters: %{temperature: 0.7}
)

generation = Langfuse.update(generation,
  output: %{role: "assistant", content: "Hi there!"},
  usage: %{input: 10, output: 5, total: 15}
)
generation = Langfuse.end_observation(generation)

Langfuse.score(trace, name: "quality", value: 0.9)

Span Types

Spans support semantic types for better organization in the Langfuse UI:

Langfuse.span(trace, name: "agent-loop", type: :agent)
Langfuse.span(trace, name: "tool-call", type: :tool)
Langfuse.span(trace, name: "rag-chain", type: :chain)
Langfuse.span(trace, name: "doc-search", type: :retriever)
Langfuse.span(trace, name: "embed-text", type: :embedding)
Langfuse.span(trace, name: "generic-step", type: :default)

Sessions

Group related traces into sessions:

session_id = Langfuse.Session.new_id()

trace1 = Langfuse.trace(name: "turn-1", session_id: session_id)
trace2 = Langfuse.trace(name: "turn-2", session_id: session_id)

Langfuse.Session.score(session_id, name: "satisfaction", value: 4.5)

Prompts

Fetch and use prompts from Langfuse:

{:ok, prompt} = Langfuse.Prompt.get("my-prompt")
{:ok, prompt} = Langfuse.Prompt.get("my-prompt", version: 2)
{:ok, prompt} = Langfuse.Prompt.get("my-prompt", label: "production")

compiled = Langfuse.Prompt.compile(prompt, %{name: "Alice", topic: "weather"})

generation = Langfuse.generation(trace,
  name: "chat",
  prompt_name: prompt.name,
  prompt_version: prompt.version,
  input: compiled
)

Prompts are cached by default. To invalidate:

Langfuse.Prompt.invalidate("my-prompt")
Langfuse.Prompt.invalidate("my-prompt", version: 2)
Langfuse.Prompt.invalidate_all()

Use fallback prompts when fetch fails:

fallback = %Langfuse.Prompt{
  name: "my-prompt",
  prompt: "Default template: {{name}}",
  type: :text
}

{:ok, prompt} = Langfuse.Prompt.get("my-prompt", fallback: fallback)

Scores

Score traces, observations, or sessions:

Langfuse.score(trace, name: "quality", value: 0.85)

Langfuse.score(trace,
  name: "sentiment",
  string_value: "positive",
  data_type: :categorical
)

Langfuse.score(trace,
  name: "hallucination",
  value: false,
  data_type: :boolean
)

Langfuse.score(trace,
  name: "feedback",
  value: 5,
  comment: "Excellent response",
  metadata: %{reviewer: "human"}
)

API Coverage

This SDK covers the core Langfuse API. See the Langfuse API Reference for full documentation.

Tracing (via SDK)

FeatureFunctionStatus
Create traceLangfuse.trace/1Supported
Create spanLangfuse.span/2Supported
Create generationLangfuse.generation/2Supported
Create eventLangfuse.event/2Supported
Create scoreLangfuse.score/2Supported
Update observationLangfuse.update/2Supported
End observationLangfuse.end_observation/1Supported
Batch ingestionLangfuse.IngestionSupported

Prompts

OperationFunctionStatus
Get promptClient.get_prompt/2Supported
List promptsClient.list_prompts/1Supported
Create promptClient.create_prompt/1Supported
Update labelsClient.update_prompt_labels/3Supported

Datasets

OperationFunctionStatus
Create datasetClient.create_dataset/1Supported
Get datasetClient.get_dataset/1Supported
List datasetsClient.list_datasets/1Supported
Delete datasetClient.delete_dataset/1Supported

Dataset Items

OperationFunctionStatus
Create itemClient.create_dataset_item/1Supported
Get itemClient.get_dataset_item/1Supported
Update itemClient.update_dataset_item/2Supported
List itemsClient.list_dataset_items/1Supported
Delete itemClient.delete_dataset_item/1Supported

Dataset Runs

OperationFunctionStatus
Create runClient.create_dataset_run/1Supported
Get runClient.get_dataset_run/2Supported
List runsClient.list_dataset_runs/2Supported
Delete runClient.delete_dataset_run/2Supported
Create run itemClient.create_dataset_run_item/1Supported
List run itemsClient.list_dataset_run_items/1Supported

Traces & Sessions

OperationFunctionStatus
Get traceClient.get_trace/1Supported
List tracesClient.list_traces/1Supported
Get sessionClient.get_session/1Supported
List sessionsClient.list_sessions/1Supported

Observations

OperationFunctionStatus
Get observationClient.get_observation/1Supported
List observationsClient.list_observations/1Supported

Scores

OperationFunctionStatus
Create scoreLangfuse.score/2Supported
Get scoreClient.get_score/1Supported
List scoresClient.list_scores/1Supported
Delete scoreClient.delete_score/1Supported

Score Configs

OperationFunctionStatus
Create configClient.create_score_config/1Supported
Get configClient.get_score_config/1Supported
List configsClient.list_score_configs/1Supported

Models

OperationFunctionStatus
Create modelClient.create_model/1Supported
Get modelClient.get_model/1Supported
List modelsClient.list_models/1Supported
Delete modelClient.delete_model/1Supported

Health & Auth

OperationFunctionStatus
Auth checkLangfuse.auth_check/0Supported
Health checkClient.get("/api/public/health")Via raw API

Not Yet Implemented

The following Langfuse API features are not yet implemented but can be accessed via Client.get/2, Client.post/2, Client.patch/2, and Client.delete/1:

  • Annotation Queues
  • Comments
  • Media (file uploads)
  • Metrics
  • Projects management
  • Organizations management
  • SCIM provisioning

Client API Examples

{:ok, _} = Langfuse.auth_check()

{:ok, dataset} = Langfuse.Client.create_dataset(name: "eval-set")
{:ok, datasets} = Langfuse.Client.list_datasets()

{:ok, item} = Langfuse.Client.create_dataset_item(
  dataset_name: "eval-set",
  input: %{query: "test"},
  expected_output: %{answer: "response"}
)
{:ok, _} = Langfuse.Client.update_dataset_item(item["id"], status: "ARCHIVED")

{:ok, run} = Langfuse.Client.create_dataset_run(
  dataset_name: "eval-set",
  name: "experiment-1"
)

{:ok, model} = Langfuse.Client.create_model(
  model_name: "gpt-4-turbo",
  match_pattern: "(?i)^(gpt-4-turbo)$",
  input_price: 0.01,
  output_price: 0.03,
  unit: "TOKENS"
)
{:ok, models} = Langfuse.Client.list_models()

{:ok, observations} = Langfuse.Client.list_observations(trace_id: trace.id)
{:ok, observation} = Langfuse.Client.get_observation(observation_id)

{:ok, prompt} = Langfuse.Client.get_prompt("my-prompt", version: 1)

{:ok, config} = Langfuse.Client.create_score_config(
  name: "quality",
  data_type: "NUMERIC",
  min_value: 0,
  max_value: 1
)

Instrumentation

Use macros for automatic function tracing:

defmodule MyApp.Agent do
  use Langfuse.Instrumentation

  @trace name: "agent-run"
  def run(input) do
    process(input)
  end

  @span name: "process-step", type: :chain
  def process(input) do
    call_llm(input)
  end

  @generation name: "llm-call", model: "gpt-4"
  def call_llm(input) do
    # LLM call here
  end
end

OpenTelemetry Integration

For applications using OpenTelemetry, Langfuse can receive spans via a custom span processor:

config :opentelemetry,
  span_processor: {Langfuse.OpenTelemetry.SpanProcessor, []}

Or configure programmatically:

Langfuse.OpenTelemetry.Setup.configure()

Map OpenTelemetry attributes to Langfuse fields:

:otel_tracer.with_span "llm-call", %{attributes: %{
  "langfuse.type" => "generation",
  "langfuse.model" => "gpt-4",
  "langfuse.input" => Jason.encode!(messages),
  "langfuse.output" => Jason.encode!(response)
}} do
  # Your code here
end

See Langfuse.OpenTelemetry for full documentation.

Data Masking

Redact sensitive data before sending to Langfuse:

config :langfuse,
  mask_fn: &MyApp.Masking.mask/1
defmodule MyApp.Masking do
  def mask(data) do
    Langfuse.Masking.mask(data,
      patterns: [
        ~r/\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/,
        ~r/\b\d{3}-\d{2}-\d{4}\b/
      ],
      replacement: "[REDACTED]"
    )
  end
end

Or use the built-in masking:

config :langfuse,
  mask_fn: {Langfuse.Masking, :mask, [[
    patterns: [~r/secret_\w+/i],
    keys: ["password", "api_key", "token"]
  ]]}

Telemetry

The SDK emits telemetry events for observability:

EventMeasurementsMetadata
[:langfuse, :ingestion, :flush, :start|:stop|:exception]durationbatch_size
[:langfuse, :http, :request, :start|:stop|:exception]durationmethod, path, status
[:langfuse, :prompt, :fetch, :start|:stop|:exception]durationname, version
[:langfuse, :prompt, :cache, :hit|:miss]-name, version
:telemetry.attach(
  "langfuse-logger",
  [:langfuse, :http, :request, :stop],
  fn _event, measurements, metadata, _config ->
    duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
    Logger.info("Langfuse HTTP #{metadata.method} #{metadata.path}: #{duration_ms}ms")
  end,
  nil
)

Langfuse.Telemetry.attach_default_logger()

Testing

The SDK provides helpers for testing applications that use Langfuse:

config :langfuse, enabled: false
defmodule MyApp.TracingTest do
  use ExUnit.Case
  import Langfuse.Testing

  setup do
    start_supervised!({Langfuse.Testing.EventCapture, []})
    :ok
  end

  test "traces are created" do
    MyApp.Agent.run("test input")

    assert_traced("agent-run")
    assert_generation_created("llm-call", model: "gpt-4")
  end
end

For mocking HTTP calls:

Mox.defmock(Langfuse.HTTPMock, for: Langfuse.HTTPBehaviour)

config :langfuse, http_client: Langfuse.HTTPMock

Graceful Shutdown

The SDK automatically flushes pending events on application shutdown. For explicit control:

Langfuse.flush()

Langfuse.flush(timeout: 10_000)

Langfuse.shutdown()

Runtime Configuration

Reload configuration at runtime (useful for feature flags):

Application.put_env(:langfuse, :enabled, false)
Langfuse.Config.reload()

License

MIT License - see LICENSE for details.