Streaming Validation
View SourceExJsonschema works great with Elixir's Stream module for processing large datasets without loading everything into memory. Here are some simple patterns to get you started.
Basic File Streaming
Process a file line by line:
# Compile your schema once
schema = ~s({"type": "object", "properties": {"name": {"type": "string"}}})
{:ok, validator} = ExJsonschema.compile(schema)
# Stream and validate each line
results =
"data.jsonl"
|> File.stream!()
|> Stream.map(&String.trim/1)
|> Stream.map(fn line ->
case ExJsonschema.validate(validator, line) do
:ok -> :valid
{:error, _errors} -> :invalid
end
end)
|> Enum.frequencies()
IO.inspect(results) # %{valid: 1500, invalid: 23}Concurrent Processing
Use Task.async_stream/3 for parallel validation:
data_stream = File.stream!("large_file.jsonl")
results =
data_stream
|> Task.async_stream(fn line ->
ExJsonschema.validate(validator, String.trim(line))
end, max_concurrency: 8)
|> Stream.map(fn {:ok, result} -> result end)
|> Enum.frequencies()JSON Arrays
Stream elements from a large JSON array:
# Load and parse the array
{:ok, big_array} = "large_dataset.json" |> File.read!() |> Jason.decode()
# Stream individual elements
valid_items =
big_array
|> Stream.map(&Jason.encode!/1)
|> Stream.filter(fn item_json ->
ExJsonschema.valid?(validator, item_json)
end)
|> Enum.to_list()Memory-Friendly Processing
Process without accumulating results:
# Just count, don't store results
{valid_count, invalid_count} =
File.stream!("huge_file.jsonl")
|> Stream.map(&String.trim/1)
|> Enum.reduce({0, 0}, fn line, {valid, invalid} ->
case ExJsonschema.validate(validator, line) do
:ok -> {valid + 1, invalid}
{:error, _} -> {valid, invalid + 1}
end
end)
IO.puts("Processed: #{valid_count + invalid_count} total")That's It
Streams work exactly as you'd expect with ExJsonschema. The validation functions are designed to work seamlessly in stream pipelines, so you can build whatever processing patterns make sense for your use case.
For more complex scenarios, check out:
- Performance & Production Guide - Optimization techniques
- Advanced Features Guide - Configuration options