dataprep

Composable, type-driven preprocessing and validation combinator library for Gleam.
dataprep is a combinator toolkit, not a rule catalog.
- Built-in and user-defined rules are identical in power.
- No domain-specific rules (email, URL, UUID). Write your own or use a dedicated package.
- No schema, no DSL, no reflection.
- Prep transforms. Validator checks. They do not mix.
- Errors are your types, not ours.
Requirements
- Gleam 1.15 or later
- Erlang/OTP 27 or later
Install
gleam add dataprep
Quick start
import dataprep/prep
import dataprep/validated.{type Validated}
import dataprep/rules
pub type User {
User(name: String, age: Int)
}
pub type Err {
NameEmpty
AgeTooYoung
}
pub fn validate_user(name: String, age: Int) -> Validated(User, Err) {
let clean = prep.trim() |> prep.then(prep.lowercase())
let check_name = rules.not_empty(NameEmpty)
let check_age = rules.min_int(0, AgeTooYoung)
validated.map2(
User,
name |> clean |> check_name,
check_age(age),
)
}
// validate_user(" Alice ", 25) -> Valid(User("alice", 25))
// validate_user("", -1) -> Invalid([NameEmpty, AgeTooYoung])
Examples
Field validation with structured error context
Attach field names to errors so callers can identify which field failed.
import dataprep/prep
import dataprep/rules
import dataprep/validated.{type Validated}
import dataprep/validator
pub type FormError {
Field(name: String, detail: FieldDetail)
}
pub type FieldDetail {
Empty
TooShort(min: Int)
TooLong(max: Int)
}
pub fn validate_username(raw: String) -> Validated(String, FormError) {
let clean = prep.trim() |> prep.then(prep.lowercase())
let check =
rules.not_empty(Empty)
|> validator.guard(
rules.min_length(3, TooShort(3))
|> validator.both(rules.max_length(20, TooLong(20))),
)
|> validator.label("username", Field)
raw |> clean |> check
}
// validate_username(" Al ")
// -> Invalid([Field("username", TooShort(3))])
// validate_username(" Alice ")
// -> Valid("alice")
Parse then validate
Use validated.and_then to bridge type-changing parsing with
same-type validation. Parsing short-circuits; validation accumulates.
import dataprep/parse
import dataprep/rules
import dataprep/validated.{type Validated}
import dataprep/validator
pub type AgeError {
NotAnInteger(raw: String)
TooYoung(min: Int)
TooOld(max: Int)
}
pub fn validate_age(raw: String) -> Validated(Int, AgeError) {
let check_range =
rules.min_int(0, TooYoung(0))
|> validator.both(rules.max_int(150, TooOld(150)))
parse.int(raw, NotAnInteger)
|> validated.and_then(check_range)
}
// validate_age("abc") -> Invalid([NotAnInteger("abc")])
// validate_age("200") -> Invalid([TooOld(150)])
// validate_age("25") -> Valid(25)
Nested error labeling with map3
Combine multiple fields into a domain type. All errors from all fields are accumulated with their field names.
import dataprep/prep
import dataprep/rules
import dataprep/validated.{type Validated}
import dataprep/validator
pub type SignupForm {
SignupForm(name: String, email: String, age: Int)
}
pub type SignupError {
Field(name: String, detail: Detail)
}
pub type Detail {
Empty
TooShort(min: Int)
OutOfRange(min: Int, max: Int)
}
fn validate_name(raw: String) -> Validated(String, SignupError) {
let clean = prep.trim() |> prep.then(prep.lowercase())
let check =
rules.not_empty(Empty)
|> validator.guard(rules.min_length(2, TooShort(2)))
|> validator.label("name", Field)
raw |> clean |> check
}
fn validate_email(raw: String) -> Validated(String, SignupError) {
let clean = prep.trim() |> prep.then(prep.lowercase())
let check =
rules.not_empty(Empty)
|> validator.label("email", Field)
raw |> clean |> check
}
fn validate_age(age: Int) -> Validated(Int, SignupError) {
let check =
rules.min_int(0, OutOfRange(0, 150))
|> validator.both(rules.max_int(150, OutOfRange(0, 150)))
|> validator.label("age", Field)
check(age)
}
pub fn validate_signup(
name: String,
email: String,
age: Int,
) -> Validated(SignupForm, SignupError) {
validated.map3(
SignupForm,
validate_name(name),
validate_email(email),
validate_age(age),
)
}
// validate_signup("", "", 200)
// -> Invalid([
// Field("name", Empty),
// Field("email", Empty),
// Field("age", OutOfRange(0, 150)),
// ])
More examples are available in the doc/recipes/ directory of the repository.
Modules
| Module | Responsibility |
|---|---|
dataprep/prep | Infallible transformations: trim, lowercase, uppercase, collapse_space, replace, default. Compose with then or sequence. |
dataprep/validator | Checks without transformation: check, predicate, both, all, alt, guard, map_error, label, each, optional. |
dataprep/validated | Applicative error accumulation: map, map_error, and_then, from_result, from_result_map, to_result, map2..map5, sequence, traverse, traverse_indexed. |
dataprep/non_empty_list | At-least-one guarantee for error lists: single, cons, append, concat, map, flat_map, to_list, from_list. |
dataprep/rules | Built-in rules: not_empty, not_blank, matches, min_length, max_length, length_between, min_int, max_int, min_float, max_float, non_negative_int, non_negative_float, one_of, equals. |
dataprep/parse | Parse helpers: int, float. Bridge String to typed Validated with custom error mapping. |
Composition overview
| Phase | Combinator | Errors | When to use |
|---|---|---|---|
| Prep | prep.then | (none) | Chain infallible transforms |
| Validate | validator.both / all | Accumulate all | Independent checks on same value |
| Validate | validator.alt | Accumulate on full failure | Accept alternative forms |
| Validate | validator.guard | Short-circuit | Skip if prerequisite fails |
| Combine | validated.map2..map5 | Accumulate all | Build domain types from independent fields |
| Bridge | validated.and_then | Short-circuit | Parse then validate (type changes) |
| Bridge | parse.int / parse.float | Short-circuit | String to typed Validated in one step |
| Bridge | raw |> prep |> validator | (prep has none) | Apply infallible transform before validation |
| Collection | validated.sequence / traverse | Accumulate all | Validate a list of values |
| Collection | validator.each | Accumulate all | Apply a validator to every list element |
| Collection | validator.optional | (none if None) | Skip validation for absent values |
Development
This project uses mise to manage Gleam and Erlang versions, and just as a task runner.
mise install # install Gleam and Erlang
just ci # format check, typecheck, build, test
just test # gleam test
just format # gleam format
just check # all checks without deps download
Contributing
Contributions are welcome. See CONTRIBUTING.md for details.