dataprep

CI Hex.pm

dataprep_logo

Composable, type-driven preprocessing and validation combinator library for Gleam.

dataprep is a combinator toolkit, not a rule catalog.

Requirements

Install

gleam add dataprep

Quick start

import dataprep/prep
import dataprep/validated.{type Validated}
import dataprep/rules

pub type User {
  User(name: String, age: Int)
}

pub type Err {
  NameEmpty
  AgeTooYoung
}

pub fn validate_user(name: String, age: Int) -> Validated(User, Err) {
  let clean = prep.trim() |> prep.then(prep.lowercase())
  let check_name = rules.not_empty(NameEmpty)
  let check_age = rules.min_int(0, AgeTooYoung)

  validated.map2(
    User,
    name |> clean |> check_name,
    check_age(age),
  )
}

// validate_user("  Alice ", 25)   -> Valid(User("alice", 25))
// validate_user("", -1)           -> Invalid([NameEmpty, AgeTooYoung])

Examples

Field validation with structured error context

Attach field names to errors so callers can identify which field failed.

import dataprep/prep
import dataprep/rules
import dataprep/validated.{type Validated}
import dataprep/validator

pub type FormError {
  Field(name: String, detail: FieldDetail)
}

pub type FieldDetail {
  Empty
  TooShort(min: Int)
  TooLong(max: Int)
}

pub fn validate_username(raw: String) -> Validated(String, FormError) {
  let clean = prep.trim() |> prep.then(prep.lowercase())
  let check =
    rules.not_empty(Empty)
    |> validator.guard(
      rules.min_length(3, TooShort(3))
      |> validator.both(rules.max_length(20, TooLong(20))),
    )
    |> validator.label("username", Field)

  raw |> clean |> check
}

// validate_username("  Al  ")
//   -> Invalid([Field("username", TooShort(3))])
// validate_username("  Alice  ")
//   -> Valid("alice")

Parse then validate

Use validated.and_then to bridge type-changing parsing with same-type validation. Parsing short-circuits; validation accumulates.

import dataprep/parse
import dataprep/rules
import dataprep/validated.{type Validated}
import dataprep/validator

pub type AgeError {
  NotAnInteger(raw: String)
  TooYoung(min: Int)
  TooOld(max: Int)
}

pub fn validate_age(raw: String) -> Validated(Int, AgeError) {
  let check_range =
    rules.min_int(0, TooYoung(0))
    |> validator.both(rules.max_int(150, TooOld(150)))

  parse.int(raw, NotAnInteger)
  |> validated.and_then(check_range)
}

// validate_age("abc") -> Invalid([NotAnInteger("abc")])
// validate_age("200") -> Invalid([TooOld(150)])
// validate_age("25")  -> Valid(25)

Nested error labeling with map3

Combine multiple fields into a domain type. All errors from all fields are accumulated with their field names.

import dataprep/prep
import dataprep/rules
import dataprep/validated.{type Validated}
import dataprep/validator

pub type SignupForm {
  SignupForm(name: String, email: String, age: Int)
}

pub type SignupError {
  Field(name: String, detail: Detail)
}

pub type Detail {
  Empty
  TooShort(min: Int)
  OutOfRange(min: Int, max: Int)
}

fn validate_name(raw: String) -> Validated(String, SignupError) {
  let clean = prep.trim() |> prep.then(prep.lowercase())
  let check =
    rules.not_empty(Empty)
    |> validator.guard(rules.min_length(2, TooShort(2)))
    |> validator.label("name", Field)
  raw |> clean |> check
}

fn validate_email(raw: String) -> Validated(String, SignupError) {
  let clean = prep.trim() |> prep.then(prep.lowercase())
  let check =
    rules.not_empty(Empty)
    |> validator.label("email", Field)
  raw |> clean |> check
}

fn validate_age(age: Int) -> Validated(Int, SignupError) {
  let check =
    rules.min_int(0, OutOfRange(0, 150))
    |> validator.both(rules.max_int(150, OutOfRange(0, 150)))
    |> validator.label("age", Field)
  check(age)
}

pub fn validate_signup(
  name: String,
  email: String,
  age: Int,
) -> Validated(SignupForm, SignupError) {
  validated.map3(
    SignupForm,
    validate_name(name),
    validate_email(email),
    validate_age(age),
  )
}

// validate_signup("", "", 200)
//   -> Invalid([
//        Field("name", Empty),
//        Field("email", Empty),
//        Field("age", OutOfRange(0, 150)),
//      ])

More examples are available in the doc/recipes/ directory of the repository.

Modules

ModuleResponsibility
dataprep/prepInfallible transformations: trim, lowercase, uppercase, collapse_space, replace, default. Compose with then or sequence.
dataprep/validatorChecks without transformation: check, predicate, both, all, alt, guard, map_error, label, each, optional.
dataprep/validatedApplicative error accumulation: map, map_error, and_then, from_result, from_result_map, to_result, map2..map5, sequence, traverse, traverse_indexed.
dataprep/non_empty_listAt-least-one guarantee for error lists: single, cons, append, concat, map, flat_map, to_list, from_list.
dataprep/rulesBuilt-in rules: not_empty, not_blank, matches, min_length, max_length, length_between, min_int, max_int, min_float, max_float, non_negative_int, non_negative_float, one_of, equals.
dataprep/parseParse helpers: int, float. Bridge String to typed Validated with custom error mapping.

Composition overview

PhaseCombinatorErrorsWhen to use
Prepprep.then(none)Chain infallible transforms
Validatevalidator.both / allAccumulate allIndependent checks on same value
Validatevalidator.altAccumulate on full failureAccept alternative forms
Validatevalidator.guardShort-circuitSkip if prerequisite fails
Combinevalidated.map2..map5Accumulate allBuild domain types from independent fields
Bridgevalidated.and_thenShort-circuitParse then validate (type changes)
Bridgeparse.int / parse.floatShort-circuitString to typed Validated in one step
Bridgeraw |> prep |> validator(prep has none)Apply infallible transform before validation
Collectionvalidated.sequence / traverseAccumulate allValidate a list of values
Collectionvalidator.eachAccumulate allApply a validator to every list element
Collectionvalidator.optional(none if None)Skip validation for absent values

Development

This project uses mise to manage Gleam and Erlang versions, and just as a task runner.

mise install    # install Gleam and Erlang
just ci         # format check, typecheck, build, test
just test       # gleam test
just format     # gleam format
just check      # all checks without deps download

Contributing

Contributions are welcome. See CONTRIBUTING.md for details.

License

MIT

Search Document