roadrunner_multipart (roadrunner v0.1.0)

View Source

Parser for multipart/form-data request bodies (RFC 7578).

Three entry points:

  • boundary/1 — pull the boundary=… parameter out of a Content-Type header value, handling unquoted, quoted, and parameter-mixed forms.
  • parse/2 — split a buffered body into a list of part() maps, each with its own headers and decoded body.
  • params/1 — parse the key=value parameters of any structured header value (e.g. Content-Type, Content-Disposition) into a lowercase-keyed map.

Typical handler shape:

handle(Req) ->
    {ok, Body, Req2} = roadrunner_req:read_body(Req),
    {ok, Boundary} = roadrunner_multipart:boundary(
        roadrunner_req:header(~"content-type", Req2)
    ),
    {ok, Parts} = roadrunner_multipart:parse(Body, Boundary),
    %% Parts is a list of #{headers := [...], body := <<...>>}.
    ...

This is a buffered parser — the entire body must be in memory first. For very large file uploads where you can't afford to buffer, a streaming variant is a future feature; today, cap them at the listener's max_content_length (default 10 MB).

What gets parsed

  • The preamble (bytes before the first boundary) is discarded per RFC 7578 §4.1.
  • Each part's headers are returned as a list of {Name, Value} binaries, with the name lowercased (matching the convention in roadrunner_req:headers/1). Values are LWS-trimmed.
  • Each part's body is the bytes between \r\n\r\n (end-of-headers) and the next \r\n--<boundary> (start of next boundary or terminator).
  • The terminating boundary is --<boundary>--. Anything after it (the epilogue) is ignored.

What does NOT get parsed

  • Content-Disposition parameters (name, filename, etc.) — callers parse them out of the raw header value if they need to. Adding a disposition/1 helper is a straightforward follow-up.
  • Per-part transfer encodings (Content-Transfer-Encoding) — bodies are returned as-is. Modern browsers send raw bytes for multipart/form-data, so this is rarely needed.

Summary

Types

One section of a parsed multipart/form-data body. headers is the part's headers (names lowercased, values OWS-trimmed) as a list of {Name, Value} binaries; body is the raw bytes between end-of-headers and the next boundary.

Functions

Extract the boundary=… parameter from a Content-Type header value. Handles unquoted (boundary=abc), quoted (boundary="a b c"), and mixed-with-other-parameters forms.

Parse the parameters of a structured header value (e.g. Content-Type, Content-Disposition) into a map. The "type" prefix before the first ; is discarded — only the key=value pairs after it are returned.

Split Body into a list of multipart parts using Boundary as the delimiter. The boundary must NOT include the leading -- — that is the multipart wire prefix and is added internally.

Types

part()

-type part() :: #{headers := [{binary(), binary()}], body := binary()}.

One section of a parsed multipart/form-data body. headers is the part's headers (names lowercased, values OWS-trimmed) as a list of {Name, Value} binaries; body is the raw bytes between end-of-headers and the next boundary.

Functions

boundary(ContentType)

-spec boundary(binary()) -> {ok, binary()} | {error, no_boundary}.

Extract the boundary=… parameter from a Content-Type header value. Handles unquoted (boundary=abc), quoted (boundary="a b c"), and mixed-with-other-parameters forms.

Returns {error, no_boundary} when the parameter isn't present.

params(Value)

-spec params(binary()) -> #{binary() => binary()}.

Parse the parameters of a structured header value (e.g. Content-Type, Content-Disposition) into a map. The "type" prefix before the first ; is discarded — only the key=value pairs after it are returned.

Param names are lowercased per RFC 9110 §8.3.1 (media-type parameter names are case-insensitive); values are returned as-is, with surrounding quotes stripped.

Examples:

  • ~"text/html; charset=utf-8"#{~"charset" => ~"utf-8"}
  • ~"form-data; name=\"a\"; filename=\"f.txt\""#{~"name" => ~"a", ~"filename" => ~"f.txt"}
  • ~"text/html"#{}

Malformed pairs (no =) are silently skipped.

parse(Body, Boundary)

-spec parse(binary(), binary()) -> {ok, [part()]} | {error, term()}.

Split Body into a list of multipart parts using Boundary as the delimiter. The boundary must NOT include the leading -- — that is the multipart wire prefix and is added internally.

Returns {error, no_initial_boundary} if the body doesn't start with (or contain) the opening boundary, {error, bad_header} on a malformed part header, or other {error, _} shapes when the multipart structure is otherwise broken.