mimetype

MIME type lookup and byte-signature detection for Gleam.

The public API intentionally separates:

Types

A callback that reads up to the requested number of bytes from an input source. Returns Ok(bits) with the bytes actually read, or Error(reason) if the read fails. A reader that returns fewer bytes than requested signals end-of-input.

pub type Reader =
  fn(Int) -> Result(BitArray, String)

Values

pub fn ancestors(mime: String) -> List(String)

Return the chain of ancestors of mime, ordered from immediate parent to root.

Empty input or roots return []. The returned list does not include mime itself; use is_a(mime, mime) (always True) if you need reflexive membership.

pub fn charset(mime_type: String) -> Result(String, Nil)

Return the charset parameter from a MIME type string.

Charset values are normalized to lowercase for convenience.

pub fn charset_of(bytes: BitArray) -> Result(String, Nil)

Detect the character encoding (charset) of a BitArray.

Returns Ok(charset) when one of the following signals fires (in priority order):

  1. A Unicode BOM (UTF-8 / UTF-16 LE/BE / UTF-32 LE/BE).
  2. An XML prolog <?xml ... encoding="..." ?>.
  3. An HTML <meta charset="..."> (or <meta http-equiv=... content=...>) tag in the first 1 KB.
  4. A UTF-8 validity scan: utf-8 for input that contains valid multi-byte UTF-8 sequences, us-ascii for input that is entirely 0x00–0x7F.

Returns Error(Nil) for inputs whose encoding cannot be determined (typically non-UTF-8 high-byte content like Latin-1 or Shift_JIS without an in-document declaration). Charset names are returned in lowercase, matching the convention used by IANA’s charset registry.

pub const default_detection_limit: Int

Default upper bound on the number of leading bytes inspected by detect and detect_strict.

3072 bytes is large enough for every signature this library ships (the largest fixed-offset check is application/x-tar at offset 257, plus envelope formats like ZIP central-directory inspection reach into the first few KB) and matches the default used by Go’s gabriel-vasile/mimetype library. Pass an explicit limit via detect_with_limit / detect_with_limit_strict to override.

pub const default_mime_type: String

Fallback MIME type used when neither metadata nor byte signatures provide a more specific answer.

pub fn detect(bytes: BitArray) -> String

Detect a MIME type from the leading bytes of a blob.

This checks a curated set of common magic-number signatures. Currently supported MIME types are: application/pdf, application/zip, application/gzip, application/x-bzip2, application/x-xz, application/x-7z-compressed, application/x-rar-compressed, application/vnd.ms-cab-compressed, application/x-tar, application/zstd, application/vnd.sqlite3, application/vnd.apache.parquet, application/ogg, application/wasm, application/x-elf, audio/wav, audio/aiff, audio/mpeg, audio/flac, audio/midi, audio/mp4, image/png, image/jpeg, image/gif, image/bmp, image/tiff, image/x-icon, image/webp, image/avif, image/heic, video/x-msvideo, video/webm, video/quicktime, and video/mp4.

Returns "application/octet-stream" (the value of default_mime_type) when the input carries no recognisable magic bytes — including the empty BitArray. The fallback is silent: a caller that needs to distinguish “no signature matched” from “signature matched but produced application/octet-stream” should use detect_strict/1, which returns Error(Nil) for the no-match case.

pub fn detect_reader(
  read: fn(Int) -> Result(BitArray, String),
  limit: Int,
) -> String

Detect a MIME type by pulling at most limit leading bytes through a caller-supplied reader.

The reader is called once with limit as the requested byte count. If the reader returns an error, the default MIME type is returned.

pub fn detect_reader_strict(
  read: fn(Int) -> Result(BitArray, String),
  limit: Int,
) -> Result(String, Nil)

Detect a MIME type by pulling at most limit leading bytes through a caller-supplied reader.

This strict variant returns Error(Nil) when the reader fails or when no supported magic-number signature matches within the bytes returned.

pub fn detect_signature_only(
  bytes: BitArray,
) -> Result(String, Nil)

Detect a MIME type from a genuine binary or structural signature only.

Like detect_strict but excludes the printable-ASCII heuristic that otherwise classifies every plain-ASCII payload as text/plain. Returns Ok(mime_type) for byte magic numbers (PNG, JPEG, ZIP, text/plain; charset=utf-* BOMs, …) and structural sniffs that inspect bytes (JSON, HTML, XML, SVG). Returns Error(Nil) for arbitrary printable-ASCII text — letting the caller defer to a stronger out-of-band hint such as a filename extension.

This is the building block behind detect_with_filename / detect_with_extension: a .csv filename is a stronger signal than the byte-level fact “this is printable ASCII”, so those helpers consult the filename hint when the only thing the byte side could say was text/plain.

pub fn detect_signature_only_with_limit(
  bytes: BitArray,
  limit: Int,
) -> Result(String, Nil)

detect_signature_only with an explicit byte budget.

pub fn detect_strict(bytes: BitArray) -> Result(String, Nil)

Detect a MIME type from the leading bytes of a blob.

This strict variant returns Error(Nil) when no supported magic-number signature matches the input (including the empty BitArray), so the caller can distinguish “no signature found” from “signature matched”. Prefer this variant when the application/octet-stream fallback would be ambiguous; use detect/1 when an unconditional String is more convenient.

pub fn detect_with_extension(
  bytes: BitArray,
  extension: String,
) -> String

Detect a MIME type from bytes, consulting an explicit extension hint when the byte signature alone is not specific enough.

Genuine binary signatures (PNG, JPEG, ZIP, BOM-tagged text, …) and structural sniffs (JSON, HTML, XML, SVG) win over the extension hint. The extension takes priority when the only thing the byte side could say was the printable-ASCII fallback text/plain — a .csv extension is a stronger signal for plain-ASCII payloads than the byte-level fact “this looks textish”. The printable-ASCII fallback is still used as a last resort when neither the byte signature nor the extension is recognisable.

pub fn detect_with_extension_strict(
  bytes: BitArray,
  extension: String,
) -> Result(String, Nil)

Detect a MIME type from bytes, consulting an explicit extension hint when the byte signature alone is not specific enough.

This strict variant returns Error(Nil) only when neither the byte signature, the normalised extension, nor the printable-ASCII fallback succeed.

pub fn detect_with_filename(
  bytes: BitArray,
  filename: String,
) -> String

Detect a MIME type from bytes, consulting the filename extension when the byte signature alone is not specific enough.

Genuine binary signatures (PNG, JPEG, ZIP, BOM-tagged text, …) and structural sniffs (JSON, HTML, XML, SVG) win over the filename. The filename takes priority when the only thing the byte side could say was the printable-ASCII fallback text/plain — a report.csv filename is a stronger signal for plain-ASCII payloads than the byte-level fact “this looks textish”. The printable-ASCII fallback is still used as a last resort when neither the byte signature nor the filename’s extension is recognisable.

pub fn detect_with_filename_strict(
  bytes: BitArray,
  filename: String,
) -> Result(String, Nil)

Detect a MIME type from bytes, consulting the filename extension when the byte signature alone is not specific enough.

This strict variant returns Error(Nil) only when neither the byte signature, the filename extension, nor the printable-ASCII fallback succeed.

pub fn detect_with_limit(bytes: BitArray, limit: Int) -> String

Detect a MIME type from the leading bytes of a blob, examining at most limit bytes from the start of the input.

A non-positive limit is treated as zero, in which case no signature can match and the fallback MIME type is returned. Limits larger than the input are clamped to the input length.

pub fn detect_with_limit_strict(
  bytes: BitArray,
  limit: Int,
) -> Result(String, Nil)

Detect a MIME type from at most limit leading bytes.

Strict variant; returns Error(Nil) when no supported signature matches within the limit.

pub fn essence(mime_type: String) -> String

Return the bare MIME type without any parameters.

This trims surrounding whitespace, lowercases the media type, and strips anything after the first ;.

pub fn extension_to_mime_type(extension: String) -> String

Look up a MIME type from a file extension.

The input may include a leading dot and is normalized to lowercase before lookup. Unknown extensions fall back to application/octet-stream.

pub fn extension_to_mime_type_strict(
  extension: String,
) -> Result(String, Nil)

Look up a MIME type from a file extension.

This strict variant returns Error(Nil) when the normalized extension is not present in the generated database.

pub fn filename_to_mime_type(path: String) -> String

Look up a MIME type from the last extension component of a path or filename.

Query strings and URL fragments are ignored. Hidden files without a real extension, such as .gitignore, fall back to application/octet-stream.

pub fn filename_to_mime_type_strict(
  path: String,
) -> Result(String, Nil)

Look up a MIME type from the last extension component of a path or filename.

This strict variant returns Error(Nil) when the path does not contain a usable extension or the extension is unknown.

pub fn is_a(mime: String, parent: String) -> Bool

Return True when mime is parent or transitively inherits from parent in the static subtype tree.

The relation is reflexive (is_a(x, x) is always True for any non-empty x) and transitive (if a inherits from b and b inherits from c, then is_a(a, c) is True).

Both arguments are normalized via essence so parameters and case differences are ignored.

pub fn is_audio(mime_type: String) -> Bool

Return True when the MIME type’s top-level media type is audio.

pub fn is_image(mime_type: String) -> Bool

Return True when the MIME type’s top-level media type is image.

pub fn is_text(mime_type: String) -> Bool

Return True when the MIME type’s top-level media type is text.

pub fn is_video(mime_type: String) -> Bool

Return True when the MIME type’s top-level media type is video.

pub fn is_xml_based(mime: String) -> Bool

Return True when mime is, or inherits from, an XML media type.

Both text/xml and application/xml are accepted as XML roots, in line with RFC 7303 which permits both. Returns True for image/svg+xml and any other *+xml types added to the hierarchy.

pub fn is_zip_based(mime: String) -> Bool

Return True when mime is, or inherits from, application/zip.

Convenience wrapper for is_a(mime, "application/zip"). Returns True for .docx / .xlsx / .epub / .apk and other ZIP-based container formats.

pub fn mime_type_to_extensions(mime_type: String) -> List(String)

Return all known extensions for a MIME type.

The input is trimmed, lowercased, and stripped of any MIME parameters (such as ; charset=utf-8) before lookup. Unknown MIME types return the empty list.

pub fn mime_type_to_extensions_strict(
  mime_type: String,
) -> Result(List(String), Nil)

Return all known extensions for a MIME type.

This strict variant returns Error(Nil) when the normalized MIME type is not present in the generated database.

pub fn parameter(
  mime_type: String,
  key: String,
) -> Result(String, Nil)

Look up a parameter value from a MIME type string.

Parameter names are matched case-insensitively. This returns Error(Nil) when the key is empty or the parameter is missing.

Search Document