FileType

github.com coveralls.io hex.pm hex.pm hex.pm github.com

This package can be used to detect the MIME type and canonical extension by looking for magic numbers. It works by reading a small amount of data from the file (~256 bytes) and binary pattern matching against it's contents.

API Documentation

Usage

Detecting a file's type:

iex> FileType.from_path("profile.png")
{:ok, {"png", "image/png"}}

iex> FileType.from_path("contract.docx")
{:ok, {"docx", "application/vnd.openxmlformats-officedocument.wordprocessingml.document"}}

Detect a file's type from an IO:

iex> {:ok, file} = File.open("profile.png", [:read, :binary])
{:ok, file}

iex> FileType.from_io(file)
{:ok, {"png", "image/png"}}

Installation

The package can be installed by adding file_type to your list of dependencies in mix.exs:

def deps do
  [
    {:file_type, "~> 0.1.0"}
  ]
end

Supported types

Document

  • docx - Microsoft Word Open XML Document
  • pptx - PowerPoint Open XML Presentation
  • xlsx - Microsoft Excel Open XML Spreadsheet
  • doc - Microsoft Word Document
  • ppt - PowerPoint Presentation
  • xls - Excel Spreadsheet
  • pdf - Portable Document Format File
  • epub - Open eBook File
  • mobi - Mobipocket eBook
  • odt - OpenDocument Text Document
  • ods - OpenDocument Spreadsheet
  • odp - OpenDocument Presentation
  • rtf - Rich Text Format File

Image

  • jpg - JPEG Image
  • png - Portable Network Graphic
  • apng - Animated Portable Network Graphic
  • gif - Graphical Interchange Format File
  • webp - WebP Image
  • flif - Free Lossless Image Format File
  • cr2 - Canon Raw Image File
  • cr3 - Canon Raw 3 Image File
  • orf - Olympus RAW File
  • arw - Sony Digital Camera Image
  • dng - Digital Negative Image File
  • nef - Nikon Electronic Format RAW Image
  • rw2 - Panasonic RAW Image
  • raf - Fuji RAW Image File
  • tif - Tagged Image File
  • bmp - Bitmap Image File
  • icns - macOS Icon Resource File
  • jxr - JPEG XR Image
  • psd - Adobe Photoshop Document
  • dmg - Apple Disk Image
  • ico - Icon File
  • bpg - BPG Image
  • jp2 - JPEG 2000 Core Image File
  • jpm - JPEG 2000 Compound Image File Format
  • jpx - JPEG 2000 Image File
  • heic - High Efficiency Image Format
  • cur - Windows Cursor
  • ktx - Khronos Texture
  • avif - AV1 Image
  • dcm - DICOM Image

Video

  • mp4 - MPEG-4 Video File
  • mkv - Matroska Video File
  • webm - WebM Video File
  • mov - Apple QuickTime Movie
  • avi - Audio Video Interleave File
  • mpg - MPEG Video File
  • ogv - Ogg Video File
  • ogm - Ogg Media File
  • flv - Flash Video File
  • mts - AVCHD Video File
  • mj2 - Motion JPEG 2000 Video Clip
  • 3gp - 3GPP Multimedia File
  • 3g2 - 3GPP2 Multimedia File
  • m4v - iTunes Video File
  • m4p - iTunes Music Store Audio File
  • f4v - Flash MP4 Video File
  • f4p - Adobe Flash Protected Media File

Audio

  • mp1 - MPEG-1 Layer 1 Audio File
  • mp2 - MPEG Layer II Compressed Audio File
  • mp3 - MP3 Audio File
  • aac - Advanced Audio Coding File
  • ogg - Ogg Vorbis Audio File
  • oga - Ogg Vorbis Audio File
  • spx - Ogg Vorbis Speex File
  • opus - Opus Audio File
  • flac - Free Lossless Audio Codec File
  • wav - WAVE Audio File
  • mid - MIDI File
  • qcp - PureVoice Audio File
  • amr - Adaptive Multi-Rate Codec File
  • aif - Audio Interchange File Format
  • ape - Monkey's Audio Lossless Audio File
  • wv - WavPack Audio File
  • mpc - Musepack Compressed Audio File
  • dsf - Delusion Digital Sound File
  • voc - Creative Labs Audio File
  • ac3 - Audio Codec 3 File
  • m4a - MPEG-4 Audio File
  • m4b - MPEG-4 Audiobook File
  • f4a - Adobe Flash Protected Audio File
  • f4b - Extension Not Found
  • it - Impulse Tracker Module
  • s3m - ScreamTracker 3 Module
  • xm - Fasttracker 2 Extended Module

Font

  • ttf - TrueType Font
  • otf - OpenType Font
  • woff - Web Open Font Format File
  • woff2 - Web Open Font Format 2.0 File
  • eot - Embedded OpenType Font

Archive

  • zip - Zipped File
  • tar - Consolidated Unix File Archive
  • rar - WinRAR Compressed Archive
  • gz - Gnu Zipped Archive
  • bz2 - Bzip2 Compressed File
  • 7z - 7-Zip Compressed File
  • xz - XZ Compressed Archive
  • ar - Midtown Madness Data File
  • Z - Unix Compressed File
  • lz - Lzip Compressed File
  • cfb - Compound Binary File
  • cab - Windows Cabinet File
  • lzh - LZH Compressed File

Application

  • indd - Adobe InDesign Document
  • skp - SketchUp Document
  • blend - Blender 3D Data File
  • ics - Calendar File

Executable

  • exe - Windows Executable File
  • rpm - Red Hat Package Manager File
  • xpi - Cross-platform Installer Package
  • msi - Windows Installer Package
  • deb - Debian Software Package

Other

  • ogx - Ogg Vorbis Multiplexed Media File
  • swf - Shockwave Flash Movie
  • sqlite - SQLite Database File
  • nes - Nintendo (NES) ROM File
  • crx - Chrome Extension
  • mxf - Material Exchange Format File
  • wasm - WebAssembly Binary File
  • xml - XML File
  • glb - STK Globe File
  • pcap - Packet Capture Data
  • lnk - Windows Shortcut
  • alias - macOS Alias
  • mie - Meta Information Encapsulation
  • shp - Shapes File
  • arrow - Arrow Columnar Format
  • ps - PostScript File
  • eps - Encapsulated PostScript File
  • pgp - PGP Security Key
  • stl - Stereolithography File

Contributing

Most files can be detected with a single binary pattern match. To contribute support for new file type:

  1. Find an example file. Please make sure you have the rights to use this file.
  2. Register the fixture in test/file_type/integration_test.exs.
  3. Write some code to detect the file's type in lib/file_type/magic.ex.
  4. Update the README to include a mention of your new file format.
  5. Send a pull request!

Please note that this library is not intended to detect text-based file formats like CSV, JSON, etc.

Prior Art