DateTimeParser (DateTimeParser v1.1.3) View Source

The biggest ambiguity between datetime formats is whether it's ymd (year month day), mdy (month day year), or dmy (day month year); this is resolved by checking if there are slashes or dashes. If slashes, then it will try dmy first. All other cases will use the international format ymd. Sometimes, if the conditions are right, it can even parse dmy with dashes if the month is a vocal month (eg, "Jan").

If the string consists of only numbers, then we will try two other parsers depending on the number of digits: Epoch or Serial. Otherwise, we'll try the tokenizer.

If the string is 10-11 digits with optional precision, then we'll try to parse it as a Unix Epoch timestamp.

If the string is 1-5 digits with optional precision, then we'll try to parse it as a Serial timestamp (spreadsheet time) treating 1899-12-31 as 1. This will cause Excel-produced dates from 1900-01-01 until 1900-03-01 to be incorrect, as they really are.

digitsparserrangenotes
1-5Seriallow = 1900-01-01, high = 2173-10-15. Negative numbers go to 1626-03-17Floats indicate time. Integers do not.
6-9TokenizeranyThis allows for "20190429" to be parsed as 2019-04-29
10-11Epochlow = -1100-02-15 14:13:21, high = 5138-11-16 09:46:39If padded with 0s, then it can capture entire range.
elseTokenizerany

Required reading

Examples

iex> DateTimeParser.parse("19 September 2018 08:15:22 AM")
{:ok, ~N[2018-09-19 08:15:22]}

iex> DateTimeParser.parse_datetime("19 September 2018 08:15:22 AM")
{:ok, ~N[2018-09-19 08:15:22]}

iex> DateTimeParser.parse_datetime("2034-01-13", assume_time: true)
{:ok, ~N[2034-01-13 00:00:00]}

iex> DateTimeParser.parse_datetime("2034-01-13", assume_time: ~T[06:00:00])
{:ok, ~N[2034-01-13 06:00:00]}

iex> DateTimeParser.parse("invalid date 10:30pm")
{:ok, ~T[22:30:00]}

iex> DateTimeParser.parse("2019-03-11T99:99:99")
{:ok, ~D[2019-03-11]}

iex> DateTimeParser.parse("2019-03-11T10:30:00pm UNK")
{:ok, ~N[2019-03-11T22:30:00]}

iex> DateTimeParser.parse("2019-03-11T22:30:00.234+00:00")
{:ok, DateTime.from_naive!(~N[2019-03-11T22:30:00.234Z], "Etc/UTC")}
# `~U[2019-03-11T22:30:00.234Z]` in Elixir 1.9+

iex> DateTimeParser.parse_date("2034-01-13")
{:ok, ~D[2034-01-13]}

iex> DateTimeParser.parse_date("01/01/2017")
{:ok, ~D[2017-01-01]}

iex> DateTimeParser.parse_datetime("1564154204")
{:ok, DateTime.from_naive!(~N[2019-07-26T15:16:44Z], "Etc/UTC")}
# `~U[2019-07-26T15:16:44Z]` in Elixir 1.9+

iex> DateTimeParser.parse_datetime("41261.6013888889")
{:ok, ~N[2012-12-18T14:26:00]}

iex> DateTimeParser.parse_date("44262")
{:ok, ~D[2021-03-07]}
# This is a serial number date, commonly found in spreadsheets, eg: `=VALUE("03/07/2021")`

iex> DateTimeParser.parse_datetime("1/1/18 3:24 PM")
{:ok, ~N[2018-01-01T15:24:00]}

iex> DateTimeParser.parse_datetime("1/1/18 3:24 PM", assume_utc: true)
{:ok, DateTime.from_naive!(~N[2018-01-01T15:24:00Z], "Etc/UTC")}
# `~U[2018-01-01T15:24:00Z]` in Elixir 1.9+

iex> DateTimeParser.parse_datetime(~s|"Mar 28, 2018 7:39:53 AM PDT"|, to_utc: true)
{:ok, DateTime.from_naive!(~N[2018-03-28T14:39:53Z], "Etc/UTC")}

iex> {:ok, datetime} = DateTimeParser.parse_datetime(~s|"Mar 1, 2018 7:39:53 AM PST"|)
iex> datetime
#DateTime<2018-03-01 07:39:53-08:00 PST PST8PDT>

iex> DateTimeParser.parse_datetime(~s|"Mar 1, 2018 7:39:53 AM PST"|, to_utc: true)
{:ok, DateTime.from_naive!(~N[2018-03-01T15:39:53Z], "Etc/UTC")}

iex> {:ok, datetime} = DateTimeParser.parse_datetime(~s|"Mar 28, 2018 7:39:53 AM PDT"|)
iex> datetime
#DateTime<2018-03-28 07:39:53-07:00 PDT PST8PDT>

iex> DateTimeParser.parse_time("10:13pm")
{:ok, ~T[22:13:00]}

iex> DateTimeParser.parse_time("10:13:34")
{:ok, ~T[10:13:34]}

iex> DateTimeParser.parse_time("18:14:21.145851000000Z")
{:ok, ~T[18:14:21.145851]}

iex> DateTimeParser.parse_datetime(nil)
{:error, "Could not parse nil"}

Installation

Add date_time_parser to your list of dependencies in mix.exs:

def deps do
  [
    {:date_time_parser, "~> 1.1.2"}
  ]
end

Configuration

# This is the default config
alias DateTimeParser.Parser
config :date_time_parser, parsers: [Parser.Epoch, Parser.Serial, Parser.Tokenizer]

# To enable only specific parsers, include them in the :parsers key.
config :date_time_parser, parsers: [Parser.Tokenizer]

# Or in runtime, pass in the parsers in the function.
DateTimeParser.parse(mystring, parsers: [Parser.Tokenizer])

Write your own parser

You can write your own parser!

If the built-in parsers are not applicable for your use-case, you may build your own parser to use with this library. Let's write a simple one together.

First I will check DateTimeParser.Parser to see what behaviour my new parser should implement. It needs two functions:

  1. DateTimeParser.Parser.preflight/1
  2. DateTimeParser.Parser.parse/1

These functions accept the DateTimeParser.Parser.t/0 struct which contains the options supplied by the user, the string itself, and the context for which you should return your result. For example, if the context is :time then you should return a %Time{}; if :datetime you should return either a %NaiveDateTime{} or a %DateTime{}; if :date then you should return a %Date{}.

Let's implement a parser that reads a special time string. Our string will represent time, but all the digits are shifted up by 10 and must be prefixed with the secret word: "boomshakalaka:". For example, the real world time of 01:10 is represented as boomshakalaka:11:20 in our toy time format. 12:30 is represented as boomshakalaka:22:40, and 5:55 is represented as boomshakalaka:15:65.

defmodule MyParser do
  @behaviour DateTimeParser.Parser
  @secret_regex ~r|boomshakalaka:(?<time>\d{2}:\d{2})|

  def preflight(%{string: string} = parser) do
    case Regex.named_captures(@secret_regex, string) do
      %{"time" => time} ->
        {:ok, %{parser | preflight: time}}

      nil ->
        {:error, :not_compatible}
    end
  end

  # ... more below
end

We'll stop here first and go through the preflight function. Our special parser will only be attempted if the supplied string has any named captures from the regex. That is, it must begin with bookshakalaka: followed by 2 digits, a colon, and 2 more digits. These digits are extracted out like 00:00 where 0 is any digit. If 05:40 is passed in, it would not be compatible so the parser will be skipped.

Now let's parse the time:

def parse(%{preflight: time} = parser) do
  [hour, minute] = String.split(time, ":")
  {hour, ""} = Integer.parse(hour)
  {minute, ""} = Integer.parse(minute)
  result = Time.new(hour - 10, minute - 10, 0, {0, 0})
  for_context(parser.context, result)
end

defp for_context(:datetime, _result), do: :error
defp for_context(:date, _result), do: :error
defp for_context(:time, result), do: result

Notice that we need to consider context of the result. If the user asked for a DateTime, then we need to give them one. In our toy format, it only represents time, so therefore we must return an error when the context is a :datetime or :date.

DateTimeParser.parse_time("boomshakalaka:11:11", parsers: [MyParser])
#=> {:ok, ~T[01:01:00]}

DateTimeParser.parse_date("boomshakalaka:11:11", parsers: [MyParser])
#=> {:error, "Could not parse \"boomshakalaka:11:11\""}

DateTimeParser.parse_datetime("boomshakalaka:11:11", parsers: [MyParser])
#=> {:error, "Could not parse \"boomshakalaka:11:11\""}

DateTimeParser.parse("boomshakalaka:11:11", parsers: [MyParser])
#=> {:ok, ~T[01:01:00]}

Should I use this library?

Only as a last resort. Parsing dates from strings is educated guessing at best. Since Elixir natively supports ISO-8601 parsing (see from_iso8601/2 functions), it's highly recommended that you rely on that first and foremost.

When designing your API that involves dates and strings, be specific with your requirements and supported DateTime strings, and preferably only support ISO-8601 with no exceptions. There is no ambiguity with this format so parsing to DateTime (or Date or Time) will always be correct.

This library is helpful when you must accept ambiguous DateTime string formats and having incorrect results is acceptable. Do not use this library when the resulting (and possibly incorrect) DateTime has catastrophic and dangerous effects in your system.

Link to this section Summary

Functions

Parse a %DateTime{}, %NaiveDateTime{}, %Date{}, or %Time{} from a string.

Parse a %DateTime{}, %NaiveDateTime{}, %Date{}, or %Time{} from a string. Raises a DateTimeParser.ParseError when parsing fails.

Parse %Date{} from a string.

Parse a %Date{} from a string. Raises a DateTimeParser.ParseError when parsing fails.

Parse a %DateTime{} or %NaiveDateTime{} from a string.

Parse a %DateTime{} or %NaiveDateTime{} from a string. Raises a DateTimeParser.ParseError when parsing fails.

Parse %Time{} from a string. Accepts options parse_time_options/0

Parse %Time{} from a string. Raises a DateTimeParser.ParseError when parsing fails.

Link to this section Types

Specs

assume_date() :: {:assume_date, boolean() | Date.t()}

Specs

assume_time() :: {:assume_time, boolean() | Time.t()}

Specs

assume_utc() :: {:assume_utc, boolean()}

Specs

parse_date_options() :: [assume_date() | parsers()]

Options for parse_date/2

  • :assume_date Default false. If a date cannot be fully determined, then it will not be assumed by default. If you supply true, then Date.utc_today() will be assumed. You can also supply your own date, and the found tokens will be merged with it.
Link to this type

parse_datetime_options()

View Source

Specs

parse_datetime_options() :: [
  assume_utc() | to_utc() | assume_time() | use_1904_date_system() | parsers()
]

Options for parse_datetime/2

  • :assume_utc Default false. Only applicable for strings where parsing could not determine a timezone. Instead of returning a NaiveDateTime, this option will assume them to be in UTC timezone, and therefore return a DateTime. If the timezone is determined, then it will continue to be returned in the original timezone. See to_utc option to also convert it to UTC.

  • :to_utc Default false. If there's a timezone detected in the string, then attempt to convert to UTC timezone. If you know that your timestamps are in the future and are going to store it for later use, it may be better to convert to UTC and keep the original timestamp since government organizations may change timezone rules before the timestamp elapses, therefore making the UTC timestamp wrong or invalid. Check out the guide on future timestamps.

  • :assume_time Default false. If a time cannot be determined, then it will not be assumed by default. If you supply true, then ~T[00:00:00] will be assumed. You can also supply your own time, and the found tokens will be merged with it.

  • :use_1904_date_system Default false. For Serial timestamps, the parser will use the 1900 Date System by default. If you supply true, then the 1904 Date System will be used to parse the timestamp.

  • :parsers The parsers to use when analyzing the string. When Parser.Tokenizer, the appropriate tokenizer will be used depending on the function used and conditions found in the string. Order matters and determines the order in which parsers are attempted. These are the available built-in parsers:

    This is the default in this order:

    1. DateTimeParser.Parser.Epoch
    2. DateTimeParser.Parser.Serial
    3. DateTimeParser.Parser.Tokenizer

Specs

Options for parse/2.

Combination of parse_date_options/0 and parse_datetime_options/0 and parse_time_options/0

Specs

parse_time_options() :: [parsers()]

Options for parse_time/2.

See parse_datetime_options/0 for further definition.

Specs

parsers() :: {:parsers, [atom()]}

List of modules that implement the DateTimeParser.Parser behaviour.

Specs

to_utc() :: {:to_utc, boolean()}
Link to this type

use_1904_date_system()

View Source

Specs

use_1904_date_system() :: {:use_1904_date_system, boolean()}

Link to this section Functions

Link to this function

parse(string, opts \\ [])

View Source

Specs

parse(String.t() | nil, parse_options()) ::
  {:ok, DateTime.t() | NaiveDateTime.t() | Date.t() | Time.t()}
  | {:error, String.t()}

Parse a %DateTime{}, %NaiveDateTime{}, %Date{}, or %Time{} from a string.

Accepts parse_options/0

Link to this function

parse!(string, opts \\ [])

View Source

Specs

parse!(String.t() | nil, parse_options()) ::
  DateTime.t() | NaiveDateTime.t() | Date.t() | Time.t() | no_return()

Parse a %DateTime{}, %NaiveDateTime{}, %Date{}, or %Time{} from a string. Raises a DateTimeParser.ParseError when parsing fails.

Accepts parse_options/0.

Link to this function

parse_date(string, opts \\ [])

View Source

Specs

parse_date(String.t() | nil, parse_date_options()) ::
  {:ok, Date.t()} | {:error, String.t()}

Parse %Date{} from a string.

Accepts options parse_date_options/0

Link to this function

parse_date!(string, opts \\ [])

View Source

Specs

parse_date!(String.t() | nil, parse_datetime_options()) ::
  Date.t() | no_return()

Parse a %Date{} from a string. Raises a DateTimeParser.ParseError when parsing fails.

Accepts options parse_date_options/0.

Link to this function

parse_datetime(string, opts \\ [])

View Source

Specs

parse_datetime(String.t() | nil, parse_datetime_options()) ::
  {:ok, DateTime.t() | NaiveDateTime.t()} | {:error, String.t()}

Parse a %DateTime{} or %NaiveDateTime{} from a string.

Accepts options parse_datetime_options/0

Link to this function

parse_datetime!(string, opts \\ [])

View Source

Specs

parse_datetime!(String.t() | nil, parse_datetime_options()) ::
  DateTime.t() | NaiveDateTime.t() | no_return()

Parse a %DateTime{} or %NaiveDateTime{} from a string. Raises a DateTimeParser.ParseError when parsing fails.

Accepts options parse_datetime_options/0.

Link to this function

parse_time(string, opts \\ [])

View Source

Specs

parse_time(String.t() | nil, parse_time_options()) ::
  {:ok, Time.t()} | {:error, String.t()}

Parse %Time{} from a string. Accepts options parse_time_options/0

Link to this function

parse_time!(string, opts \\ [])

View Source

Specs

parse_time!(String.t() | nil, parse_time_options()) :: Time.t() | no_return()

Parse %Time{} from a string. Raises a DateTimeParser.ParseError when parsing fails.