Link (link v1.0.11)
Link
is a little link parsing, compacting and shortening library.
Summary
Functions
add_target_blank/1
Adds the target="_blank" attribute to links
so that links open in a new Tab/Window.
compact/1
reduces a url to its most compact form.
This is highly specific to our use case.
compact_github_url/1
does exactly as its' name implies:
compact a GitHub URL down to it's simplest version
so that we aren't wasting characters on a mobile screen.
find/1
finds all instances of a URL in a block of text.
find_replace_compact/1
finds all instances of a URL in a block of text
and replaces them with the compact/1
version.
regex/0
returns the Regular Expression needed to match URLs.
According to RFC 3986
: https://www.rfc-editor.org/rfc/rfc3986
Based on reading https://mathiasbynens.be/demo/url-regex
After much searching on Google, GitHub and StackOverflow,
this is what we came up with.
strip_protocol/1
strips the protocol e.g: "https://" from a URL.
strip_trailing_slash/1
strips trailing forward slash from URL.
Functions
add_target_blank(text)
add_target_blank/1
Adds the target="_blank" attribute to links
so that links open in a new Tab/Window.
Examples
iex> Link.add_target_blank(~s(My <a href="https://link.com">Link</a>))
~s(My <a target="_blank" href="https://link.com">Link</a>)
compact(url)
compact/1
reduces a url to its most compact form.
This is highly specific to our use case.
Examples
iex> Link.compact("https://github.com/dwyl/mvp/issues/141")
"dwyl/mvp#141"
# Can't understand the URL, just return it sans protocol:
iex> Link.compact("https://git.io/top")
"git.io/top"
iex> Link.compact("https://mvp.fly.dev/")
"mvp.fly.dev"
iex> Link.compact("https://github.com/dwyl/link")
"dwyl/link"
compact_github_url(url)
compact_github_url/1
does exactly as its' name implies:
compact a GitHub URL down to it's simplest version
so that we aren't wasting characters on a mobile screen.
Examples
iex> Link.compact_github_url("https://github.com/dwyl/mvp/issues/141")
"dwyl/mvp#141"
iex> Link.compact_github_url("https://github.com/dwyl/app/issues/275#issuecomment-1646862277")
"dwyl/app#275"
iex> Link.compact_github_url("https://github.com/dwyl/link")
"dwyl/link"
iex> Link.compact_github_url("https://github.com/dwyl/link#123")
"dwyl/link"
iex> Link.compact_github_url("https://github.com/dwyl/link/pull/5")
"dwyl/link/PR#5"
find(text)
find/1
finds all instances of a URL in a block of text.
Examples
iex> Link.find("Text with links http://goo.gl/3co4ae and https://git.io/top & www.dwyl.com etc.")
["http://goo.gl/3co4ae", "https://git.io/top", "www.dwyl.com"]
# Find a link without any other text or whitespace #8
iex> Link.find("https://github.com/dwyl/link/pull/5#pullrequestreview-1558913764")
["https://github.com/dwyl/link/pull/5#pullrequestreview-1558913764"]
find_replace_compact(text)
find_replace_compact/1
finds all instances of a URL in a block of text
and replaces them with the compact/1
version.
Only compact the links that aren't surrounded by brackets "()" i.e: "This is our MVP: https://mvp.fly.dev/ please try it!" Becomes "This is our MVP: mvp.fly.dev please try it!" But if the text already has a Markdown link or an image, don't compact the URL! e.g: "Please try our App: app.dwyl.com feedback welcome!" No change required because it's already hyperlinked.
Examples
iex> md = "# Hello World! https://github.com/dwyl/mvp/issues/141#issuecomment-1657954420 and https://mvp.fly.dev/ "
iex> Link.find_replace_compact(md)
"# Hello World! [dwyl/mvp#141](https://github.com/dwyl/mvp/issues/141#issuecomment-1657954420) and [mvp.fly.dev](https://mvp.fly.dev/) "
# Does not attempt to compact an existing markdown [link](https://github.com/dwyl/link) or 
iex> md = "existing markdown [link](https://github.com/dwyl/link) or "
iex> Link.find_replace_compact(md)
"existing markdown [link](https://github.com/dwyl/link) or "
iex> Link.find_replace_compact("https://github.com/dwyl/link/pull/5#pullrequestreview-1558913764")
iex> "[dwyl/link/PR#5](https://github.com/dwyl/link/pull/5#pullrequestreview-1558913764)"
regex()
regex/0
returns the Regular Expression needed to match URLs.
According to RFC 3986
: https://www.rfc-editor.org/rfc/rfc3986
Based on reading https://mathiasbynens.be/demo/url-regex
After much searching on Google, GitHub and StackOverflow,
this is what we came up with.
#HELPWANTED: if you find a better (faster, more comppliant) RegEx that passes all our tests, please share! github.com/dwyl/link/issues/new
Explanation:
((http(s)?):// # Optional scheme (http or https) (www.)? # Optional "www" a-zA-Z0-9@:%._+~#=]{2,256} # Domain (IPv4, IPv6 or hostname) :% # Optional port number [a-z]{2,6} # Domain extension e.g: ".com" (?:?[^
]*)? # Optional query string ?q= (?:#[^
]*)? # Optional fragment e.g: #comment
Examples
iex> Regex.run(Link.regex(), "dwyl.com") |> List.flatten |> Enum.filter(& &1 != "") |> List.first
"dwyl.com"
strip_protocol(url)
strip_protocol/1
strips the protocol e.g: "https://" from a URL.
Examples
iex> Link.strip_protocol("https://dwyl.com")
"dwyl.com"
strip_trailing_slash(url)
strip_trailing_slash/1
strips trailing forward slash from URL.
Examples
iex> Link.strip_trailing_slash("dwyl.com/")
"dwyl.com"
valid?(url)
valid?/1
confirms if a URL is valid
using the RFC 3986
compliant regex/0
above.
Examples
iex> Link.valid?("example.c")
false
iex> Link.valid?("https://www.example.com")
true