Keyword-In-Context concordance.
Given a piece of text and a search term, returns each occurrence of the term with a window of surrounding tokens on either side. The classic concordancing tool from corpus linguistics, useful for inspecting how a word is actually used in a corpus, building glossaries, and debugging tokenisation.
Example display rendering:
"the quick brown" | "fox" | "jumped over the"
"the lazy red" | "fox" | "ran past the"Each match is returned as a Text.KWIC.Match struct carrying the
pre-context, the matched token (in its original casing), and the
post-context. Use format/2 to turn a match into a readable string
with the term centred and visually delimited.
Summary
Functions
Returns every occurrence of term in text with surrounding context.
Renders a Match as a readable concordance line.
Functions
@spec concordance(String.t(), String.t(), keyword()) :: [Text.KWIC.Match.t()]
Returns every occurrence of term in text with surrounding context.
Arguments
textis a UTF-8 string.termis the search term — a single token (e.g."cat"). Multi-word phrases are not yet supported.
Options
:context— number of tokens of context on each side. Defaults to5.:case_sensitive— whenfalse(default), the search is case-insensitive. The output preserves original casing regardless.:tokenizer— a string-to-tokens function. Defaults to&Text.Segment.words/1.
Returns
- A list of
Text.KWIC.Matchstructs in document order. Returns[]if the term is not found.
Examples
iex> matches = Text.KWIC.concordance("the cat sat on the mat", "cat", context: 2)
iex> match = hd(matches)
iex> match.term
"cat"
iex> match.left
["the"]
iex> match.right
["sat", "on"]
iex> match.position
1
iex> Text.KWIC.concordance("no matches here", "missing")
[]
@spec format( Text.KWIC.Match.t(), keyword() ) :: String.t()
Renders a Match as a readable concordance line.
Arguments
matchis aText.KWIC.Match.
Options
:separator— string placed between the three sections. Defaults to" | ".:width— when set, pads the left context to this many characters so multiple lines align in a fixed-width display.
Returns
- A string.
Examples
iex> match = %Text.KWIC.Match{
...> position: 1, left: ["the"], term: "cat", right: ["sat", "on"]
...> }
iex> Text.KWIC.format(match)
"the | cat | sat on"
iex> match = %Text.KWIC.Match{
...> position: 1, left: ["the"], term: "cat", right: ["sat", "on"]
...> }
iex> Text.KWIC.format(match, separator: " ~ ")
"the ~ cat ~ sat on"