Module re_tuner

Description
Data Types
Function Index
Function Details

Helper function for working with Regular Expression Erlanb re module.

Authors: Anatolii Kosorukov (java1cprog@yandex.ru) [web site: rustkas.github.io/].

Description

Helper function for working with Regular Expression Erlanb re module.

compile_option() = unicode | anchored | caseless | dollar_endonly | dotall | extended | firstline | multiline | no_auto_capture | dupnames | ungreedy | {newline, nl_spec()} | bsr_anycrlf | bsr_unicode | no_start_optimize | ucp | never_utf

do_action()

do_action() = fun((InputString::string()) -> NewString::string())

mp()

mp() = {re_pattern, term(), term(), term(), term()}

nl_spec()

nl_spec() = cr | crlf | lf | anycrlf | any

Function Index

all_match/2	Retrieve Part of the Matched Text.
avoid_characters/0	The list of characters which raise an error if escape character is not used.
filter/3	Filter Matches in Procedural Code.
first_match/2	Retrieve the Matched Text.
first_match_info/2	Determine the Position and Length of the Match.
first_part_match/2	Retrieve Part of the Matched Text.
is_full_match/2	Check whether a string fits a certain pattern in its entirety.
is_match/2	Check whether a match can be found for a particular regular expression in a particular string.
match_chain/2	Get the matches of one Regex within the matches of another Regex.
match_evaluator/3	Replace Matches with Replacements Generated in Code.
mp/1	It is reduced form of `re:compile/1` function.
mp/2	It is reduced form of `re:compile/1` function.
replace/1	Replace one of shorthand pattern from the list `[\s,\w,\h,v]` in a pattern string.
replace/3	Replace All Matches.
save_pattern/1	Make save Regex pattern which make literal for any character.
subfilter/3	Filter a match within another match.
submatch_evaluator/4	Replace All Matches Within the Matches of Another Regex.
tune/1	Replace Regex pattern to more siple one.
unicode_block/1	The Unicode character database divides all the code points into blocks.

Function Details

all_match/2

all_match(Text, ReInput) -> Result

Text = string()
ReInput = string() | tuple()
Result = [string()] | nomatch

Text: subject string

returns: A list

Retrieve Part of the Matched Text. You have a regular expression that matches a substring of the subject text. You want to match just one part of that substring. To isolate the part you want, you added a capturing group to your regular expression.
See also: re:run/3, erlang:hd/1, lists:map/2.

See also: re_tuner:mp/1.

avoid_characters/0

avoid_characters() -> Result

Result = string()

returns: The list of spectial characters.

The list of characters which raise an error if escape character is not used.

filter/3

filter(Text, ReInput, Function) -> Result

Text = string()
ReInput = string() | tuple()
Function = function()
Result = [string()] | nomatch

Text: subject string
Function: filter function

returns: A list

Filter Matches in Procedural Code. Retrieve a list of all matches a regular expression can find in a string when it is applied repeatedly to the remainder of the string after each match. Get a list of matches that meet certain extra criteria that you cannot (easily) express in a regular expression.
See also: lists:filter/2.

See also: re_tuner:all_match/2, re_tuner:mp/1.

first_match/2

first_match(Text, ReInput) -> Result

Text = string()
ReInput = string() | tuple()
Result = string() | nomatch

Text: subject string

returns: A string as a result

Retrieve the Matched Text. You have a regular expression that matches a part of the subject text, and you want to extract the text that was matched. If the regular expression can match the string more than once, you want only the first match.
See also: re:run/3.

See also: re_tuner:mp/1.

first_match_info/2

first_match_info(Text, ReInput) -> Result

Text = string()
ReInput = string() | tuple()
Result = string() | nomatch

Text: subject string

returns: Tuples as a result

Determine the Position and Length of the Match. Instead of extracting the substring matched by the regular expression you want to determine the starting position and length of the match. With this information, you can extract the match in your own code or apply whatever processing you fancy on the part of the original string matched by the regex.
See also: re:run/3.

See also: re_tuner:mp/1.

first_part_match/2

first_part_match(Text, ReInput) -> Result

Text = string()
ReInput = string() | tuple()
Result = string() | nomatch

Text: subject string

returns: A string

See also: re_tuner:mp/1.

is_full_match/2

is_full_match(Text, ReInput) -> Result

Text = string()
ReInput = string() | tuple()
Result = true | false

Text: subject string

returns: true or false

Check whether a string fits a certain pattern in its entirety. A partial match is not sufficient.
See also: re:run/3.

See also: re_tuner:mp/1.

is_match/2

is_match(Text, ReInput) -> Result

Text = string()
ReInput = string() | tuple()
Result = true | false

Text: subject string

returns: true or false

Check whether a match can be found for a particular regular expression in a particular string. A partial match is sufficient.
See also: re:run/3.

See also: re_tuner:mp/1.

match_chain/2

match_chain(Text, ReList) -> Result

Text = string()
ReList = [string()] | [tuple()]
Result = [string()] | nomatch

Text: subject string
ReList: regex list

returns: A list

Get the matches of one Regex within the matches of another Regex. This function takes a list of Regexes. Find the matches of a Regex within the matches of another Regex, within the matches of other Regexes, as many levels deep as you want.
See also: lists:map/2.

See also: re_tuner:all_match/2, re_tuner:mp/1.

match_evaluator/3

match_evaluator(DoAction, Text, Regex) -> Result

DoAction = do_action()
Text = string()
Regex = string() | tuple()
Result = string()

DoAction: a spec - function(InputString)-> NewString
Text: subject string
Regex: regex pattern

returns: A string

Replace Matches with Replacements Generated in Code. Replace all matches of a regular expression with a new string that you build up in procedural code. You want to be able to replace each match with a different string, based on the text that was actually matched.
See also: erlang:element/2, string:slice/3, string:length/1, re:run/3.

See also: re_tuner:mp/1.

mp/1

mp(Regex) -> MP | {error, tuple()}

Regex = string()
MP = mp()

Regex: regex pattern

returns: Opaque data type containing a compiled regular expression

It is reduced form of re:compile/1 function. Return opaque data type containing a compiled regular expression or raise an error badarg.
See also: mp(), re:compile/1.

mp/2

mp(Regex, Options) -> MP | {error, badarg}

Regex = string()
Options = [Option]
MP = mp()
Option = compile_option()

Regex: regex pattern
Options: additional regular expression metadata

returns: Opaque data type containing a compiled regular expression

It is reduced form of re:compile/1 function. Return opaque data type containing a compiled regular expression or raise an error badarg.
See also: mp(), re:compile/2.

replace/1

replace(Pattern) -> UpdatedPattern

Pattern = string()
UpdatedPattern = string()

Pattern: searched regex pattern for replacing

returns: Updated Regex pattern string

Replace one of shorthand pattern from the list [\s,\w,\h,v] in a pattern string.
See also: lists:foldl/3.
Don't apply \w shorthand to unicode content.

replace/3

replace(Text, Regex, Replacement) -> Result

Text = string()
Regex = string() | tuple()
Replacement = string()
Result = string()

Text: subject string
Regex: regex pattern
Replacement: a replacement string

returns: A string

Replace All Matches. Replace all matches of the regular expression with the replacement text.
See also: re:replace/4.

See also: re_tuner:mp/1.

save_pattern/1

save_pattern(Pattern) -> SavePattern

Pattern = string()
SavePattern = string()

returns: Save pattern

Make save Regex pattern which make literal for any character.

subfilter/3

subfilter(Text, OuterReInput, InnerReInput) -> Result

Text = string()
OuterReInput = string() | tuple()
InnerReInput = string() | tuple()
Result = [string()] | nomatch

Text: subject string

returns: A list

Filter a match within another match. Find all the matches of a particular regular expression, but only within certain sections of the subject string. Another regular expression matches each of the sections in the string.

See also: re_tuner:all_match/2, re_tuner:mp/1.

submatch_evaluator/4

submatch_evaluator(Text, OuterRegex, InnerRegex, Replacement) -> Result

Text = string()
OuterRegex = string() | tuple()
InnerRegex = string() | tuple()
Replacement = string()
Result = string()

Text: subject string
OuterRegex: outer regex pattern
InnerRegex: inner regex pattern
Replacement: a replacement string

returns: A string

Replace All Matches Within the Matches of Another Regex. Replace all the matches of a particular regular expression, but only within certain sections of the subject string. Another regular expression matches each of the sections in the string.
See also: erlang:element/2, string:slice/3, string:length/1, re:run/3.

See also: re_tuner:mp/1, re_tuner:replace/3.

tune/1

tune(Regex) -> Result

Regex = string()
Result = string()

returns: Transformed Regex pattern.

Replace Regex pattern to more siple one.

unicode_block/1

unicode_block(BlockName) -> Range | nomatch

BlockName = string()
Range = string()

BlockName: is Regular Expression block name

returns: Regular Expressions range of code points

The Unicode character database divides all the code points into blocks. Each block consists of a single range of code points. The code points U+0000 through U+FFFF are divided into 156 blocks in version 6.1 of the Unicode standard.

  ‹U+0000…U+007F \p{InBasicLatin}›
  ‹U+0080…U+00FF \p{InLatin-1Supplement}›
  ‹U+0100…U+017F \p{InLatinExtended-A}›
  ‹U+0180…U+024F \p{InLatinExtended-B}›
  ‹U+0250…U+02AF \p{InIPAExtensions}›
  ‹U+02B0…U+02FF \p{InSpacingModifierLetters}›
  ‹U+0300…U+036F \p{InCombiningDiacriticalMarks}›
  ‹U+0370…U+03FF \p{InGreekandCoptic}›
  ‹U+0400…U+04FF \p{InCyrillic}›
  ‹U+0500…U+052F \p{InCyrillicSupplement}›
  ‹U+0530…U+058F \p{InArmenian}›
  ‹U+0590…U+05FF \p{InHebrew}›
  ‹U+0600…U+06FF \p{InArabic}›
  ‹U+0700…U+074F \p{InSyriac}›
  ‹U+0750…U+077F \p{InArabicSupplement}›
  ‹U+0780…U+07BF \p{InThaana}›
  ‹U+07C0…U+07FF \p{InNKo}›
  ‹U+0800…U+083F \p{InSamaritan}›
  ‹U+0840…U+085F \p{InMandaic}›
  ‹U+08A0…U+08FF \p{InArabicExtended-A}›
  ‹U+0900…U+097F \p{InDevanagari}›
  ‹U+0980…U+09FF \p{InBengali}›
  ‹U+0A00…U+0A7F \p{InGurmukhi}›
  ‹U+0A80…U+0AFF \p{InGujarati}›
  ‹U+0B00…U+0B7F \p{InOriya}›
  ‹U+0B80…U+0BFF \p{InTamil}›
  ‹U+0C00…U+0C7F \p{InTelugu}›
  ‹U+0C80…U+0CFF \p{InKannada}›
  ‹U+0D00…U+0D7F \p{InMalayalam}›
  ‹U+0D80…U+0DFF \p{InSinhala}›
  ‹U+0E00…U+0E7F \p{InThai}›
  ‹U+0E80…U+0EFF \p{InLao}›
  ‹U+0F00…U+0FFF \p{InTibetan}›
  ‹U+1000…U+109F \p{InMyanmar}›
  ‹U+10A0…U+10FF \p{InGeorgian}›
  ‹U+1100…U+11FF \p{InHangulJamo}›
  ‹U+1200…U+137F \p{InEthiopic}›
  ‹U+1380…U+139F \p{InEthiopicSupplement}›
  ‹U+13A0…U+13FF \p{InCherokee}›
  ‹U+1400…U+167F \p{InUnifiedCanadianAboriginalSyllabics}›
  ‹U+1680…U+169F \p{InOgham}›
  ‹U+16A0…U+16FF \p{InRunic}›
  ‹U+1700…U+171F \p{InTagalog}›
  ‹U+1720…U+173F \p{InHanunoo}›
  ‹U+1740…U+175F \p{InBuhid}›
  ‹U+1760…U+177F \p{InTagbanwa}›
  ‹U+1780…U+17FF \p{InKhmer}›
  ‹U+1800…U+18AF \p{InMongolian}›
  ‹U+18B0…U+18FF \p{InUnifiedCanadianAboriginalSyllabicsExtended}›
  ‹U+1900…U+194F \p{InLimbu}›
  ‹U+1950…U+197F \p{InTaiLe}›
  ‹U+1980…U+19DF \p{InNewTaiLue}›
  ‹U+19E0…U+19FF \p{InKhmerSymbols}›
  ‹U+1A00…U+1A1F \p{InBuginese}›
  ‹U+1A20…U+1AAF \p{InTaiTham}›
  ‹U+1B00…U+1B7F \p{InBalinese}›
  ‹U+1B80…U+1BBF \p{InSundanese}›
  ‹U+1BC0…U+1BFF \p{InBatak}›
  ‹U+1C00…U+1C4F \p{InLepcha}›
  ‹U+1C50…U+1C7F \p{InOlChiki}›
  ‹U+1CC0…U+1CCF \p{InSundaneseSupplement}›
  ‹U+1CD0…U+1CFF \p{InVedicExtensions}›
  ‹U+1D00…U+1D7F \p{InPhoneticExtensions}›
  ‹U+1D80…U+1DBF \p{InPhoneticExtensionsSupplement}›
  ‹U+1DC0…U+1DFF \p{InCombiningDiacriticalMarksSupplement}›
  ‹U+1E00…U+1EFF \p{InLatinExtendedAdditional}›
  ‹U+1F00…U+1FFF \p{InGreekExtended}›
  ‹U+2000…U+206F \p{InGeneralPunctuation}›
  ‹U+2070…U+209F \p{InSuperscriptsandSubscripts}›
  ‹U+20A0…U+20CF \p{InCurrencySymbols}›
  ‹U+20D0…U+20FF \p{InCombiningDiacriticalMarksforSymbols}›
  ‹U+2100…U+214F \p{InLetterlikeSymbols}›
  ‹U+2150…U+218F \p{InNumberForms}›
  ‹U+2190…U+21FF \p{InArrows}›
  ‹U+2200…U+22FF \p{InMathematicalOperators}›
  ‹U+2300…U+23FF \p{InMiscellaneousTechnical}›
  ‹U+2400…U+243F \p{InControlPictures}›
  ‹U+2440…U+245F \p{InOpticalCharacterRecognition}›
  ‹U+2460…U+24FF \p{InEnclosedAlphanumerics}›
  ‹U+2500…U+257F \p{InBoxDrawing}›
  ‹U+2580…U+259F \p{InBlockElements}›
  ‹U+25A0…U+25FF \p{InGeometricShapes}›
  ‹U+2600…U+26FF \p{InMiscellaneousSymbols}›
  ‹U+2700…U+27BF \p{InDingbats}›
  ‹U+27C0…U+27EF \p{InMiscellaneousMathematicalSymbols-A}›
  ‹U+27F0…U+27FF \p{InSupplementalArrows-A}›
  ‹U+2800…U+28FF \p{InBraillePatterns}›
  ‹U+2900…U+297F \p{InSupplementalArrows-B}›
  ‹U+2980…U+29FF \p{InMiscellaneousMathematicalSymbols-B}›
  ‹U+2A00…U+2AFF \p{InSupplementalMathematicalOperators}›
  ‹U+2B00…U+2BFF \p{InMiscellaneousSymbolsandArrows}›
  ‹U+2C00…U+2C5F \p{InGlagolitic}›
  ‹U+2C60…U+2C7F \p{InLatinExtended-C}›
  ‹U+2C80…U+2CFF \p{InCoptic}›
  ‹U+2D00…U+2D2F \p{InGeorgianSupplement}›
  ‹U+2D30…U+2D7F \p{InTifinagh}›
  ‹U+2D80…U+2DDF \p{InEthiopicExtended}›
  ‹U+2DE0…U+2DFF \p{InCyrillicExtended-A}›
  ‹U+2E00…U+2E7F \p{InSupplementalPunctuation}›
  ‹U+2E80…U+2EFF \p{InCJKRadicalsSupplement}›
  ‹U+2F00…U+2FDF \p{InKangxiRadicals}›
  ‹U+2FF0…U+2FFF \p{InIdeographicDescriptionCharacters}›
  ‹U+3000…U+303F \p{InCJKSymbolsandPunctuation}›
  ‹U+3040…U+309F \p{InHiragana}›
  ‹U+30A0…U+30FF \p{InKatakana}›
  ‹U+3100…U+312F \p{InBopomofo}›
  ‹U+3130…U+318F \p{InHangulCompatibilityJamo}›
  ‹U+3190…U+319F \p{InKanbun}›
  ‹U+31A0…U+31BF \p{InBopomofoExtended}›
  ‹U+31C0…U+31EF \p{InCJKStrokes}›
  ‹U+31F0…U+31FF \p{InKatakanaPhoneticExtensions}›
  ‹U+3200…U+32FF \p{InEnclosedCJKLettersandMonths}›
  ‹U+3300…U+33FF \p{InCJKCompatibility}›
  ‹U+3400…U+4DBF \p{InCJKUnifiedIdeographsExtensionA}›
  ‹U+4DC0…U+4DFF \p{InYijingHexagramSymbols}›
  ‹U+4E00…U+9FFF \p{InCJKUnifiedIdeographs}›
  ‹U+A000…U+A48F \p{InYiSyllables}›
  ‹U+A490…U+A4CF \p{InYiRadicals}›
  ‹U+A4D0…U+A4FF \p{InLisu}›
  ‹U+A500…U+A63F \p{InVai}›
  ‹U+A640…U+A69F \p{InCyrillicExtended-B}›
  ‹U+A6A0…U+A6FF \p{InBamum}›
  ‹U+A700…U+A71F \p{InModifierToneLetters}›
  ‹U+A720…U+A7FF \p{InLatinExtended-D}›
  ‹U+A800…U+A82F \p{InSylotiNagri}›
  ‹U+A830…U+A83F \p{InCommonIndicNumberForms}›
  ‹U+A840…U+A87F \p{InPhags-pa}›
  ‹U+A880…U+A8DF \p{InSaurashtra}›
  ‹U+A8E0…U+A8FF \p{InDevanagariExtended}›
  ‹U+A900…U+A92F \p{InKayahLi}›
  ‹U+A930…U+A95F \p{InRejang}›
  ‹U+A960…U+A97F \p{InHangulJamoExtended-A}›
  ‹U+A980…U+A9DF \p{InJavanese}›
  ‹U+AA00…U+AA5F \p{InCham}›
  ‹U+AA60…U+AA7F \p{InMyanmarExtended-A}›
  ‹U+AA80…U+AADF \p{InTaiViet}›
  ‹U+AAE0…U+AAFF \p{InMeeteiMayekExtensions}›
  ‹U+AB00…U+AB2F \p{InEthiopicExtended-A}›
  ‹U+ABC0…U+ABFF \p{InMeeteiMayek}›
  ‹U+AC00…U+D7AF \p{InHangulSyllables}›
  ‹U+D7B0…U+D7FF \p{InHangulJamoExtended-B}›
  ‹U+D800…U+DB7F \p{InHighSurrogates}›
  ‹U+DB80…U+DBFF \p{InHighPrivateUseSurrogates}›
  ‹U+DC00…U+DFFF \p{InLowSurrogates}›
  ‹U+E000…U+F8FF \p{InPrivateUseArea}›
  ‹U+F900…U+FAFF \p{InCJKCompatibilityIdeographs}›
  ‹U+FB00…U+FB4F \p{InAlphabeticPresentationForms}›
  ‹U+FB50…U+FDFF \p{InArabicPresentationForms-A}›
  ‹U+FE00…U+FE0F \p{InVariationSelectors}›
  ‹U+FE10…U+FE1F \p{InVerticalForms}›
  ‹U+FE20…U+FE2F \p{InCombiningHalfMarks}›
  ‹U+FE30…U+FE4F \p{InCJKCompatibilityForms}›
  ‹U+FE50…U+FE6F \p{InSmallFormVariants}›
  ‹U+FE70…U+FEFF \p{InArabicPresentationForms-B}›
  ‹U+FF00…U+FFEF \p{InHalfwidthandFullwidthForms}›
  ‹U+FFF0…U+FFFF \p{InSpecials}›

See also: lists:kefind/3.

Generated by EDoc