View Source Evision.Text (Evision v0.2.9)
Summary
Functions
Compute the different channels to be processed independently in the N&M algorithm @cite Neumann12.
Compute the different channels to be processed independently in the N&M algorithm @cite Neumann12.
Variant 1:
Reads an Extremal Region Filter for the 1st stage classifier of N&M algorithm from the provided path e.g. /path/to/cpp/trained_classifierNM1.xml
Variant 1:
Reads an Extremal Region Filter for the 1st stage classifier of N&M algorithm from the provided path e.g. /path/to/cpp/trained_classifierNM1.xml
Variant 1:
Reads an Extremal Region Filter for the 2nd stage classifier of N&M algorithm from the provided path e.g. /path/to/cpp/trained_classifierNM2.xml
Variant 1:
Reads an Extremal Region Filter for the 2nd stage classifier of N&M algorithm from the provided path e.g. /path/to/cpp/trained_classifierNM2.xml
Utility function to create a tailored language model transitions table from a given list of words (lexicon).
Extracts text regions from image.
Extracts text regions from image.
Applies the Stroke Width Transform operator followed by filtering of connected components of similar Stroke Widths to return letter candidates. It also chain them by proximity and size, saving the result in chainBBs.
Applies the Stroke Width Transform operator followed by filtering of connected components of similar Stroke Widths to return letter candidates. It also chain them by proximity and size, saving the result in chainBBs.
Find groups of Extremal Regions that are organized as text blocks.
Find groups of Extremal Regions that are organized as text blocks.
Allow to implicitly load the default classifier when creating an ERFilter object.
Allow to implicitly load the default classifier when creating an ERFilter object.
Allow to implicitly load the default character classifier when creating an OCRBeamSearchDecoder object.
Allow to implicitly load the default character classifier when creating an OCRHMMDecoder object.
Allow to implicitly load the default character classifier when creating an OCRHMMDecoder object.
Allow to implicitly load the default character classifier when creating an OCRHMMDecoder object.
Enumerator
Types
Functions
@spec computeNMChannels(Keyword.t()) :: any() | {:error, String.t()}
@spec computeNMChannels(Evision.Mat.maybe_mat_in()) :: [Evision.Mat.t()] | {:error, String.t()}
Compute the different channels to be processed independently in the N&M algorithm @cite Neumann12.
Positional Arguments
src:
Evision.Mat
.Source image. Must be RGB CV_8UC3.
Keyword Arguments
- mode:
integer()
.Mode of operation. Currently the only available options are:- ERFILTER_NM_RGBLGrad (used by default) and ERFILTER_NM_IHSGrad**.
Return
channels:
[Evision.Mat]
.Output vector\<Mat> where computed channels are stored.
In N&M algorithm, the combination of intensity (I), hue (H), saturation (S), and gradient magnitude channels (Grad) are used in order to obtain high localization recall. This implementation also provides an alternative combination of red (R), green (G), blue (B), lightness (L), and gradient magnitude (Grad).
Python prototype (for reference only):
computeNMChannels(_src[, _channels[, _mode]]) -> _channels
@spec computeNMChannels(Evision.Mat.maybe_mat_in(), [{:mode, term()}] | nil) :: [Evision.Mat.t()] | {:error, String.t()}
Compute the different channels to be processed independently in the N&M algorithm @cite Neumann12.
Positional Arguments
src:
Evision.Mat
.Source image. Must be RGB CV_8UC3.
Keyword Arguments
- mode:
integer()
.Mode of operation. Currently the only available options are:- ERFILTER_NM_RGBLGrad (used by default) and ERFILTER_NM_IHSGrad**.
Return
channels:
[Evision.Mat]
.Output vector\<Mat> where computed channels are stored.
In N&M algorithm, the combination of intensity (I), hue (H), saturation (S), and gradient magnitude channels (Grad) are used in order to obtain high localization recall. This implementation also provides an alternative combination of red (R), green (G), blue (B), lightness (L), and gradient magnitude (Grad).
Python prototype (for reference only):
computeNMChannels(_src[, _channels[, _mode]]) -> _channels
@spec createERFilterNM1(Keyword.t()) :: any() | {:error, String.t()}
@spec createERFilterNM1(binary()) :: Evision.Text.ERFilter.t() | {:error, String.t()}
@spec createERFilterNM1(term()) :: Evision.Text.ERFilter.t() | {:error, String.t()}
Variant 1:
Reads an Extremal Region Filter for the 1st stage classifier of N&M algorithm from the provided path e.g. /path/to/cpp/trained_classifierNM1.xml
Positional Arguments
- filename:
String
Keyword Arguments
- thresholdDelta:
integer()
. - minArea:
float
. - maxArea:
float
. - minProbability:
float
. - nonMaxSuppression:
bool
. - minProbabilityDiff:
float
.
Return
- retval:
Evision.Text.ERFilter.t()
Has overloading in C++
Python prototype (for reference only):
createERFilterNM1(filename[, thresholdDelta[, minArea[, maxArea[, minProbability[, nonMaxSuppression[, minProbabilityDiff]]]]]]) -> retval
Variant 2:
Create an Extremal Region Filter for the 1st stage classifier of N&M algorithm @cite Neumann12.
Positional Arguments
- cb:
Evision.Text.ERFilter.Callback
Keyword Arguments
- thresholdDelta:
integer()
. - minArea:
float
. - maxArea:
float
. - minProbability:
float
. - nonMaxSuppression:
bool
. - minProbabilityDiff:
float
.
Return
- retval:
Evision.Text.ERFilter.t()
The component tree of the image is extracted by a threshold increased step by step from 0 to 255, incrementally computable descriptors (aspect_ratio, compactness, number of holes, and number of horizontal crossings) are computed for each ER and used as features for a classifier which estimates the class-conditional probability P(er|character). The value of P(er|character) is tracked using the inclusion relation of ER across all thresholds and only the ERs which correspond to local maximum of the probability P(er|character) are selected (if the local maximum of the probability is above a global limit pmin and the difference between local maximum and local minimum is greater than minProbabilityDiff).
Python prototype (for reference only):
createERFilterNM1(cb[, thresholdDelta[, minArea[, maxArea[, minProbability[, nonMaxSuppression[, minProbabilityDiff]]]]]]) -> retval
@spec createERFilterNM1( binary(), [ maxArea: term(), minArea: term(), minProbability: term(), minProbabilityDiff: term(), nonMaxSuppression: term(), thresholdDelta: term() ] | nil ) :: Evision.Text.ERFilter.t() | {:error, String.t()}
@spec createERFilterNM1( term(), [ maxArea: term(), minArea: term(), minProbability: term(), minProbabilityDiff: term(), nonMaxSuppression: term(), thresholdDelta: term() ] | nil ) :: Evision.Text.ERFilter.t() | {:error, String.t()}
Variant 1:
Reads an Extremal Region Filter for the 1st stage classifier of N&M algorithm from the provided path e.g. /path/to/cpp/trained_classifierNM1.xml
Positional Arguments
- filename:
String
Keyword Arguments
- thresholdDelta:
integer()
. - minArea:
float
. - maxArea:
float
. - minProbability:
float
. - nonMaxSuppression:
bool
. - minProbabilityDiff:
float
.
Return
- retval:
Evision.Text.ERFilter.t()
Has overloading in C++
Python prototype (for reference only):
createERFilterNM1(filename[, thresholdDelta[, minArea[, maxArea[, minProbability[, nonMaxSuppression[, minProbabilityDiff]]]]]]) -> retval
Variant 2:
Create an Extremal Region Filter for the 1st stage classifier of N&M algorithm @cite Neumann12.
Positional Arguments
- cb:
Evision.Text.ERFilter.Callback
Keyword Arguments
- thresholdDelta:
integer()
. - minArea:
float
. - maxArea:
float
. - minProbability:
float
. - nonMaxSuppression:
bool
. - minProbabilityDiff:
float
.
Return
- retval:
Evision.Text.ERFilter.t()
The component tree of the image is extracted by a threshold increased step by step from 0 to 255, incrementally computable descriptors (aspect_ratio, compactness, number of holes, and number of horizontal crossings) are computed for each ER and used as features for a classifier which estimates the class-conditional probability P(er|character). The value of P(er|character) is tracked using the inclusion relation of ER across all thresholds and only the ERs which correspond to local maximum of the probability P(er|character) are selected (if the local maximum of the probability is above a global limit pmin and the difference between local maximum and local minimum is greater than minProbabilityDiff).
Python prototype (for reference only):
createERFilterNM1(cb[, thresholdDelta[, minArea[, maxArea[, minProbability[, nonMaxSuppression[, minProbabilityDiff]]]]]]) -> retval
@spec createERFilterNM2(Keyword.t()) :: any() | {:error, String.t()}
@spec createERFilterNM2(binary()) :: Evision.Text.ERFilter.t() | {:error, String.t()}
@spec createERFilterNM2(term()) :: Evision.Text.ERFilter.t() | {:error, String.t()}
Variant 1:
Reads an Extremal Region Filter for the 2nd stage classifier of N&M algorithm from the provided path e.g. /path/to/cpp/trained_classifierNM2.xml
Positional Arguments
- filename:
String
Keyword Arguments
- minProbability:
float
.
Return
- retval:
Evision.Text.ERFilter.t()
Has overloading in C++
Python prototype (for reference only):
createERFilterNM2(filename[, minProbability]) -> retval
Variant 2:
Create an Extremal Region Filter for the 2nd stage classifier of N&M algorithm @cite Neumann12.
Positional Arguments
- cb:
Evision.Text.ERFilter.Callback
Keyword Arguments
- minProbability:
float
.
Return
- retval:
Evision.Text.ERFilter.t()
In the second stage, the ERs that passed the first stage are classified into character and non-character classes using more informative but also more computationally expensive features. The classifier uses all the features calculated in the first stage and the following additional features: hole area ratio, convex hull ratio, and number of outer inflexion points.
Python prototype (for reference only):
createERFilterNM2(cb[, minProbability]) -> retval
@spec createERFilterNM2(binary(), [{:minProbability, term()}] | nil) :: Evision.Text.ERFilter.t() | {:error, String.t()}
@spec createERFilterNM2(term(), [{:minProbability, term()}] | nil) :: Evision.Text.ERFilter.t() | {:error, String.t()}
Variant 1:
Reads an Extremal Region Filter for the 2nd stage classifier of N&M algorithm from the provided path e.g. /path/to/cpp/trained_classifierNM2.xml
Positional Arguments
- filename:
String
Keyword Arguments
- minProbability:
float
.
Return
- retval:
Evision.Text.ERFilter.t()
Has overloading in C++
Python prototype (for reference only):
createERFilterNM2(filename[, minProbability]) -> retval
Variant 2:
Create an Extremal Region Filter for the 2nd stage classifier of N&M algorithm @cite Neumann12.
Positional Arguments
- cb:
Evision.Text.ERFilter.Callback
Keyword Arguments
- minProbability:
float
.
Return
- retval:
Evision.Text.ERFilter.t()
In the second stage, the ERs that passed the first stage are classified into character and non-character classes using more informative but also more computationally expensive features. The classifier uses all the features calculated in the first stage and the following additional features: hole area ratio, convex hull ratio, and number of outer inflexion points.
Python prototype (for reference only):
createERFilterNM2(cb[, minProbability]) -> retval
@spec createOCRHMMTransitionsTable(binary(), [binary()]) :: Evision.Mat.t() | {:error, String.t()}
Utility function to create a tailored language model transitions table from a given list of words (lexicon).
Positional Arguments
vocabulary:
String
.The language vocabulary (chars when ASCII English text).
lexicon:
[String]
.The list of words that are expected to be found in a particular image.
Return
- retval:
Evision.Mat.t()
The function calculate frequency statistics of character pairs from the given lexicon and fills the output transition_probabilities_table with them. The transition_probabilities_table can be used as input in the OCRHMMDecoder::create() and OCRBeamSearchDecoder::create() methods. Note:
- (C++) An alternative would be to load the default generic language transition table provided in the text module samples folder (created from ispell 42869 english words list) : https://github.com/opencv/opencv_contrib/blob/master/modules/text/samples/OCRHMM_transitions_table.xml
Python prototype (for reference only):
createOCRHMMTransitionsTable(vocabulary, lexicon) -> retval
@spec detectRegions( Evision.Mat.maybe_mat_in(), Evision.Text.ERFilter.t(), Evision.Text.ERFilter.t() ) :: [{number(), number(), number(), number()}] | {:error, String.t()}
Extracts text regions from image.
Positional Arguments
image:
Evision.Mat
.Source image where text blocks needs to be extracted from. Should be CV_8UC3 (color).
er_filter1:
Evision.Text.ERFilter
.Extremal Region Filter for the 1st stage classifier of N&M algorithm @cite Neumann12
er_filter2:
Evision.Text.ERFilter
.Extremal Region Filter for the 2nd stage classifier of N&M algorithm @cite Neumann12
Keyword Arguments
method:
integer()
.Grouping method (see text::erGrouping_Modes). Can be one of ERGROUPING_ORIENTATION_HORIZ, ERGROUPING_ORIENTATION_ANY.
filename:
String
.The XML or YAML file with the classifier model (e.g. samples/trained_classifier_erGrouping.xml). Only to use when grouping method is ERGROUPING_ORIENTATION_ANY.
minProbability:
float
.The minimum probability for accepting a group. Only to use when grouping method is ERGROUPING_ORIENTATION_ANY.
Return
groups_rects:
[Rect]
.Output list of rectangle blocks with text
Python prototype (for reference only):
detectRegions(image, er_filter1, er_filter2[, method[, filename[, minProbability]]]) -> groups_rects
@spec detectRegions( Evision.Mat.maybe_mat_in(), Evision.Text.ERFilter.t(), Evision.Text.ERFilter.t(), [filename: term(), method: term(), minProbability: term()] | nil ) :: [{number(), number(), number(), number()}] | {:error, String.t()}
Extracts text regions from image.
Positional Arguments
image:
Evision.Mat
.Source image where text blocks needs to be extracted from. Should be CV_8UC3 (color).
er_filter1:
Evision.Text.ERFilter
.Extremal Region Filter for the 1st stage classifier of N&M algorithm @cite Neumann12
er_filter2:
Evision.Text.ERFilter
.Extremal Region Filter for the 2nd stage classifier of N&M algorithm @cite Neumann12
Keyword Arguments
method:
integer()
.Grouping method (see text::erGrouping_Modes). Can be one of ERGROUPING_ORIENTATION_HORIZ, ERGROUPING_ORIENTATION_ANY.
filename:
String
.The XML or YAML file with the classifier model (e.g. samples/trained_classifier_erGrouping.xml). Only to use when grouping method is ERGROUPING_ORIENTATION_ANY.
minProbability:
float
.The minimum probability for accepting a group. Only to use when grouping method is ERGROUPING_ORIENTATION_ANY.
Return
groups_rects:
[Rect]
.Output list of rectangle blocks with text
Python prototype (for reference only):
detectRegions(image, er_filter1, er_filter2[, method[, filename[, minProbability]]]) -> groups_rects
@spec detectTextSWT(Evision.Mat.maybe_mat_in(), boolean()) :: {[{number(), number(), number(), number()}], Evision.Mat.t(), Evision.Mat.t()} | {:error, String.t()}
Applies the Stroke Width Transform operator followed by filtering of connected components of similar Stroke Widths to return letter candidates. It also chain them by proximity and size, saving the result in chainBBs.
Positional Arguments
input:
Evision.Mat
.the input image with 3 channels.
dark_on_light:
bool
.a boolean value signifying whether the text is darker or lighter than the background, it is observed to reverse the gradient obtained from Scharr operator, and significantly affect the result.
Return
result:
[Rect]
.a vector of resulting bounding boxes where probability of finding text is high
draw:
Evision.Mat.t()
.an optional Mat of type CV_8UC3 which visualises the detected letters using bounding boxes.
chainBBs:
Evision.Mat.t()
.an optional parameter which chains the letter candidates according to heuristics in the paper and returns all possible regions where text is likely to occur.
Python prototype (for reference only):
detectTextSWT(input, dark_on_light[, draw[, chainBBs]]) -> result, draw, chainBBs
@spec detectTextSWT( Evision.Mat.maybe_mat_in(), boolean(), [{atom(), term()}, ...] | nil ) :: {[{number(), number(), number(), number()}], Evision.Mat.t(), Evision.Mat.t()} | {:error, String.t()}
Applies the Stroke Width Transform operator followed by filtering of connected components of similar Stroke Widths to return letter candidates. It also chain them by proximity and size, saving the result in chainBBs.
Positional Arguments
input:
Evision.Mat
.the input image with 3 channels.
dark_on_light:
bool
.a boolean value signifying whether the text is darker or lighter than the background, it is observed to reverse the gradient obtained from Scharr operator, and significantly affect the result.
Return
result:
[Rect]
.a vector of resulting bounding boxes where probability of finding text is high
draw:
Evision.Mat.t()
.an optional Mat of type CV_8UC3 which visualises the detected letters using bounding boxes.
chainBBs:
Evision.Mat.t()
.an optional parameter which chains the letter candidates according to heuristics in the paper and returns all possible regions where text is likely to occur.
Python prototype (for reference only):
detectTextSWT(input, dark_on_light[, draw[, chainBBs]]) -> result, draw, chainBBs
@spec erGrouping(Evision.Mat.maybe_mat_in(), Evision.Mat.maybe_mat_in(), [ [{number(), number()}] ]) :: [{number(), number(), number(), number()}] | {:error, String.t()}
Find groups of Extremal Regions that are organized as text blocks.
Positional Arguments
image:
Evision.Mat
channel:
Evision.Mat
regions:
[[Point]]
.Vector of ER's retrieved from the ERFilter algorithm from each channel.
Keyword Arguments
method:
integer()
.Grouping method (see text::erGrouping_Modes). Can be one of ERGROUPING_ORIENTATION_HORIZ, ERGROUPING_ORIENTATION_ANY.
filename:
String
.The XML or YAML file with the classifier model (e.g. samples/trained_classifier_erGrouping.xml). Only to use when grouping method is ERGROUPING_ORIENTATION_ANY.
minProbablity:
float
.The minimum probability for accepting a group. Only to use when grouping method is ERGROUPING_ORIENTATION_ANY.
Return
groups_rects:
[Rect]
.The output of the algorithm are stored in this parameter as list of rectangles.
Python prototype (for reference only):
erGrouping(image, channel, regions[, method[, filename[, minProbablity]]]) -> groups_rects
@spec erGrouping( Evision.Mat.maybe_mat_in(), Evision.Mat.maybe_mat_in(), [[{number(), number()}]], [filename: term(), method: term(), minProbablity: term()] | nil ) :: [{number(), number(), number(), number()}] | {:error, String.t()}
Find groups of Extremal Regions that are organized as text blocks.
Positional Arguments
image:
Evision.Mat
channel:
Evision.Mat
regions:
[[Point]]
.Vector of ER's retrieved from the ERFilter algorithm from each channel.
Keyword Arguments
method:
integer()
.Grouping method (see text::erGrouping_Modes). Can be one of ERGROUPING_ORIENTATION_HORIZ, ERGROUPING_ORIENTATION_ANY.
filename:
String
.The XML or YAML file with the classifier model (e.g. samples/trained_classifier_erGrouping.xml). Only to use when grouping method is ERGROUPING_ORIENTATION_ANY.
minProbablity:
float
.The minimum probability for accepting a group. Only to use when grouping method is ERGROUPING_ORIENTATION_ANY.
Return
groups_rects:
[Rect]
.The output of the algorithm are stored in this parameter as list of rectangles.
Python prototype (for reference only):
erGrouping(image, channel, regions[, method[, filename[, minProbablity]]]) -> groups_rects
@spec loadClassifierNM1(Keyword.t()) :: any() | {:error, String.t()}
@spec loadClassifierNM1(binary()) :: term() | {:error, String.t()}
Allow to implicitly load the default classifier when creating an ERFilter object.
Positional Arguments
filename:
String
.The XML or YAML file with the classifier model (e.g. trained_classifierNM1.xml)
Return
- retval:
Evision.Text.ERFilter.Callback.t()
returns a pointer to ERFilter::Callback.
Python prototype (for reference only):
loadClassifierNM1(filename) -> retval
@spec loadClassifierNM2(Keyword.t()) :: any() | {:error, String.t()}
@spec loadClassifierNM2(binary()) :: term() | {:error, String.t()}
Allow to implicitly load the default classifier when creating an ERFilter object.
Positional Arguments
filename:
String
.The XML or YAML file with the classifier model (e.g. trained_classifierNM2.xml)
Return
- retval:
Evision.Text.ERFilter.Callback.t()
returns a pointer to ERFilter::Callback.
Python prototype (for reference only):
loadClassifierNM2(filename) -> retval
@spec loadOCRBeamSearchClassifierCNN(Keyword.t()) :: any() | {:error, String.t()}
@spec loadOCRBeamSearchClassifierCNN(binary()) :: term() | {:error, String.t()}
Allow to implicitly load the default character classifier when creating an OCRBeamSearchDecoder object.
Positional Arguments
filename:
String
.The XML or YAML file with the classifier model (e.g. OCRBeamSearch_CNN_model_data.xml.gz)
Return
- retval:
Evision.Text.OCRBeamSearchDecoder.ClassifierCallback.t()
The CNN default classifier is based in the scene text recognition method proposed by Adam Coates & Andrew NG in [Coates11a]. The character classifier consists in a Single Layer Convolutional Neural Network and a linear classifier. It is applied to the input image in a sliding window fashion, providing a set of recognitions at each window location.
Python prototype (for reference only):
loadOCRBeamSearchClassifierCNN(filename) -> retval
Allow to implicitly load the default character classifier when creating an OCRHMMDecoder object.
Positional Arguments
filename:
String
.The XML or YAML file with the classifier model (e.g. OCRBeamSearch_CNN_model_data.xml.gz)
classifier:
integer()
.Can be one of classifier_type enum values.
Return
- retval:
Evision.Text.OCRHMMDecoder.ClassifierCallback.t()
Python prototype (for reference only):
loadOCRHMMClassifier(filename, classifier) -> retval
@spec loadOCRHMMClassifierCNN(Keyword.t()) :: any() | {:error, String.t()}
@spec loadOCRHMMClassifierCNN(binary()) :: term() | {:error, String.t()}
Allow to implicitly load the default character classifier when creating an OCRHMMDecoder object.
Positional Arguments
filename:
String
.The XML or YAML file with the classifier model (e.g. OCRBeamSearch_CNN_model_data.xml.gz)
Return
- retval:
Evision.Text.OCRHMMDecoder.ClassifierCallback.t()
The CNN default classifier is based in the scene text recognition method proposed by Adam Coates & Andrew NG in [Coates11a]. The character classifier consists in a Single Layer Convolutional Neural Network and a linear classifier. It is applied to the input image in a sliding window fashion, providing a set of recognitions at each window location. @deprecated use loadOCRHMMClassifier instead
Python prototype (for reference only):
loadOCRHMMClassifierCNN(filename) -> retval
@spec loadOCRHMMClassifierNM(Keyword.t()) :: any() | {:error, String.t()}
@spec loadOCRHMMClassifierNM(binary()) :: term() | {:error, String.t()}
Allow to implicitly load the default character classifier when creating an OCRHMMDecoder object.
Positional Arguments
filename:
String
.The XML or YAML file with the classifier model (e.g. OCRHMM_knn_model_data.xml)
Return
- retval:
Evision.Text.OCRHMMDecoder.ClassifierCallback.t()
The KNN default classifier is based in the scene text recognition method proposed by Lukás Neumann & Jiri Matas in [Neumann11b]. Basically, the region (contour) in the input image is normalized to a fixed size, while retaining the centroid and aspect ratio, in order to extract a feature vector based on gradient orientations along the chain-code of its perimeter. Then, the region is classified using a KNN model trained with synthetic data of rendered characters with different standard font types. @deprecated loadOCRHMMClassifier instead
Python prototype (for reference only):
loadOCRHMMClassifierNM(filename) -> retval