content_indexer v0.2.5 ContentIndexer.Services.Calculator
Summary
calculates the content_indexer weights for a document of tokens against a corpus of tokenized documents
DEPRECATED - use the [`ContentIndexer.TfIdf.Calculate`](ContentIndexer.TfIdf.Calculate.html) module.
See also `ContentIndexer.TfIdf.IndexProcessTest`
Link to this section Summary
Functions
calculates the content_indexer weights for each token in the list of tokens against the corpus of tokens
calculates the content_indexer weights for each token in the list of tokens against the corpus of tokens
calculates the content_indexer weights for each token in the query - weights the query against itself
calculates the term frequency for each token in the list of tokens representing the document and returns a list of the tokens with their respective term frequencies
calculates the word count for each token in the list of tokens representing the document and returns a list of the tokens with their respective word counts
simple function to check if an item is contained in the list
Link to this section Functions
calculates the content_indexer weights for each token in the list of tokens against the corpus of tokens
## Parameters
- tokens: List of tokens to be indexed
- corpus_of_tokens: List of lists representing the corpus of all tokens
## Example
iex> ContentIndexerService.calculate_content_indexer_documents(
["bread","butter","jam"],
[
["red","brown","jam"],
["blue","green","butter"],
["pink","green","bread","jam"]
])
{:ok, [bread: 0.3662040962227032, butter: 0.3662040962227032,jam: 0.3662040962227032]}
calculates the content_indexer weights for each token in the list of tokens against the corpus of tokens
## Parameters
- tokens: List of tokens to be indexed
- corpus_of_tokens: List of lists representing the corpus of all tokens
- corpus_size: Integer with the size of the corpus_of_tokens - just so we can avoid re-calculating it
## Example
iex> ContentIndexerService.calculate_content_indexer_documents(
["bread","butter","jam"],
[
["red","brown","jam"],
["blue","green","butter"],
["pink","green","bread","jam"]
])
{:ok, [bread: 0.3662040962227032, butter: 0.3662040962227032,jam: 0.3662040962227032]}
calculates the content_indexer weights for each token in the query - weights the query against itself
## Parameters
- tokens: List of tokens to be indexed
## Example
iex> ContentIndexerService.calculate_content_indexer_query(["bread","butter","jam"])
{:ok, [bread: 0.0, butter: 0.0, jam: 0.0]}
calculates the term frequency for each token in the list of tokens representing the document and returns a list of the tokens with their respective term frequencies
## Parameters
- tokens: List of tokens to be indexed
## Example
iex> ContentIndexerService.calculate_tf_document(["bread","butter","jam","jam","bread","bread"])
{:ok, [bread: 0.5, butter: 0.16666666666666666, jam: 0.3333333333333333]}
calculates the word count for each token in the list of tokens representing the document and returns a list of the tokens with their respective word counts
## Parameters
- tokens: List of tokens to be indexed
## Example
iex> ContentIndexerService.calculate_token_count_document(["bread","butter","jam","jam","bread","bread"])
{:ok, [bread: 3, butter: 1, jam: 2]}
simple function to check if an item is contained in the list
## Parameters
- list: List of any type
- item: Any type of item stored in the list
## Example
iex> ContentIndexerService.calculate_content_indexer_documents(
["bread","butter","jam"],
[
["red","brown","jam"],
["blue","green","butter"],
["pink","green","bread","jam"]
])
{:ok, [bread: 0.3662040962227032, butter: 0.3662040962227032,jam: 0.3662040962227032]}