content_indexer v0.2.5 ContentIndexer.Services.Calculator

Summary

calculates the content_indexer weights for a document of tokens against a corpus of tokenized documents

DEPRECATED - use the [`ContentIndexer.TfIdf.Calculate`](ContentIndexer.TfIdf.Calculate.html) module.

See also `ContentIndexer.TfIdf.IndexProcessTest`

Link to this section Summary

Functions

calculates the content_indexer weights for each token in the list of tokens against the corpus of tokens

calculates the content_indexer weights for each token in the list of tokens against the corpus of tokens

calculates the content_indexer weights for each token in the query - weights the query against itself

calculates the term frequency for each token in the list of tokens representing the document and returns a list of the tokens with their respective term frequencies

calculates the word count for each token in the list of tokens representing the document and returns a list of the tokens with their respective word counts

simple function to check if an item is contained in the list

Link to this section Functions

Link to this function calculate_content_indexer_documents(tokens, corpus_of_tokens)

calculates the content_indexer weights for each token in the list of tokens against the corpus of tokens

## Parameters

- tokens: List of tokens to be indexed
- corpus_of_tokens: List of lists representing the corpus of all tokens

## Example

iex> ContentIndexerService.calculate_content_indexer_documents(
        ["bread","butter","jam"],
        [
          ["red","brown","jam"],
          ["blue","green","butter"],
          ["pink","green","bread","jam"]
        ])
      {:ok, [bread: 0.3662040962227032, butter: 0.3662040962227032,jam: 0.3662040962227032]}
Link to this function calculate_content_indexer_documents(tokens, corpus_of_tokens, corpus_size)

calculates the content_indexer weights for each token in the list of tokens against the corpus of tokens

## Parameters

- tokens: List of tokens to be indexed
- corpus_of_tokens: List of lists representing the corpus of all tokens
- corpus_size: Integer with the size of the corpus_of_tokens - just so we can avoid re-calculating it

## Example

iex> ContentIndexerService.calculate_content_indexer_documents(
      ["bread","butter","jam"],
      [
        ["red","brown","jam"],
        ["blue","green","butter"],
        ["pink","green","bread","jam"]
      ])
      {:ok, [bread: 0.3662040962227032, butter: 0.3662040962227032,jam: 0.3662040962227032]}
Link to this function calculate_content_indexer_query(tokens)

calculates the content_indexer weights for each token in the query - weights the query against itself

## Parameters

- tokens: List of tokens to be indexed

## Example

iex> ContentIndexerService.calculate_content_indexer_query(["bread","butter","jam"])
      {:ok, [bread: 0.0, butter: 0.0, jam: 0.0]}
Link to this function calculate_tf_document(tokens)

calculates the term frequency for each token in the list of tokens representing the document and returns a list of the tokens with their respective term frequencies

## Parameters

- tokens: List of tokens to be indexed

## Example

iex> ContentIndexerService.calculate_tf_document(["bread","butter","jam","jam","bread","bread"])
      {:ok, [bread: 0.5, butter: 0.16666666666666666, jam: 0.3333333333333333]}
Link to this function calculate_token_count_document(tokens)

calculates the word count for each token in the list of tokens representing the document and returns a list of the tokens with their respective word counts

## Parameters

- tokens: List of tokens to be indexed

## Example

iex> ContentIndexerService.calculate_token_count_document(["bread","butter","jam","jam","bread","bread"])
      {:ok, [bread: 3, butter: 1, jam: 2]}
Link to this function init_calculator()
Link to this function list_contains(list, item)

simple function to check if an item is contained in the list

## Parameters

- list: List of any type
- item: Any type of item stored in the list

## Example

iex> ContentIndexerService.calculate_content_indexer_documents(
      ["bread","butter","jam"],
      [
        ["red","brown","jam"],
        ["blue","green","butter"],
        ["pink","green","bread","jam"]
      ])
      {:ok, [bread: 0.3662040962227032, butter: 0.3662040962227032,jam: 0.3662040962227032]}