TantivyEx.MergePolicy (TantivyEx v0.4.1)
View SourceMerge policy configuration for TantivyEx indexes.
Merge policies control when and how index segments are merged together. This is important for index performance and storage efficiency.
Available Policies
LogMergePolicy- Default policy that merges segments of similar sizesNoMergePolicy- Never merges segments automatically
Examples
# Create a default log merge policy
{:ok, policy} = TantivyEx.MergePolicy.log_merge_policy()
# Create a custom log merge policy
{:ok, policy} = TantivyEx.MergePolicy.log_merge_policy(%{
min_num_segments: 4,
max_docs_before_merge: 5_000_000,
min_layer_size: 5_000,
level_log_size: 0.8,
del_docs_ratio_before_merge: 0.3
})
# Create a no-merge policy for testing
{:ok, policy} = TantivyEx.MergePolicy.no_merge_policy()
# Apply policy to an index writer
TantivyEx.MergePolicy.set_merge_policy(index_writer, policy)
Summary
Functions
Gets information about the current merge policy of an IndexWriter.
Gets the number of searchable segments in an index.
Gets the list of searchable segment IDs from an index.
Creates a new LogMergePolicy with default settings.
Creates a new LogMergePolicy with custom settings.
Manually triggers a merge operation for specific segments.
Creates a NoMergePolicy that never automatically merges segments.
Sets the merge policy for an IndexWriter.
Waits for all merging threads to complete.
Types
@type log_merge_options() :: %{ optional(:min_num_segments) => non_neg_integer(), optional(:max_docs_before_merge) => non_neg_integer(), optional(:min_layer_size) => non_neg_integer(), optional(:level_log_size) => float(), optional(:del_docs_ratio_before_merge) => float() }
@type merge_policy() :: reference()
Functions
Gets information about the current merge policy of an IndexWriter.
Parameters
index_writer- The IndexWriter reference
Returns
{:ok, info}- Debug information about the current merge policy{:error, reason}- If getting the info fails
Examples
{:ok, info} = TantivyEx.MergePolicy.get_merge_policy_info(index_writer)
IO.puts(info)
@spec get_num_segments(reference()) :: {:ok, non_neg_integer()} | {:error, term()}
Gets the number of searchable segments in an index.
Parameters
index- The Index reference
Returns
{:ok, count}- Number of segments{:error, reason}- If getting the count fails
Examples
{:ok, segment_count} = TantivyEx.MergePolicy.get_num_segments(index)
IO.puts("Index has #{segment_count} segments")
Gets the list of searchable segment IDs from an index.
This is useful for understanding the current segment structure and for manual merge operations.
Parameters
index- The Index reference
Returns
{:ok, segment_ids}- List of segment ID strings{:error, reason}- If getting segment IDs fails
Examples
{:ok, segment_ids} = TantivyEx.MergePolicy.get_searchable_segment_ids(index)
IO.inspect(segment_ids, label: "Segment IDs")
@spec log_merge_policy() :: {:ok, merge_policy()} | {:error, term()}
Creates a new LogMergePolicy with default settings.
LogMergePolicy groups segments into levels based on their size and merges segments within each level when there are enough segments or when the delete ratio exceeds the threshold.
Returns
{:ok, policy}- The merge policy reference{:error, reason}- If creation fails
Examples
{:ok, policy} = TantivyEx.MergePolicy.log_merge_policy()
@spec log_merge_policy(log_merge_options()) :: {:ok, merge_policy()} | {:error, term()}
Creates a new LogMergePolicy with custom settings.
Options
:min_num_segments- Minimum number of segments to merge (default: 8):max_docs_before_merge- Maximum docs in segment before it's excluded from merging (default: 10,000,000):min_layer_size- Minimum segment size for level grouping (default: 10,000):level_log_size- Log ratio between consecutive levels (default: 0.75):del_docs_ratio_before_merge- Delete ratio threshold to trigger merge (default: 1.0)
Returns
{:ok, policy}- The merge policy reference{:error, reason}- If creation fails or parameters are invalid
Examples
# More aggressive merging
{:ok, policy} = TantivyEx.MergePolicy.log_merge_policy(%{
min_num_segments: 4,
del_docs_ratio_before_merge: 0.2
})
# Less aggressive merging for better write performance
{:ok, policy} = TantivyEx.MergePolicy.log_merge_policy(%{
min_num_segments: 12,
max_docs_before_merge: 50_000_000
})
Manually triggers a merge operation for specific segments.
This allows you to explicitly control which segments get merged, bypassing the merge policy's automatic decisions.
Parameters
index_writer- The IndexWriter referencesegment_ids- List of segment ID strings to merge
Returns
:ok- If the merge was triggered successfully{:error, reason}- If the merge cannot be started
Examples
{:ok, segment_ids} = TantivyEx.Index.get_searchable_segment_ids(index)
:ok = TantivyEx.MergePolicy.merge_segments(index_writer, segment_ids)
@spec no_merge_policy() :: {:ok, merge_policy()} | {:error, term()}
Creates a NoMergePolicy that never automatically merges segments.
This is useful for testing scenarios or when you want complete manual control over segment merging.
Returns
{:ok, policy}- The merge policy reference{:error, reason}- If creation fails
Examples
{:ok, policy} = TantivyEx.MergePolicy.no_merge_policy()
@spec set_merge_policy(reference(), merge_policy()) :: :ok | {:error, term()}
Sets the merge policy for an IndexWriter.
Parameters
index_writer- The IndexWriter referencemerge_policy- The merge policy to set
Returns
:ok- If the policy was set successfully{:error, reason}- If setting the policy fails
Examples
{:ok, policy} = TantivyEx.MergePolicy.log_merge_policy()
:ok = TantivyEx.MergePolicy.set_merge_policy(index_writer, policy)
Waits for all merging threads to complete.
This is useful when you want to ensure all pending merges are finished before proceeding, such as during testing or before closing an index.
Parameters
index_writer- The IndexWriter reference
Returns
:ok- If all merging threads completed successfully{:error, reason}- If waiting fails
Examples
:ok = TantivyEx.MergePolicy.wait_merging_threads(index_writer)