Selecto.Builder.Sql.Olap (Selecto v0.4.3)

OLAP dimension optimization SQL patterns for star and snowflake schemas.

Provides optimized JOIN patterns for analytical workloads, focusing on fact table performance and dimension table efficiency. Handles both star schema (denormalized dimensions) and snowflake schema (normalized dimension hierarchies) patterns.

Phase 4: Full OLAP dimension implementation with query optimization

supported-patterns

Supported Patterns

  • Star schema dimensions: Direct fact-to-dimension joins optimized for aggregation
  • Snowflake dimensions: Multi-level normalization chains with proper JOIN ordering
  • Dimension filtering: Optimized WHERE clause placement for analytical queries
  • Fact table hints: Query hints and ordering for large fact table performance

star-vs-snowflake

Star vs Snowflake

Star Schema - Denormalized dimensions for query performance:

fact_table -> dimension_1 (all attributes in one table)
           -> dimension_2 (all attributes in one table)

Snowflake Schema - Normalized dimensions for data integrity:

fact_table -> dim_level_1 -> dim_level_2 -> dim_level_3

examples

Examples

# Star schema: sales facts with denormalized customer dimension
config = %{
  type: :star_dimension,
  source: "customers",
  display_field: "full_name",
  dimension_key: "customer_id"
}

# Snowflake schema: product hierarchy with normalization
config = %{
  type: :snowflake_dimension,
  source: "products", 
  display_field: "name",
  normalization_joins: [
    %{table: "categories", key: "category_id"},
    %{table: "brands", key: "brand_id"}
  ]
}

Link to this section Summary

Functions

Build dimension-aware WHERE clause optimization.

Build fact table optimization hints and JOIN ordering.

Build snowflake schema dimension join with normalization chain.

Build star schema dimension join optimized for OLAP queries.

Link to this section Functions

Link to this function

build_dimension_filter_optimization(filter_config)

Build dimension-aware WHERE clause optimization.

In OLAP queries, WHERE clause placement significantly affects performance. Dimension filters should be applied early, while fact table filters need careful consideration of index usage.

filter-placement-strategy

Filter Placement Strategy

  • Dimension filters: Applied at JOIN time for early elimination
  • Fact filters: Applied after JOINs for optimal fact table index usage
  • Time dimension filters: Special handling for partitioned fact tables

examples

Examples

build_dimension_filter_optimization(%{
  dimension_filters: [
    {"customers.region", "=", "North America"},
    {"products.category", "IN", ["Electronics", "Books"]}
  ],
  fact_filters: [
    {"sales.amount", ">", 1000},
    {"sales.date", "BETWEEN", ["2023-01-01", "2023-12-31"]}
  ]
})

Returns: {optimized_where_iodata, filter_params}

Link to this function

build_fact_table_optimization(selecto, fact_config, join_configs)

Build fact table optimization hints and JOIN ordering.

Fact tables in OLAP systems are typically very large, so JOIN ordering and query hints are critical for performance. This function adds database-specific optimizations for fact table queries.

optimizations-applied

Optimizations Applied

  • Fact table scanned first (for selective WHERE conditions)
  • Dimension tables joined in order of selectivity
  • Query hints for large table handling
  • Index hints for dimensional foreign keys

examples

Examples

build_fact_table_optimization(selecto, :sales_facts, %{
  large_fact_table: true,
  primary_dimensions: [:time, :customer, :product],
  estimated_rows: 10_000_000
})

Returns: {query_hints_iodata, optimization_params}

Link to this function

build_olap_join_with_optimization(selecto, join, config, olap_type, fc, p, ctes)

Build OLAP-optimized join with pattern detection.

Main entry point for OLAP join building. Detects whether to use star or snowflake patterns based on configuration and applies appropriate optimizations.

pattern-detection

Pattern Detection

  • Star schema: Single dimension table with denormalized data
  • Snowflake schema: Multiple normalization tables in chain

Returns: {from_clause_iodata, params, ctes}

Link to this function

build_snowflake_dimension_join(selecto, join, config, fc, p, ctes)

Build snowflake schema dimension join with normalization chain.

Snowflake schemas normalize dimension data across multiple tables to maintain data integrity. This requires chaining multiple JOINs to reconstruct the full dimensional context.

normalization-chain-handling

Normalization Chain Handling

  • Primary dimension table joined to fact
  • Secondary normalization tables joined in sequence
  • Proper JOIN ordering to avoid Cartesian products
  • Optimized for referential integrity queries

parameters

Parameters

  • selecto: Main selecto struct
  • join: Primary dimension identifier
  • config: Snowflake dimension configuration with normalization_joins
  • fc: Current from clause iodata
  • p: Current parameters list
  • ctes: Current CTEs list

Returns: {updated_from_clause, updated_params, updated_ctes}

Link to this function

build_star_dimension_join(selecto, join, config, fc, p, ctes)

Build star schema dimension join optimized for OLAP queries.

Star schemas prioritize query performance by denormalizing dimension data into single tables. This creates direct fact-to-dimension joins that are optimal for aggregation queries and analytical workloads.

optimizations-applied

Optimizations Applied

  • Dimension tables joined directly to fact table
  • Dimension filters pushed down for early elimination
  • Display fields aliased for clear result presentation
  • Faceted filtering enabled for interactive analytics

parameters

Parameters

  • selecto: Main selecto struct (contains fact table info)
  • join: Join identifier (dimension name)
  • config: OLAP dimension configuration
  • fc: Current from clause iodata
  • p: Current parameters list
  • ctes: Current CTEs list

Returns: {updated_from_clause, updated_params, updated_ctes}