PowerOfThree (PowerOfThree v0.1.3)
View SourceThe PowerOfThree defines three macros to be used with Ecto.Schema to creates cube config files.
The PowerOfThree must be used after using Ecto.Schema.
The Ecto.Schema defines table column names to be used in measure and dimensions defenitions.
The definition of the Cude is possible through main APIs:
cube/3.
cube/3 has to define sql_table: that is refering Ecto schema source.
After using Ecto.Schema and PowerOfThree define cube with cube/2 macro.
Example
defmodule Example.Customer do
use Ecto.Schema
use PowerOfThree
schema "customer" do
field(:first_name, :string)
field(:last_name, :string)
field(:email, :string)
field(:birthday_day, :integer)
field(:birthday_month, :integer)
field(:brand_code, :string)
field(:market_code, :string)
end
cube :of_customers, # name of the cube: mandatory
sql_table: "customer", # Ecto.Schema `source`: mandatory
# Only `sql_table:` is supported. Must reference EctoSchema `:source`
# the `sql:` is not supported and never will be.
description: "of Customers"
# path through options in accordance with Cube DSL
dimension(
[:brand_code, :market_code, :email],
# several fields of `customer` Ecto.Schema: mandatory
# the list is validated against list of fields of EctoSchema
name: :email_per_brand_per_market,
# dimensions `name:`, optional.
primary_key: true # This `customer:` table supports only one unique combination of
# `:brand_code`, `:market_code`, `:email`
)
dimension(
:first_name, # a field of `customer` Ecto.Schema: mandatory
# validated against list of fields of EctoSchema
name: :given_name, # dimension `name:` optional
description: "Given Name"
# path through options in accordance with Dimension DSL
)
measure(:count) # measure of type `count:` is a special one: no column reference in `sql:` is needed
# `name:` defaults to `count:`
measure(:email, # measures counts distinct of `email:` column
name: :aquarii, # name it proper latin plural
type: :count_distinct,
description: "Only count Aquariuses", # add `description:` and `filter:` in options
filters: [%{sql: "(birthday_month = 1 AND birthday_day >= 20) OR (birthday_month = 2 AND birthday_day <= 18)"}]
# correct SQL refrencing correct columns
# `filters:` uses an SQL clause to not count others
)
end
endAfter creating a few dimensions and measures run mix compile. The following yaml is created for the above:
---
cubes:
- name: of_customers
description: of Customers
sql_table: customer
measures:
- name: count
type: count
- meta:
ecto_field: email
ecto_type: string
name: aquarii
type: count_distinct
description: Only count Aquariuses
filters:
- sql: (birthday_month = 1 AND birthday_day >= 20) OR (birthday_month = 2 AND birthday_day <= 18)
sql: email
dimensions:
- meta:
ecto_fields:
- brand_code
- market_code
- email
name: email_per_brand_per_market
type: string
primary_key: true
sql: brand_code||market_code||email
- meta:
ecto_field: first_name
ecto_field_type: string
name: given_name
type: string
description: Given Name
sql: first_name
Auto-Generated Default Cube
When cube/2 is called without a block, PowerOfThree automatically generates dimensions and measures
based on your Ecto schema field types. This provides a quick way to get started without manually
defining each dimension and measure.
Auto-Generation Rules
Dimensions are created for these field types:
:string,:binary,:binary_id,:bitstring→ string dimension:boolean→ boolean dimension:date,:time,:naive_datetime,:utc_datetime(and_usecvariants) → time dimension
Measures are created as follows:
count- always generated (counts all rows)- For
:integerand:idfields - TWO measures per field:<field>_sum- sums the values<field>_distinct- counts distinct values
- For
:floatand:decimalfields:<field>_sum- sums the values
Example
defmodule Example.Product do
use Ecto.Schema
use PowerOfThree
schema "products" do
field :name, :string
field :description, :string
field :active, :boolean
field :price, :float
field :quantity, :integer
timestamps() # adds inserted_at and updated_at
end
# Auto-generates all dimensions and measures
cube :products, sql_table: "products"
endThis auto-generates:
- Dimensions:
name,description,active,inserted_at,updated_at - Measures:
count,quantity_sum,quantity_distinct,price_sum
Accessor functions are created for all auto-generated dimensions and measures:
Product.Dimensions.name() # Access name dimension
Product.Measures.quantity_sum() # Access quantity sum measure
Product.Measures.price_sum() # Access price sum measureWhen to Use Auto-Generation vs Explicit Block
Use auto-generation when:
- You want all fields as dimensions and standard aggregations
- Prototyping or getting started quickly
- Your schema has simple field types that map directly to cube concepts
Use explicit block when:
- You need custom SQL expressions in dimensions
- You want filtered measures
- You need multi-field dimensions (concatenated)
- You want to exclude certain fields
- You need custom measure names or types
Accessing Dimensions and Measures
PowerOfThree generates accessor functions for dimensions and measures in two ways:
Module Accessors
Individual dimensions and measures can be accessed via generated modules:
Customer.Dimensions.brand() # Returns %PowerOfThree.DimensionRef{}
Customer.Dimensions.email()
Customer.Measures.count() # Returns %PowerOfThree.MeasureRef{}
Customer.Measures.aquarii()List Accessors
Get all dimensions or measures as lists:
dimensions = Customer.dimensions() # Returns [%PowerOfThree.DimensionRef{}, ...]
measures = Customer.measures() # Returns [%PowerOfThree.MeasureRef{}, ...]
# Find specific dimension/measure from list
brand = Enum.find(dimensions, fn d -> d.name == :brand end)
count = Enum.find(measures, fn m -> m.name == "count" end)Building Queries
Both accessor styles can be used with QueryBuilder and df/1:
# Using module accessors
Customer.df(columns: [
Customer.Dimensions.brand(),
Customer.Measures.count()
])
# Using list accessors
dimensions = Customer.dimensions()
measures = Customer.measures()
Customer.df(columns: [
Enum.find(dimensions, fn d -> d.name == :brand end),
Enum.find(measures, fn m -> m.name == "count" end)
])Type Mapping
The dimensions and measures derive some defaults from Ecto.Schema.field properties.
For example the dimension: type: is derived from ecto if not given explicitly according to this rules:
| Cube dimension types | Ecto type | Elixir type |
|---|---|---|
| number | :id | integer |
| string | :binary_id | binary |
| number, boolean | :integer | integer |
| number, boolean enough? | :float | float |
| boolean | :boolean | boolean |
| string | UTF-8 encoded string | string |
| string | :binary | :binary |
| string | :bitstring | :bitstring |
{:array, inner_type} | list | TODO geo? |
| Not Supported now | :map | map |
| Not Supported now | {:map, inner_type} | map |
| number | :decimal | Decimal |
| time | :date | Date |
| time | :time | Time |
| time | :time_usec | Time |
| time | :naive_datetime | NaiveDateTime |
| time | :naive_datetime_usec | NaiveDateTime |
| time | :utc_datetime | DateTime |
| time | :utc_datetime_usec | DateTime |
| number | :duration | Duration |
The goal of PowerOfThree is to cover 80% of cases where the source of Ecto Schema is a table and fields have real column names:
where field name =:= database column name
The the support of all cube features is not the goal here. The automation of obtaining the usable cube configs with minimal verbocity is: avoid typing more typos then needed.
The cube DSL allows the sql: - any SQL query. If everyone can write SQL it does not mean everyone should.
Writing good SQL is an art a few knew. In the memory of Patrick's Mother the PowerOfThree will not support sql:.
While defining custom sql: may looks like an option, how would one validate the creative use of aliases in SQL?
Meanwhile Ecto.Schema fields are available for references to define dimensions type:.
Summary
Functions
Dimension first argument takes a single Ecto.Schema field or a list of Ecto.Schema fields.
Measure first argument takes an atom :count, a single Ecto.Schema field or a list of Ecto.Schema fields.
Uses :inserted_at as default time dimension
defmacro cube(CALLER,cube_name, what_ecto_schema, block)
Functions
Dimension first argument takes a single Ecto.Schema field or a list of Ecto.Schema fields.
Lets create a Dimension for several Ecto.Schema fields. A list of Ecto.Schema fields is mandatory.
Ecto.Schema fields concatenated into SQL: brand_code||market_code||email
The primary_key: true tells the cube how to distinguish unique records.
Examples
dimension(
[:brand_code, :market_code, :email],
name: :email_per_brand_per_market,
primary_key: true
)Lets create a Dimension for a single Ecto.Schema field
Examples
dimension(:brand_code, name: :brand, description: "Beer")
Measure first argument takes an atom :count, a single Ecto.Schema field or a list of Ecto.Schema fields.
Lets create a Measure for several Ecto.Schema fields. A list of Ecto.Schema fields reference and :type are mandatory.
The sql: is mandatory, must be valid SQL clause using the fields from list and returning a number.
Examples
measure([:tax_amount,:discount_total_amount],
sql: "tax_amount + discount_total_amount",
type: :sum,
description: "two measures we want add together"
)Lets create a Measure for a single Ecto.Schema field
The Ecto.Schema field reference and :type are mandatory
The other cube measure DLS properties are passed through
Examples
measure(:email,
name: :emails_distinct,
type: :count_distinct,
description: "count distinct of emails"
)Lets create a Measure of type :count
No :type is needed
The other cube measure DLS properties are passed through
Examples
measure(:count,
description: "no need for fields for :count type measure"
)
Uses :inserted_at as default time dimension
defmacro cube(CALLER,cube_name, what_ecto_schema, block)
defp cube(caller,cube_name,what_ecto_schema, block) do