Queries and Schema

Archeometer.Query defines an Ecto-like query language to get information out of the generated data base without going out of Elixir.

For example, to get the top 10 biggest modules, you can use the following

import Archeometer.Query
alias Archeometer.Schema.Module

query = from m in Module,
          select: [m.name, m.num_lines]
          order_by: [desc: m.num_lines]

Archeometer.Repo.all(query)

Let's break it down into simpler parts

schema

Schema

You must query an schema. An schema holds the information about the table. It has fields and references to other schemas. In the previous example we used the Archeometer.Schema.Module schema. It looks like this (with some skipped sections for brevity)

defschema(:modules) do
  field(:id, primary_key: true)
  field(:name)
  field(:num_lines)
  ...
  belongs_to(App, key: :app_id)
  has(Function, as: :functions)
  ...
end

It has the name and num_lines fields, so you can select them. There are references to other schema. They allow you to select fields in other tables. For example the schema at Archeometer.Schema.App looks like this

defschema :apps do
  field(:id, primary_key: true)
  field(:name)
  has(Module, as: :modules, key: :app_id)
end

And the schema at Archeometer.Schema.Function also has a name field. So you can do the following

import Archeometer.Query
alias Archeometer.Schema.Module

from m in Module, select: m.name, m.app.name, m.functions.name

And this will select all the function names, with module and application. You can read the Archeometer.Schema documentation for more details about how they work. The available schemas are

term-in-query-expressions

Term in query expressions

The expression that are accepted in the queries are a subset of the regular Elixir terms. A subset that is translatable to SQL. For example, in the following query

from m in Module, where: m.num_lines > 50, select: [m.name, m.app.name]

we use an arithmetic comparison, some integer literals. m.name is a field lookup, m.app.name is also a field lookup, but with an implicit join with another table holding the application information.

You can find read more about them in Archoemeter.Query and the whole specification is available in Archeometer.Query.Term.

building-queries

Building queries

Queries are built using the Archoemeter.Query.from/2 macro. The first parameter must be a module implementing a schema. The rest are keyword arguments. Each keyword corresponds to an SQL keyword. They more or less replicate the Ecto API. For example you can do the following

import Archeometer.Query
alias Archeometer.Schema.Module

from m in Module,
  select: [app: m.app.name, avg_size: avg(m.num_lines)]
  where: m.app.name == "archeometer"
  order_by: [desc: sum(m.num_lines)],
  group_by: m.app.name
  having: avg_size > 50,
  limit: 10

To then get all the data you must execute the query. For now the only way to do it is with the Archeometer.Repo.all/1 function, as in the initial example. You can see its full documentation at Archeometer.Repo.