Metastatic.Analysis.BusinessLogic.InefficientFilter (Metastatic v0.10.4)

View Source

Detects inefficient filtering: fetching all data then filtering in memory.

This analyzer identifies code that fetches all records from a database/API, then filters them in application memory - a major performance anti-pattern.

Cross-Language Applicability

This is a universal performance anti-pattern across all data access layers:

  • Python/Django: users = User.objects.all(); active = [u for u in users if u.active]
  • Python/SQLAlchemy: users = session.query(User).all(); active = filter(lambda u: u.active, users)
  • JavaScript/Sequelize: users = await User.findAll(); active = users.filter(u => u.active)
  • JavaScript/MongoDB: users = await collection.find().toArray(); active = users.filter(...)
  • Elixir/Ecto: users = Repo.all(User); Enum.filter(users, & &1.active)
  • Ruby/ActiveRecord: users = User.all; active = users.select(&:active?)
  • C#/Entity Framework: users = context.Users.ToList(); active = users.Where(u => u.Active)
  • Java/Hibernate: users = session.createQuery("from User").list(); filtered = users.stream().filter(...)

Problem

Fetching all data then filtering wastes:

  • Network bandwidth: Transferring unnecessary data
  • Memory: Loading records that will be discarded
  • CPU: Client-side filtering instead of server-side
  • Database resources: Full table scan when index could be used

For 1 million users where 100k are active:

  • Bad: Transfer 1M records, filter to 100k in memory
  • Good: Transfer 100k active records via WHERE clause

Examples

Bad (Python/Django)

users = User.objects.all()  # Fetch all
active_users = [u for u in users if u.is_active]  # Filter in memory

Good (Python/Django)

active_users = User.objects.filter(is_active=True)

Bad (JavaScript)

const posts = await Post.findAll();
const published = posts.filter(p => p.status === 'published');

Good (JavaScript)

const published = await Post.findAll({ where: { status: 'published' } });

Bad (Elixir/Ecto)

users = Repo.all(User)
active_users = Enum.filter(users, fn u -> u.active end)

Good (Elixir/Ecto)

import Ecto.Query
active_users = User |> where([u], u.active == true) |> Repo.all()

Bad (C#/Entity Framework)

var users = context.Users.ToList();
var active = users.Where(u => u.IsActive).ToList();

Good (C#/Entity Framework)

var active = context.Users.Where(u => u.IsActive).ToList();

Detection Strategy

Detects the pattern:

{:block, [
  {:assignment, var, {:function_call, fetch_all_op, ...}},
  {:collection_op, :filter, lambda, var}
]}

Where:

  1. Variable is assigned result of "fetch all" database operation
  2. Same variable is immediately filtered via collection operation

Database "Fetch All" Heuristics

Function names suggesting "fetch all":

  • *all*, *findAll*, *getAll*, *fetchAll*
  • *toList*, *toArray*, *.list*
  • ORM-specific: Repo.all, *.objects.all(), query().all()

Limitations

  • Requires consecutive statements (assignment + filter)
  • May miss if intermediate operations occur
  • Heuristic-based: may have false positives/negatives