Code content analyzer with cognitive load adjustments for programming content.
This analyzer processes code blocks and applies cognitive load multipliers to account for the mental overhead of context switching between prose and code. The algorithm handles identifier boundaries, numeric literals, and operator sequences appropriately.
Algorithm
Line Processing: Reject blank lines and count the remaining lines.
Token Splitting: Split lines on whitespace and identifier boundaries, using heuristics to treat operators as countable tokens.
Decimal literals are treated as single tokens (
3.14is one token)Simple string literals (
"string",'string', and`string`) are unwrapped for token counting; triple literals (""",''', or```) are left aloneDots between identifiers (
object.method.call) are replaced with space for identifier tokenization (resulting in 3 tokens, not 5)Non-identifier character sequences are treated as single tokens, so the Elixir range literal (
1..10//2) is treated as 5 tokens
Cognitive Load: Apply reading word adjustment based on token density
- Lines with < 5 tokens: token count = reading words
- Lines with ≥ 5 tokens:
max(tokens + 3, 10)reading words
There are no configuration options for Prosody.CodeAnalyzer.
Examples
Simple Function
block = %{
type: :code,
content: "def hello\n puts 'world'\nend",
language: "ruby"
}
# Line 1: "def hello" -> 2 tokens -> 2 reading words
# Line 2: " puts 'world'" -> 2 tokens -> 2 reading words
# Line 3: "end" -> 1 token -> 1 reading word
# Result: %{words: 5, reading_words: 5, lines: 3}Complex Expression
block = %{
type: :code,
content: "result = Math.sqrt(a * a + b * b)",
language: "javascript"
}
# Tokens: ["result", "=", "Math", "sqrt", "(", "a", "*", "a", "+", "b", "*", "b", ")"]
# 13 tokens -> max(13 + 3, 10) = 16 reading words
# Result: %{words: 13, reading_words: 16, lines: 1}Numeric and Operator Handling
block = %{
type: :code,
content: "range = 1..100\nstep = 3.14159",
language: "ruby"
}
# Line 1: ["range", "=", "1", "..", "100"] -> 5 tokens -> max(5 + 3, 10) = 10 reading words
# Line 2: ["step", "=", "3.14159"] -> 3 tokens -> 3 reading words
# Result: %{words: 8, reading_words: 13, lines: 2}