Evaluates whether LLM output is relevant to the input query.
Relevance means the output directly addresses and answers the question. Uses statement extraction: breaks output into statements, classifies each as relevant or irrelevant, scores based on proportion of relevant content.