Langfuse just got faster →

What Are Scores and When Should I Use Them?

Scores are covered in detail on the Evaluation Concepts page, including:

How to Create Scores

There are four ways to add scores:

  • LLM-as-a-Judge: Set up automated evaluators that score traces based on custom criteria (e.g. hallucination, tone, relevance). These can return numeric, categorical, or boolean (true / false) scores plus reasoning, and can run on live production traces or on experiment results.
  • Annotation in the UI: Team members manually score traces, observations, or sessions directly in the Langfuse dashboard. Requires a score config to be set up first.
  • Annotation queues: Set up structured review workflows where reviewers work through batches of traces.
  • Scores via API/SDK: Programmatically add scores from your application code — for user feedback, guardrail results, custom evaluation pipelines, or open-ended text feedback.

Was this page helpful?