Metric modules

class chatbot_eval.metrics.basic.ExactMatchMetric(name='exact_match')[source]

Bases: object

Binary exact-match score after simple normalization.

Parameters:

name (str)

name: str
score(sample, bot_result)[source]
Parameters:
Return type:

MetricResult

class chatbot_eval.metrics.basic.KeywordRecallMetric(name='keyword_recall')[source]

Bases: object

Recall of expected-answer tokens present in the generated answer.

Parameters:

name (str)

name: str
score(sample, bot_result)[source]
Parameters:
Return type:

MetricResult

class chatbot_eval.metrics.basic.AnswerLengthMetric(name='answer_length_chars')[source]

Bases: object

Character length of the answer as a communication proxy.

Parameters:

name (str)

name: str
score(sample, bot_result)[source]
Parameters:
Return type:

MetricResult

class chatbot_eval.metrics.basic.PolitenessMetric(name='politeness')[source]

Bases: object

Simple heuristic scoring polite or helpful markers in the answer.

Parameters:

name (str)

name: str
score(sample, bot_result)[source]
Parameters:
Return type:

MetricResult

class chatbot_eval.metrics.llm_judge.LLMJudgeMetric(name, llm_client, prompt_path, debug=False)[source]

Bases: object

Call a judge model and parse a JSON score-and-reason response.

Parameters:
  • name (str)

  • llm_client (object)

  • prompt_path (str | Path)

  • debug (bool)

name: str
llm_client: object
prompt_path: str | Path
debug: bool
score(sample, bot_result)[source]
Parameters:
Return type:

MetricResult

chatbot_eval.metrics.registry.build_default_metrics(project_root)[source]

Build the default deterministic and judge-based metric suite.

Parameters:

project_root (str | Path)

Return type:

list[object]