Core modules
- class chatbot_eval.types.Sample(question, expected_answer)[source]
Bases:
objectOne evaluation row loaded from the FAQ CSV file.
- Parameters:
question (str)
expected_answer (str)
- question: str
- expected_answer: str
- class chatbot_eval.types.Completion(text, thinking=None, raw=<factory>)[source]
Bases:
objectRaw completion returned by a chat backend.
- Parameters:
text (str)
thinking (str | None)
raw (Dict[str, Any])
- text
Final model output shown to the user.
- Type:
str
- thinking
Optional reasoning trace, when the provider exposes one.
- Type:
str | None
- raw
Provider-native payload kept for debugging.
- Type:
Dict[str, Any]
- text: str
- thinking: str | None
- raw: Dict[str, Any]
- class chatbot_eval.types.BotResult(answer, metadata=<factory>)[source]
Bases:
objectAnswer returned by a bot together with trace metadata.
- Parameters:
answer (str)
metadata (Dict[str, Any])
- answer: str
- metadata: Dict[str, Any]
- class chatbot_eval.types.MetricResult(name, score, details=<factory>)[source]
Bases:
objectResult produced by one metric for one question-answer pair.
- Parameters:
name (str)
score (float)
details (Dict[str, Any])
- name: str
- score: float
- details: Dict[str, Any]
- class chatbot_eval.evaluation.evaluator.Evaluator(metrics)[source]
Bases:
objectEvaluate bots against samples and collect row-level outputs.
- Parameters:
metrics (list[object])
- metrics: list[object]