New
- We have added one more provider support - Vertex AI (now total 13) on Maxim.
- Along with this addition, we have added 15 new evaluators by Vertex AI in the evaluator store
- Vertex Exact Match – Checks if the model’s output exactly matches the expected answer.
- Vertex Bleu – Measures n-gram overlap between the generated and reference texts for translation tasks.
- Vertex Rouge – Evaluates content overlap between output and reference, especially for summaries.
- Vertex Fluency – Assesses how naturally and grammatically correct the text reads.
- Vertex Coherence – Evaluates logical consistency and flow across the generated response.
- Vertex Safety – Flags potentially harmful, toxic, or unsafe content in the output.
- Vertex Groundedness – Verifies if the response stays factual and rooted in provided context.
- Vertex Fulfillment – Measures how well the output satisfies the user prompt or intent.
- Vertex Summarization Quality – Holistic quality score for summaries, balancing coverage, fluency, and faithfulness.
- Vertex Summarization Helpfulness – Assesses whether the summary effectively aids user understanding.
- Vertex Summarization Verbosity – Evaluates whether the summary is overly verbose or too brief.
- Vertex Question Answering Quality – Overall score for QA responses across relevance, correctness, and clarity.
- Vertex Question Answering Relevance – Checks if the answer is contextually relevant to the input question.
- Vertex Question Answering Helpfulness – Assesses if the answer improves understanding or provides value.
- Vertex Question Answering Correctness – Evaluates factual accuracy and truthfulness of QA responses.