Maxim AI release notes
Maxim AI release notes
www.getmaxim.ai

πŸ’Ύ Evaluation presets for reusing your test configs

 

New

  
  • To streamline your testing, Maxim now supports saving a preset for your testing configuration and can be reused across the workspace.
    • A preset is combination of
      • Dataset
      • RAG source
      • Set of evaluators
  • A test config preset can be used for all the entities in Maxim (Workflow, Prompt chains and Prompts).
  • To create a preset,
    • Go to any workflow/prompt/prompt chain
    • Click on presets and click on β€œCreate new preset”.
    • Or create a test config and click on β€œSave as a preset”.

presets.gif

βœ… Pass-fail criteria for evaluators

 

New

  

Adding evaluation thresholds is helpful for quick decisions. Today, we are shipping pass-fail workspace-specific criteria on custom and built-in evaluators to speed up decision-making.

How to set pass-fail criteria?

  1. Go to any evaluator in your workspace.
  2. You will see a section called, Pass criteria.
  3. The first value is on each entry-level i.e. in the given image, a test run entry will be marked as passed, if the "Clarity" score for that test run entry is >= 0.8
  4. The second value is at the test run report level, i.e., "Clarity" is marked as passed if 80% of the entries are marked as passed.

image.png

You can view the pass-fail result in the top section of Test Run Report

image.png

πŸŒ… Prompt chains now support image inputs

 

New

  
  • Prompt chain playground now supports user messages with image inputs.

prompt chain image.gif

βŒ— Datasets editor got a major upgrade

 

New

  

We have launched an all-new Excel-like editing experience for our datasets.

  • Directly copy and paste from and to Excel sheets.
  • Shortcuts compatibility.
  • Add/remove columns based on your use cases.

Supported column types

  1. Input - this goes as a user message in test runs.
  2. Expected Output.
  3. Expected Tools Call.
  4. Variable - These are variables used in input/workflow/prompt/prompt-chains/prompt-experiments.
  5. Images - These go as an attachment with user input if the selected model supports it.

Datasets.gif

GPT-4o is available on Maxim

 

New

  
  • We have enabled GPT-4o model.
  • Along with GPT-4o, we have enabled
    • Llama-3 (8b), (70b) on Groq

πŸ”— Visual Prompt Chains now have a new home

 

New

  
  • You can access them easily by clicking on the Prompts menu on the top bar.

image.png

 

Improvement

  
  • Test run console logs got some visual uplift.

image.png

🏷️ Now create Prompt Version with multiple messages

 

New

  
  • We have updated prompt version management to handle multiple messages in a version and return them in the same order they were selected using the SDK.

prompt-version.gif

πŸ‘©β€πŸ”¬ All new Prompt Experiments now support vision models

 

New

  
  • Our Prompt Experiment tool has been given a makeover to help you compare prompts side-by-side like never before. You're going to love the new and improved version!

promt-experiments.gif

πŸ“¦ Meta Llama 3 is available via Together Provider

 

New

  
  • We have enabled Meta Llama 3 on Together provider. Both Llama 3 (8B) and Llama 3 (70B) are available to use.
  • You can add them to the platform by going to Settings > Model Config > Together.

image.png

πŸ“ž Tools calling accuracy is now available on evaluator store

 

New

  
  • We have added a new evaluator called Tools Calling Accuracy in our evaluator store that helps you find function calling accuracy for a given model and prompt.

  • Install "Tools Calling Accuracy" evaluator from Evaluator store.

install-evaluator.gif

  • Create a dataset with "Expected Tool Calls" column

save-tool-accuracy-dataset.gif

  • Trigger the test run

test-run.gif