Maxim AI release notes

🤖 Claude 4 models are live on Maxim!

New

Anthropic's latest Claude 4 models are now available on Maxim. Access Claude 4 Opus and Claude 4 Sonnet, both offering enhanced reasoning capabilities and improved performance for your experimentation and evaluation workflows.

Start using these models via the Anthropic provider:
✅ Go to Settings > Models > Select Anthropic provider > Add Claude 4 Opus or Claude 4 Sonnet

🤖 Agentic mode in the Prompt Playground

New

Prototype complete agent behavior, including automatic tool calling, directly within the playground. Here’s what you can do:

Test multi-step flows: Experiment and evaluate complex agentic interactions where the model automatically calls tools and executes steps until a final response is generated
Set limits and termination conditions: Control the maximum number of tool calls allowed and specify a custom string to end the agentic sequence when it appears in the response.
Mimic and monitor tool usage: Track which tools are being called during model generation.

✨ Enhanced Workflow customization and API-based evaluation with Scripts!

New

Maxim Workflows now offer powerful scripting capabilities, allowing you to tailor API interactions to your exact needs. Use Scripts to modify requests, process response data, and manage multi-turn conversations, giving you full control over your AI agent evaluations.

Key features:

Request Modification: Use the prescript function to modify request parameters, add headers, or transform data before sending requests to your API.
Response Processing: Use the postscriptV2 function to transform the response returned by the API, extract specific fields, and map them to evaluation parameters based on your needs.
Simulation: Use the preSimulation and postSimulation functions to clean up data and set up evaluations on your desired output parameters.

Customize your test runs directly from the UI using Scripts. Learn more.

🧠 Gemma 3 and Qwen3 models are live on Maxim!

New

Gemma 3- Google’s latest open‑model series and Qwen3- Alibaba’s newest open‑source model family are now available on Maxim. Leverage their multimodal and multilingual capabilities for efficient experimentation and evaluation on the Maxim platform.

Add these models to your workspace via the Ollama provider:
✅ Go to Settings > Models > Select Ollama provider > Add your desired model from the Qwen3 or Gemma 3 family

📝 Markdown support for Human Eval instructions

New

You can now use Markdown formatting when creating custom instructions for Human Evaluators. Key benefits:

✅ Improved Formatting: Use headings, lists, and other Markdown elements to structure your instructions clearly.
📖 Enhanced Readability: Makes it easier for human annotators and SME's to comprehend the evaluation criteria, reducing ambiguity.

Mistral AI Models on Maxim!

New

You can now use Mistral’s SOTA models such as Ministral 8B-7b, Mistral Large, and Pixtral Large via the Mistral provider in Maxim.

✅ Get started by navigating to Settings > select Mistral and add a new config > enter your API key and select the desired model.

🕣 Scheduled Runs!

New

Run automated evaluations for your prompts, workflows, and prompt chains at regular intervals using Scheduled Runs. This removes the need for manually triggering test runs each time, and ensures your AI agents and workflows are routinely evaluated for quality and performance.

Set up Scheduled Runs for your workflows by following these steps.

🎆 Fireworks AI is now available on Maxim!

New

You can now connect and run popular models like DeepSeek, Llama, and Qwen using the Fireworks AI provider within Maxim. Integration options:

Serverless: Best for a fast, no-ops setup. Just navigate to Settings > select Fireworks AI provider > add your "API key" to use powerful models like Deepseek v3, Llama 4, and more.
Deployment: Great for custom model control. In the Fireworks AI config window, select type as Deployment > add your API key, Account ID, Deployment ID, and select the models you wish to use.

📎 Attach Files to Traces and Spans!

New

Enhance the observability of your AI workflows by adding local files (audio, images, text, etc.) or remote files (as URLs) directly to your traces and spans using the Maxim SDK. This capability provides richer context, e.g., documents, audio recordings, or images, which were used as input or context, for debugging, analysis, and auditing.

All attachments are stored and viewable within the Maxim platform alongside your trace data, allowing quick access to supporting information for faster issue resolution. Learn more.

# Example- Attach a local audio file to a trace
trace.add_attachment(FileAttachment(path='./files/wav_audio.wav'))

# Example- Attach a remote file or image by URL to a span
span.add_attachment(UrlAttachment(url='https://sample-image.com/test-image'))

📊 Exported test run reports now include a summary

New

When you export your test run reports, the generated XLSX file now automatically includes a summary of the test run alongside the detailed evaluation results.