Maxim AI release notes
Maxim AI release notes
www.getmaxim.ai

Logging and online evaluations

 

New

  

Maxim now enables seamless import of production logs via our Python and JavaScript SDKs. For detailed guidance, consult the documentation.

online_eval.gif

Unlike traditional logging frameworks, Maxim’s stateless SDK allows for one-time logger configuration across multiple services, nodes, or functions, eliminating the need to manage thread pools or manually sequence logs. With a session object, you can track activities throughout the entire conversation lifecycle.

A session provides the overarching context for an operation, while traces offer detailed records of activities within that session, delivering granular insights. We also have seamless integration with langchain for easy integration using the MaximLangChainTracer.

You can also do detailed tracing and debugging with online evaluation:

  1. Access a trace or session view for comprehensive analysis.

  2. Set up a continuous evaluation with custom rules:

    • Go to the Evaluation tab at the top.

    • Click on Configure evaluation

    • Specify the sampling rate to determine the percentage of traces to run evaluation on.

    • Apply filters to evaluate only the queries that meet these criteria eg, run evaluation only on a range of user feedback (<3 ⭐️ )

    • Activate the desired evaluators. Just so you know,t these evaluators for online evaluators are only those that do not require ground truth.

  3. Click the Save button to begin the evaluation.

  4. For the first time, you also need to enable the active toggle.

After making the evaluation active, all your new logs will start being evaluated based on your sampling rate.

Dataset splits

 

New

  

Maxim now offers the ability to create dataset splits directly from a single dataset, allowing you to tailor subsets for specific use cases and run test evaluations on these customized splits.

dataset_split.gif

Key Value Add:

Previously, managing multiple datasets was necessary, but with this new feature, you can bring a single dataset, create multiple splits, and evaluate each split as needed.

Any changes made to the main dataset, such as deletions or rearrangements, will automatically be reflected in the split datasets if the affected entries are included in the split.

How to Create a Split:

  1. Go to any dataset.
  2. Click on the Add split button at the top.
  3. Select the entries you want to include in the split.
  4. Click the Add n entries to split button (where n represents the number of selected rows). Alternatively, you can:
    • Search and select entries: After selecting the desired entries, directly click the Add n entries to split button.
    • Filter and select entries: After filtering and selecting the entries, directly click the Add n entries to split button.
  5. Provide a split name and description, then click the Update button to finalize the split.

This approach offers flexibility in creating splits by allowing you to search, filter, and select entries efficiently.

Once the split is created, you can include it in your test configuration for evaluation.

✅ 3 new context evaluators

 

New

  

Maxim has introduced three new evaluators to enhance the assessment of context quality and retriever performance:

  1. Context Relevancy
  2. Context Precision
  3. Context Recall

context_evals.gif

Key Value Add: These evaluators go beyond providing scores—they also offer detailed reasoning for the results. They also consider context ranking, providing transparency and clarity that other context evaluators don’t offer. This feature ensures you understand not just the score but also why the score was given.

To use these evaluators:

  1. Navigate to the Evaluator Store.
  2. Add these evaluators to your workspace.
  3. Toggle them on during your testing sessions.

➡️ Data curation from Logs -> dataset is now available

 

New

  

Maxim now offers a new feature for curating datasets directly from production logs, making it seamless to curate finetuning and eval datasets that follow production distribution.

data_curation.gif

With this feature, users can:

  • Filter Logs: Apply filters based on user feedback, model type, and many more options to select specific target logs
  • Create Datasets: Curate datasets for evaluation and fine-tuning from the selected log entries.

How to Curate Data:

  1. Go to the Logs section of any log repository that you have added or add new by clicking the plus icon on the top and view by traces.
  2. Add a filter to select the desired entries.
  3. Click Add to dataset on the right top corner.
  4. This opens a drop down where you can select to add these entries to an existing dataset or create a new dataset.
  5. Map the columns and finally click Add to dataset.

This feature streamlines the process of creating datasets from production logs, enabling a complete feedback loop for curating fine-tuned and evaluation datasets.

📊 Run tests directly on datasets

 

New

  

Maxim now enables direct testing of your datasets, adding a new column type, "Output," to simplify the process. With this latest update, you can bring your GenAI applications to Maxim through workflows, context sources, and datasets.

To get started, you'll need to bring in your dataset with three key columns: Input, Output, and Expected Output.

To test your dataset in Maxim:

  1. Click on the "Test" button.
  2. You'll see an option to add context—this is optional but useful if you want to evaluate context. If so, ensure your dataset includes a "Context" column.
  3. Choose the evaluators.
  4. Trigger a test run.
  5. Review the results in the "Runs" tab.

dataset_testing.gif

✨ llama 3.1 is now available on Maxim

 

New

  

You can use llama 3.1 (all variants) on Maxim now via 2 providers

  • Together provider
  • Groq provider

🌎 Domain Discovery is here!

 

New

  

Maxim now offers Domain Discovery, a powerful new feature to streamline user onboarding for larger teams and organizations.

With Domain Discovery, users can:

  • Find existing accounts associated with their organization's domain
  • Request to join these accounts directly within Maxim

Key benefits:

  • Reduces administrative overhead for account management
  • Accelerates team collaboration by quickly integrating new members

To enable Domain Discovery:

  1. As a Super Admin of the account, you can go to Settings > Account Info.
  2. You will see a toggle to enable domain discovery for your account.

domain-discovery.gif

To use Domain Discovery:

  1. During sign-up, log in using your work email.
  2. Maxim will display any existing accounts associated with that domain
  3. Select the appropriate account and submit a join request
  4. Account administrators can approve or deny requests from their dashboard

Domain Discovery is available for all Maxim users and is particularly valuable for enterprises and growing teams looking to manage their Maxim workspace efficiently.

⭐️ GPT-4o mini is available on Maxim

 

New

  

GPT-4o mini is

  • 3x cheaper than GPT-3.5-turbo.
  • 2x cheaper than Gemini-Flash.
  • 40% cheaper than Haiku.

And it's available on the Maxim platform now:

Go to Settings > Model Config > OpenAI or Azure > Select GPT 4o Mini

image.png

🔗 Prompt chain supports 2 new blocks

 

Improvement

  
  • Prompt chains is the best way to create different scenarios for your workflows. We've enhanced this feature with two new block types:
  1. Code block: Write any JS code snippet that accepts an input and responds with an output.
  2. API block: Use any API call to process data in a prompt chain.

Jul-02-2024 3-19-39 PM.gif

✨ New models, including Claude 3.5 Sonnet, are now available

 

New

  
  • We have added support for 4 new models across providers
    • Anthropic provider
      • Claude 3.5 Sonnet
    • Bedrock provider
      • Claude Instant 1.2
      • Claude 2.1
      • Claude 3 Sonnet
      • Claude 3.5 Sonnet