-
Notifications
You must be signed in to change notification settings - Fork 39
Blog Post Submission: Tracking and Debugging AI Safety Evaluations with Inspect AI and MLflow #532
Copy link
Copy link
Open
Labels
ack/guideI have read through and am familiar with the contributing guideI have read through and am familiar with the contributing guideack/legalI have read and understand the legal considerations for blog postingI have read and understand the legal considerations for blog postingack/readmeI have configured my local development environment for building the website locallyI have configured my local development environment for building the website locallyblog/deep-diveI want to write an in-depth guide blogI want to write an in-depth guide blogtopic/advancedI'm writing about advanced features or the plugin system of MLflowI'm writing about advanced features or the plugin system of MLflowtopic/genaiI'm writing about GenAI use cases or featuresI'm writing about GenAI use cases or featurestopic/trackingI'm writing about MLflow trackingI'm writing about MLflow trackingtopic/uiI'm writing about the MLflow UII'm writing about the MLflow UI
Metadata
Metadata
Assignees
Labels
ack/guideI have read through and am familiar with the contributing guideI have read through and am familiar with the contributing guideack/legalI have read and understand the legal considerations for blog postingI have read and understand the legal considerations for blog postingack/readmeI have configured my local development environment for building the website locallyI have configured my local development environment for building the website locallyblog/deep-diveI want to write an in-depth guide blogI want to write an in-depth guide blogtopic/advancedI'm writing about advanced features or the plugin system of MLflowI'm writing about advanced features or the plugin system of MLflowtopic/genaiI'm writing about GenAI use cases or featuresI'm writing about GenAI use cases or featurestopic/trackingI'm writing about MLflow trackingI'm writing about MLflow trackingtopic/uiI'm writing about the MLflow UII'm writing about the MLflow UI
Acknowledgements
ack/guideI have read through the contributing guideack/readmeI have configured my local development environment so that I can build a local instance of the MLflow website by following the development guideack/legalI have verified that there are no legal considerations associated with the nature of the blog post, its content, or references to organizations, ideas, or individuals contained within my post. JJ Allaire (Inspect AI lead) has publicly endorsed the package on the Inspect Community Slack and added it to the Inspect AI Extensions page and Scout documentation.Proposed Title
Tracking and Debugging AI Safety Evaluations with Inspect AI and MLflow
Abstract
This post covers how to use MLflow's tracking and tracing capabilities with Inspect AI, the UK AI Security Institute's open-source evaluation framework (16M+ monthly PyPI downloads). The
inspect-mlflowpackage provides auto-registering hooks that give users hierarchical experiment tracking, execution tracing with span-level visibility into model calls and tool usage, and a Scout import source for safety analysis. The package was built across 4 merged PRs to Inspect AI and published to PyPI after JJ Allaire (Inspect AI lead, creator of RStudio) requested standalone distribution.Blog Type
blog/deep-dive: An in-depth guide that covers a specific feature in MLflowTopics Covered in Blog
topic/genai: Highlights MLflow's use in training, tuning, or deploying GenAI applicationstopic/tracking: Covering the use of Model Tracking APIs and integrated Model Flavorstopic/advanced: Featuring guides on Custom Model Development or usage of the plugin architecture of MLflowtopic/ui: Covering features of the MLflow UIAdditional Context