Machine Learning (ML) models are developed and maintained iteratively (with continuous retraining) that result in multiple versions. ML model governance helps reduce the risk of losing track of which model version is running in production, who authorized that and when the model was operationalized. At the same time, ML model governance does that at the expense of limiting agile development where data science teams iteratively develop ML models and collaborate with data analysts, QA, and production support teams. Approval Workflows is a key feature to mitigate the aforementioned risk while promoting a collaborative agile team environment.
ML Model Governance
ML models are stochastic models whose parameters are determined by the training data. The same raw data may yield different versions of training data due to transformations and/or feature development. Different versions of the training data are one of the three variables in ML model deployment, the other two being the algorithm code versions and the different hyperparameter sets. These three combine to drive iterative agile ML development enabled using ML pipelines. The upshot is that depending on the data size and model complexity, there may be a Cambrian explosion of possible operational ML models. Thereafter, enterprises require strong ML model governance to keep track of which model is currently in production and who authorized the last production release.
Approval Workflows are processes designed to ensure the authorized flow of information and/or artefacts in organizations. Such workflows are very familiar in everyday enterprise activities such as applications for leave, expense reimbursements, equipment allocation and the like. There are industry-standard best practices for these workflows, which each organization modifies to suit its needs and culture. They have also become commonplace in the world of software development, for example,
- A developer issues a “Pull Request” (PR) to a Tech Lead / Manager. The code is reviewed and corrected before being merged into the release branch of the code repository.
- Managers review unit test/integration test reports before releasing code for the QA team to test. Additional tests may be required before the code is deemed ready for QA.
- A Release Manager reviews QA reports before releasing software to a production system.
Approval Workflows are based on the concepts of Roles and Requests. A workflow typically defines a Request (to be performed by a specific Role), and multiple levels of approvals to be provided by other Roles. In the first example provided above, the Developer Role made a Pull Request to be approved by a Manager Role.
In addition, an ideal workflow should be customizable to suit the needs of the organization and should maintain a log of requests and approvals (by whom and when), thus providing a complete audit trail.
Such Approval Workflows need to permeate into the world of MLOps as well. We provide an example below for ML model deployment with a workflow that is familiar to managers involved in the operationalization of ML models.
ML Model Governance with Approval Workflows
As shown in Fig. 1, a data scientist completes a model (after multiple iterations with ML pipelines) using a MLOps platform such as xpresso. Next a manager reviews the model and provides feedback. Once the model is approved by the manager, it is promoted to QA in xpresso. In QA, a data analyst (or any QA specialist) tests the model. If the model has issues, then the data analyst provides feedback to the data scientist, the model is redone, and the cycle is repeated. Once the model passes QA test, the data analyst approves and promotes the model to production in xpresso. In production, an IT engineer deploys the model. All the workflow roles, responsibilities and subsequent actions in xpresso are labeled, identified and controlled using Approval Workflows.
In conclusion, the demonstrated Approval Workflow maintains a complete log of approvals (by whom and when) and promotions (dev to QA and QA to production) for a production ML model. Such model governance improves ML production lifecycle with streamlined processes and accountability in a collaborative environment.