When machine learning models in production become critical to business functions, new requirements emerge to ensure quality and to comply with legal and regulatory obligations. The deployment and modification of models can have far-reaching impacts, so establishing clear practices can ensure consistent management and minimized risk.
Model governance sets the rules and controls for deployments, including access control, testing and validation, change and access logs, and traceability of prediction results. With model governance in place, organizations can scale deployments and provide legal and compliance reports.
Scaling the use and value of models in production requires a robust and repeatable production process, including clearly defined roles, procedures, and logging. A consistent process dramatically reduces an organization’s operational, legal, and regulatory risk. Additionally, logging shows that rules were followed and supports troubleshooting to resolve issues quickly, which increases trust and value from AI projects.
Aspects of governance¶
Model governance for MLOps includes various components:
Roles and responsibilities: One of the first steps in production model governance is to establish clear roles with duties within the production model lifecycle. Users may have more than one role. MLOps admins are central to maintaining model governance within an organization.
Access control: To maintain control over production environments, access to production models and environments must be limited. Limitations can be implemented at the individual user level or via role-based access control (RBAC). In either case, a limited number of people will have the ability to update production data for model training, deploy production models, or modify production environments.
Deployment testing and validation: To ensure quality in production, processes should include testing and validation of each new or refreshed model before deployment. These tests and their results should be logged to show that the model was deemed ready for production use. Testing information will be required for model approval.
Model history: Models will change over time as they are updated and replaced in production. Maintenance of the complete model history, including model artifacts and changelogs, is critical for legal and regulatory needs. The ability to understand when a change was made and by whom is critical for compliance but is also very useful for troubleshooting when something goes wrong.
Humility rules: Humility rules can be configured to allow models to be capable of recognizing, in real-time, when they make uncertain predictions or receive data they have not seen before. Unlike data drift, model humility does not deal with broad statistical properties over time—it is instead triggered for individual predictions, allowing you to set desired behaviors with rules that depend on different triggers.
Fairness monitoring: Fairness monitoring can be configured to allow models to be capable of recognizing when protected features fail to meet predefined fairness criteria. Testing the fairness of production models is triggered by individual predictions; however, any predictions made within the last 30 days are also taken into account.
Traceable model results: Each model result must be attributable back to the model and model version that generated that result to meet legal and regulatory compliance obligations. Traceability is especially critical because of the dynamic nature of the production model lifecycle that results in frequent model updates. At the time of a legal or regulatory filing, which could be months after an individual model response, the model in production may not be the same as the model used to create the prediction in question. A record of request data and response values with date and time information satisfies this requirement. Also, a model ID should be provided as part of the model response to make the tracking process easier.