Introduction

In regulated industries such as banking, finance, and insurance, Model Development Documents (MDD) play a crucial role. MDDs are instrumental in helping model validators comply with regulations, ensuring that models are compliant and functioning as intended. As machine learning and artificial intelligence models become more prolific, standards to document them, such as MDDs, will likely evolve to gain confidence and trust in them.

This guide explains MDDs, their importance, the stakeholders involved, and best practices for creating and maintaining them.

‍

What is a Model Development Document?

A Model Development Document is a comprehensive report detailing a model's purpose and scope, details on the data used for training, model design and methodology, performance metrics and testing methodology, deployment approach and risk mitigation mechanisms. It is also a cornerstone for the model validation team to create a Model Validation Document (MVD) in banks that independently validate models to mitigate model risk and is shared with auditors. The MDD serves several major internal and external stakeholders - model development teams, business stakeholders, validators, and regulators. It’s typically written by the model developer.

The overall purpose of this document is to provide a comprehensive and transparent record of the model's iterative development and design process, facilitate monitoring and maintenance of deployed models, support effective risk management, and ensure that the bank meets regulatory standards.

‍

Importance of MDD

Transparency and Traceability: MDDs create a mechanism to provide transparency and traceability into the AI system's capabilities, data inputs, decision-making processes, and potential biases or risks. Transparency and traceability are critical to building trust and accountability for the full end-to-end modeling lifecycle. Additionally, by uniformly requiring MDDs for all models, a holistic understanding of the diversity and complexity of models is understood which serves as an important input into model risk management. The availability of MDDs ensures an agile and effective response to both regulatory requirements and ad hoc requests.
Regulatory Compliance: MDDs are essential for meeting regulatory requirements, particularly under SR 11-7 in the United States. Other regulations in the UK (SS1/23) and Canada’s E23 which goes into effec July 1, 2025 share the same overarching principles as SR 11-7. However, E23 has a broader mandate impacting not only Banks but also Life Insurance, Property and Casualty Companies as well as Trust and Loan Companies. MDDs make it easier for independent validators and regulators to assess AI/ML models for conformance to evolving mandates and regulations.
Risk Mitigation: Model risk can originate from poor specifications, incorrect parameter estimates, flawed hypotheses, poor assumptions, mathematical computation errors, inaccurate/inappropriate/incomplete data, unintended usage, and inadequate monitoring and controls. These risks introduce myriad adverse financial and reputational consequences. MDDs provide a consolidated view and oversight of the AI/ML models throughout the full lifecycle which reduces the model risk and exposure.
‍
Validation Requirement: The Model Risk Management (MRM) team requires MDDs to validate models. These documents ensure that models are accurate, reliable, and compliant with regulatory standards.
‍
Responsible AI Practices: Thorough documentation enables organizations to analyze and mitigate potential risks, such as bias, privacy violations, or misuse of the AI system. Additionally, it supports governance by providing a clear framework for accountability, compliance with regulations, and the establishment of ethical guidelines throughout the AI lifecycle.
‍
Knowledge Transfer: Proper documentation facilitates the sharing of knowledge, insights, techniques, and successes within and between technical and non-technical teams. This promotes continuous learning and improvement, ensuring valuable information is accessible to both model validation and business teams.
‍

MDDs in Practice

Typically, the data scientist or model developer is responsible for creating and maintaining the MDD. They document the entire model development process, methodologies used, and results obtained. Their manager establishes and enforces best practices for documentation within the team. Lastly, the model validator guides the modeler on what information is required in the document and ensures completeness. Model validators use the MDD to create a Model Validation Document (MVD).

Validation and MRM teams use the MDD to easily validate modeling team results, challenge the developed models and ensure the integrity of the AI/ML models. Internal regulators use MDDs and MVDs to verify that models meet internal compliance standards, while external regulators ensure models comply with external regulatory requirements.

Business stakeholders review MDDs to understand the model and verify that it meets their business requirements. ML/Ops teams use MDDs to deploy and monitor the models. Data science teams use MDDs for model governance, knowledge sharing, and project management.

‍

Contents of an MDD

The MDD is structured into several key sections to provide a comprehensive overview of a model's lifecycle and its various components. MDDs can be lightweight or very comprehensive based on the needs of the organization. Below is an overview of the main sections that go into a complete MDD:

1. Executive Summary

The Executive Summary provides a high-level overview of the model, highlighting its purpose, scope, key stakeholders, and the intended business impact. This section sets the stage for the detailed contents of the MDD, offering a snapshot that decision-makers can quickly grasp.

Project Details

This section delves into the specifics of the project, providing context and background necessary to understand the model’s development and application.

Project Overview

A broad description of the project, outlining the objectives, expected outcomes, and any significant milestones achieved during the development phase.

Model Purpose and Use

This component clearly defines the model's intended purpose and its application area, specifying relevant legal entities and business requirements. It discusses any restrictions on model usage, including limitations related to specific business areas, and outlines the scope of the model, covering included products and businesses, as well as any notable exclusions. Additionally, it provides a brief assessment of the model's complexity and identifies any other models or teams that might leverage its output.‍

‍Model Stakeholders

Identification and roles of key stakeholders involved in the model development, deployment, and maintenance processes.

Model Summary

A concise summary of the model, including its type, structure, and primary functionalities.

Additional Model Information

Any supplementary information pertinent to the model, such as assumptions, dependencies, or external factors that could influence its performance or application.

‍

2. Data Overview

A comprehensive examination of the data used in the model, from initial request and extraction to final preparation for training.

Data Request and Extraction

Provide details about the data request process, including the source, type of data requested, and extraction methods.

Training Data Overview

An assessment of the data quality and completeness, ensuring that the dataset is robust and reliable for model training.

Data Special Processing

A description of any special processing applied to the data before manipulation, including:

Exclusion/Filtering: Criteria used to exclude or filter out irrelevant or unreliable data.
Data Preparation: Techniques for handling missing values, backfilling, and outlier removal.
Data Transformation: Methods for normalizing, scaling, or encoding categorical variables.

‍

3. Model Details

An in-depth look at the model's technical specifications and methodologies.

Model Version Methodology

A description of the techniques and overall methodology used in the model's development, including model versioning practices.

Model Properties and Calibration

Details on key model parameters, their selection process, and fine-tuning procedures such as hyper-parameter tuning and cross-validation.

Model Performance

Evaluation metrics and performance results that demonstrate how well the model meets its objectives.

Model Alternatives

Consideration of alternative models that were evaluated and the rationale for selecting the final model.

‍

4. Model Testing

Information on the testing processes to ensure the model's accuracy, robustness, and reliability.

Model Explainability

An analysis of the model’s explainability, including techniques used to explain predictions and the importance of different features.

Model Baseline Improvement and Selection

Steps taken to improve and select the best-performing model, comparing it against baseline models.

Model Sensitivity Analysis

An assessment of the model’s sensitivity to various inputs, helping to understand its stability and reliability.

Model Testing Framework

The framework and methodologies used for testing the model, ensuring it performs as expected under different conditions.

‍

5. Model Productionalization

Details on the deployment and operationalization of the model.

Model Deployment

A thorough description of how the model will be put into production, including deployment strategies and considerations.

Model Monitoring

Plans for ongoing monitoring of the model’s data and predictions, including preventive measures to ensure the accuracy and reliability of outputs.

By systematically documenting each of these components, the MDD provides a transparent, detailed, and structured approach to understanding and managing the lifecycle of a machine learning model. This ensures that all stakeholders have a clear and comprehensive view of the model, facilitating better decision-making and adherence to regulatory requirements.

‍

‍

Approaches to Building MDDs

Model development teams often create model documentation retrospectively, using commonly used text editors without integrating key assets. A major pitfall in machine learning documentation is the failure to link essential assets to the documentation.

Easy access to crucial assets such as datasets, model versions, graphs, and code is vital. This accessibility allows validators to review and analyze the main components in depth efficiently.

Teams typically adopt one of two approaches for developing MDDs. These approaches include:

Manual Documentation: This involves manually copying and pasting information into a word processing document, such as Microsoft Word. However, this method is time-intensive and susceptible to inconsistencies. And only performed after the model is completed, losing the history of the iterations it took to arrive at the final model.
Semi-automatic Documentation: This approach utilizes specialized tools like LaTeX for semi-automated document generation. While offering improved consistency, it requires technical expertise in working with markup languages and document preparation systems. This is also typically only done after the final model is completed, again losing the history of the model development.
Automated Documentation: The last approach involves specialized software that automates repetitive documentation tasks. This reduces the manual effort required and increases the robustness with low risk and lower cost.

‍

Maintaining MDDs

Maintaining a MDDs is crucial for ensuring the longevity and reliability of AI models. This task can become challenging when there is turnover within the team, as knowledge and context can be lost. Best practices for maintaining MDDs effectively include:

Regular Updates

Annual or Quarterly Reviews: Schedule regular intervals for reviewing and updating the MDD. This can be done annually or quarterly, depending on the sensitivity and criticality of the model. Regular updates help ensure that the model assumptions are still valid, that the production data and model performance are in conformance with expectations and that the documentation reflects the current state of the model, including any modifications or enhancements that have been made.

Comprehensive Documentation

Detailed and Up-to-date Information: Maintain detailed documentation that covers all aspects of the model, including its design, development, deployment, and operational performance. This should include data sources, preprocessing steps, model architecture, training processes, performance metrics, and any assumptions or limitations. Ensuring that this documentation is kept up-to-date is vital for maintaining the model's integrity and facilitating understanding by new team members or external auditors.

Dedicated Maintenance Team

Assigned Responsibility: Assign a specific team member or a group of members the responsibility of maintaining the MDD. This dedicated role ensures that there is accountability and continuity in the documentation process. The designated person(s) should be well-versed in the model's technical details and its operational context to effectively manage updates and changes.

‍

Scaling MDDs

Scaling the creation and maintenance of MDDs is a complex task, especially as the number of models and the scale of operations grow. However, implementing strategic approaches can streamline this process and make it more manageable. Strategies for effectively scaling MDDs include:

Developing a Robust Template

Reusable Templates: Create a comprehensive and adaptable template for MDDs that can be used across different models. This template should cover all essential sections mentioned in the section “Contents of an MDD”.
Standardized Format: Ensure the template uses a standardized format that is easy to understand and follow. This helps maintain consistency across documents and makes it easier for team members to create and update MDDs.
Customization Options: While the template should be robust, it should also allow for customization to cater to specific model requirements. This balance ensures that the template is both flexible and thorough.

‍

Building or Rolling Out a System for Documentation

Documentation Management System: Invest in or develop a documentation management system specifically tailored for MDDs. This system should support collaborative editing, version control, and easy access to documentation.
Integration with Model Development Tools: Integrate the documentation system with existing model development tools and workflows. For instance, linking it with code repositories, data management systems, and model training platforms can automate the capture of relevant information.
Automated Documentation Generation: Utilize tools that can automatically generate parts of the MDD based on all the key assets produced during the model lifecycle. This reduces the manual effort required and ensures accuracy in documentation.

Common Problems with MDDs

Both model developers and validators often encounter significant challenges with MDDs. Addressing these common problems can enhance the quality and efficiency of model documentation processes.

‍

For Model Developers
‍

Procrastination:
‍
Problem: Developers frequently delay the documentation process until the end of a project. This procrastination can result in omitting critical details as memory fades over time.
‍
Solution: To mitigate this, developers should integrate documentation into the development workflow, documenting key steps and decisions as they occur.
Reproducibility Issues:

Problem: Incomplete documentation can hinder the ability to reproduce model results, which is essential for validation and future model updates.

Solution: Ensuring thorough and continuous documentation can help maintain a clear and reproducible record of the development process, from data preprocessing to model tuning and evaluation.
Time Consumption:

Problem: Creating comprehensive MDDs is often seen as a tedious and time-consuming task, leading to resistance from developers.

Solution: Utilizing automated documentation tools and templates can streamline the process, making it less burdensome and more efficient for developers.

‍

For Validators

‍

Incompleteness:
‍
Problem: Validators often receive MDDs that are incomplete, lacking essential details necessary for thorough validation.

Solution: Establishing clear documentation standards and checklists can help ensure that all necessary information is included in the MDDs.
Data and Code Accessibility:
‍
Problem: Difficulties in locating and accessing data and code referenced in the MDDs can lead to inefficiencies and delays in the validation process.

Solution: Implementing centralized repositories for data, code, and lineage, with clear linkage to the MDDs, can facilitate easier access and review.
Inconsistencies:
‍
Problem: Variations in documentation practices across different developers and teams can result in inconsistencies, making it challenging to standardize validation procedures.

Solution: Promoting uniform documentation practices through training and standardized templates can help achieve consistency across all MDDs.

By recognizing and addressing these common problems, banks can improve the quality of the MDD creation, leading to more reliable and efficient model development and validation processes.

‍

Best Practices for MDDs

Document Continuously: Encourage developers to document their work as they go to ensure completeness and accuracy.
Capture Lineage: Maintain a detailed record of all changes and updates to the model.
Use Advanced Tools: Move beyond Word documents and copy-pasting to more sophisticated tools that can automate parts of the documentation process.
Maintain an Audit Trail: Ensure all changes and updates are logged and traceable.
Reusable Templates: Develop and use templates and macros to ensure consistency and save time.
Unified Documentation: Keep all documentation in a single, unified system rather than across multiple channels with an embedded communication system
Automate Refreshes: Implement automated processes to update model documentation regularly.

‍

Conclusion

Model Development Documents are crucial tools in AI/ML, particularly in highly regulated industries. They provide transparency, ensure regulatory compliance, mitigate risks, and facilitate knowledge transfer. As AI systems become more complex and ubiquitous, the importance of comprehensive, standardized documentation grows.

To meet these challenges, organizations must adopt best practices for creating and maintaining MDDs. These include continuous documentation throughout the model lifecycle, capturing detailed lineage, utilizing advanced documentation tools, maintaining audit trails, and implementing automated processes for updates. Standardized templates and unified documentation systems can significantly improve consistency and efficiency.

However, challenges persist. Model developers often struggle with documenting after the fact, reproducibility issues, and time constraints. Validators face problems with incomplete documentation, data accessibility, and inconsistencies across different teams. Addressing these issues requires a concerted effort to integrate documentation into the development workflow, establish clear standards, and leverage technology for automation and accessibility.

As the field progresses, the future of MDDs likely lies in more sophisticated, integrated systems that can automatically capture and update relevant information throughout the model's lifecycle. This evolution will not only streamline the documentation process but also enhance the overall governance and reliability of AI/ML models in critical sectors like banking, finance, and insurance.

By embracing these best practices and addressing common challenges, organizations can create a robust framework for model documentation. This approach will not only satisfy regulatory requirements but also foster trust, and ensure the responsible development and deployment of AI systems in an increasingly AI/ML-driven world.

‍

A Guide to Model Development Document (MDD)

September 5, 2024

Introduction

This guide explains MDDs, their importance, the stakeholders involved, and best practices for creating and maintaining them.

‍

What is a Model Development Document?

‍

Importance of MDD

Transparency and Traceability: MDDs create a mechanism to provide transparency and traceability into the AI system's capabilities, data inputs, decision-making processes, and potential biases or risks. Transparency and traceability are critical to building trust and accountability for the full end-to-end modeling lifecycle. Additionally, by uniformly requiring MDDs for all models, a holistic understanding of the diversity and complexity of models is understood which serves as an important input into model risk management. The availability of MDDs ensures an agile and effective response to both regulatory requirements and ad hoc requests.
Regulatory Compliance: MDDs are essential for meeting regulatory requirements, particularly under SR 11-7 in the United States. Other regulations in the UK (SS1/23) and Canada’s E23 which goes into effec July 1, 2025 share the same overarching principles as SR 11-7. However, E23 has a broader mandate impacting not only Banks but also Life Insurance, Property and Casualty Companies as well as Trust and Loan Companies. MDDs make it easier for independent validators and regulators to assess AI/ML models for conformance to evolving mandates and regulations.
Risk Mitigation: Model risk can originate from poor specifications, incorrect parameter estimates, flawed hypotheses, poor assumptions, mathematical computation errors, inaccurate/inappropriate/incomplete data, unintended usage, and inadequate monitoring and controls. These risks introduce myriad adverse financial and reputational consequences. MDDs provide a consolidated view and oversight of the AI/ML models throughout the full lifecycle which reduces the model risk and exposure.
‍
Validation Requirement: The Model Risk Management (MRM) team requires MDDs to validate models. These documents ensure that models are accurate, reliable, and compliant with regulatory standards.
‍
Responsible AI Practices: Thorough documentation enables organizations to analyze and mitigate potential risks, such as bias, privacy violations, or misuse of the AI system. Additionally, it supports governance by providing a clear framework for accountability, compliance with regulations, and the establishment of ethical guidelines throughout the AI lifecycle.
‍
Knowledge Transfer: Proper documentation facilitates the sharing of knowledge, insights, techniques, and successes within and between technical and non-technical teams. This promotes continuous learning and improvement, ensuring valuable information is accessible to both model validation and business teams.
‍

MDDs in Practice

‍