FINMA today published its Regulatory Notice 08/2024 – Governance and risk management in the use of artificial intelligence published (PDF).
FINMA had already formulated its expectations for dealing with AI in various places, both in the banking and insurance sectors. On this basis, it has carried out supervisory reviews, including on-site inspections. The supervisory communication is the result of these and essentially summarizes them,
- which Risks FINMA sees,
- to which Challenges it has encountered in its supervision (including on-site inspections) and
- which Measures observed and tested them.
Overall, FINMA has observed that
- the use of AI in the financial market is increasing,
- the associated risks are often difficult to assess and
- The governance and risk management structures of financial institutions are usually still being developed.
In essence, the supervisory communication clarifies what is known (not in these words):
- When using new technology, the driver is not the 2nd line, but the 1st line. It understands the technology, but sometimes only its possible applications and sometimes not even that.
- The organization as a whole often neither understands the associated risks nor has the necessary governance in place. Both the understanding and the internal structures are developing much more slowly than the technology.
- In addition, there is often blind trust in the quality of purchased services, with a lack of choice due to market concentration.
- At the same time, the internal effort required for auditing, monitoring and controlling performance is underestimated.
Against this backdrop, the supervisory communication addresses both AI-related challenges and general or typical deficiencies in internal structures. Its content can be summarized as follows – to put it bluntly, FINMA does not paint such a black picture:
Risks
FINMA sees the following AI-related risks in particular:
- Operational risksin particular model risks (e.g. lack of robustness, correctness, bias, lack of stability and explainability)
- IT and cyber risks
- increasing Dependence from third parties, in particular hardware, model and cloud providers
- Legal and reputational risks
- Assignment of responsibilitycomplicated by the “autonomous and difficult to explain actions” of AI systems and “scattered responsibilities for AI applications”
Governance
- Problem:
- Focus too much on data protection risks and too little on model risks
- the development of AI applications is often decentralized – this leads to less consistent standards, blurring of responsibility and overlooking of risks
- Purchased services: It is not always understood whether they contain AI, what data and methods are used and whether sufficient due diligence exists
- Expectations:
- Supervised entities with “many or significant applications” have AI governance
- There is a central inventory with risk classification and measures
- Responsibilities and accountabilities for the development, implementation, monitoring and use of AI are defined
- Specifications for model tests and supporting system controls, documentation standards and “broad training measures” exist
- Outsourcing: additional tests and controls are implemented; there are contractual clauses that regulate responsibilities and liability issues; the necessary skills and experience of the service provider are checked
Inventory and risk classification
- ProblemDifficulty of completeness of the inventory, also due to a narrow definition of AI, decentralized use and inconsistent criteria for the inventory of applications that are of particular importance due to their significance or the associated risks
- Expectations:
- “AI” is defined broadly enough so that “classic applications” with similar risks are also covered
- AI inventories are complete and contain a risk classification of AI applications
Data quality
- Problem:
- Specifications and controls for data quality are missing
- Data can be incorrect, inconsistent, incomplete, unrepresentative, outdated or biased (and with learning systems, data quality is often more important than model selection)
- Purchased solutions: Training data is often not known and may not be suitable
- Expectations:
- Internal directives with specifications for ensuring data quality
Tests and ongoing monitoring
- ProblemWeaknesses in the selection of performance indicators, tests and ongoing monitoring
- Expectations:
- Tests to ensure data quality and the functionality of the AI applications are planned (including a check for accuracy, robustness and stability as well as bias if necessary)
- Experts provide questions and expectations
- Performance indicators for the suitability of an AI application are defined, e.g. threshold values or other validation methods for assessing the correctness and quality of the outputs
- Changes in input data are monitored (“data drift”)
- if an output is ignored or changed by users, this is monitored as an indication of possible vulnerabilities
- Supervisors give prior consideration to the recognition and handling of exceptions
Documentation
- Problem:
- There are no central specifications for documentation
- Existing documentation is not sufficiently detailed and recipient-oriented
- Expectations:
- essential applications: the documentation addresses the purpose of the applications, data selection and preparation, model selection, performance measures, assumptions and limitations, testing and controls, and fallback solutions
- Data selection: Data sources and data quality checks can be explained (incl. integrity, correctness, appropriateness, relevance, bias and stability)
- Robustness, reliability and traceability of the application are ensured
- Applications are appropriately categorized in a risk category (with corresponding justification and review)
Explainability
- Problem: Results cannot be traced, explained or reproduced and thus assessed
ExpectationsPlausibility and robustness of the results can be assessed when making decisions vis-à-vis investors, clients, employees, the supervisory authority or the audit firm - Among other things, the drivers of the applications or the behavior under different conditions are understood
Independent review
- Problem:
- There is no clear demarcation between the development of AI applications and their independent testing
- few supervisors carry out an independent review of the entire model development process
- Expectations:
- “essential applications” means an independent review that includes an objective, experienced and unbiased opinion on the appropriateness and reliability of a procedure for a particular application
- the results of the review are taken into account during development
Expectations (summary)
Overall, FINMA’s expectations can be summarized as follows: