Mar 4 / James Kavanagh

The real work that comes after ISO42001 Certification

The gap between compliance documentation and genuine system-level assurance is where AI governance is falling down. Other industries learned this lesson the hard way.
On 14 June 2017, a fire broke out in a fourth-floor flat at Grenfell Tower in West London. Within minutes, the flames hit the building's exterior cladding and raced up the outside of the tower. Within an hour, the fire had engulfed most of the building. 72 people died. Over 70 more were injured.
Grenfell Tower had been recently refurbished. The cladding had certificates. Building regulations had been technically met. Inspections had been conducted. The compliance machinery of the building safety system had processed this building and, by its own measures, the system had worked.
Dame Judith Hackitt's subsequent independent review laid bare what had gone wrong. Not a single rogue failure, but something more systemic. She described a pervasive "tick-box" culture across the entire building safety regime, one that placed compliance above real-world safety. Organisations were treating minimum standards as a high bar to negotiate down rather than genuinely owning the safety of their buildings. The regulatory system checked that processes were followed. It didn't require anyone to make a genuine, evidence-based argument that this specific building was safe.
It's incidents like this that I think about someone tells me their AI governance program is in great shape because they've achieved ISO 42001 certification.
If I can talk to them about it, here's the question I'll ask: pick your highest-risk deployed AI system and tell me, with evidence, why you believe it's operating safely, securely, and lawfully right now. Not how you followed the process. Not how the documentation is up to date. Why that specific system is behaving in a way you can defend as being safe.
It's not an easy question. If anyone had asked me the same about the many systems and services I took through audits and certifications, ranging from a variety of ISO certifications to multiple government certifications at Microsoft and Amazon, I would have found it challenging too.
Yet for assurance of safe, secure and lawful AI, it's the only question that matters.
Here's my blunt opinion: ISO 42001 certification is the start line, not the finish line. It means you've started the work of building the organisational machinery. It doesn't mean the machinery is doing anything useful yet. And over-indexing on regulatory compliance, treating something like EU AI Act conformity assessment as a goal rather than a baseline, is just the same trap. The real work is maintaining genuine, evidence-based assurance that each of your AI systems is doing what you claim it's doing, in conditions that keep changing. That work starts after certification, after compliance.

What a management system actually tells you

ISO 42001 is a management system standard. It's structurally identical to ISO 27001, ISO 9001, and every other ISO management system standard. Plan-Do-Check-Act. Documented processes. Internal audit. Management review. Continual improvement. It asks one fundamental question: do you have the right organisational processes, roles, and controls in place to govern AI responsibly?
Now that's not a bad question. For organisations that might have had no structured approach to AI governance at all, being forced to think systematically about risk, accountability, and oversight for the first time produces genuine organisational learning. I'm not dismissing that.
But it's the wrong question if what you actually care about is whether your AI systems are behaving acceptably. A management system tells you about organisational capability and intent. It tells you almost nothing about whether any specific system you've deployed is actually safe, fair, reliable, or fit for purpose. Those are different things, and they should not be conflated, even if it's convenient to do so.
An organisation can be fully compliant with ISO 42001 and still be deploying a system that's causing real harm. The policies are documented. The roles are assigned. The risk assessments are filed. And the model has been silently drifting into unacceptable behaviour for months because nobody is maintaining a specific, evidence-based argument about that system's performance.
I say this from experience. I spent the best part of two decades at Microsoft and Amazon Web Services, working across security, privacy, cloud infrastructure resilience, and responsible AI. I've lived inside management system compliance regimes and taken them through more audits (always on the auditee-side). And I can tell you that a significant amount of what goes into management system documentation is performative. 
Every certified organisation knows it. Every auditor knows it. Or they should.
Not maliciously performative. Nobody sets out to write fiction. But the messy reality of how security or resilience is actually achieved inside a hyperscale cloud provider, the informal knowledge networks, the incident response muscle memory, the engineering judgment calls that happen in the middle of the night, the rapid response mechanisms.  They bear very little resemblance to the neat process descriptions that satisfy an ISO 27001 auditor. The documentation describes a world that is cleaner, more linear, and more controlled than the actual world. It has to, because the standard demands a level of prescriptive tidiness that complex, fast-moving environments don't naturally produce.
So you end up with two parallel realities. The documented reality, which satisfies the audit. And the operational reality, which is where the actual security outcomes happen. Sometimes those two realities overlap a lot. Sometimes the gap is wide enough to drive a truck through. The management system certification tells you nothing about the size of that gap.
When a company of vast scale and complexity consistently produces audit reports with zero findings, that's not a sign of excellence.  It's a sign of theatre.
Now apply that same dynamic to AI governance. Organisations are building AI governance documentation that describes how they manage risk, how they assess impacts, how they monitor systems. Some of that documentation reflects genuine practice. Some of it describes aspirational processes that don't yet exist in any meaningful way. And some of it is written specifically to satisfy the structure of the standard, because the standard asks for things in a format that doesn't match how the organisation actually works. The auditor checks the documentation. The certificate gets issued. The gap between the documented governance and the operational reality remains invisible.

The safety case tradition

Other industries figured this out decades ago, often after people died.
Before the Piper Alpha disaster in 1988, offshore safety regulation in the UK was largely prescriptive. Comply with the rules and you're deemed safe. Lord Cullen's inquiry fundamentally shifted that philosophy. So instead of prescriptive compliance, the UK moved to goal-based regulation. Now the duty holder had to demonstrate, through a structured safety case, that their specific installation was acceptably safe. The regulations set goals, the operator owned the argument and the evidence.
A safety case is a structured, evidence-based argument that a specific system is acceptably safe for a specific application in a specific operating environment. It's not documentation. It's an argumentative structure with claims, sub-claims, evidence, and explicit assumptions. If the evidence is weak, the argument is visibly weak. The structure makes gaps in assurance legible in a way that management system documentation almost never does.
This approach is now being explored seriously for AI systems, and the research is worth paying attention to even if you're not building frontier models.

What the frontier AI safety case research tells us

The foundational work here is Clymer et al.'s 2024 paper, "Safety Cases: How to Justify the Safety of Advanced AI Systems." They propose four categories of safety argument: inability (the system simply can't cause the harm), control (even if it could, mitigations prevent it), trustworthiness (the system wouldn't attempt harm even if capable), and deference (for very powerful systems, using a credible AI overseer). This taxonomy is useful because it maps onto how confidence in safety degrades as capabilities increase. You start with inability arguments, which are the strongest and simplest, and progressively need more sophisticated arguments as models get more capable.
Buhl, Sett, Koessler, Schuett, and Anderljung at the Centre for the Governance of AI built on this in their October 2024 paper, exploring how safety cases could function in both industry self-regulation and government oversight. They sketch how a safety case could inform major deployment decisions, with a designated safety case team producing the argument, an internal review team challenging it, and leadership making the call. That's a fundamentally different decision architecture from "the risk assessment says medium, ship it."
The most revealing example is Anthropic's own work. Roger Grosse published three sketches of what safety case components might look like for ASL-4 level systems, covering mechanistic interpretability, AI control, and incentives analysis. What's particularly honest about this work is the explicit acknowledgment that none of the sketches fully succeeds. The scenario assumes a model where developers can't rule out the possibility that it could cause a catastrophe, and also can't rule out that the model could strategically sandbag evaluations or undermine monitoring. That's not a great starting position for building a safety argument, and they're transparent about the gaps.
Separately, Korbak, Clymer, and colleagues published a control safety case sketch in early 2025, focused on a concrete scenario: arguing that a hypothetical LLM agent deployed internally won't exfiltrate sensitive data. The safety argument rests on three claims: the red team adequately elicited model capabilities, control measures remain effective in deployment, and you can conservatively extrapolate from test to production. It's narrow and specific enough to actually evaluate, which is exactly the point.
The UK's AI Security Institute has been doing complementary work, publishing a safety case template for "inability" arguments as a proof of concept. They're upfront that this only covers a subset of relevant arguments and that a full safety case would also need sociotechnical arguments about organisational safety culture and staff competence. That last point matters. It's an explicit acknowledgment from the institution most engaged in frontier AI safety that technical safety cases alone aren't sufficient without the organisational layer.
All of these are great reads - if you can spend an afternoon, going through all five, you'll enjoy it.  But I think you'll notice two things about all of this work, or at least I thought these were important.
First, it's almost entirely focused on frontier model risks. CBRN threats, autonomous replication, loss of control, deceptive alignment. Important stuff, genuinely. But not what 99% of organisations deploying AI systems need to think about.
Second, everything published so far is a sketch or template, not an actual safety case for an actual deployed system. The methodology is being developed ahead of the practice. We're in the theory-building phase, which is where process safety was in the early 1990s after Cullen but before the practice matured.
I think what matters for most practitioners is not the frontier-specific details. It's the underlying principle: if you can't make a structured, evidence-based argument about why a specific system is acceptably safe in a specific context, your governance is incomplete regardless of what your management system documentation says.

The gap that matters for practitioners

I think that the gap that governance practitioners should care about is not at the frontier. It's in every organisation running an AI system that makes or influences consequential decisions about people. Screening CVs, triaging claims for insurance or benefits, assessing loans, the new agentic system for pricing. 
For each of these systems, someone in the organisation should be able to make a structured argument with three components:
  • First, what specific claims are we making about this system's behaviour? Not vague aspirations like "it's fair" or "it's responsible", and almost as bad, that "It's in scope of our ISO42001 certification". Operational claims, like "This system does not produce materially different approval rates across protected demographic groups, measured by disparate impact on all tracked dimensions of >0.9"
  • Second, what evidence do we have that those claims hold, and how current is that evidence? So often, if an impact assessment was even done, it was before deployment, eighteen months ago. The bias testing was run on the training data, not on production outputs. The monitoring dashboard exists but nobody reviews it. The evidence that once supported the claims has gone stale, and nobody noticed because the management system doesn't require anyone to maintain the argument.
  • Third, what conditions would cause those claims to stop holding, and how would we detect that? This is the most important piece. It forces you to think about the failure modes of your assurance, not just the failure modes of your system. Distribution shift. Changed user population. Model updates. New regulatory requirements. Upstream data quality degradation. If you haven't identified the conditions under which your assurance degrades, you have no way of knowing when your governance has stopped working.

From compliance to genuine assurance

I'm not advocating for practitioners to abandon the management system. But we've got to stop treating it as the governance end goal and start treating it as just the infrastructure that enables governance.
Your risk assessment process should produce system-specific claims, not generic risk register entries. Your monitoring process should produce evidence that those claims still hold, not dashboards that nobody interrogates. Your management review should answer a meaningful question: do our assurance arguments for deployed systems even still stand? And if not, what's changed and what are we doing about it?
This creates the feedback loop that static compliance lacks. When the evidence weakens, when the operating context shifts, when the claims no longer hold, the assurance argument visibly degrades. That degradation becomes a governance signal. It tells you something needs to change. Compare that with standard management system compliance, where you can maintain perfect documentation while a system drifts quietly into causing harm.
I recognise there's a cultural shift here too. Management system compliance distributes governance responsibility into process roles. The risk owner ticks the box. The auditor checks the box was ticked. Management reviews the summary. It's kind of nice and convenient to diffuse accountability - when things go wrong, it was a process failure. But an assurance argument concentrates accountability around a specific claim about a specific system. Someone has to own it. In Amazon we called it single-threaded ownership.  You stand behind it and say "I believe this system is operating acceptably, and here's why." That's such a fundamentally different posture from "we followed the process."

Proportionality and Pragmatism

The objection I hear from practitioners is that this sounds like a lot of extra work when they're barely keeping the management system afloat. Fair point.
The answer is proportionality. You don't need a formally structured safety case with Goal Structuring Notation for a low-risk internal chatbot. But for the systems that actually matter, the ones making or materially influencing consequential decisions about real people, the discipline of articulating and maintaining a specific assurance argument is the difference between governance that works and governance that performs.  
A practical starting point: take your single highest-risk deployed AI system and the questions I started this article with. Just try to write down the three components of the assurance argument. What claims are you making? What evidence supports them? What would cause them to fail?  Just try to notice where the gaps are.  It takes 5 minutes.
Those gaps are your real governance priorities. Not whatever's next on the audit checklist, or some obscure update to a policy.

The Race Ahead

Some days I feel like the AI governance and AI safety field is where offshore safety was in the early 1990s. The philosophy of goal-based, evidence-driven assurance is gaining traction, and the methodology is slowly developing. But the actual practice of building, challenging, and maintaining system-level assurance arguments for real AI systems still lies ahead for most organisations.
ISO 42001 certification is a genuine achievement. It means you've built the first foundations of organisational machinery. But the machinery needs a purpose beyond generating documentation and passing audits. That purpose is maintaining defensible, evidence-based assurance that the AI systems you operate are doing what you claim they're doing.  That they remain safe, secure and lawful.
In my previous article on adaptive governance, I used Waymo's Driver-Simulator-Critic architecture to show what governance looks like when it's designed as a continuous feedback loop rather than a point-in-time assessment. The system senses problems, tests improvements, verifies them, and deploys. Continuously. It doesn't wait for an annual review. It doesn't depend on someone remembering to check.
The three-part assurance argument I've described in this article of specific claims, current evidence, defined degradation conditions. That's exactly that kind of mechanism applied to AI governance more broadly. It creates the feedback loop that static compliance lacks. When your evidence goes stale, you know. When your operating context shifts, the argument visibly weakens. When your claims no longer hold, the gap is legible rather than buried in a risk register nobody reads.
That's the difference between governance that performs and governance that works. Between a management system that satisfies auditors and an assurance practice that actually keeps your systems safe.
You got to the start line. Now drive.
Thank you for reading, and especially to all of you within our practitioner community.  
If you want to build the kind of AI governance that works for real, covering both the theory and the practice, then the AI Governance Practitioner Program is for you.

References

1. Grenfell Tower Inquiry. https://www.grenfelltowerinquiry.org.uk
2. Dame Judith Hackitt, "Building a Safer Future: Independent Review of Building Regulations and Fire Safety" (2018) https://www.gov.uk/government/publications/independent-review-of-building-regulations-and-fire-safety-final-report
3 Clymer, Gabrieli, Krueger & Larsen, "Safety Cases: How to Justify the Safety of Advanced AI Systems" (2024). https://arxiv.org/abs/2403.10462
4 Buhl, Sett, Koessler, Schuett & Anderljung, "Safety Cases for Frontier AI" (2024), Centre for the Governance of AI. https://arxiv.org/abs/2410.21572
5 Grosse, "Three Sketches of ASL-4 Safety Case Components" (2024), Anthropic Alignment Science. https://alignment.anthropic.com/2024/safety-cases/
6 Korbak, Clymer, Hilton, Shlegeris & Irving, "A Sketch of an AI Control Safety Case" (2025) https://arxiv.org/abs/2501.17315
7 UK AI Security Institute, "Safety Case Template for Inability Arguments"  https://www.aisi.gov.uk/blog/safety-case-template-for-inability-arguments
8 UK AI Security Institute, "Safety Cases at AISI" — https://www.aisi.gov.uk/blog/safety-cases-at-aisi
9 Cârlane & Gomez, "Dynamic Safety Cases for Frontier AI" (2024) https://arxiv.org/abs/2412.17618
10 Simon Mylius, "Systematic Hazard Analysis for Frontier AI using STPA" (2025) https://arxiv.org/abs/2506.01782
11 Anthropic Responsible Scaling Policy https://www.anthropic.com/responsible-scaling-policy