Dec 23 / James Kavanagh

The Design Gap in AI Governance

AI governance already has the scaffolding of laws, frameworks, committees, standards and rules. What's missing is the design discipline and practice to build and sustain real safety and security.
Chemical plants rarely explode. Planes rarely fall from the sky. That’s not because those industries have more regulations than everyone else, though they certainly are regulated. It's because they've learned, through tragedy, that design is the key driver of safety. Not rules. Not audits. Not compliance departments. Most certainly not paperwork.
Their safety and security is by design. Design of systems, of processes, of the mechanisms and culture that catch problems before they become disasters. When rare accidents do happen, the subsequent investigation almost always finds inadequate design as the primary cause of failure.
When Fukushima Daiichi went into meltdown, it didn’t happen in a regulatory void. Japan had nuclear laws, regulatory agencies, inspections, safety drills, emergency plans, peer exchange and international reporting. And yet a single site situated on a tsunami-prone coastline managed to lose power, melt multiple reactor cores, and contaminate land and lives for years.1
When you look closely at events in Fukushima, you don’t find an absence of rules or regulatory process. You don’t find compliance failures. You find wrong design assumptions, and a governance system that didn’t force those assumptions to change, even as evidence for change accumulated.2
After a year of writing, teaching, researching and building, that’s how I could sum up what I see as the current state of AI governance today. We’re not short of regulations, standards, or frameworks. We’re short of good design. Design of laws and regulations. Design of products. Design of user interactions. And critically, design of the governance mechanisms needed for sustained safe, secure and lawful AI.
Until we fix that by building the disciplines and practice of design, I worry that adding more rules and controls just leads to more paperwork, not safer systems. Let me explain.

Fukushima: Lots of rules, fatally bad design

Before 2011, Fukushima Daiichi sat inside a dense, multi-layered governance ecosystem. Japan had licensing and oversight structures, and it participated in international reporting on nuclear safety. It was a regulatory regime that we would commonly call “strong regulation.”
But what failed in March 2011 were two very specific design assumptions.
First, the size of tsunami the plant had to survive. Fukushima was designed around a worst case specification of a tsunami that might be only a few metres tall, later revised to around 5-6 metres. The 2011 tsunami rose to roughly 14 metres, more than double what the plant was built to withstand. 3
Second, where critical equipment was physically placed. Backup diesel generators and electrical switchgear (the systems that kept reactor cooling running after the grid went down) were located in low-lying turbine-building basements just a few metres above sea level. When the seawall was overtopped, those basements flooded, the generators failed, and the plant lost almost all power. Within days, three reactor cores had largely melted.
But neither of these are regulatory or compliance problems in the usual sense. They were design problems.
Nor were they unknowable. In fact, by the late 2000s, internal analyses at TEPCO using newer tsunami models were already suggesting waves in the 10 to 15 metre range were very plausible at Fukushima Daiichi. Those numbers should have triggered redesign work, not deferral. Engineers were fully aware that if the seawall was breached, the power systems would be flooded, and control systems would fail.
And yet countermeasures were postponed. TEPCO actually explained their findings and risk assessment to regulators only days before the disaster, but as they “were not instructed to immediately implement countermeasures”, they took no action.
The appointed investigator later called the accident “a profoundly manmade disaster” that “could and should have been foreseen and prevented.” The accident report concluded that although laws, regulations and inspection programs were complied with, both the operator and the regulator itself tolerated outdated assumptions and failed to force an adequate shift in safety margin.
Rules existed. Reviews happened. The design basis didn’t move fast enough, and the plant still failed catastrophically. Because it’s governance did not adapt to critical new data.
That’s the pattern I’m worried about in AI: that we’re setting about building elegant facades of regulations and compliance that are not inherently adaptive and are very unlikely to successfully mitigate harm from complex, dynamic AI systems.
Safety science has been making the point for decades that complex systems require safety controls that are designed from the outset to be adaptive. In modern safety thinking, accidents are not just component failures; they arise when the guidance and feedback loops that should steer behaviour across a complex socio-technical system break down or were never designed in the first place. 4
This framing matters because AI systems are complex adaptive systems. They change as they encounter new data. The people using them change how they work in response. The organisations deploying them adapt their processes. The environment shifts. A compliance model that treats safety as a property you verify once through a conformity assessment, and then maintain through periodic audits or even ‘post-market monitoring’ fundamentally misunderstands the problem. You’re not safeguarding a static, bounded artifact. You’re trying to guide the behaviour of something that’s constantly evolving, embedded in social and organisational contexts that are also constantly evolving.
Designing for safety in that context means designing the controls that sense what’s happening, interpret whether it’s drifting toward harm, and intervene before damage occurs. It means building closed feedback loops where information about system behaviour reaches people and systems with the authority and capability to act on it. It means architecting for adaptation, not just compliance at a point in time. 5

An aerial view of the Fukushima Plant after the tsunami

Regulation matters. I’m 100% in favour of good regulation and genuinely believe that regulation around safety and security of digital systems, including AI, should be strengthened. Regulation done well sets expectations, creates accountability, forces organisations to collect evidence, and aligns different actors around shared language and comparative baselines. It establishes the scaffolding within which safety can be built. The EU AI Act, the NIST framework, ISO 42001, these all create structure that would otherwise be absent. They ask certain questions and provide method to their answers.
But scaffolding isn’t building. Regulation can require you to think about risk. It can’t do the thinking for you. It can mandate documentation, but it can’t ensure that documentation is critically considered and reflects reality. It can demand human oversight, but it can’t makes sure that interfaces and escalation paths are designed to make oversight meaningful. The hard work of turning abstract requirements and perceived risks into concrete mechanisms, architectures, and interactions - that’s design, and it happens inside the scaffolding that regulation provides. Regulation drives compliance, but compliance does not drive safety. Design drives safety.
Fukushima shows this starkly. You can have regulatory structure, inspection, and penalties, yet still build a plant that goes dark if the water is twice as high as your knowingly incorrect assumption. The problem wasn’t a shortage of regulation. The problem was that the design basis didn’t get forced to change as the hazard picture evolved. Elaborate apparatus around the plant. Not enough attention to how the plant itself behaves under stress.
Badly designed regulations can make the problem worse, simply contributing to the theatre, often despite the best of intentions. Michael Power’s 1999 work on the audit society is all about how systems of checking can expand to become ‘rituals of verification’ without any meaningful improvement to underlying performance 6. In 2011, Bruce Schneier coined the term “security theater” to describe measures that look reassuring while doing little to reduce real risk. 7
More recently, Sidney Dekker’s Safety Theater extends this into operational safety management, showing how the desire for perfection paradoxically generates the opposite: compliance clutter, along with inauthentic relationships between documented procedures and work-as-actually-done. Compliance even creates new kinds of accidents. His phrase captures it precisely: we “meet the target but miss the point.” 8
I worry that AI governance is developing its own version of this dynamic, building elaborate documentation systems that satisfy auditors while the actual behaviour of AI systems remains poorly understood and inadequately controlled.

The four layers of design that matter

For AI, safety and security come from four kinds of design working together.

1. Regulatory and legal design

The first layer is the deliberate design of the laws and regulations that shape what organisations must do. Regulations encode assumptions, and when those assumptions don’t match reality, compliance becomes disconnected from safety.
Well-designed regulation would ask the same questions we ask of governance mechanisms: What behaviour does this actually incentivise? What information flows does it require? What feedback loops exist to update requirements as understanding evolves? How will organisations game this, and does that gaming undermine safety?
Poorly-designed regulation creates perverse outcomes. Static conformity assessments for dynamic systems. Accountability categories that don’t match causal chains. Documentation requirements that expand without improving safety. Definitions that force neat boundaries onto messy realities.
Despite good intentions, laws like the EU AI Act can act against good design. For legal convenience, it treats an “AI system” as a bounded, identifiable thing that can be placed on the market, assessed for conformity, and assigned to a risk category. The Act’s own explanatory materials acknowledge this is a simplification. In practice, what we call an “AI system” is often a fluid assembly of components from multiple vendors, updated continuously, behaving differently across contexts, and entangled with human workflows in ways that make neat boundaries impossible to draw.
This isn’t a minor technicality. When regulation assumes a static, bounded system but the actual technology is dynamic and distributed, compliance becomes disconnected from safety. Organisations can satisfy the legal definition while the real risks emerge from exactly the aspects the definition ignores: supply chain dependencies, continuous model updates, emergent behaviours from component interactions, and the gap between tested conditions and deployment reality. Even the lead author of the EU AI Act Gabrielle Mazzini, stepped away critical of how the Act had become overly complex, rigid and lacking in common sense, an instrument of bureaucracy that would have limited ability to adapt to changing technology. 9
A design-first argument has to extend to regulators and standards bodies. If we’re building governance mechanisms to satisfy regulatory frameworks that encode false assumptions about how AI works, we’re compounding the problem.

2. Governance mechanism design

Governance mechanism design is about how you sense, decide, and intervene. Many organisations now have some form of AI steering group or risk committee. But that’s not design, that’s scaffold.
A governance mechanism only becomes real when you can answer concrete questions: what signals does it see; when does it act (real time, design reviews, quarterly); what triggers escalation; and can it actually say no, pause a rollout, or demand design or operational threshold changes?
Safety science again points at the failure mode: if your governance system cannot reliably enforce safety constraints, you don’t have a safety control system. You only have documentation about one.
This is also the core argument I develop in my work on adaptive governance mechanisms: that adaptive governance is best implemented as collections of discrete closed-loop mechanisms with concrete inputs, tooling, ownership, adoption pathways, inspection, and continuous improvement. A committee, a policy and an audited checklist is not governance, even if that is sufficient to satisfy the regulation.10. Course 3 of the AI Governance Foundation Program goes into this in depth, covering how to design mechanisms that adapt because they are embedded into the real lifecycle of AI systems. 11

3. Product and system design

Product and system design is about how the technical system is architected. I’ve never seen an AI system in production that is a tidy, bounded object. In reality, most deployed AI is part of a socio-technical system: data sources, models, orchestration logic, dependent services, tools, guardrails, observability components, human workflows, and downstream processes.
Safety in that kind of system depends on specific design choices: boundaries around what the system is for and not for; architectural separation so errors in one component can’t cascade everywhere; failure behaviour that alert; and observability that lets you see and respond to what’s happening across the full lifecycle.
The Fukushima equivalent of “where are your backup generators?” in AI might be: where does the final decision about a consequential decision actually get made? Which part of this system has the power to commit an irreversible action? What prevents a silent model failure from flowing into that action?
Those are design questions, and I don’t think you can answer them by pointing at a paragraph in a regulation or a control in a library.

4. Product and system design

Interaction design is about how humans experience, understand, and control the system. In safety critical industries, there is a long established understanding of how user interaction design becomes a control surface.
Therac-25 is the classic horror story here: a radiotherapy machine in which software and interaction failures contributed to massive overdoses, and where the system’s feedback to operators was a key part of why dangerous states were not correctly recognised and stopped. 12
In AI, we recreate softer versions of this pattern regularly: UIs that present model output as authoritative “answers” rather than suggestions; risky actions hidden behind a frictionless click; no indication of uncertainty, data gaps, or model limits; and feedback mechanisms so buried users don’t bother.
We know this failure mode well. “Automation bias,” the tendency to over-rely on automated decision support, has a substantial research literature. Microsoft’s literature review on overreliance also frames why this phenomenon makes meaningful oversight harder in practice, because the user becomes the last line of defense while becoming less vigilant.13 This is the problem-space I explore directly in Cognitive Calibration where I look at why explainability can backfire and how humans systematically miscalibrate their understanding of complex systems.14
If you claim “meaningful human oversight” but the human can’t see what matters, don’t understand what they’re looking at, and can’t easily intervene, that’s not oversight. That’s just theatre again. This is also why I treat “human in the loop” as a fragile design claim in that without explicit well-designed control surfaces and escalation pathways, oversight decays into symbolism under pressure. 15

When governance design matters most

It can be helpful to take a lifecycle view of where good design has the greatest impact.
During development, governance design means assigning cross-functional roles from the start, not bolting some form of ethics review onto the end. It means translating principles like fairness and transparency into requirements engineers can build against. It means treating safety, security and robustness tests like unit tests: automated, continuous, and required to pass before code merges.
Pre-deployment is where formal review happens. But the review should be designed to challenge, not just another rubber-stamp. A well-designed governance mechanism should specify who reviews what evidence against what criteria, with what power to demand changes. Documentation is a byproduct of the mechanism, not the mechanism itself.
Post-deployment is where adaptive governance earns its name. Monitoring, escalation triggers, incident response protocols, and retraining pipelines all need to be designed before launch. When the environment changes, like new laws, new standards, new scientific understanding, new standards, the governance mechanism needs a way to push those changes forward to deployed systems.
This is exactly what Fukushima lacked: a governance mechanism capable of forcing redesign as hazard understanding evolved and as vulnerabilities became much clearer. This is also why, in our work on adaptive human oversight, we treat “oversight” as an adaptive controller problem where you need rapid feedback loops and escalation paths, with explicit strategies for proactive adaptation (scenario testing and simulation) and reactive adaptation (monitoring triggers and response mechanisms).
If you’ve been reading this blog, you know I have a habit of opening with disasters. Fukushima, Therac-25, Tenerife, Piper Alpha. It’s not morbid fascination. I find each of these tragedies motivating, they hold lessons about what happens when governance structures and mechanisms prove inadequate. They’re mirrors, and instructive warning signs, not just history.
Fukushima shows what happens when we build elaborate governance structures around a system whose underlying design assumptions are wrong, and when the institutional system can’t reliably force those assumptions to change as new evidence arrives.
AI is at a much earlier stage, but the shape of the mistake is familiar. Lots of new rules. Lots of new artefacts. Not enough serious design of how systems behave under stress, how humans can actually intervene, and how governance acts on what it sees.
We can keep comforting ourselves that with the next law, the next standard, the next framework, safety will finally arrive. Or we can admit that we have a design problem, one that exists as multiple levels. We have the scaffolding of safe, secure and lawful AI, but not enough of the practical discipline to control behaviour in the complex adaptive systems of modern AI.
This is my 50th article of the year, and my last for 2025. As I head into next year, I plan to go deeper on this question of how we design adaptive governance. Not just what it should achieve but how to actually build it layer by layer and piece by piece. The mechanisms, the architectures, the feedback loops that make governance real and high integrity.
If you’ve read this far and you’re ready to move beyond theory, certifications and rulebooks to learn how to genuinely design and implement AI governance in practice. I encourage you to join us in the AI Governance Foundation Program where we teach, learn and practice exactly that. 16
Thanks for reading, and see you in 2026.
1  https://www.nirs.org/wp-content/uploads/fukushima/naiic_report.pdf
2  https://world-nuclear.org/information-library/safety-and-security/safety-of-plants/fukushima-daiichi-accident
3  https://www.tepco.co.jp/en/press/corp-com/release/betu11_e/images/111202e16.pdf
4  https://shemesh.larc.nasa.gov/iria03/p13-leveson.pdf
5  https://aicareer.pro/blog/systems-safety-engineering-in-ai
6  https://academic.oup.com/book/26482
7  https://www.schneier.com/blog/archives/2009/11/beyond_security.html
8  https://www.routledge.com/Safety-Theater-How-the-Desire-for-Perfection-Drives-Compliance-Clutter-Inauthenticity-and-Accidents/Dekker/p/book/9781032012476
9  https://www.ft.com/content/6585fb32-8a86-4ffb-a940-06b17e06345a
10 https://aicareer.pro/blog/mechanisms-for-ai-governance
11  https://governance.aicareer.pro/course/pro-3
12 https://escholarship.org/content/qt5dr206s3/qt5dr206s3.pdf
13  https://www.microsoft.com/en-us/research/publication/overreliance-on-ai-literature-review/
14  https://aicareer.pro/blog/cognitive-calibration
15 https://blog.aicareer.pro/p/meaningful-human-oversight-of-ai
16  https://governance.aicareer.pro/program/track-1-foundations