Mar 11 / James Kavanagh

The EU AI Act. I'm convinced it was written by Kafka

As an engineer reading the EU AI Act, I see so many requirements that look rational in isolation, but put together are incoherent and disconnected from the reality of AI systems in use.
In The Trial, Josef K. wakes up one morning to find himself under arrest. No one tells him what he's charged with. He spends the rest of the novel navigating a legal system that appears entirely rational from the inside. Every official follows procedure, every rule references another rule. But none of it connects to anything happening in the actual world. The remarkable thing is that K. participates anyway. He shows up to hearings. He hires lawyers. He exhausts himself trying to mount a defence against a charge nobody will name. The system doesn't need to make sense to compel his engagement. It doesn't need to, it doesn't need to connect with reality or truth. It has jurisdiction over him regardless. He never does find out what he's accused of. 
(I won't spoil the ending - I never finished it myself after finding it maddeningly frustrating). Fair warning, this is a long read, but I wanted to get a few things down that have been progressively bugging me more and more about the EU AI Act. So I thought I'd share them, and see if we can navigate a way out. 
I've spent twenty-five years in engineering. Some of it in environments where getting safety wrong could kill people, part where getting security wrong could cause enormous financial and reputational loss. Process safety. Cybersecurity. Cloud infrastructure. AI governance. Regulatory engagement and standards. 
I work with people to build complex, intelligent systems, and my vocation is to build the technical and organisational governance infrastructure around them that keeps them safe, secure, and lawful. A large part of that career has been spent at the interface between regulation and engineering: translating safety and security requirements into engineering constraints and design features. And I've worked in the other direction, verifying and articulating how engineered systems actually achieve their safety and security goals. It's a particular kind of work. You learn to read regulation with an engineer's eye. Not to ask "is this legally coherent, to construct narrative or dispute who can be held liable" but rather to ask "can we build to this? Does this map to how the technology actually behaves?" 
If I can talk to them about it, here's the question I'll ask: pick your highest-risk deployed AI system and tell me, with evidence, why you believe it's operating safely, securely, and lawfully right now. Not how you followed the process. Not how the documentation is up to date. Why that specific system is behaving in a way you can defend as being safe.
In some way the EU AI Act comes into my work every day, either directly or by influence. I help practitioners interpret its requirements and map its obligations onto real systems in real organisations. The more I do that work, the worse it sits with me. Because more and more I find that it regulates a conception of AI that, as far as I can tell, doesn't exist in the real world. 
At least, I've never seen it. 

An AI System in use is not a Smart Toaster

The AI system the Act imagines is a neat, self-contained product. It has a singular purpose. It sits in one or a few identifiable domains. It was built by a clearly defined provider who understands every aspect of how it works. It behaves the same way after deployment as it did during conformity assessment. It has a clear boundary between itself and its environment. You can point at it and say: "that is the AI system, and that is not." It's an assemblage of predictable technical components, disconnected from it's users who have no impact on its operation.
Now, I should be precise about what I'm claiming here. The EU AI Act is not literally blind to the messiness of modern AI. It addresses general-purpose AI models. It contemplates downstream providers integrating components from upstream. It acknowledges that systems can continue learning after deployment. It requires post-market monitoring. So when I say the Act imagines a neat, self-contained product, I don't mean the drafters were unaware that reality is more complicated. I mean that the regulatory architecture they built to handle that complexity is still rooted in product-safety assumptions that don't fit the subject matter. The Act tries to govern dynamic, adaptive systems using a framework designed for static, bounded ones. Sure, it acknowledges the dynamism, but the tools it reaches for keep pulling it back toward the static model. 
That's the structural problem I want to work through.
In my experience, the systems I've encountered that warrant real safety and security concern are adaptive. They're composed of components from multiple providers, operating across domain boundaries, behaving differently depending on context and input, changing their own characteristics over time. They're embedded so deeply in organisational processes that the line between "the AI system" and "the workflow it's part of" is a matter of opinion rather than fact. The Act's regulatory architecture was built for the platonic ideal. The real world is messier.
The fundamental problem, as I see it, is architectural. The EU AI Act is built on a product safety paradigm. Assess the system. Certify it. Deploy it. Monitor it for deviation from the certified state. This works beautifully for toasters. It even works reasonably well for medical devices. But an AI system, especially one deployed and in use is not a toaster. It's a complex adaptive system operating inside other complex adaptive systems: organisations, markets, societies. The interactions between the system, its users, its data environment, and its context are constantly shifting. The Act tries to draw static, bounded lines around something that is fundamentally unbounded and dynamic. The result is a set of tensions that should concern anyone who cares about effective AI regulation. I do.
I can already hear someone saying: "But the AI Act is based on the New Legislative Framework and the Blue Guide, the EU's proven methodology for writing product safety legislation. It must be solid." And that's exactly my point. The Blue Guide is proven. It's been refined over decades for toys, machinery, radio equipment, low voltage electrical equipment, medical devices, fertilising products. Physical, bounded, manufactured goods with defined intended purposes that don't change their own characteristics after they leave the factory. The NLF was explicitly conceived for "movable and physical objects." The Blue Guide itself frames the methodology around EU product rules for goods. The pedigree of the methodology isn't in question. The question is whether the methodology fits the subject matter. For AI systems that learn, adapt, operate across domains, and behave differently depending on context, I'd argue it fundamentally doesn't.
What makes this choice especially puzzling to me is that the EU already has regulatory frameworks designed for complex, dynamic, safety-critical systems, and none of them use the NLF. European aviation safety is governed through EASA under its own Basic Regulation (EU 2018/1139), using type certification combined with continuous airworthiness obligations, mandatory safety management systems, occurrence reporting, and ongoing oversight by a specialist agency. The system is designed around the assumption that safety is not a state you certify once but a property you maintain continuously through active management. Process safety for major accident hazards is regulated under the Seveso III Directive (2012/18/EU), which requires operators to implement safety management systems with management-of-change procedures, submit safety reports to competent authorities, and maintain emergency plans. Seveso III is built around the principle that complex industrial systems change over time and that safety governance must change with them. Neither of these frameworks works by assessing a product, certifying it, and then watching for deviation from the certified state. They treat safety as an ongoing, adaptive, system-level property. The EU had these models available. But it chose instead to regulate AI, a technology whose defining characteristic is that it learns, adapts, and changes, using the framework it built for products that don't. As if a complex AI system were nothing more than a smart toaster.
Here are a few pointed and specific places where I think that architectural mismatch creates real problems in practice. They're four examples of why I think Kafka was holding the pen.

1. The systems that will actually hurt people don't get proportionate scrutiny

The Act classifies risk primarily by domain and use case. Annex III lists the areas where AI is considered high-risk: biometric identification, critical infrastructure, education, employment, law enforcement, migration, and so on. If your AI system operates in one of those domains, you get the full weight of Chapter III obligations. If it doesn't, you're subject to a much lighter regime.
Now consider AI companion chatbots. The Character.AI scenario. These systems are deployed at scale to people in genuine emotional distress, including minors, with no clinical guardrails and no meaningful human oversight. They don't fit any Annex III category. They're not medical devices. They're not education. They're not employment. A vulnerable person forms a parasocial attachment to a system with no duty of care, and we've already seen real-world fatalities at least alleged to be linked to these systems.
To be fair, the Act is not completely silent here. Article 50 imposes disclosure duties on systems that interact directly with people. Article 5's prohibited-practices provisions can catch manipulative or exploitative uses. Market surveillance authorities retain residual powers even for systems outside the high-risk framework. But none of that comes close to the level of regulatory scrutiny that the actual harm profile of these systems warrants. No conformity assessment. No risk management system obligation. No incident reporting requirement. That gap between the documented severity of harm and the regulatory response ought to bother people more than it does.
The Act asks "what sector and use case is this deployed in?" when it should also be asking "what is the nature and severity of the potential harm?" A system sorting CVs is high-risk. A system that a suicidal teenager talks to at 3am is not. As an engineer, I find that difficult to reconcile.
Now perhaps you might say: "Those are covered by the DSA, the GDPR, product liability, consumer protection." And you're right. Those instruments exist. But again, that's the weird point. If the most consequential AI harms fall outside the EU's definitive AI-specific regulation and must instead be addressed by legislation that wasn't designed with AI in mind, that tells us something about the Act's risk classification framework. It has a structural blind spot. The domain-based categorisation pulls regulatory attention toward systems that fit neatly into listed sectors and away from the more persistent, harder-to-categorise harms that don't.
What I personally cannot understand, and I mean this straightforwardly, is why the Act classifies risk by domain and use case rather than by the nature and severity of the potential harm. Every safety discipline I've ever worked in starts with the hazard. You identify what could go wrong, how badly, and for whom. Then you design controls proportionate to that risk. The EU AI Act starts with a list of sectors. If your system operates in a listed sector, it's high-risk. If it doesn't, it largely isn't. That is such an incredibly strange design choice for safety regulation, and I have not heard a convincing explanation for it.
Now, the Act does include mechanisms to adapt. Article 6 applies a significance filter to Annex III systems, and Article 7 empowers the Commission to amend Annex III by adding or modifying use cases where comparable risks arise. So the framework is not permanently frozen. But relying on future amendments to address harms that are already well-documented and already causing real-world casualties is cold comfort. The structural question is whether a domain-and-use-case classification can ever adequately capture harms that are emergent, context-dependent, and not neatly sectoral. I don't think it can.

2. The Act tries to manage continuous change, but its tools keep assuming stability

Here's where the product safety paradigm really starts to creak. The conformity framework assumes you assess a system, it passes, you deploy it, and then you monitor it against the assessed baseline. Article 43 requires a new conformity assessment when there's a "substantial modification." Perfectly sensible for a static product.
Take an adaptive cybersecurity threat detection system deployed in critical infrastructure. This one actually is caught by the Act as high-risk. But its entire value proposition depends on continuous change. New attack vectors emerge daily. If the model isn't retrained regularly, it degrades and the infrastructure it protects becomes vulnerable. Freezing the model to preserve your conformity assessment makes the system less safe, not more.
The Act does acknowledge this problem. Article 43(4) says that for systems which continue to learn after deployment, changes that were "predetermined by the provider" at the time of initial conformity assessment don't trigger a new assessment. That's a genuine attempt to accommodate adaptation. But it shifts the problem rather than solving it. The conformity assessment now covers a methodology for change rather than a specific system state. How do you certify that a process for continuous retraining will always produce compliant outcomes? What happens when the retraining process performs exactly as designed but the resulting model drifts into behaviour that would have failed the original assessment? The boundary between acceptable predetermined adaptation and a genuine substantial modification remains legally and operationally undefined.
Researchers have already flagged this gap. Floridi, Holweg and colleagues, in their capAI conformity assessment procedure, identified the operational phase as the most significant gap in business compliance processes governing AI, precisely because unlike conventional software, AI systems are not fixed in their characteristics once deployed. The Act acknowledges continuous learning exists. It just hasn't resolved what continuous learning means for the static assumptions still embedded in its conformity framework.
The monitoring obligation under Article 26(5) compounds this. The deployer must monitor the system against the instructions for use. But the instructions for use describe a system whose correct behaviour is to change. How do you monitor for anomalous behaviour when the baseline is moving by design? How do you deal with a provider that just provides overly broad instructions for use, in doing, transferring obligations to deployers.
If you're a safety science guy who reads Hollnagel for fun, then I think you'll recognise the problem immediately. The Act is Safety-I thinking applied to a system that needs Safety-II governance. It defines what should happen and watches for deviation, when what it actually needs is to understand the conditions under which the system goes right and ensure those conditions are maintained as it evolves.

3. The entity with the heaviest compliance burden has the least visibility into what matters most

This is really bad, and I think a really unfair aspect of the AI Act. It's structually somewhat absurd. Under the Act, a GPAI model provider (think OpenAI, Anthropic, Google) has relatively contained obligations under Article 53: technical documentation, information for downstream providers, copyright policy. Manageable.
A downstream company takes that model and integrates it into a high-risk AI system. Under Article 25, that downstream company becomes the provider of the high-risk system and inherits the full weight of Chapter III: risk management, data governance, conformity assessment, the lot.
The conformity assessment under Article 43 requires this downstream provider to demonstrate that the whole system meets the requirements. But the system's behaviour is substantially determined by the foundation model, which they didn't build, may not fully understand, and often cannot inspect in depth.
The Act does try to bridge this gap. Article 53 requires GPAI providers to supply documentation and information that enables downstream providers to understand capabilities and limitations. Article 25(4) requires written agreements specifying necessary information, capabilities, technical access, and other assistance. They're not not nothing, but Article 25 also explicitly protects intellectual property rights, confidential business information, and trade secrets.
So the Act creates this tension. It simultaneously requires the downstream provider to demonstrate compliance across the whole system and protects the upstream provider's right to limit disclosure of how the most consequential component actually works. The information-sharing obligations and the confidentiality protections pull in opposite directions, and the Act doesn't clearly resolve which prevails when they collide. The entity with the deepest understanding of the model has lighter obligations. The entity with the heaviest compliance burden has to work within whatever the upstream provider chooses to disclose.
Imagine product safety law requiring a car manufacturer to certify engine safety while the engine supplier has a legally protected right to limit what engineering specifications it shares. The manufacturer has a right to request the information, and the supplier has an obligation to provide information. But the supplier also has a protected right to withhold what it considers proprietary. That's where we seem to be with the EU AI Act.
Hacker and Holweg, in their 2025 paper on regulating fine-tuning, propose some form of "federated compliance" structure precisely because the Act's linear value chain model doesn't match the distributed reality of GPAI development and deployment. Their framework includes joint testing of base and modified models, Failure Mode and Effects Analysis, and a shared database for GPAI modifications. These are seriously heavyweight techniques, but the fact that we're needing proposals on entirely new compliance architectures to bridge the gap is genuine evidence that the gap is real.
Right now, the only viable solution is probably to never include a GPAI in any high-risk AI.

4. The same technology gets radically different treatment depending on context, but the rationale doesn't fully justify the distinction

Article 5(1)(f) prohibits the use of AI systems to infer emotions in workplace and education settings. Already in force. The rationale, per Recital 44, cites both the limited reliability of emotion inference from biometric data and the power imbalances inherent in employment and educational relationships. The combination of intrusive, unreliable technology and contexts where people cannot meaningfully object. Fair enough, that's good.
But the real world is a bit more complicated. Outside workplaces and educational institutions, the same technology inferring the same emotional states from the same biometric signals is not prohibited. It is regulated. Where emotion-recognition systems fall outside the Article 5 prohibition, they sit in the Act’s high-risk/transparency architecture rather than being banned outright. But it's permitted. The European Commission's own guidelines use a call centre as an example: monitoring whether agents are stressed is prohibited; the same system monitoring whether customers are angry is allowed under the high-risk framework.
I get the logic. The power-asymmetry rationale does create a meaningful distinction between employees who can't walk away and customers who can (in theory) take their business elsewhere. I accept that basic reasoning.
But I have two problems with it. First, the scientific objection doesn't disappear at the workplace door. If inferring emotions from biometric data is too unreliable to use on employees (and the predominant scientific consensus is that it is unreliable), that unreliability is a property of the technology, not the context. The high-risk framework subjects the customer-facing use to conformity assessment and transparency requirements, but it doesn't address the underlying scientific problem that partly motivated the prohibition in the first place. We're saying: this technology is too unreliable to use on employees, but reliable enough to regulate-and-permit for customers, as long as you document and disclose it. That's a coherent regulatory choice, but it's worth being explicit that it's a policy trade-off, not a resolution of the scientific concern.
Second, the power-asymmetry rationale should logically extend to other coercive contexts that the prohibition doesn't cover. Prisons. Immigration detention. Social welfare offices. A person applying for benefits at a government office has no more ability to "walk away" than an employee does. If the rationale is really about power imbalance combined with intrusive technology, the scope of the prohibition doesn't match the scope of the rationale. So did those potential use cases just not crop up in the brainstorming session?
And then there are the dual-context settings that make clean boundaries difficult in practice. A university bookshop is both retail and educational. A hospital is a workplace for doctors and a medical facility for patients. An employee doing professional development online is simultaneously in a workplace and an educational institution as a student. The Act draws sharp binary lines through contexts that are, in reality, continuous and overlapping.

So what's actually going on here?

I think the EU AI Act is the product of legislators doing their honest best to regulate something genuinely unprecedented, using the tools and frameworks they had available. Product safety regulation is what the EU does well. It was natural to reach for that toolbox.
But the toolbox assumes bounded systems in defined domains with stable behaviour and clear supply chain responsibilities. AI systems in use are none of those things. They cross domain boundaries. They change continuously. Their risk profiles are determined by context, not category. Their value chains distribute knowledge and accountability in ways that a linear supply chain model can't capture.
The EU AI Act isn't blind to this. It addresses general-purpose AI, contemplates continuous learning, creates mechanisms for downstream integration, and includes powers to amend its risk classifications over time. My objection is not that the Act ignores these realities. My objection is that it addresses them using a regulatory architecture still fundamentally rooted in product-law assumptions, and that fit remains awkward for adaptive, context-sensitive AI systems.
Some will argue that the harmonised standards process now underway in CEN/CENELEC will resolve these tensions. I'm sceptical. Standards operationalise a regulatory framework; they don't redesign it. The standards being developed by JTC 21 map directly to the Act's existing articles: risk management, data governance, quality management systems, conformity assessment. They translate the Act's requirements into technical detail. If the architecture is misaligned, the standards encode the misalignment more precisely. That's not a fix. And as we know, the process itself is under enormous time pressure, behind schedule, largely opaque to the practitioners who will need to implement the results, and shaped disproportionately by the organisations with the resources to put people in the room. None of that inspires confidence that the operational awkwardness will be resolved at the standards layer.
And while I'm getting things out there, there are also missed opportunities that deserve mention. Here's one: labelling. The Act's visible output to the world is a CE mark. Binary. You passed or you didn't. For a toaster, that's fine. You don't need to know how safe the toaster is, just that it met the threshold. But for AI systems, where risk is contextual, where the same system behaves differently in different deployments, where a deployer's choices and the mode of a user's interaction materially affect the risk profile, a binary stamp tells you almost nothing useful. It doesn't tell a procurement team what trade-offs were made. It doesn't tell a deployer what to watch for. It doesn't help a user or an affected person understand what kind of system they're interacting with. 
And the EU knows how to do better than this. Energy efficiency gets an A-to-G rating that drives both purchasing decisions and manufacturer behaviour. Tyres get graded on fuel efficiency, wet grip, and noise. These are informational mechanisms designed to shape behaviour, not just certify minimum compliance. If you wanted to drive genuine adoption of responsible AI practices, rather than just policing a threshold, graduated and informative labelling would be one of the most powerful tools available. And the Act doesn't use it.
I believe that every one of the tensions I've described above comes from the same root: the Act applies static, categorical, domain-bounded thinking to something that is dynamic, contextual, and emergent. Any individual provision can be defended in isolation. The problem is that the underlying model of what an AI system is, and how it behaves in the world, doesn't quite match the reality those provisions are trying to govern.
I'm an engineer, not a lawyer. I'm a European living on the other side of the world. But the question I'm raising isn't a legal one. It's an architectural one. Can a product safety paradigm govern systems that aren't products in any meaningful sense? I don't think legal reasoning alone resolves that, any more than engineering reasoning alone resolves questions of fundamental rights. 
I believe in serious conversations across both disciplines. Not one where lawyers explain the Act to engineers, but where engineers and lawyers work together on whether the underlying framework is fit for what it's being asked to do, and how it should be adapted to do so.
None of this is an argument against regulation. I want to be clear about that. AI systems need safety regulation. Good regulation protects people, creates accountability, and gives engineers like me something meaningful to build toward. Safety regulation saves lives, and protects against harm. I've seen what happens when it's absent or weak. The question has never been whether to regulate AI. 
The EU AI Act is law, and rightly so. As engineers and practitioners, our job is to understand it, translate its requirements into engineering constraints, and apply it as rigorously as we can. That's what we do. That's what I help people do every day. But doing that work diligently doesn't make the feeling go away. The feeling that you're inside a system that looks rational on its own terms, follows its own logic, and was built with genuine intent, but that breaks down the moment you try to connect it to what's actually happening in the world outside. 
Josef K. would recognise it immediately.
Thank you for reading this long article, and especially to all of you within our practitioner community.  I originally published this on LinkedIn where you may find some additional commentary and debate:  https://www.linkedin.com/pulse/engineer-reading-eu-ai-act-im-convinced-written-kafka-james-kavanagh-lmqdc/
If you want to build the kind of AI governance that works for real, covering both the theory and the practice, then the AI Governance Practitioner Program is for you.

References

1. EU AI Act: Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence. https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng
2. Blue Guide: European Commission, 'The Blue Guide on the implementation of EU product rules 2022' (OJ C 247, 29.6.2022). https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=oj:JOC_2022_247_R_0001
3 EASA Basic Regulation: Regulation (EU) 2018/1139 of the European Parliament and of the Council of 4 July 2018 on common rules in the field of civil aviation and establishing the European Union Aviation Safety Agency. https://www.easa.europa.eu/en/regulations/basic-regulation
4 Seveso III Directive: Directive 2012/18/EU of the European Parliament and of the Council of 4 July 2012 on the control of major-accident hazards involving dangerous substances. https://www.hse.gov.uk/seveso/index.htm
5 Hacker, P. and Holweg, M. (2025) 'The Regulation of Fine-Tuning: Federated Compliance for Modified General-Purpose AI Models', Computer Law & Security Review. https://doi.org/10.1016/j.clsr.2025.106234
6 Hollnagel, E. (2014) Safety-I and Safety-II: The Past and Future of Safety Management. Farnham: Ashgate.
7 CEN-CENELEC JTC 21: Work programme and standards under development in support of the EU AI Act. https://jtc21.eu/
8 Smith, A.L. (2025) 'The CEN-CENELEC JTC 21 work programme supporting the EU AI Act', AI regulation, standards and reality (Substack). https://adamleonsmith.substack.com/p/the-cen-cenelec-jtc-21-work-programme