Wiki / Blog / Compliance and regulation

EU AI Act Cybersecurity Requirements for High-Risk AI Systems

A practitioner's guide to Article 15 cybersecurity, Article 9 risk management, and Article 17 quality management for high-risk AI providers. What auditors will actually ask in 2026.

eu-ai-actcompliancehigh-risk

EU AI Act Cybersecurity Requirements for High-Risk AI Systems

Articles 9, 12, 14, 15, and 17 are the ones that land on security teams in 2026. Article 15 is the headline (accuracy, robustness, cybersecurity); Article 12 is the one teams underestimate (automatic logging across the full lifecycle); Article 14 is the one agentic systems break first (human oversight on destructive operations). This is the practitioner's reading.

The EU AI Act creates a horizontal regulatory regime for AI systems across the European Union. The strictest obligations fall on providers of high-risk AI systems (Annex III categories). Many of those obligations apply from 2 August 2026 - and 2 August 2026 is closer than the calendar makes it look once you factor in conformity assessment timelines.

We have walked roughly twenty customers through their first internal Article-9-through-17 review since the start of the year. The pattern is consistent enough that this article exists. The auditor questions in the closing section are paraphrased from the procurement conversations we have actually been in, not from the text itself.

Who counts as a "high-risk AI system provider"

Annex III names eight categories, including biometric identification, critical infrastructure, education, employment, essential private and public services (credit scoring, insurance pricing, public benefits, emergency-call triage), law enforcement, migration and border control, and administration of justice. If your AI system supports decisions in any of these categories, you are likely a high-risk provider.

The clarifier we end up giving on every call: it is the AI system plus its deployment context that defines high risk, not the model. A general-purpose LLM is not high-risk in itself; the customer-facing credit assistant built on top of it is. The same foundation model can power one application that is high-risk and another that is not, and the conformity assessment lives at the application layer.

Article 9: Risk management system

Article 9 requires a documented, continuous risk-management process across the AI system's lifecycle. The auditor's mental model is "show me you ran a structured risk assessment, identified mitigations, and re-ran after every material change."

Practical evidence:

  • A written risk register with AI-specific entries (prompt injection, sensitive disclosure, hallucination, supply-chain compromise of the model, model drift).
  • A cadence: the risk register is reviewed at defined intervals AND on every material change (new model version, new tool added, scope expansion).
  • A traceability link from each identified risk to a mitigation (technical control, organisational control, or accepted-risk decision with rationale).

Where AI-SPM helps: scheduled adversarial scans plus finding triage plus control mapping produces this register continuously. Each finding ships with the framework reference; the audit log records every status change. We covered the practical control mapping in the NIST AI 600-1 walkthrough, which overlaps substantially with how a defensible Article 9 risk register reads.

Article 10: Data and data governance

Article 10 sets standards for training, validation, and test datasets: relevance, representativeness, freedom from errors, completeness, statistical bias mitigation. For LLM applications built on third-party foundation models, the practical question is "what data does your system retrieve and use at inference time?"

Practical evidence:

  • An inventory of every data source the AI system retrieves from at inference (RAG corpora, tool outputs, vector stores).
  • A documented data-quality process for each source.
  • Tenant-isolation testing where shared embedding stores are in use. We dedicated a separate post to the specific failure mode that auditors care about here.

Article 12: Record-keeping

Article 12 requires automatic logging of AI system events sufficient to ensure traceability for the system's lifetime. For LLM applications, this means inference-time event logging, not just training-time. This is the article that surprises teams most often, because their existing logging answers operational questions but not regulatory ones.

Practical evidence:

  • Append-only logs of inference events with enough metadata to reconstruct a decision: input class, system prompt version, model version, retrieval source, output, user identity (where applicable).
  • Retention long enough for the system's expected lifetime (typically multi-year, and ten years for some regulated industries).
  • A tamper-evident channel so the auditor has a second-source trail even if an actor with database-administrator access changed something.

Article 14: Human oversight

Article 14 requires the AI system to be designed for effective human oversight: the human operator must be able to understand the output, intervene, and override. For agentic systems this article is the one we see broken first, because tool-calling agents do things faster than humans can react to.

Practical evidence:

  • Per-decision rationale capture and surfacing.
  • An override flow with audit trail.
  • For agentic systems, explicit human-in-the-loop on destructive operations. We discussed this control in the MCP security checklist at line item six. The same control answers Article 14.

Article 15: Accuracy, robustness, and cybersecurity

This is the article most directly relevant to security teams.

"High-risk AI systems shall be designed and developed in such a way that they achieve an appropriate level of accuracy, robustness and cybersecurity, and that they perform consistently in those respects throughout their lifecycle."

For LLM applications the operational subset is:

  • Robustness against adversarial inputs. Prompt injection (direct, indirect via RAG, tool-output), jailbreaks, persona-shift exploits.
  • Resilience to data poisoning. Training data poisoning for fine-tuned models; corpus tainting for RAG.
  • Cybersecurity of the AI system itself. Authentication, authorisation, secrets management, supply-chain security.

Practical evidence:

  • Adversarial test suite with documented coverage of attack categories. The prompt-injection testing post covers the multi-category test pattern.
  • Continuous testing cadence (a single-point-in-time pentest is not "throughout the lifecycle"). We say it three times in this article on purpose, because three out of four organisations we have walked through this still expect annual cadence to suffice.
  • Incident response and patching cadence aligned to vulnerability severity.

Article 17: Quality management

Article 17 requires a documented quality management system covering the development, testing, deployment, and post-market monitoring of high-risk AI systems. ISO/IEC 42001 is the natural framework for this; AI Act conformity assessments often reuse 42001 evidence. If you already maintain a 42001-aligned AIMS, the Article 17 evidence is largely the same documentation with a different cover letter.

What auditors will actually ask in 2026

Based on procurement conversations across banking, insurance, healthcare, and public-sector tenders:

  1. Show me the asset inventory. Every AI system you are claiming compliance for needs to be in the inventory. The follow-up question is "what is in the system inventory that is NOT in the compliance scope, and how do you decide?"
  2. Show me the risk register entry for prompt injection. Plus the mitigation, plus when you last verified it. The verification date is the question that catches teams.
  3. Show me the cybersecurity testing cadence. A one-time pentest will not satisfy "throughout the lifecycle." A scheduled programme with the most recent run date in the last seven days is the safe answer.
  4. Show me the human oversight flow for destructive operations. Particularly relevant for agentic systems. Auditors increasingly know what an MCP server is and will ask if the agent's tool list has been reviewed.
  5. Show me the post-market monitoring evidence. How do you know your AI is still behaving the way the conformity assessment said it would? This is the question Article 72 puts in the room and most teams do not have a clean answer the first time.

Where to start

Three concrete steps for a security team facing Article 15 in 2026:

  1. Inventory. Catalog every LLM application, agent, MCP server, RAG pipeline, vector database, and runtime gateway. Anything missing is a finding waiting to happen. The asset inventory post covers the eight categories we recommend.
  2. Adversarial testing on a schedule. Daily or weekly. The OWASP LLM Top 10 plus OWASP Agentic Top 10 are the right starting catalog.
  3. Control-mapped evidence pack. Every finding tied to an Article plus Annex IV control. PDF and JSON exports for the auditor.

This is the discipline that AI-SPM platforms automate. Read the compliance page or request an architecture review.

Related reading


Continue in the wiki

All articles Request architecture review