EU AI Act Annex IV: Documentation Checklist for AI Systems
Before a high-risk AI system can be placed on the EU market, its provider must assemble technical documentation that demonstrates compliance with the EU AI Act’s requirements. This documentation is not a one-time filing — it must be kept current throughout the system’s lifetime and made available to supervisory authorities on request.
Annex IV of the EU AI Act specifies exactly what this documentation must contain. It’s one of the most specific and actionable sections of the regulation — and one of the least discussed in engineering circles.
TL;DR
- Annex IV defines eight documentation categories required for high-risk AI systems
- Documentation must be assembled before market placement and kept current throughout operation
- Key gaps for AI agent deployments: intended purpose definition, training data description, risk management evidence, and monitoring procedures
- Documentation must be “appropriate for the purpose of demonstrating compliance” — functional, not just present
- Annex IV documentation connects directly to Articles 9, 12, 13, 14, 15, and 17
The Eight Documentation Categories
Annex IV specifies eight categories of technical documentation. For each, the regulation requires documentation that is “appropriate, relevant and understandable.”
Category 1: General Description of the AI System
Required content:
- Intended purpose of the AI system
- Name and version of the software
- How the AI system interacts with hardware or software it is not itself part of
- Versions of relevant software or firmware, and the requirements for their updates
For AI agents, this category requires a clear, specific statement of intended purpose that maps to the Annex III high-risk category the system qualifies under. Vague purpose statements (“provide decision support”) do not satisfy this requirement — you need specificity about what decisions, for what users, in what context.
Category 2: Design Specifications and Development Process
Required content:
- Overall design of the AI system (technical specifications)
- Design choices, assumptions, and design constraints
- Training methodologies, techniques, and main design choices
- Design specifications for input data
- Model architecture and computational resources used
For AI agents using foundation models (Claude, GPT-4, Gemini), this requires documentation of: which foundation model is used, which version, how you’ve configured it (system prompts, fine-tuning if any), and the constraints you’ve applied through your governance layer. You don’t need to document Anthropic’s internal model design — you need to document your deployment configuration.
Category 3: Information on Training, Validation, and Testing
Required content:
- Training, validation, and testing data sets (or, for foundation model-based systems, the data governance practices applied)
- Methods used to examine training data
- Accuracy, robustness, and other performance metrics
- Methodology for testing, including testing of discriminatory effects
This is the Article 15 evidence category. Test results, accuracy metrics, subgroup analysis, and adversarial test results belong here. For AI agents using foundation models without custom training, document the evaluation methodology you applied before deployment and the results.
Category 4: Monitoring, Functioning, and Control
Required content:
- Capabilities and limitations of the AI system
- The level of accuracy, robustness, and cybersecurity achieved (and the level against which it is measured)
- Reasonably foreseeable unintended uses and foreseeable misuse
- Processes in place for human oversight
- Input data specifications and the conditions under which the system may fail or produce incorrect results
This is one of the most demanding categories for AI agent deployments. The “capabilities and limitations” section must honestly document what the agent can and cannot reliably do — not just what it’s designed to do. Known failure modes must appear here.
Category 5: Risk Management System
Required content:
- Summary of the risks identified under Article 9
- Mitigations applied
- Residual risks and the basis for accepting them
This is the Article 9 compliance evidence category. The Article 9 risk management system must be summarized in the Annex IV documentation, with a clear record of identified risks, mitigations, and residual risk acceptance.
Category 6: Changes Made to the System
Required content:
- Changes made to the system during its lifetime
- Changes that were pre-approved by a notified body (where applicable)
- Change log with dates, nature of changes, and compliance impact assessment
For AI agents, this category requires a change management process that documents: model version updates, governance rule changes, scope expansions, and any architectural changes. Each entry must include a compliance impact assessment — does this change affect the system’s compliance posture?
Category 7: EU Declaration of Conformity
Reference to the EU Declaration of Conformity — a separate document in which the provider declares that the AI system meets all applicable EU AI Act requirements.
This is a formal legal document. It must reference the specific articles of the EU AI Act against which compliance is declared. For AI agents, the declaration typically covers Articles 9–17 and the applicable Annex III category.
Category 8: Post-Market Monitoring Plan
Required content:
- Post-market monitoring system specification
- Timeline and methodology for monitoring
- Data collection plan
- Corrective action process
This is the operational compliance evidence category. The monitoring plan must specify: what metrics are tracked, how often, who reviews them, and what triggers a corrective action. This connects directly to the Article 9 continuous risk management requirement.
Summary Checklist
| Annex IV Category | Key gap for AI agents | Status check |
|---|---|---|
| General description | Specific intended purpose, version tracking | Is your intended purpose specific and mapped to Annex III? |
| Design specs | Foundation model documentation | Is your deployment configuration documented? |
| Training/testing | Evaluation results, subgroup analysis | Do you have test results on file? |
| Monitoring and control | Known failure modes, human oversight | Are failure modes honestly documented? |
| Risk management | Residual risk acceptance | Are Article 9 decisions documented? |
| Change management | Model update impact assessments | Do updates trigger compliance review? |
| Declaration of conformity | Legal review required | Is the declaration drafted and signed? |
| Post-market monitoring | Active monitoring plan | Is your monitoring plan documented? |
Common Documentation Gaps
Gap 1: Intended purpose is too vague. “AI-assisted decision support” is not a sufficient intended purpose description for Annex IV. You need: what decisions, made by whom, about whom, in what process, with what decision authority. A credit decision agent should document: automated pre-screening of personal loan applications for individuals with verifiable income, within the EU, for amounts up to €X.
Gap 2: No change log. AI agent deployments change frequently — model updates, rule updates, scope expansions. Annex IV requires a documented change log. Many teams have no systematic mechanism for capturing this.
Gap 3: Known failure modes underdocumented. Category 4 requires documentation of “conditions under which the system may fail or produce incorrect results.” This requires honest documentation of known limitations. Underreporting failure modes to appear more capable is not Article 15-compliant.
Gap 4: Post-market monitoring plan is aspirational. The monitoring plan must describe an operational system, not an intention to monitor. Document: which metrics, which tools, what alert thresholds, who reviews, and what happens when an alert fires.
For the operational evidence that feeds Annex IV documentation, see EU AI Act Article 12: Logging Requirements Decoded and EU AI Act Article 9: Continuous Risk Management.
FAQ
Q: When must Annex IV documentation be assembled?
Before the AI system is placed on the market or put into service in the EU. For AI agents deployed to EU users or affecting EU natural persons, this means before your production deployment. The documentation must be kept current throughout the system’s operation.
Q: How long must Annex IV documentation be retained?
Ten years from the date the high-risk AI system is placed on the market or put into service, or until the end of the system’s operating lifetime, whichever is longer.
Q: Must all eight categories be documented for all high-risk AI systems?
Yes. Annex IV explicitly lists all eight categories as required. However, the depth and detail required for each category is “appropriate” to the system — a simpler system requires less detailed documentation than a complex one. The standard is that the documentation demonstrates compliance, not that it reaches a specified length.
Q: We use a third-party foundation model. Do we need to document how the model was trained?
For the foundation model itself: you document the model’s publicly stated design specifications and training approach (Anthropic’s documentation for Claude, for example). For your deployment: you document your configuration, governance rules, and evaluation results. You are responsible for your deployment’s compliance, not the model provider’s internal training documentation.
Q: Is there an official EU template for Annex IV documentation?
No. The EU AI Act specifies content requirements, not format. Document your AI system in whatever format best demonstrates compliance with the eight categories. Clear, structured technical documentation with explicit mapping to Annex IV requirements is easier for regulatory review than document format compliance.
By Nikola Kovtun, founder of Infracortex AI Studio. Cortex generates the operational compliance evidence — decision logs, risk management records, monitoring data — that feeds your Annex IV documentation. Book a 30-minute call to audit your Annex IV readiness.
See also: EU AI Act Article 9: Continuous Risk Management | EU AI Act Article 12: Logging Requirements Decoded | EU AI Act Article 15: Accuracy and Robustness
Cortex build: 0.1.35-260423