Web Project Studios

Field notes

Hiring AI under the EU AI Act: the deadline moved, the obligations did not

28 May 2026

ai-workflow-opscomplianceworkflow

A client asked me last month whether their candidate-scoring workflow needed a CE mark. They had built a small automation: job applications come in, an LLM scores each one against a rubric, a ranked list lands in a Slack channel. No human sees the raw applications before the list is generated. They had been running it for four months. They had no documentation, no bias audit, no logs beyond what their LLM provider retains by default.

The answer to their question was yes. When I told them, they had about ten weeks to sort it out. Then on 7 May, the EU agreed to push the Annex III deadline by sixteen months. The new compliance date for stand-alone high-risk AI systems is 2 December 2027. That changes the calendar. It does not change the answer.

Nothing in this post is legal advice. The EU AI Act (Regulation 2024/1689) is a live regulatory instrument and its interpretation will develop through guidance from the AI Office and national market surveillance authorities. If you are building or deploying high-risk AI systems, get qualified legal counsel involved early.

Annex III, point 4(a) of the EU AI Act covers AI systems intended to be used for the recruitment or selection of natural persons. The scope is wider than most builders assume when they first read it.

It is not just a CV parser or an interview-scoring tool. The regulation covers any system that filters, ranks, scores, or otherwise influences which candidates a human decision-maker sees or considers. That includes job ad targeting that determines which candidates are shown a vacancy. It includes automated scheduling that sequences or prioritises candidates. It includes any LLM-based rubric that produces a ranked output before a recruiter reviews applications.

The test is functional, not nominal. If your workflow produces a ranked or filtered view of candidates that a human then acts on, the system is doing selection. The label you put on it does not change that. Article 6 does allow providers to argue that an Annex III system is not high-risk if it poses no significant risk to fundamental rights, but for anything that shapes hiring decisions, that argument is difficult to sustain.

Providers of high-risk AI systems face the full set of obligations under Chapter III, Section 2 of the Act. Deployers who use third-party systems inherit a separate but overlapping set of obligations, including the Fundamental Rights Impact Assessment under Article 27. If you are building for clients, you are a provider. Your clients are deployers. Both sets of obligations matter, and you need to understand both to build something your clients can actually use lawfully.

The postponement happened because the compliance infrastructure the Act depends on was not ready. Technical standards were delayed. Guidance documents missed their statutory deadlines. The result was that providers had no finalised rules to build conformity architecture against.

That is not the same as the obligations being unclear. The text of the regulation has not changed. Articles 9 through 15 say exactly what they said when the Act was published in July 2024. What was missing was the supporting framework: harmonised standards, templates, the EU database infrastructure.

For most workflow operators I speak to, the bottleneck was never the standards. It was the fact that their systems have no risk management documentation, no controlled logging, and no structural human oversight mechanism. Those are build problems, not standards problems. The sixteen extra months help with standards alignment. They do not help with the fact that you have not started.

The pattern I see most often: the system works, the outputs are reasonable, and there is almost no paper trail. No documented risk management process. No data governance record covering training or fine-tuning data. No technical documentation that would survive a market surveillance audit. No meaningful human oversight mechanism beyond "someone looks at the Slack message."

The AML workflow post on this site covers a structurally similar problem in a different sector: compliance obligations that require process design upstream, not a paper layer applied after the fact. The lesson transfers directly. You cannot retrofit a risk management system onto a workflow that was never designed with one.

Article 9 requires a risk management system: a continuous, documented process that identifies risks to health, safety, and fundamental rights, estimates and evaluates those risks, and records the measures taken to address them. For hiring AI, the obvious risks are discriminatory outputs and the erosion of meaningful human review. Your risk management system needs to name these, quantify them where possible, and document what your system does to mitigate them.

Article 10 covers data governance. If your system uses training data, fine-tuning data, or any data pipeline that shapes model behaviour, you need documented governance over that data. Relevance, representativeness, known biases, data collection practices, and any preprocessing steps all need to be recorded. If you are wrapping a third-party model (OpenAI, Anthropic, Mistral), you still need to document what data you feed into it and what your prompts do to shape outputs.

Article 11 requires technical documentation before market placement. Annex IV specifies what this covers: system description, design choices, training methodologies, performance metrics, known limitations, and the measures taken to achieve compliance. This is not a README. It is a structured document that a market surveillance authority could use to assess conformity.

Article 12 requires automatic logging sufficient to enable post-hoc auditing of system operation. For a hiring workflow, that means logging inputs, outputs, and the parameters active at the time of each decision. Default LLM provider logs are not sufficient. You need logs you control, with retention periods that match the Act's requirements.

Article 13 covers transparency to deployers. Your clients need to understand what the system does, what it does not do, and what they are responsible for. This means instructions for use that go beyond a product walkthrough. They need to cover the system's intended purpose, performance characteristics, known limitations, and the human oversight measures the deployer must maintain.

Article 14 requires human oversight to be technically built in, not just assumed. The system must be designed so that a natural person can understand, monitor, and where necessary override its outputs. A Slack message with a ranked list does not satisfy this. The oversight mechanism needs to be a deliberate design choice, documented as such.

Article 15 covers accuracy, robustness, and cybersecurity. Your system needs to perform consistently across the range of inputs it will encounter, handle errors and unexpected inputs without producing harmful outputs, and be protected against attempts to manipulate it.

For deployers using your system, Article 27 requires a Fundamental Rights Impact Assessment before the system is put into service. Your technical documentation and instructions for use are what your clients will rely on to complete that assessment. If your documentation does not give them what they need, their FRIA will be incomplete, and they will come back to you.

High-risk AI systems under Annex III generally follow the internal conformity assessment route under Article 43(2), which means you assess your own conformity against the requirements, compile the technical documentation, register the system in the EU database under Article 71, and affix CE marking. There is no mandatory third-party notified body involvement for most Annex III, point 4 systems.

That sounds simpler than it is. Self-assessment does not mean light-touch. It means you are making a documented legal claim that your system meets the requirements, and you are liable if it does not.

The registration obligation under Article 49 is a separate step that many builders have not factored into their timelines.

The deadline is December 2027. That is roughly eighteen months from this post. Eighteen months is generous for a team that starts now. It is tight for a team that waits until summer 2027.

The sequence matters. Risk management drives everything else. You cannot write accurate technical documentation until you have completed risk identification. You cannot design meaningful human oversight until you know what risks the oversight is meant to catch.

hiring_ai_compliance_build_order:
  phase_1_scope_and_risk:
    timeline: "months 1-3"
    tasks:
      - task: "Scope determination"
        detail: "Confirm which workflows meet the Annex III point 4(a) functional test. Document the decision for each."
      - task: "Risk management system initiation (Article 9)"
        detail: "Identify risks to fundamental rights and safety. Prioritise discriminatory output and oversight erosion."
      - task: "Data governance audit (Article 10)"
        detail: "Document all data inputs, training data provenance, prompt structures, and known biases."
 
  phase_2_infrastructure:
    timeline: "months 4-6"
    tasks:
      - task: "Logging architecture (Article 12)"
        detail: "Implement controlled logs: inputs, outputs, active parameters, timestamps. Confirm retention period."
      - task: "Human oversight mechanism design (Article 14)"
        detail: "Redesign any workflow where ranked output reaches a decision-maker without a documented review gate."
      - task: "Accuracy and robustness testing (Article 15)"
        detail: "Test across edge-case inputs. Document failure modes and mitigations."
 
  phase_3_documentation:
    timeline: "months 7-9"
    tasks:
      - task: "Technical documentation (Article 11 + Annex IV)"
        detail: "Draft full conformity documentation. System description, design rationale, performance metrics, limitations."
      - task: "Instructions for use (Article 13)"
        detail: "Write deployer-facing documentation covering intended purpose, limitations, and oversight requirements."
 
  phase_4_conformity:
    timeline: "months 10-12"
    tasks:
      - task: "Internal conformity assessment (Article 43)"
        detail: "Assess conformity against Articles 9-15. Compile declaration of conformity."
      - task: "EU database registration (Article 49)"
        detail: "Register system in the EU database under Article 71."
      - task: "CE marking"
        detail: "Affix CE marking to system and documentation."
 
  phase_5_monitoring_and_buffer:
    timeline: "months 13-18"
    tasks:
      - task: "Post-market monitoring (Article 72)"
        detail: "Establish process for collecting and reviewing performance data from deployers."
      - task: "Standards alignment"
        detail: "Align documentation and processes with harmonised standards as they are published."
      - task: "Incident reporting"
        detail: "Define what constitutes a serious incident and the reporting pathway to national authority."

The items in phase 4 are not the hard part. The hard part is phases 1 and 2, because that is where you discover whether your system can actually meet the requirements or whether it needs to be redesigned.

Some systems will need redesign. A scoring workflow that produces a ranked output with no intermediate human checkpoint is not going to satisfy Article 14 by adding a disclaimer to the Slack message. The oversight mechanism needs to be structural.

Pull up every hiring-adjacent workflow you have shipped or are building. Apply the functional test: does this system filter, rank, score, or sequence candidates in a way that shapes which applications a human sees? If yes, you are in scope.

Then look at your logging. If you cannot reconstruct what inputs produced a given output on a given date, you are already non-compliant with Article 12, and you have no foundation for the technical documentation Article 11 requires.

Start there. Risk management and logging are the load-bearing elements. Everything else depends on them.

The AI pilots post on this site makes the point that most AI projects fail not because the technology stops working but because the process underneath was never designed to sustain them. The EU AI Act is, among other things, a forcing function for process design. The postponement gives you more time to do that design well. It does not give you permission to skip it.

If you are working through this and want a structured review of where your hiring AI workflows stand against the obligations, the AI Workflow Audit is the place to start. The deadline is further away than it was last week. The work is the same size it always was.