EU AI Act March 2026

The EU AI Act is live. Here's what fintech CTOs actually need to do.

By Alex Hessami, founder of SealVera · March 2026

Most of the coverage focuses on what the EU AI Act prohibits. But if you're a fintech building credit scoring, fraud detection, or underwriting systems, the real question is simpler: what records do you need to keep, and in what format?

Everyone's focused on the wrong thing

I built SealVera after watching fintech teams ship AI into production with compliance programs that hadn't caught up. The EU AI Act wasn't the trigger — it was the moment the gap became undeniable.

The dominant narrative around the EU AI Act is about prohibition: facial recognition in public spaces, social scoring, subliminal manipulation. Genuinely bad stuff. And genuinely not what you're building.

So a lot of fintech teams have quietly filed the EU AI Act under "probably not my problem" and moved on.

That's a mistake. The Act's real impact on fintech isn't about what you can't do. It's about the operational obligations that come with what you are doing — every day, at scale, often without any auditability whatsoever.

The fines are up to €30 million or 6% of global turnover. Enforcement starts now. And the obligations for high-risk AI systems went live when the Act did. You don't get a grace period because you haven't read the regulation yet.

Is your AI system "high-risk"? Almost certainly yes.

The EU AI Act defines high-risk AI systems in Annex III. For fintech, three categories are immediately relevant:

Credit scoring and creditworthiness assessment: any automated system that evaluates whether someone qualifies for a loan, a credit card, or a BNPL arrangement
Insurance underwriting: AI systems that assess risk and set pricing or eligibility for insurance products
Fraud detection: specifically where automated decisions have a material effect on individuals (account freezes, transaction blocks, identity flags)

If your product touches any of those — and if you're reading this, it probably does — you are operating a high-risk AI system under EU law. The classification isn't based on model sophistication, data volume, or revenue. It's based on what the system decides and who it affects.

Quick self-test: Does your AI output influence whether a person can access financial services, get a loan, or have a transaction approved? If yes, you're almost certainly in Annex III territory. Don't wait for a lawyer to tell you this costs more to confirm than to just act on.

One important nuance: the Act applies based on where the person being assessed is located, not where your company is incorporated. If you serve EU residents — even from a US or UK-headquartered company — the obligations apply.

Three obligations that actually matter

The Act imposes a lot of requirements on high-risk AI systems. Some of them (conformity assessments, technical documentation, CE marking) are slower-burn concerns. But three obligations have immediate engineering implications that you need to understand now.

1. Transparency

Users who are subject to AI-driven decisions have the right to meaningful explanations. "The algorithm decided" is not sufficient. You need to be able to articulate — in human-readable terms — what factors influenced a credit decision, a fraud flag, or an underwriting outcome.

This doesn't mean you have to expose your model weights. It means you need explanation infrastructure: the ability to generate, on demand, a clear account of why a specific decision was made for a specific person at a specific point in time. Post-hoc rationalization from a current model state won't cut it if that model has been updated since the decision was made.

2. Record-keeping

This is the one most teams are completely unprepared for. Every AI decision needs to be logged with enough detail to reconstruct what happened — and those logs need to be retained for years. More on the specifics below, because this is where the real engineering work is.

3. Human oversight

High-risk AI systems must be designed so that humans can intervene, override, and correct outputs. This isn't just a policy requirement — it needs to be built into your systems. Concretely: you need override workflows, escalation paths, and audit trails showing that human review happened where it was required.

If your fraud detection system automatically freezes accounts with no human in the loop, you have a compliance gap. If your credit scoring system issues rejections without any override mechanism available to the applicant, you have a compliance gap.

What Article 12 actually requires

Article 12 of the EU AI Act is titled "Record-keeping." It's four paragraphs and reads like bureaucratic boilerplate, but what it requires in practice is substantial.

For high-risk AI systems, you need to automatically log, for every consequential decision:

A timestamp: exact date and time the decision was made
The model version: which version of the model made the decision, not just the current version
The inputs: the data that was passed to the model (or a cryptographic reference to it)
The output: the actual decision or score the model produced
The reasoning: the explanation or feature attributions that drove the output
Human review status: whether a human reviewed the decision, and what they decided

And critically: for credit decisions, the retention period is 10 years. That's not a typo. A loan application you process today needs a complete, retrievable decision record until 2036.

The logs need to be tamper-evident. If a regulator asks to see the record of a decision made three years ago, and you can't prove the record hasn't been modified since it was created, the record has limited evidentiary value. This is why a database row with an updated_at timestamp isn't compliant — it's trivially editable.

The intent of Article 12 is reproducibility: if someone challenges a credit decision made in 2024, you should be able to reconstruct exactly what information the model had, which version of the model it was, and what the model produced. Not an approximation. The actual thing.

This has deep implications for how you version models, how you store inputs, and whether your current logging architecture is even capable of satisfying an audit request.

The embarrassing truth about most fintech AI stacks

Here's the honest situation. Most fintech engineering teams building on top of LLM APIs or ML models have logging that looks something like this:

Application logs with request IDs and response times
Maybe a Datadog dashboard showing model latency and error rates
Database rows capturing the final output (approved / rejected / flagged)
If you're sophisticated: some feature importance scores cached somewhere

None of that is Article 12 compliance. It doesn't capture the full input state at decision time. It doesn't capture the model version. It probably doesn't capture the reasoning in a retrievable, human-readable format. And it's almost certainly not tamper-evident.

More critically: if you've deployed AI agents — systems that chain multiple model calls, retrieve context, and make multi-step decisions — the audit trail problem is even harder. Which call in the chain was the "decision"? What context was retrieved and when? If the agent used a tool or called an external API as part of the decision, is that captured?

Most teams haven't thought through any of this. It's not because they're negligent — it's because "make the model work" was the priority, and "prove the model worked correctly three years from now" was not.

That's a reasonable engineering tradeoff to have made in 2022. It's not a reasonable one to still be making in 2026.

What you actually need to do this quarter

Let's be practical. You have a team, a roadmap, and limited sprint capacity. Here's how to sequence this without it becoming a six-month compliance project:

Audit what you're actually deciding with AI. Make a list of every automated decision your system makes that affects a user's access to financial services. Be specific: not "we use a credit model" but "the model produces a score, a threshold applies, and if below X the application is rejected with code Y." You need to know the exact decision boundary before you can log around it.
Implement decision-level logging now. Even if your format isn't perfect yet, start capturing: decision ID, timestamp, model version identifier, raw inputs (or a hash of them), raw output, and explanation payload. Store this separately from your operational database with write-once semantics if possible. A log you can't edit is more useful than a record you can.
Version your models explicitly. If you can't answer "which model version produced this decision" for a decision made 30 days ago, you have a problem. Tag every model deployment with a version, and store that version alongside every logged decision. This sounds obvious; most teams aren't doing it systematically.
Map your human oversight gaps. For each automated decision type: what's the path for a user to request human review? What's the escalation path internally? Document it. Build the override workflow if it doesn't exist. This protects you both regulatorily and operationally.
Talk to your legal team about retention schedules. 10 years for credit decisions is the headline number, but different decision types may have different retention requirements. Get clarity on what you're obligated to keep, and for how long, before you design your storage architecture.

One thing you can skip for now: full conformity assessments, CE marking processes, and registration in the EU database. These matter, but they're longer-lead-time concerns. The record-keeping and oversight obligations are live today and are the ones most likely to bite you first in an audit or a user complaint.

The practical close

The EU AI Act isn't a boogeyman. It's a set of operational requirements that are, honestly, reasonable. If you're making automated decisions that affect people's financial lives, you should be able to explain those decisions, prove you made them fairly, and show that a human can intervene when needed. That's not compliance theater — that's good engineering.

The challenge is that the infrastructure to do this properly is non-trivial to build from scratch. Tamper-evident logs, model versioning that's linked to decision records, explanation generation that works against historical model states — none of that comes out of the box with the tools most teams are using.

Some teams will build it themselves. If you have the capacity and want full control, that's a reasonable call. The schema design alone is worth getting right before you start accumulating years of records.

If you'd rather not reinvent this particular wheel, SealVera is purpose-built for exactly this problem: tamper-evident decision records, model version tracking, and explanation storage for high-risk AI systems in fintech. It wires into your existing stack as a logging layer, so you don't have to redesign your architecture to get compliant.

Either way, the time to start is now — not when a regulator asks why you can't produce records for a credit decision from last year.

SealVera is one way to do this.

It's early, and we're still building. But if you want to see what decision-level logging looks like in practice, try it free. I'm also genuinely happy to talk through your setup if you're figuring this out.

Try it free