Skip to content

Fixed-scope audit · 2 weeks · $7,500

Accepting 1 new Fractional · June 2026
30-40 hrs/wk contract · Q2-Q3 2026

AI agent infrastructure readiness assessment

Is your agentic system ready for production in a regulated domain, or for EU AI Act Article 14 enforcement on Aug 2, 2026? 2 weeks. Written findings. Remediation roadmap.

Most AI agent teams focus on capability: does the agent complete the task? The infrastructure questions come later: can you stop it when it does something unexpected, can you prove what it did, does it have a policy layer between intent and action. When they come in a regulatory context or a production incident, they come fast. This assessment answers those questions before they become problems.

EU AI Act Article 14 requires human oversight controls for high-risk AI systems (automated decision-making in credit, hiring, critical infrastructure, and similar regulated domains). Enforcement date: August 2, 2026. If you don't know your gap, you can't close it in time. For teams that need the deeper Article 14 implementation audit after this assessment, see the Article 14 Gap Audit.

$1,500 deposit to reserve your slot. Remainder due on report delivery. Non-refundable deposit after Day 1 kickoff; full report guaranteed or you owe nothing on the balance.

42µs signing latency 15 years in production 40+ Rust crates shipped Ex-Gemini · Ex-Akuna

Why this reviewer

Founder of ZeroCopy Systems. Built hardware-attested signing and policy engine for autonomous systems in AWS Nitro Enclaves. The policy engine governs what actions a system can authorize: every signing request passes through a policy layer before reaching the enclave. This is the same control pattern Article 14 asks for.

Founding crypto DevOps engineer at Akuna Capital (Jun 2021–Sep 2022). Built autonomous execution infrastructure across 12+ exchanges: order routing, failover, circuit breakers. Operated systems where the cost of an unauthorized action was immediate and financial.

Lead Linux Infrastructure Engineer at FlexTrade (6 years, 2012–2018). Built and operated OMS/EMS infrastructure under MiFID II compliance requirements, a regulation with human oversight requirements that predate the EU AI Act by a decade.

What you get.

Human oversight controls review

I trace your kill switch implementation end-to-end: does it actually stop the system under realistic conditions, not just in unit tests? I review your intervention points (where a human can interrupt agent execution), your escalation paths (what happens when the agent hits a state it's not confident about), and whether these controls are tested and observable.

Agent action authorization review

What determines whether an agent action is permitted? I review the policy layer between intent and execution: whether there's a policy engine, whether it's machine-verifiable (not just a prompt instruction), whether it covers side-effecting actions (API calls, database writes, external notifications), and whether a compromised or confused LLM response can bypass it.

Attestation and audit-trail review

Can you prove, after the fact, what the agent did and why? I review your audit log pipeline: what's logged at every agent action, whether the log is tamper-evident, whether you record the model version and context state alongside the action, and whether the logs would satisfy an Article 14 regulator asking for evidence of oversight. This is the most common gap I expect to find.

Deterministic safety-net review

What happens when the LLM produces a response outside the expected envelope? I review your deterministic guardrails: output validation, action-space constraints, rate limits on agent actions, and fallback paths when the model output is structurally invalid or semantically out-of-bounds. Prompt instructions alone are not a safety net.

Article 14 gap analysis

If your system qualifies as high-risk under EU AI Act Annex III, I map each Article 14 sub-clause to your current technical controls and give you a pass/fail verdict. The output is a prioritized list of what to fix before Aug 2, 2026. For teams that need the full 10-day implementation-focused Article 14 audit after this assessment, see the Article 14 Gap Audit.

Prioritized findings and remediation roadmap

Written report as PDF and markdown. Each finding gets a severity level, an impact description, and a specific remediation step. Followed by a 60-minute walkthrough call with your engineering team.

What the report looks like.

Sample table of contents. Actual scope adjusted to your system.

AI Agent Infrastructure Readiness Assessment | [Client] | [Date]

1. Executive summary

1.1 Production-readiness verdict (per domain)

1.2 Critical / high / medium / low findings

2. System scope and Article 14 applicability

2.1 Does this system qualify as high-risk under Annex III?

3. Human oversight controls

3.1 Kill switch implementation and test coverage

3.2 Intervention points

3.3 Escalation path review

4. Agent action authorization

4.1 Policy layer review

4.2 Side-effecting action coverage

5. Attestation and audit trail

6. Deterministic safety nets

7. Article 14 gap analysis (if applicable)

7.1 Sub-clause mapping

7.2 Aug 2 prioritization

8. Remediation roadmap (severity-ordered)

Example finding (anonymized)

Finding #1 - SEVERITY: CRITICAL

Audit trail · Article 14 §14(1)(d)

Description: The audit log pipeline records agent action outcomes but not agent action intents. When the agent selects a tool call, the tool invocation and response are logged, but the model's reasoning context (prompt, intermediate reasoning steps, model version) is not. A regulator asking for evidence of "appropriate human oversight" under Article 14 would not be able to reconstruct why the agent took a specific action from the audit log alone.

Remediation: Log the full context at action boundary: model version, prompt hash, reasoning summary (or full trace for high-risk actions), and the policy check result that authorized the action. Store logs in an append-only pipeline with a hash chain so they're tamper-evident. This is the minimum audit trail that satisfies Article 14 §14(1)(d).

What this is not.

Not a model evaluation. I review the infrastructure surrounding the AI system: oversight controls, authorization, attestation, safety nets. I do not evaluate model quality, benchmark performance, or prompt engineering. Those are out of scope.

Not a full Article 14 implementation audit. This assessment surfaces your gaps. For teams that need a deeper 10-day engagement focused on implementation evidence, a report you can show to regulators or counsel, see the Article 14 Gap Audit.

Not applicable to all AI systems. Article 14 applies to high-risk AI systems as defined in Annex III. If you're not sure whether your system qualifies, the free 20-minute diagnostic will clarify. Many AI agent systems don't qualify. Knowing that is also a useful output.

Not legal advice. I tell you what the technical controls are and whether you have them. Whether those controls satisfy your specific legal exposure is a question for your counsel. I work alongside your legal team, not instead of them.

Not an implementation engagement. The assessment tells you what to build. Building it is a separate scoped project.

2-week process.

Day 1

Kickoff

You share architecture documentation, code access (read-only), and any existing compliance documentation. I review your agent system's design: action space, tool definitions, authorization model, and deployment configuration. We align on scope: which agents, which action categories, which regulatory exposure is in scope.

Day 2–4

Controls review

I trace the kill switch, intervention points, and escalation paths. I test whether the kill switch actually stops execution under realistic conditions, not just in unit tests. I map the agent action space and review the authorization model for each action category.

Day 5–9

Attestation, safety nets, and Article 14 gap analysis

I review the audit log pipeline for completeness and tamper-evidence. I assess deterministic guardrails and fallback paths. If your system qualifies as high-risk, I map each Article 14 sub-clause to your current controls and draft the gap list.

Day 10–14

Write findings and walkthrough

I write the findings report. PDF and markdown delivered by end of Day 14. 60-minute walkthrough call with your engineering team on Day 14 or 15. Balance due on report delivery.

Questions.

What access do you need?

Read access to your codebase (specifically agent logic, policy layer, and audit log pipeline), architecture documentation, and your deployment configuration. I don't need access to production API keys, user data, or live agent execution. If your system handles regulated data, I can work with sanitized staging environments.

My system uses an LLM API (OpenAI, Anthropic, etc.) directly. Is this still relevant?

Yes. The infrastructure review is about what surrounds the model, not the model itself. Whether you call GPT-4 or Anthropic's API, the review covers the same questions: what's the policy layer, where's the kill switch, what does the audit log capture. The model provider is out of scope; your integration layer is in scope.

We're not sure if Article 14 applies to us.

That's a common starting point. Article 14 applies to AI systems that fall under Annex III of the EU AI Act: automated decision-making in credit, hiring, law enforcement, critical infrastructure, and education, among others. "AI agent" doesn't automatically qualify. The 20-minute diagnostic can clarify whether your system is in scope before you commit to the audit.

What's the refund policy on the $1,500 deposit?

Non-refundable after Day 1 kickoff. Before kickoff, full refund if you cancel at least 48 hours before the scheduled start. If I fail to deliver the report by Day 14, you owe nothing on the balance.

Who owns the report?

You do. Full IP transfer on final payment. I retain the right to reference the engagement type in aggregate, but will not disclose your firm name, system architecture, or findings without written permission. You can share the report with regulators, counsel, or investors.

Aug 2 is soon. Know your gap now.

If the assessment surfaces critical items, you need time to fix them. Two weeks for the assessment, then however long your engineering team needs to remediate. The earlier you know, the more options you have.

One slot per 2-week period. Reserve with the deposit.