AI Security, Offensive Security, Certification Review

My Experience Passing the Hack The Box Certified Offensive AI Expert — HTB COAE

Ryan McCutcheonVulnX Security, LLC

Most offensive security certifications focus on traditional attack surfaces: web applications, Active Directory, networks, privilege escalation, and post-exploitation. COAE is different.

AI Red TeamingCOAEHack The BoxAdversarial Machine LearningLLM SecurityOffensive AI Security

HTB Certified Offensive AI Expert COAE banner

Introduction

I recently passed the Hack The Box Certified Offensive AI Expert (HTB COAE) certification.

I am one of the top 50 people worldwide to pass the COAE. Earning the certification this early makes it a meaningful milestone in my offensive AI security journey.

This certification stood out to me because it sits at the intersection of two areas I care deeply about: offensive security and AI security.

Most offensive security certifications focus on traditional attack surfaces: web applications, Active Directory, networks, privilege escalation, and post-exploitation. COAE is different. It focuses on how modern AI and machine learning systems fail under adversarial pressure.

That includes large language models, prompt injection, adversarial machine learning, model evasion, data attacks, privacy risks, application-layer weaknesses, and the security implications of integrating AI into real business workflows.

This was not just a “jailbreak the chatbot” certification. It was a practical offensive AI assessment that required understanding how AI systems work, how they are deployed, how they can be abused, and how to communicate those risks in a professional report.

For VulnX, this certification directly supports the direction we are moving: helping organizations assess, attack, and harden modern systems before real attackers do.


Why I Pursued COAE

AI is being adopted faster than most organizations can secure it.

Companies are integrating LLMs, copilots, AI agents, retrieval-augmented generation systems, automation workflows, and machine learning models into production environments. In many cases, these systems are being connected to sensitive data, APIs, identity systems, customer information, internal documentation, and business-critical processes.

That changes the attack surface.

The question is no longer only:

Can the model be tricked?

The better question is:

What can the AI system access, influence, disclose, modify, or trigger?

That is where the real risk lives.

As a red teamer, I wanted to go deeper than surface-level AI security concepts. I wanted hands-on experience testing AI systems in a realistic way — not just reading about prompt injection, but actually exploiting AI-specific weaknesses and documenting impact like a real engagement.

COAE gave me that opportunity.


Certification at a Glance

Ryan McCutcheon HTB Certified Offensive AI Expert certificate

The HTB COAE is tied to Hack The Box Academy’s AI Red Teamer Job Role Path.

At a high level, the path covers:

  • AI and machine learning fundamentals
  • Practical AI development workflows
  • AI threat modeling and red team methodology
  • Prompt injection attacks
  • LLM output attacks
  • AI data attacks
  • AI application and system security
  • Evasion attacks
  • First-order adversarial attacks
  • Sparsity-based attacks
  • AI privacy
  • AI defense

The certification exam itself is a practical assessment. Candidates are placed in an AI-focused environment and must identify, exploit, document, and report security issues.

The final deliverable is not just flags. It requires a professional report, which I appreciated. In real consulting and red team work, the exploit is only part of the job. The report is what helps the client understand the risk, reproduce the issue, prioritize remediation, and improve their security posture.

That made the exam feel much closer to a real engagement.


My Background Going Into the Exam

Going into COAE, I already had a strong offensive security foundation. My background includes penetration testing, web application security, Active Directory attack paths, offensive tooling, vulnerability reporting, and hands-on lab work through platforms like Hack The Box.

I also hold the OSCP, which helped from a methodology standpoint.

However, COAE required a different type of thinking.

Traditional offensive security often rewards enumeration, chaining misconfigurations, abusing trust relationships, and finding the one overlooked weakness that leads to compromise.

AI security still requires that mindset, but it adds new layers:

  • Understanding model behavior
  • Understanding input/output boundaries
  • Thinking about data leakage and privacy risk
  • Evaluating how guardrails fail
  • Recognizing when model behavior can be manipulated
  • Understanding adversarial perturbations
  • Translating mathematical concepts into practical attacks
  • Thinking about how AI systems interact with surrounding applications

That last point is important.

AI vulnerabilities rarely exist in isolation. The real impact usually comes from how the model is connected to the rest of the system.


What Made the Course Valuable

The strongest part of the AI Red Teamer path is that it does not treat AI security as one topic.

It breaks the attack surface into multiple layers:

  • Model layer: How models make decisions and how adversarial inputs can influence predictions.
  • Data layer: How training data, datasets, labels, and model artifacts can introduce risk.
  • Application layer: How AI features are exposed through web apps, APIs, workflows, and user interfaces.
  • System layer: How the surrounding infrastructure, dependencies, and deployment patterns affect security.
  • Privacy layer: How attackers can infer sensitive information from models and outputs.
  • Defense layer: How mitigations such as guardrails, adversarial training, privacy-preserving techniques, and secure design patterns can reduce risk.

That layered approach is what made the path useful.

It forced me to stop thinking about AI security as simply “prompt injection” and start thinking about AI systems as full application ecosystems.


The Hardest Part: Adversarial Machine Learning

The most challenging part of the path for me was adversarial machine learning.

Topics like gradient-based attacks, norm constraints, perturbations, saliency maps, and model evasion require a different mental model than traditional web exploitation.

In traditional web testing, you may be looking for broken access control, injection flaws, insecure file handling, weak authentication, or exposed secrets.

In adversarial ML, you are thinking about how small changes to input data can cause a model to make the wrong prediction while still appearing valid to a human or downstream system.

That is a different kind of attack.

It requires more comfort with:

  • Python
  • PyTorch
  • Model inputs and outputs
  • Loss functions
  • Gradients
  • Optimization
  • Image classification workflows
  • Data preprocessing
  • Model confidence
  • Attack constraints

Some of the material was difficult, but that difficulty is also what made it valuable. By the end, I had a much better understanding of why adversarial attacks work, not just how to run them.

That distinction matters.

Running a tool is not expertise. Understanding why the attack works is what allows you to adapt when the environment changes.


The Exam Experience

Without giving away exam details, the COAE exam felt practical and fair.

The exam tested whether I could apply the concepts from the path in a realistic environment. It was not a trivia test, and it was not something I could pass by memorizing definitions.

The exam required me to:

  • Enumerate the environment
  • Understand the AI components in scope
  • Identify weaknesses
  • Develop working proof-of-concept attacks
  • Validate impact
  • Document evidence clearly
  • Write a professional report

The reporting requirement was one of the best parts of the certification.

In real offensive security work, finding a vulnerability is not enough. You need to explain:

  • What the vulnerability is
  • Why it exists
  • How it can be exploited
  • What the business or security impact is
  • How the client can remediate it
  • What short-term and long-term controls should be considered

That is especially important in AI security because many stakeholders are still learning how to reason about these risks. A vague finding like “prompt injection exists” is not enough. The report needs to explain the practical impact in the context of the system.


Key Lessons Learned

1. AI security is not just prompt injection

Prompt injection is important, but it is only one part of the AI attack surface.

AI systems can fail through weak application logic, poor access controls, unsafe tool usage, insecure data flows, model leakage, data poisoning, excessive trust in outputs, weak guardrails, and insecure integrations.

The model is only one component. The system around the model is often where the real impact happens.

2. Context determines severity

A prompt injection issue in a standalone demo chatbot may be low impact.

A prompt injection issue in an AI agent with access to internal documents, APIs, customer records, or business workflows may be much more serious.

The same technique can have very different severity depending on what the AI system can access or influence.

3. AI red teaming requires both offensive and defensive thinking

The best AI red teamers need to understand how attacks work and how defenses fail.

That includes guardrails, adversarial training, privacy-preserving methods, model monitoring, access control, logging, human approval flows, and secure architecture.

If you cannot explain remediation clearly, you are only doing half the job.

4. Reporting matters

AI findings can be difficult for non-specialists to understand.

A strong report needs to translate technical behavior into operational risk. It should include clear reproduction steps, evidence, affected components, business impact, and remediation guidance.

This is one area where COAE felt aligned with real consulting work.

5. AI systems expand existing attack surfaces

AI does not replace traditional security risk. It compounds it.

If an AI application has weak authentication, broken access control, insecure APIs, poor logging, exposed secrets, or unsafe backend integrations, AI can amplify the impact.

The fundamentals still matter.


Who Should Take COAE?

I would recommend COAE for:

  • Penetration testers moving into AI security
  • Red teamers who want to assess LLM and ML-enabled systems
  • Application security engineers supporting AI features
  • Security engineers responsible for AI risk
  • Consultants building AI security assessment services
  • Researchers interested in adversarial machine learning
  • Anyone who wants hands-on AI red team experience beyond basic prompt injection

That said, I would not recommend going into it completely cold.

Helpful prerequisites include:

  • Strong Python fundamentals
  • Basic web application security knowledge
  • Familiarity with machine learning concepts
  • Comfort reading technical documentation
  • Some exposure to PyTorch or similar ML libraries
  • Experience writing professional security reports

You do not need to be a machine learning engineer, but you should be willing to wrestle with the technical material.


How I Would Prepare

If I were preparing again, I would focus on four areas.

Build strong notes

Take detailed notes throughout the AI Red Teamer path. Do not just copy commands. Capture why each attack works, what assumptions it depends on, and how to recognize similar patterns later.

Use Jupyter notebooks

For the machine learning and adversarial attack portions, Jupyter notebooks are extremely useful. Build reusable snippets and keep your experiments organized.

During an exam or real engagement, having working reference code is much more valuable than having vague notes.

Practice reporting

Do not treat reporting as an afterthought.

For every major lab or exercise, practice writing the finding as if it were going to a client. Include description, impact, reproduction steps, evidence, remediation, and references.

Learn the defensive side

Do not skip the defense content.

Understanding mitigations makes you a better attacker and a better consultant. It helps you explain risk in a way that engineering teams can actually act on.


Final Thoughts

Passing COAE was a meaningful milestone for me.

It validated a skillset that I believe will become increasingly important: the ability to assess AI-enabled systems through an offensive security lens.

AI is already being embedded into applications, security tools, business workflows, customer support systems, developer tooling, and internal knowledge platforms. As that adoption continues, organizations will need security professionals who understand more than traditional web vulnerabilities.

They will need people who can evaluate how AI systems behave, how they fail, how they can be manipulated, and how those failures affect the broader environment.

That is the value of COAE.

It does not just teach isolated attacks. It helps build the mindset needed to assess AI systems as real-world attack surfaces.

For VulnX Security, this aligns directly with our mission: applying real-world attacker methodology, enterprise-grade reporting, and practical defensive guidance to help organizations identify risk before attackers do.

AI security is no longer theoretical.

It is here, it is moving quickly, and it deserves serious offensive testing.


About VulnX Security

VulnX Security, LLC provides offensive security consulting focused on real-world attacker methodology, clear reporting, and actionable remediation guidance.

Our services include:

  • Web application penetration testing
  • API security testing
  • External and internal penetration testing
  • Active Directory security assessments
  • AI and LLM security assessments
  • Cloud and Microsoft 365 security reviews
  • Vulnerability assessment and remediation guidance

If your organization is adopting AI-enabled applications, LLM workflows, or machine learning systems, now is the time to start assessing how those systems can fail under adversarial pressure.