ai-security

AI Red Teaming on a Budget: Getting Started

Pete McKernan

15 Mar 2026 — 8 min read

AI Red Teaming on a Budget: Getting Started

AI security certifications are arriving fast. OffSec has OSAI. HTB has the AI Red Teamer path. SANS has four planned by end of year. These are real programs built by real training organizations, and the best of them will become industry standards.

You should get certified. Seriously. OSAI and the HTB AI Red Teamer path are the two I'd recommend right now, and I'll explain why at the end of this post. But you shouldn't walk into a $1,500 certification cold. The cert validates what you already know. It shouldn't be where you learn it. One of the challenges I see with this material right now is that to many of us, this is all new stuff and its difficult to introduce new fundamentals at an advanced level.

Here's what nobody is telling you: almost everything you need to start building AI red teaming skills is free right now. The tools exist. The practice labs exist. The reference frameworks exist. You just need to know what to use and in what order.

This is the sharpening-your-axe guide. Work this list, then go get certified.

Who This Is For

You have some security experience. Maybe you're working toward OSCP, maybe you already have it, maybe you've been doing this for years and AI just wasn't on your radar until recently. You've heard "prompt injection" enough times to know it matters but you haven't actually done it. You want to start building these skills now so that when you sit for a certification, you're proving what you can do, not learning it for the first time under exam pressure.

Good. That's the right instinct.

The Learning Gradient

AI red teaming breaks into layers. Each layer builds on the last. Skip ahead and you'll have gaps that show up on engagements.

Layer 1: Prompt Injection Fundamentals
What LLMs are, how they process instructions, and why the boundary between "data" and "instructions" doesn't exist the way you'd expect.

Layer 2: Advanced Prompt Techniques
Filter bypass, blind injection, multi-step extraction, context manipulation. The difference between "I can jailbreak ChatGPT" and "I can extract sensitive data from a production system."

Layer 3: Adversarial Machine Learning
Different muscle entirely. Model evasion, data poisoning, membership inference. This is the math side. You don't need to be a data scientist, but you need to understand what these attacks look like.

Layer 4: System-Level AI Attacks
RAG pipeline exploitation, agent abuse, MCP tool manipulation, multi-agent coordination attacks. This is where prompt injection meets real infrastructure.

Layer 5: Full Engagement
Chaining everything together against a realistic environment. Recon, exploitation, persistence, reporting. This is what the certs test.

The free resources below are organized by layer. Work them in order.

The Learning Gradient: 5 layers from prompt injection fundamentals to full engagement

Layer 1: Prompt Injection Fundamentals

Gandalf (Lakera)

Cost: Free, no signup
URL: gandalf.lakera.ai
Time: 1-3 hours

Start here. Everyone starts here. Gandalf is a chatbot guarding a secret password across seven progressively harder levels. Level 1 is "ignore previous instructions." Level 7 requires creative thinking about how language models process context.

Gandalf by Lakera — the Hello World of AI red teaming

This is the "Hello World" of AI red teaming. If you can clear all seven levels, you understand the basic mechanics of prompt injection. If you can't, you're not ready for anything else on this list.

What you learn: Instruction override, context manipulation, output filtering bypass, the fundamental insight that LLMs don't distinguish between instructions and data.

OWASP Top 10 for LLM Applications

Cost: Free
URL: owasp.org/www-project-top-10-for-large-language-model-applications/
Time: 2-4 hours to read, ongoing reference

This is your taxonomy. The same way the OWASP Web Top 10 gave the industry a shared vocabulary for web vulnerabilities, the LLM Top 10 does it for AI. Prompt injection, insecure output handling, training data poisoning, model denial of service, supply chain vulnerabilities, all categorized with severity ratings and real examples.

Read it once to understand the landscape. Bookmark it. You'll reference it constantly.

MITRE ATLAS

Cost: Free
URL: atlas.mitre.org
Time: 2-3 hours to explore, ongoing reference

ATLAS extends MITRE ATT&CK into machine learning. If you know ATT&CK, you know how to read ATLAS. Tactics, techniques, case studies. The case studies are where the value lives: real-world incidents mapped to technique IDs.

Between OWASP LLM Top 10 and MITRE ATLAS, you have the two reference frameworks that every AI security cert builds on. Know these cold.

Layer 2: Advanced Prompt Techniques

Hack The Box: Prometheon (Free)

Cost: Free HTB account
Difficulty: Medium
Time: 2-4 hours

Five levels of progressive prompt injection against alignment boundaries. Level 1 is instruction injection. Level 2 is intent leakage through rephrasing. Level 3 is hypothetical framing. Level 4 is role confusion via debug mode invocation. Level 5 combines everything.

This is structured, progressive, and teaches you why different techniques work at different defensive layers. Better than random CTF challenges because the difficulty curve is intentional.

What you learn: Instruction injection, intent leakage, hypothetical framing, role confusion, alignment bypass.

Hack The Box: AI Space (Free)

Cost: Free HTB account
Time: 1-3 hours

Different angle. This challenge involves analyzing data patterns from signal origins using Multidimensional Scaling (MDS). Less about talking to an LLM, more about understanding how ML models process and represent data.

What you learn: ML data analysis, pattern recognition, visualization techniques.

PortSwigger LLM Security Labs

Cost: Free
URL: portswigger.net (Web Security Academy)
Time: 4-6 hours

Four labs covering LLM security from the web application angle. PortSwigger's lab quality is consistently high, and the framing from a web security perspective is useful because that's how most people will encounter LLM vulnerabilities in the wild: through web applications that integrate AI features.

What you learn: LLM exploitation in web application context, API-level interaction with AI services.

Hack The Box VIP Challenges ($25/mo)

If you want to spend a little: one month of VIP+ unlocks two more AI challenges.

Sigma Technology (Medium): Adversarial machine learning
External Affairs (VIP): LLM exploitation

$25 for a month. I maintain this sub personally and its lasted longer than any streaming app I pay for right now.

Layer 3: Adversarial ML and Agent Attacks

Damn Vulnerable LLM Agent

Cost: Free, self-hosted
URL: GitHub (ReversecLabs, originally from WithSecure's BSides London 2023 CTF)
Time: 4-8 hours

A deliberately insecure ReAct agent that teaches Thought/Action/Observation injection. This is where you learn that prompt injection against an agent with tools is fundamentally different from prompt injection against a chatbot. When the agent can read files, make API calls, and execute code, your injection payload has real consequences.

What you learn: Agent architecture, ReAct loop exploitation, tool abuse through prompt injection.

Damn Vulnerable MCP Server

Cost: Free, self-hosted
URL: GitHub
Time: 6-10 hours

Ten escalating challenges against Model Context Protocol implementations. Tool poisoning, tool shadowing, rug pull attacks, indirect prompt injection through tool responses. MCP is how modern AI systems integrate with external services, and this attack surface is growing fast.

The progression goes from basic prompt injection all the way to token theft and remote code execution through MCP vulnerabilities. Harder than most people expect.

What you learn: MCP architecture, tool poisoning, tool shadowing, indirect injection through tool integration.

Layer 4: Tools and Automation

Promptfoo

Cost: Free, open source
URL: promptfoo.dev
Time: 4-6 hours to get productive

Open-source LLM red teaming framework. Automated scanning for prompt injection, jailbreaks, and safety violations. Useful both as a learning tool (see what automated attacks look like) and as a professional tool (include in your methodology).

What you learn: Automated AI security testing, systematic vulnerability scanning.

PyRIT (Microsoft)

Cost: Free, open source
URL: GitHub (Azure/PyRIT)
Time: 6-8 hours to get productive

Microsoft's Python Risk Identification Tool for generative AI. Designed for security professionals to automate adversarial testing. More enterprise-focused than Promptfoo. Understanding how the big players approach AI red teaming at scale is valuable context.

What you learn: Enterprise-scale AI red teaming methodology, automated attack orchestration.

DeepTeam

Cost: Free, open source
URL: GitHub (confident-ai/deepteam)
Time: 4-6 hours

LLM penetration testing framework with built-in support for OWASP Top 10 for LLMs and NIST AI RMF. Good for understanding how frameworks map to actual testing procedures.

What you learn: Framework-aligned testing, OWASP LLM Top 10 validation.

Layer 5: Competitive and Advanced

RedTeam Arena

Cost: Free
Time: Ongoing

Competitive jailbreak challenges under time pressure. Good for sharpening skills after you've built the foundation. The competitive format forces you to think fast and try creative approaches.

Reference Shelf

Keep these bookmarked. You'll use them constantly.

Resource	What It Is	URL
OWASP Top 10 for LLMs	Vulnerability taxonomy	owasp.org
OWASP ML Top 10	Machine learning security risks	owasp.org
MITRE ATLAS	AI threat matrix (extends ATT&CK)	atlas.mitre.org
F.O.R.G.E.	AI-integrated security technique framework	forge.itsbroken.ai
Awesome LLM Red Teaming (GitHub)	Curated resource list	github.com
Anthropic Prompt Engineering Guide	How models process instructions	docs.anthropic.com

The Total Cost

If you work every resource on this list:

Path	Cost	What You Get
Completely free	$0	Gandalf + HTB free challenges + DVLLM Agent + DV MCP Server + AI Security Lab Hub + PortSwigger + RedTeam Arena + all tools + all references
One month VIP+	$25	Everything above + Sigma Technology + External Affairs

For $0-25, you get prompt injection fundamentals through advanced agent exploitation, dozens of practice challenges and labs, four professional tools, and the reference frameworks that every certification builds on.

What Comes Next

Once you've worked through this progression, you won't just be ready for certification. You'll walk in with reps. The exam becomes a proof of what you already built, not a cram session.

The two certifications I'd prioritize:

OffSec OSAI (AI-300): The engagement methodology cert. Eleven modules covering recon through full red team operations against AI-integrated environments. 48-hour practical exam with active blue team telemetry. This is OSCP for AI. If you're an experienced pentester adding AI to your engagements, this is the one.

HTB AI Red Teamer: The hands-on execution cert. Deep adversarial ML coverage, real CVE exploitation chains, Google SAIF alignment. Best for building technical depth across the full AI attack spectrum. The infrastructure boxes alone are worth the price of admission. I was able to hit this track through the Enterprise portal, if you have the ability to get access to this side of HTB through your employer, ask if its a reality for your org. This is a very effective tool for professional and program growth.

Both are serious programs from serious organizations. Both assume you already have the fundamentals this guide covers. Do the free work first, then go earn the credential. You'll get more out of the course, you'll pass the exam with confidence, and you'll have skills that compound from day one on the job.

The operators who invest in these skills now will own this space for years.

The Cheat Sheet

We built an interactive AI Red Teaming cheat sheet that lives on this site. Attack techniques, quick references, tool commands, and the methodology framework all in one place.

AI Red Teaming Cheat Sheet →

Bookmark it. Take it to your next engagement. It's free, it's updated, and it's built by operators, not vendors.

AI Security Cheat Sheet — 41 techniques across 9 categories

The Toolkit

Once you've worked through this progression, you'll want tools. We published a companion post today with an interactive cheat sheet (41 techniques, searchable, free) and a Claude Code plugin that makes security methodology automatic.

Tools for the Fight →

Pete McKernan is a disabled Marine veteran, red teamer, and founder of itsbroken.ai. He holds GXPN, CISSP, and GPEN certifications, is ranked Omniscient (#17 US) on Hack The Box, and is currently evaluating AI security certifications. His work on the F.O.R.G.E. framework is published at forge.itsbroken.ai.

AI Red Teaming on a Budget: Getting Started

Pete McKernan

AI Red Teaming on a Budget: Getting Started

Who This Is For

The Learning Gradient

Layer 1: Prompt Injection Fundamentals

Gandalf (Lakera)

OWASP Top 10 for LLM Applications

MITRE ATLAS

Layer 2: Advanced Prompt Techniques

Hack The Box: Prometheon (Free)

Hack The Box: AI Space (Free)

PortSwigger LLM Security Labs

Hack The Box VIP Challenges ($25/mo)

Layer 3: Adversarial ML and Agent Attacks

Damn Vulnerable LLM Agent

Damn Vulnerable MCP Server

Layer 4: Tools and Automation

Promptfoo

PyRIT (Microsoft)

DeepTeam

Layer 5: Competitive and Advanced

RedTeam Arena

Reference Shelf

The Total Cost

What Comes Next

The Cheat Sheet

The Toolkit

Read more

Tools for the Fight

Rift Walking: My Journey towards OSAI

Man vs. Machine

You Can't Signature a Taco Truck