I Needed More Memory, So I Built a Cluster

I needed more memory. So I built three machines. And what came out of that weekend was more than I ever thought possible.

I Needed More Memory, So I Built a Cluster

I Needed More...Everything. I Have Become Frankenstein

I needed more memory, that's how it started this time. With the computerized mech winches, it was "I need a computer to get these machines talking in the warehouse." Before that, "I need an RF solution for these disposable drones." " I need more GPU to cube and render this keyspace" And it goes on and on and on.

This is the spiral that my wallet never escapes from. However, when new projects arise, I always have a very healthy supply of parts that I can breathe life back into. It turns out that running a home 16 GPU password cracking operation for the last six years gives you a LOT to work with. I mean, I had to let my fleet of GTX 1080 Ti return to the cryo vault because I scored some 5090s. H0L# $%*#, raw nasty speed.

The old rig. Dual EVGA 1070 Tis in an open frame. She served well.

That's it. That's how this started. Not with a grand vision. Not with a business plan or a whiteboard or a Jira ticket. I needed more memory for the AI work I was doing, and the cloud was too slow, too expensive, and too far away from where I actually think. Also, as someone who has spent more time than most breaking into highly controlled and guarded systems for a majority of my adult life, in the employ of powerful institutions that needed someone who could think like a REAL threat actor, I prefer to control my data with airgapped solutions, because I have designed and built those as well.

So I built it.

Three machines, one weekend, a BIOS flash at midnight, and what came out the other side was more than hardware. More than I planned. More than I honestly thought I was capable of building.

I've been in technology for twenty years. Activision QA in 2002, back when testing games meant playing the same level a hundred times until you found the crash. United States Marine Corps, the unique crucible that I credit with giveing me both the resilience and edge. I started as a 2821, telecommunications jock, they called us TechCon, fixing equipment that everyone else would have just trashed because I loved the tech too much to let it die. I was good at the micro soldering, good at making broken things work again. I was good at getting things online and keeping them online, that was my life as a Marine. Good enough that I got recruited into a different world. I became a 2651, and you won't find much about what that really means because you're not supposed to. What I will tell you is that I have been trained by agencies to emulate advanced actors across the capability spectrum, full scope operations. I went from fixing radios to places like USAFRICOM and Quantico, enabling red teams, and then years of offensive security work that I can't fully talk about and wouldn't anyway, because it's a lot cooler than slipping between the perimeter cracks at your grandparents' bank. GXPN, CISSP, GPEN, and now an invitation to one of the REDACTED TEXT certification cohorts, a cert that didn't exist six months ago because the field it certifies didn't exist six months ago.

This is evidence of what you can do when you build more hands. When I tell you that you can build whatever you want, I need you to know that I mean it. Not theoretically. Not inspirationally. I mean it the way someone means it when they've been doing the work for two decades and just had the most productive three weeks of their entire career. I did it all after dinner until the wee hours of the morning, because it was the most fun I have had in a long time. A big kid in a digital sandbox where everything I knew how to do was literally right in front of me, plugged into the dream machine, reality canvas go. People that I work with joke sometimes that "we need another Pete." I feel them. I need another Pete too. Well, that's what I thought, and then I built my agents. I have as many hands as I need now, guided by my knowledge and experience. I build things in hours that used to take weeks with a team. My new team, yeah we are all thinking with the same brain. Hive-mind-a-go-go. I cant wait till more people check this out and start talking about it.

And it started because I needed more memory.

The Stack

Three machines. All different. All perfect for what they are.

The pile. This is what it looks like before it becomes something.

Forge-1 is the flagship. A Maingear Retro98 with an RTX 5090, a Ryzen 7 9850X3D, and what was supposed to be 128GB of DDR5-6000. I say "supposed to be" because the MSI X870E motherboard had opinions about that. Strong opinions. We'll get there.

Forge is one of multiple units that I have on order from this vendor, maingear has been a great supplier.

Lunchbox is the NUC that could. 60GB of RAM, no discrete GPU, but it runs Docker, Open WebUI, a vector database, and a Discord bot. It's the connective tissue of the whole operation, the little box that holds everything together. There's something poetic about the smallest machine being the most important one. Ask any squad leader about the radio operator.

Lunchbox. The NUC that holds everything together.

Beast is the Beast Canyon NUC with an RTX 3080. 10GB of VRAM, which is just enough to be dangerous. It runs our visual model because llava:13b fits like a glove in 10 gigs. Beast doesn't do the heavy lifting, but it does the weird lifting, and sometimes that matters more.

Total investment for three nodes: less than what most companies spend on a single enterprise GPU instance for a month. The 5090 was the big ticket. Everything else was hardware I already had, repurposed from previous lives. Former ESXi hosts, former gaming rigs, former "I'll use this someday" machines that finally got their someday.

A Sunday Afternoon

February 9th, 2026. A Sunday. My wife made coffee. The kids were doing kid things. I said "I'm building a cluster today" the way other people say "I'm going to clean the garage."

Ubuntu 24.04 on all three nodes. The former ESXi hosts had multiple disks that confused the installer, which is a polite way of saying the installer looked at the partition table and panicked. Lesson learned: boot to "Try Ubuntu," wipefs -a everything, then install clean. NVMe drives show up as /dev/nvme0n1, not /dev/sda, because nothing is ever simple and you'd think after twenty years I'd stop being surprised by that.

Ollama installed on all three. Tailscale mesh network so I can SSH from anywhere, including the couch, including the truck, including the airport if I'm bored enough. Definitely when I am up all night on the road because sleep doesn't work after Afghanistan. NVIDIA drivers on the two GPU boxes. By 3 PM (or AM, depending on the night) we had three nodes talking to each other.

Then the moment.

First inference: llama3.1:70b on the 5090. I typed a prompt, hit enter, and watched a 70 billion parameter model think about my question using hardware sitting three feet from me. "Greetings from the Neural Forge."

My wife was there. My son was running around in his Minecraft hoodie. My daughter was drawing something. And this thing we'd been planning for weeks was suddenly, physically, tangibly real. Not in a browser. Not behind an API. Not metered by the token. Right here. In my house. Running on my electricity.

That feeling is the whole point of this article. Remember it. We're coming back to it.

The setup. forge-1, lunchbox, beast, 43 inches of 4K, and a sticker that says SHRED OR DIE.

AI is not going to replace experts. AI is going to make experts dangerous.

Every Forge Needs a Window

I wanted a live dashboard on the big screen. Not Grafana. Not a monitoring tool that looks like a NOC and makes you feel like you should be wearing a badge. I wanted something that looked like it belonged in our world. Amber on black. CRT scanlines. Agent sprites that glow when they're thinking.

THE WINDOW. Amber on black. The agents glow when they think.

We built it in one session. FastAPI backend polling Ollama and nvidia-smi every five seconds, WebSocket pushing state to the browser in real time. The frontend is one HTML file with enough CSS to make a web developer cry, and I mean that as a compliment.

The agents are the hero. Big pixel art sprites, center stage. When a model is loaded and running, the sprite lights up with a glow ring. When it's idle, it fades to a ghost. Below that, a single line stat bar showing GPU utilization, VRAM, CPU, RAM for all three nodes. At the bottom, an activity feed that drifts upward like smoke, with system messages mixed in between quotes that mean something to us.

"Slow is smooth. Smooth is quick."
"Welcome home."

It's not a dashboard. It's a heartbeat. It's exactly what I wanted, it's one of one, and I love it more than anything I ever paid for. Thank you, ten years of building things on contract for peanuts so I could eat. Thank you, minimum wage dungeon that was Activision QA, because it turns out that learning to test everything to destruction is a phenomenal foundation for building things that actually work. When you know how the pieces are supposed to gel in harmony, you can feel when something is off before you can explain why.

I know how that sounds. I know that calling a monitoring page a "heartbeat" is the kind of thing that makes serious engineers roll their eyes. Roll away, boys. This is not the Princeton Review. I'm writing this, finishing an iOS app that I'm releasing, and then going skating with my kids. It's almost 65 degrees today and my mini ramp is calling me and the shredders.

But here's the thing: that dashboard has been running 24/7 for a week now, and every time I walk past that screen and see the agents glowing, I think about what to build next. That's not monitoring. That's motivation. And if you don't think motivation matters in engineering, you've never tried to maintain a side project past week three. Fun side projects had a habit of turning into PITA projects because I was not a great developer, like truly not great, like when git came around and I was deep in exploit development and payload obfuscation as a solo operator, I was perfectly happy with the old way: here is the share with the file, never touch anything in here unless I say so.

Now I have agents that write code, manage repos, and run tests. The kid who used to protect a file share with a stern warning has a team of AI specialists running 200+ automated tests across the codebase. Life comes at you fast when you stop buying tools and start building them.

The RAM Situation

128GB of DDR5-6000. Two boxes of G.Skill Trident Z5 Neo. The key ingredient.

Here's where the story gets good.

Forge-1 shipped with 4x32GB DDR5-6000 G.SKILL Trident Z5 Neo. 128GB of fast memory, exactly what you need to keep multiple large language models warm and loaded. Except the MSI X870E GAMING PLUS WIFI wouldn't POST with all four sticks installed. Two sticks? Perfect. Boot right up, no issues, here's your 64GB, have a nice life. Four sticks? Black screen. Fans spinning at 50%. Debug LEDs doing nothing useful. Silence.

So we pulled two sticks and built the whole cluster on 64GB. It worked. The 5090 has 32GB of VRAM, so a 70B model at Q4 quantization takes about 42GB total, spilling the overflow into system RAM. With 64GB of system RAM, you can hold one big model and not much else. Want to talk to a different agent? Unload the first one. Wait for it to swap out. Load the new one. Wait for it to swap in.

That's not collaboration. That's a queue. And I didn't build a cluster to wait in line. while gnerally patient, never for mad science. Throw the switch right now!

So at midnight, after the kids were in bed and the coffee was still brewing, we went back for the RAM.

The theory was sound: the factory BIOS, version 1.A01 from October 2025, predated AMD's AGESA memory compatibility updates. Specifically, AGESA 1.2.0.3a Patch A, which optimized DDR5 for 2DPC configurations. Two DIMMs per channel. Exactly our situation. The hardware was fine. The firmware just hadn't caught up yet.

The execution was less elegant.

First attempt: download the BIOS update on the Mac, put it on a USB stick. The Mac helpfully added .fseventsd, .Spotlight-V100, .Trashes, and what felt like a partridge in a pear tree to the root of the drive. MSI's M-FLASH utility couldn't find the BIOS file buried under Apple's housekeeping. We renamed it. Still nothing. We reformatted the stick from Linux by SSH-ing into lunchbox, because lunchbox is always there when you need it. Copied the file back. Still nothing, because it turns out M-FLASH filters by the original filename from MSI's website. Not a generic name. The exact original name.

Second attempt: the Flash BIOS Button on the rear I/O panel. This is MSI's "you don't even need a CPU installed" emergency flash option. It wants the file named MSI.ROM. We renamed the file, plugged in the stick, held the button. The LED blinked. The system powered on and off. The LED stopped. And the BIOS was... still 1.A01.

The Flash BIOS Button didn't flash.

My wife, from the couch: "How's it going over there?"

"Hey, so this is going really great." The cost of the system that won't work scrolling through my mind like a stock ticker. Feels great.

Third attempt: just download the damn file directly on forge-1. We had Ubuntu running fine on two sticks. wget the BIOS zip from MSI's website, unzip it, copy to USB with the original filename preserved perfectly because Linux doesn't add invisible garbage to your drives. Reboot into BIOS, M-FLASH, and there it is. E7E70AMSI.1A50. Select. Flash. Three minute progress bar that felt like thirty.

Done. Version 1.A5. January 2026 firmware, three months newer than what shipped.

Then the moment of truth. Power off. Clear the CMOS, because DDR5 memory training data from the old BIOS would just confuse things. Seat all four sticks. Close the case. Power on.

Fans spin up to 50%. Nothing on screen. Just the sound of fans and my own breathing.

This is normal, I tell myself. DDR5 memory training. AM5 boards can take several power cycles to stabilize new memory configurations. Don't touch it. Don't power cycle. Let it work. Let the memory controller do its thing. The board knows what it's doing even if it doesn't look like it.

I waited.

My wife watched.

It POSTed.

I texted my Mom <3

forge-1. RTX 5090, 128GB DDR5, all systems go.

128GB. All four sticks. Four green slots in the BIOS memory map. The update from October to January was the difference between "this board can't do four DIMMs" and "this board just needed a chance to learn how."

Zero dollars spent. The RAM was already sitting on the shelf in anti-static bags, doing nothing. The BIOS update was free. The knowledge of what AGESA version to look for came from twenty years of reading firmware changelogs and knowing that hardware problems are usually software problems in disguise. I got really good at this during the crypto mining boom, chasing consumer-grade hardware for 16 slot GPU motherboards, finding the weird deals, making things POST that weren't supposed to. Dope AF then. Still dope now.

That's what I mean about expertise. The AI didn't diagnose the BIOS issue. I did. But the AI helped me build everything that runs on top of those 128 gigs, at a speed I never could have matched alone.

The Discord Bridge

While the BIOS was flashing, because you don't just sit there staring at a progress bar, you build something else, we built a Discord bot that connects the Circle agents to a real chat server.

The architecture is stupid simple, and I mean that as the highest compliment. One Python process running on lunchbox. discord.py for the connection. httpx hitting Ollama's API on whichever node hosts that agent's model. Each agent posts through its own Discord webhook, so they show up as separate identities with their own names and avatars.

Type !forge what do you think about this architecture and Forge responds from the 5090. Type !mirth how should we teach this concept and Mirth responds from the little NUC. Type !circle and all the agents weigh in, one after another, each from their own hardware, each with their own perspective shaped by different model weights and different system prompts.

Forge reporting for duty. One command, one response, running on hardware three feet away.

Is it production-grade? No. Is it absurdly fun and surprisingly useful? Yes. The Circle has a place to talk now. A real one, with rooms and channels and history that persists between sessions. I didn't plan to build a multi-agent chat system that weekend. I planned to install RAM. But one thing led to another, and that's kind of the whole point.

What 128GB Actually Means

With 64GB of RAM and 32GB of VRAM, forge-1 could hold one 70B model and maybe a small 7B helper on the side. The agents took turns like kids on a swing set. Polite but slow.

With 128GB of RAM and 32GB of VRAM, the math changes completely. The 70B model at Q4 is 42GB. A 13B model is 8GB. A 7B model is 4.4GB. Stack them up. All warm, all loaded, all ready to respond without the 30-second cold start that kills conversation flow.

Furthermore, the basis for a lot of my experiments is built on multiple models working in chains, where the output of one becomes the input for the next, each model contributing a different kind of intelligence to the same problem. You can't chain models if they're taking turns swapping in and out of memory. You need them all warm, all thinking, all connected. That's what 128GB buys you.

When someone types !circle in Discord and multiple agents respond, they don't have to wait for each other to unload and reload. The 5090 handles the hot layers, the RAM holds the overflow, and the whole thing just works. Fast enough that it feels like a conversation, not a conference call with a bad connection.

That's the difference between a demo and infrastructure. A demo shows that something is possible. Infrastructure shows that it's practical. I needed to cross that line, and 128GB is what got me there.

What I Didn't Plan For

Here's the thing nobody tells you about building your own infrastructure: the plan is never the product.

I planned to build a cluster that could run large language models locally. That was the scope. Three machines, Ollama, SSH, done.

What I actually built, in the three weeks since that Sunday, is something I couldn't have predicted and wouldn't have believed if you'd described it to me:

A live monitoring dashboard that looks like a video game and makes me want to build things every time I see it. A Discord server where AI agents have real conversations with real people. A memory system that preserves context between sessions so the work compounds instead of resetting. A vector database full of everything I've learned. A pipeline that runs every thirty minutes, extracting knowledge and making it searchable. An ops dashboard on the internet that shows me cluster health from my phone.

Here's why I think this happened, and why it keeps happening: A Raspberry Pi and a D20. The whole philosophy in one photo.

I play games. I make games. I see my life that way. It's all a game, and I have fun playing it, and I have done this my whole life. That's why I never give up and come through the hard times. It's all a puzzle and I solve them.

I was big on HackTheBox when it launched and I saw the gamification value immediately, because that's the key. Being a game maker gives me an edge that others without that experience struggle to see. Game design teaches you to read the motions that drive systems, to see the loops and the feedback and the progression curves. You learn what makes people lean in and what makes them quit. That's not just game knowledge, that's systems knowledge, and it transfers to everything.

That's how I can bridge something like the operational workflow for an AI lab to a literal living village of agentic characters that have personified tasks, activities, and command and control of actual consequential workloads. Through a farming game. How is that not the biggest win ever?

Work is play. Play is fun. Fun makes things a little more worth doing, hard things a little easier to get done. You know what it's like sitting in meetings with corporate leaders that look like they are getting handed a school assignment they don't want to do? Yeah. That energy is why most companies can't build anything interesting. Fun is not a distraction from the work. Fun is the work.

None of that was in the plan. All of it exists because I started building and the building showed me what was possible. Every time I finished one thing, I could see the next thing. Not because I'm some kind of genius architect, but because twenty years of building systems gives you a sense for what a system wants to become. You feel the shape of it before you can articulate it. In my defense, I have built a lot. I hacked everything I could together to do things that I wanted to try because recycling technology and hunting for compute and GPUs is an adventure. You never come back with just what you were looking for.

The AI provided the throughput. I provided the direction. Together, we built things neither of us could have built alone. And we built them fast. Not startup-pitch fast, not "we'll have a prototype in Q3" fast. Three weeks fast. From bare metal to a working ecosystem that I use every single day.

Next up: flying my old 3D printed racing drones with the forge as their brain. Hive mind in the skies of Iowa. Soon.

The Two-Layer Thing

I keep coming back to this idea, and I think it's the most important thing I can say to anyone reading this.

AI is not going to replace experts. AI is going to make experts dangerous. Because one by one everyone around is me is starting to see that. Three video games, two production suites, 4 websites - with all the bells and whistles. 20 Days. I keep watching for the card trick to expose itself, but I am flowing like water now, just like Bruce said.

Not dangerous in the way people fear. Dangerous in the way that a carpenter with a nail gun is dangerous compared to a carpenter with a hammer. Same knowledge, same eye, same twenty years of knowing what a load-bearing wall looks like. Just faster. Just more capable of turning vision into reality before the inspiration fades.

The AI brings breadth. I bring depth. Neither works alone.

I know what models to run because I've been evaluating tools for two decades. I know how to debug a BIOS flash because I've bricked and recovered more machines than I can count. I know what a good dashboard should feel like because I've stared at bad ones during incident response at 3 AM and cursed the person who built them. I know what memory architecture to use because I've designed systems that had to work when lives depended on them.

The AI doesn't know any of that. The AI brings the ability to write Python at 2 AM without making typos, to hold a hundred files in working memory while refactoring, to suggest approaches from domains I've never worked in. The AI brings breadth. I bring depth. Neither works alone.

That's not a slogan. That's a Tuesday for me now. All of my friends and family are completely shocked at what's going on. Before this, I was in what I would call the sharpening phase. I am a red teamer at my core, everything in my career echoes that. I was doing muay thai, running, staying sharp, grinding through a hard stretch where the pressure kept building. Life is full of storms. Faith and patience see you through. Not the religious kind. The fighter kind. The kind you learn when you've been hit enough times to know that storms pass, and if you keep your hands up and your feet moving, you're still standing when the sky clears.

This storm cleared in a big way.

And here's what I want you to hear, whoever you are, wherever you are, whatever your twenty years looks like: your depth matters. Your expertise matters. The thing you know how to do better than anyone in the room, the thing you've been doing so long it feels like breathing, that's the thing that makes AI actually useful. Not the prompting tricks. Not the API wrappers. Not the tools that someone else built and is charging you $49/month to access.

Don't buy it. Build it.

I don't say that to be contrarian. I say it because the act of building is where the discoveries live. I didn't discover the memory system by reading a product page. I didnt discover carbon fiber conducts electricty becasue I read it in a book. I discovered it by building the cluster and realizing I needed something to preserve context between sessions. I didn't discover the monitoring dashboard by shopping for SaaS tools. I discovered it by wanting to see the fire in my own forge.

Every shortcut you buy is a discovery you skip. And right now, in this moment, the discoveries are the whole game.

Don't buy it. Build it.

The Stardew Proof

I want to tell you a story about a farming game, but this article is already long enough, and that story deserves its own space.

Here's the short version: I took everything I just described, the deep expertise, the AI collaboration, the "build it yourself" philosophy, and applied it to modding Stardew Valley. Yes, the farming game. Yes, seriously.

I built a SMAPI mod called Circle Reach-Through. One C# file, one config file. When you walk up to an NPC and talk to them, the mod intercepts the dialogue, fires an HTTP request to an Ollama model running on the NEURF cluster, and the NPC responds with a real AI-generated answer. No cloud. No API keys. No subscriptions. Just your machine, your models, your world.

I learned C# from a friend that showed me the keys to every door when we worked together. Easy to go zero to hero with what you already know.

Each NPC maps to a different agent with its own model, personality, and hardware. Forge runs on the 5090. Fraz runs on Beast's 3080. Jinx thinks about your question using a completely different set of weights than Anvil does. They're not reading from a script. They're thinking. On hardware sitting in my office.

Circle Reach-Through: AI NPCs powered by local LLMs

The wizard talks about Machine Learning and Neural Networks because that's what he knows. The blacksmith calls himself "the Paladin of Quality" and almost drops the word "pytest" before catching himself. The analyst offers to build you a dashboard with "Tufte principles for optimal clarity." These are not canned responses. These are language models running locally, in character, responding to whatever you say, powered by hardware you built yourself.

Forge, the Circle's artificer

Jinx sees patterns in data that others might miss

What happened next proved the thesis in a way that a cluster build can't, because a cluster build is technical and specific and most people's eyes glaze over when you say "AGESA memory compatibility update." But everyone understands a farming game. Everyone understands taking something that exists and making it do something new. Everyone understands the moment when you look at what you built and think, "I didn't know I could do that."

That story is coming. Two parts, maybe three. And at the end of it, the mod goes public. Open source, on GitHub, free to use and improve. Because the thesis isn't just "I can build things." The thesis is "we can build things." All the intelligences together. Your expertise, my expertise, the AI's throughput, pointed at whatever problem keeps you up at night.

Better together. Always.

The Real Point

I started this because I needed more memory.

I ended up with an ecosystem I use every day, a thesis I believe in my bones, and the strongest three weeks of professional output in a twenty-year career. Not because the technology is magic. Because the technology met someone who knew what to do with it.

You know what to do with it too. Whatever your twenty years looks like, whether it's nursing or woodworking or teaching or trading or anything else where you've built real expertise through real work, AI is about to make that expertise reach further than you ever imagined. If it seems like its too much to wrap your head around, reach out. I have this site for a reason, subscribe to the news letter, email me. You'll get a chat, ask any of my friends :)

Don't wait for someone to build the perfect tool for you. They won't. They'll build an approximation and charge you a subscription for it. Build the thing yourself. Start ugly. Start small. Start with "I need more memory" and see where it takes you.

The window is open. The fire is lit. Come build with us.

Don't buy it. Build it.

Pete McKernan is the Head Human at itsbroken.ai, and a disabled veteran who spent twenty years building technology across sectors you've heard of and some you haven't before discovering that the most dangerous thing you can do with AI is hand it to someone who actually knows their craft. He holds GXPN, CISSP, and GPEN certifications, and was recently invited to the REDACT3D program. He lives in Iowa with his wife, three kids, a mini ramp, and a growing collection of machines that glow amber in the dark.