Reverse engineering now and then

When I was a teenager in the 90s, a friend of mine had a side-gig making cracks and key-generators for games and software. To anyone who grew up with always-on broadband, that sentence probably needs some context.

A World Without Internet (by Default)

In Europe in the late 90s, Internet access wasn't the default mode of any device. You chose to go online — deliberately — by firing up your 56k modem on the family landline, knowing full well that every minute ticked on the phone bill. Mobile phones? They existed, barely. Internet on the phone did not. SMS came later and cost a small fortune per message.

This isolation-by-default created something remarkable: an entire intellectual arms race built on the assumption that software lived offline. Developers built protections knowing there was no server to phone home to. Hackers broke them knowing the same thing. It was an adversarial craft — part art, part sport — played entirely within the confines of a single machine.

Here’s a small sample curated of the art at the time 😉

Software distribution before internet

Back then, you discovered new software the way you discovered music: through curation. Tech magazines shipped with a free CD stuffed with freeware (fully free software) and shareware (a taste for free, the full experience once you'd paid for a License Key).

An example of the kind of publication at the time - here a CDROM issue of 1994

It was a surprisingly elegant distribution model. As a consumer, you got curated recommendations from a magazine you trusted — far better than the alternative of spending 16 hours downloading a 200 MB file at 3.5 KB/s, only to have the connection drop after five.

There were magazines dedicated to software, games, productivity — and yes, hacking. And among the search engines of the era, alongside the generalist Alta Vista, sat its shadowy cousin: astalavista.box.sk. There was no real concept of a "dark web" at the time. The web was what it was — light and darkness mixed together — and you were expected to watch where you clicked.

The art of cracks & keygens

If you wanted to unlock a shareware program, you went to Astalavista and searched for it. You'd find either a keygen or a crack.

Keygens were the golden ticket. They generated a proper License Key — one that told the software "I'm good, I'm a paying customer." Since Internet was a luxury, the software never tried to verify that claim against a remote database. You didn't modify the program at all. You just had a key that fit the lock.
Cracks were more invasive. They patched the software itself — typically dropping in a modified DLL that monkey-patched (as we'd say today) the code responsible for checking registration. Surgery, not lockpicking.

Keygens are mostly gone now. Cracks still exist in spirit — the modding community around games like Skyrim puts staggering effort into projects like the Unofficial Skyrim Special Edition Patch, which is essentially the same discipline applied with different intent.

But the underlying process was always the same, and to teenage me it looked like pure black magic. You had to reverse-engineer the code responsible for the license check — understand the key verification algorithm well enough to generate compliant keys, or understand the program's architecture well enough to surgically disconnect the licensing module without breaking everything else.

In practice, this meant a Windows machine tweaked at the boot level to run a decompiler or disassembler, tracing execution pathways and kernel calls to figure out what was actually happening. It required tremendous skill, deep knowledge, and pure grit.

I wondered recently: what would that process look like today, now that we have AI models like Claude that can supply the "grit" part on demand?

Let’s try it with today’s tech

Rather than reverse-engineer an existing proprietary format (let's keep things legal and self-contained), I created a toy problem. I designed a new binary file format called MIC (Multi Image Container) — built to store multiple images in a single binary file with error correction codes and thumbnail support.

The experiment has three steps:

Design the spec — published separately at ogirardot.github.io/mic
Implement a writer — a Python prototype at github.com/ogirardot/mic
Ask Claude to reverse-engineer the output with zero context

The format is pretty simple and the base structure layout looks like :

High level description of the MIC file format

I packed two of my own photos into a .mic file:

shell

➜  mic git:(main) ✗ python mic.py pack mountains.mic IMG_20260208_145001.jpg PXL_20260228_111319728.MP.jpg 
Wrote mountains.mic (2 images, 730248 bytes)

Then I handed the resulting binary to Claude with the simplest possible prompt:

Here's a strange file, can you reverse-engineer it to tell me what is it about and if there are any data inside?

No spec. No hints. No context. I launched Claude Code with CLAUDE_CODE_SIMPLE=1 to ensure a completely blank slate, and fed the same prompt to three different models: Haiku 4.5, Sonnet 4.6, and Opus 4.6.

The Results

My expectation was that this first prompt would be a long shot. I was wrong.

Sonnet 4.6 went first. After about 4 minutes of autonomously running xxd dumps and writing ad-hoc Python scripts, it produced a full reverse-engineering report: the magic bytes, the header structure, the directory layout, per-image metadata including dimensions, filenames, CRC32 checksums (verified!), and even a description of the actual photo content. It mapped the entire format.

Here's how all three models performed:

Model	Time	Result
Haiku 4.5	37 seconds	✅ Full reverse-engineering
Sonnet 4.6	4 min 25s	✅ Full reverse-engineering (bonus: opened the extracted images)
Opus 4.6	1 min 16s	✅ Full reverse-engineering

Yes — Haiku, the smallest and cheapest model, cracked it in 37 seconds.

Each model autonomously figured out the magic bytes (MIC!, IMG!, ENDMIC!), the header layout, the directory structure with offsets and sizes, image dimensions, embedded filenames, CRC32 integrity checks, and the raw JPEG payloads. Opus gave the most concise structural mapping. Sonnet went the extra mile and actually rendered the extracted images.

What Does This Mean?

This was admittedly a simple format — no encryption, no compression beyond what JPEG provides, clear magic bytes as handholds. A skilled reverse engineer would have cracked it with a hex editor in minutes.

But the interesting part isn't the difficulty of the problem. It's the nature of the process.

What used to require a human sitting in front of a disassembler for hours — loading hex dumps into working memory, forming hypotheses about byte sequences, writing test scripts, iterating — is now something an AI can do autonomously. The model writes its own exploration tools, tests its own hypotheses, and converges on a structural understanding through the same iterative loop a human would use. It just does it faster, and it never loses focus.

The ability of modern models to generate on-the-fly Python debugging code, interpret binary patterns, form and revise hypotheses about data structures — that's a genuine capability shift. It doesn't replace human intuition on the hardest problems. But it compresses the iteration cycle dramatically.

My friend from the 90s spent weeks learning the tools and techniques before he could crack his first shareware. Today, the activation energy for that same intellectual exercise is a single prompt.

It's a brave new world. If something was designed using any kind of logical structure, it can now be understood at a speed we've never seen before.