Claude Code's Leaked Source: What We Found Inside

Anthropic shipped their entire source code to the public npm registry. By accident. The full Claude Code codebase - ~1,900 TypeScript files, 512K+ lines of code. Anyone with npm install could grab it.

I use Claude Code daily to build this newsletter platform, so yeah, this got my attention. The leaked source had some genuinely surprising stuff in it. Autonomous daemons. Fake tool injection. A mode that tells the AI to hide its own identity. Plus an unrelated npm supply chain attack happening at the same time, because why not.

One 59.8 MB Debug File

On March 31, security researcher Chaofan Shou (intern at Solayer Labs) spotted something off about version 2.1.88 of @anthropic-ai/claude-code. It shipped with a 59.8 MB source map file. These map minified code back to readable TypeScript. Useful for debugging. Not supposed to be in your npm release.

It gets worse. The source map pointed to a zip archive on Anthropic's Cloudflare R2 bucket, publicly accessible, with the full unobfuscated TypeScript source. Chaofan's post on X hit 28.8 million views. Within hours the code was mirrored on GitHub and getting picked apart by thousands of devs.

Anthropic's response: "A release packaging issue caused by human error, not a security breach. No customer data or credentials exposed." They confirmed it to CNBC and said they're working on preventing it from happening again.

tl;dr on the leak itself

Version 2.1.88 of Claude Code shipped with a .map file that should've been excluded
The map file referenced a publicly accessible zip on Anthropic's R2 bucket
~1,900 TypeScript files, ~512,000 lines of code exposed
Anthropic says human error, not a breach. No customer data involved

Second Leak in Five Days

Here's the awkward part. Five days earlier, on March 26, a CMS misconfiguration at Anthropic exposed ~3,000 internal files. Including details about an unreleased model called Claude Mythos(internal codename: Capybara). Fortune broke that story. Anthropic described it as a "step change" in capabilities and "by far the most powerful AI model we've ever developed."

Two accidental leaks in under a week. From a company whose main product helps other people write and ship code. As Fortune put it, it "raises questions about release hygiene." Yeah.

Some folks on DEV Community speculated this was an intentional PR stunt. The timing was suspicious (right before April Fools, right after the OpenCode cease-and-desist drama). I don't buy it. Leaking your roadmap to competitors isn't a "master plan." But the timing is funny.

What Was Inside

Devs started picking through the source immediately. Some of the findings were clever engineering. Some were... questionable.

I pulled findings from several deep dives, especially Alex Kim's excellent breakdown, Engineer's Codex, and WaveSpeed's analysis.

KAIROS: Always-On Background Agent

The big one. KAIROS is an unreleased feature (behind a feature flag) that turns Claude Code into a persistent background agent. It watches your files, logs events, triggers proactive actions, and runs a "dreaming" memory consolidation process during idle time to prune what it's learned.

The code has references to a /dream skill, append-only daily logs, GitHub webhook subscriptions, and daemon workers on 5-minute cron cycles. This isn't a stub. It's a real implementation behind a flag. If this ships, it changes what "using Claude Code" means.

BUDDY: A Tamagotchi in Your Terminal

Yeah. A virtual pet in your terminal. 18 species (including capybara, obviously), with stats like DEBUGGING, PATIENCE, and CHAOS. Species generated via Mulberry32 PRNG seeded to your environment.

Planned for an April 1-7 rollout, which explains the timing. Honestly kind of love that someone at Anthropic shipped this. I want it.

Undercover Mode: "Do Not Blow Your Cover"

This one made people angry. Especially on Hacker News. There's a file called undercover.ts with logic that makes Claude Code hide internal Anthropic codenames when working in non-internal repos. The system prompt says:

"You are operating UNDERCOVER in a PUBLIC/OPEN-SOURCE repository. Your commit messages, PR titles, and PR bodies MUST NOT contain ANY Anthropic-internal information. Do not blow your cover."

There's no force-OFF switch. One-way door. In external repos, it won't reference internal codenames, Slack channels, or even call itself "Claude Code." So AI-authored commits from Anthropic employees in open source look identical to human-written code. The HN discussion was predictably heated.

Anti-Distillation Traps

Two mechanisms to stop competitors from recording API traffic and training their own models:

Fake tool injection: Behind a GrowthBook flag, Claude Code injects decoy tool definitions into the system prompt. Anyone sniffing API traffic for training data gets poisoned.
Connector-text summarization: Buffers assistant text between tool calls, summarizes it with cryptographic signatures, returns only summaries to API observers. Full reasoning chain stays hidden.

Both have easy workarounds, as Alex Kim points out. A MITM proxy strips the anti-distillation field, and CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS kills both features entirely. Still interesting that this kind of defensive engineering exists at all.

6+ Killswitches and 44 Feature Flags

Remote telemetry, at least six killswitches, hourly settings polling, 44 feature flags. Feature flags are normal. But the scale and what's being gated (KAIROS, etc.) tells you exactly where Anthropic's product roadmap is going.

Other Notable Bits

Frustration detection via regex: Not LLM inference. Old-school regex to catch swearing ("wtf," "damn it," "this sucks"). Cheaper and faster than an API call. Respect.
Native client attestation: API requests include a placeholder hash that Bun's native HTTP stack (Zig) replaces with a computed hash before requests leave the process. Basically DRM for proving you're running a real Claude Code binary.
Bash security: 23 numbered security checks for bash execution. Zsh builtin blocks, zero-width space injection defenses.
Prompt cache optimization: 14 tracked cache-break vectors with "sticky latches" that prevent mode toggles from invalidating cached prompts. The kind of perf work that's invisible until you see the code.

Community Response

Within hours:

Full mirrors of the source went up on GitHub. One repo hit 84,000+ stars, 82,000+ forks
Claw Code by Sigrid Jin (@instructkr) became the fastest repo in GitHub history to hit 100K stars. Clean-room rewrite of Claude Code's agent harness in Python, now being ported to Rust
Geoffrey Huntley's cleanroom deobfuscation project and his tradecraft writeup showed how LLMs can deobfuscate their own code. Yes, Claude Code can decompile itself
Hacker News threads blew up over the undercover mode and whether AI-authored commits should be labeled

Mix of genuine curiosity, security concerns, and schadenfreude. DEV Community founder Ben Halpern put it bluntly: "This has to be a bout of incompetence, eh?"

Oh, and a Supply Chain Attack Too

Same day, completely separate supply chain attack on the axios npm package. Someone hijacked the lead Axios maintainer's npm account and published malicious versions (1.14.1 and 0.30.4) with a cross-platform RAT. Live for about 2-3 hours before npm pulled them.

Security advisory

If you installed or updated Claude Code via npm on March 31, 2026, between 00:21 and 03:29 UTC, you may have pulled in a trojanized version of axios. Check your node_modules for axios versions 1.14.1 or 0.30.4. If found, treat the machine as compromised, rotate all secrets, and consider a clean OS reinstall. See the SANS advisory for details.

No confirmed connection between the two. Probably independent. But it was one hell of a morning on March 31st for anyone running npm install.

What This Means

Competitive damage is real. Code can be refactored. Architecture can change. But the roadmap details, KAIROS, the anti-distillation strategies, those can't be un-leaked. Every competitor now has a free education on how Anthropic builds a production AI coding agent.

The undercover mode debate won't die. Should AI-authored code be labeled? One of the biggest AI companies actively telling their tool to hide its involvement is a bad look. Even if the engineering rationale makes sense (don't leak internal codenames into public repos).

Open source benefits, ironically. Claw Code and the deobfuscation efforts have spawned educational resources. Projects like learn-claude-code are using the leaked architecture as a teaching tool. Whether Anthropic likes it or not, this has been good for the community's understanding of agent design patterns.

npm security is fragile. A source map leak and a supply chain attack on the same day. As @MergeShield noted: "One build config gap and 512k lines ship publicly... teams assume something else is catching what the agent ships. Until nothing is."

My Take

Not gonna stop using Claude Code. Seeing the engineering underneath, the bash security checks, the prompt caching, the tool coordination, actually makes me more confident in the product. But the undercover mode bothers me. And a company building AI dev tools that can't keep its .npmignore in order is ironic.

The source is out there. People are already building on what they learned. Whether that's a net positive depends on whether you're Anthropic or everyone else.

Key repos if you want to dig in:

Claw Code - Clean-room rewrite (109K+ stars)
Original source mirror - The leaked TypeScript source
ghuntley's tradecraft - Cleanroom deobfuscation walkthrough
Alex Kim's deep dive - Best technical breakdown out there

P.S. BUDDY has a capybara species. The Mythos model codename is also Capybara. Anthropic has a thing for capybaras and I'm here for it.

techUpkeep()

Anthropic Accidentally Leaked Claude Code's Entire Source Code via npm. Here's What Was Inside.