← Blog

Tooling

Browserbase vs Anchor vs Playwright: a KvK-watch verdict

A 17-person Amsterdam tax-advisory needed continuous KvK watch on 3,200 client entities. We tested three browser-automation stacks; one of them broke at 02:00 on a Sunday.

Jacob Molkenboer· Founder · A Brand New Company· 15 Aug 2025· 6 min
Brass three-way railway lever on ivory blotter, folded telegraph slip with chartreuse wax seal, iron shipping tags, red pin.

Maandag 06:42. The compliance lead at a Keizersgracht tax-advisory opens her laptop, scrolls through eighteen overnight Slack cards, and stops at the third one: a long-standing client just lost a board member at the KvK overnight and the firm did not know. The card is the agent's job. The agent existed because doing this by hand for 3,200 entities had stopped being possible in 2023.

The interesting question, six months into running that agent, is not whether to build it. It is which browser-automation layer to put underneath, now that we have real numbers from a real production run.

The job: 3,200 entities, every working hour

The firm has 17 people, mostly seniors who came up at the Big Four. They do continuous compliance work for 3,200 Dutch entities. Director changes, handelsregister extracts, UBO shifts, dissolution filings, address moves. Before the agent, two paralegals checked maybe 80 entities a day by hand. Whatever fell off the list became next quarter's surprise.

The brief was small: pull every entity's handelsregister state every 8 working hours, diff against last known, post a Slack card to the responsible advisor when something changed. We dropped pulls for dormant entities and weekends, and settled on 4,800 pulls per week, steady-state.

The official KvK Handelsregister API exists. It is competent, but for this volume the per-query cost added up faster than the firm wanted, and the API does not return the same enriched view that Mijn KvK shows after login. So the build was browser automation against the logged-in gateway.

Three candidate stacks ended up on the whiteboard.

The three stacks

Browserbase. A managed Chromium fleet you talk to over a WebSocket. You get stealth profiles, residential proxies, session-replay video, and a dashboard a non-engineer can open at 02:00 to see what broke.

Anchor Browser. Similar shape, newer, leaning into the "browser for agents" framing. Cheaper per minute at the time we tested. Smaller, cleaner API surface.

Self-hosted Playwright on chromium-headless-shell. One Hetzner CCX23 box (€30/month, 4 vCPU, 16 GB), a Redis queue, residential proxies bought separately, no UI. Engineering time was where the real cost lived.

Per-run cost at 4,800 pulls per week

A pull averaged 27 seconds end-to-end: session check, navigate, render, scrape, log out cleanly. Call it 30s with overhead.

4,800 runs × 30s = 40 browser-hours per week.

Rough numbers from our procurement (June 2026; pricing moves):

  • Browserbase with stealth and captcha solving: roughly €380 to €460 per week all-in. Dominated by minutes, not bandwidth.
  • Anchor Browser at the same minute count, residential proxy bundled: roughly €180 to €240 per week.
  • Self-hosted: €30 per month for the box, €110 per month for residential proxies, plus one engineer on call. So about €33 per week in hardware, plus whatever fraction of payroll you charge against this single agent.

If you stop the analysis here, self-hosted wins by an order of magnitude. We did not stop here.

Mijn KvK survival

Mijn KvK quietly tightened its session telemetry over the last year. Canvas fingerprinting, mouse-entropy checks, and a soft challenge that appears when the gateway is unsure about you. It is not a reCAPTCHA. It is a re-authentication nudge, and it is enough to break an unattended job.

Over a two-week pilot at the full 4,800/week volume, sessions-flagged-for-reauth landed at:

  • Browserbase, stealth profile + residential proxy: ~0.4% of runs.
  • Anchor Browser, default profile + residential proxy: ~1.1% of runs.
  • Self-hosted Playwright on chromium-headless-shell, residential proxy: ~3.7% of runs.

The self-hosted number moved. Once we ported stealth-plugin patterns to Playwright and rotated user agents per IP, self-hosted dropped to ~1.6%. Browserbase did not need tuning. That is what we were paying for.

Warning

chromium-headless-shell is faster and lighter than full Chromium, but it is also more distinguishable. If the target site cares about fingerprint, run full Chromium and pay the RAM cost.

The 02:00 question

Cost is the easy axis. The interesting axis is: when a session breaks at 02:00 on a Sunday, who restores it.

This is the hidden price tag on self-hosted. The client did not have a duty engineer. We were the duty engineer. That meant either ABN ate the pages, or we wrote a runbook clean enough for the office manager to follow on Monday morning and accept a 6-hour blind window.

Browserbase and Anchor both ship dashboards where the operator can watch the last failed session as video, log in once, mark the session healthy, and walk away. Anchor's UI was newer and rougher in June 2026, but workable. Browserbase's session-replay was the feature the compliance lead actually used twice during the pilot, at hours where she would not have called us.

If your client has on-call engineers, self-hosted is a real option. If your client is 17 accountants and nobody owns Linux, the managed stack pays for itself the first time something flags at midnight.

The audit table, briefly

Tangential but worth saying. Every pull writes a row to a Postgres audit table (entity_id, pulled_at, raw_html_hash, diff_summary). At 4,800 pulls per week that is roughly 250k rows per year, retained for seven (Dutch tax retention rules). The table is partitioned by month from day one. An HN front-page thread that week reminded everyone that the only scalable delete in Postgres is DROP TABLE, which is exactly why declarative partitioning exists. When 2032 rolls around and the 2025 partitions age out, the retention job drops the partition. No VACUUM drama.

An aside that may save you a refactor in year three.

What we picked, and why

We picked Browserbase for the hourly job, with self-hosted Playwright as the fallback for the Saturday-night bulk re-scrape.

The reasoning was mundane:

  • The marginal €200/week premium over Anchor bought us session replay that the compliance lead actually opened.
  • The marginal €400/week premium over self-hosted bought us not being the on-call team for a client whose runbook would never get loved enough to work at 02:00.
  • Anchor stays on the shortlist. If Browserbase pricing tightens or Anchor ships a session-replay UI a non-engineer can drive, we reopen the choice.

The Saturday bulk job, which has no SLA and runs across a 12-hour window, still uses self-hosted Playwright. Cost matters more there than reliability, because we can simply restart it on Sunday afternoon.

Takeaway

The cheapest browser stack is the one that does not page you at 02:00, even if its per-minute cost looks alarming on the invoice.

If you are picking this week

A five-minute audit before you commit to any of these:

  1. Take a sample of 50 pulls. Run them through your target site from a residential IP. Measure the flag rate.
  2. Multiply that flag rate by your weekly volume. That is how many human-recovery events your team will own per week.
  3. Decide who owns those events. Write the name down. If you cannot write a name, you are buying the managed option.

When we built this AI agent for the tax-advisory, the thing we kept underestimating was the cost of recovery, not the cost of running. We ended up solving it by paying for session replay and handing the compliance lead the keys, so the agent failed in a way her team could actually see.

Key takeaway

The cheapest browser stack is the one that does not page you at 02:00, even if its per-minute cost looks alarming on the invoice.

FAQ

Why not use the official KvK Handelsregister API?

It works fine at low volume. At 4,800 weekly pulls the per-query cost compounds, and the API does not return the enriched logged-in view that Mijn KvK shows. Volume tipped the build toward browser automation.

Do you need residential proxies for KvK at this volume?

Yes. Datacenter IPs get flagged within a few hundred consecutive pulls on Mijn KvK. Residential is the boring answer that keeps flag rates under 1% across all three stacks.

Is chromium-headless-shell production-safe?

For inattentive targets, yes. For sites doing canvas and font fingerprinting, full Chromium survives longer. Headless-shell is lighter and faster but easier to detect.

What about pure HTTP scraping with no browser?

Mijn KvK is heavily JS-rendered behind a login. You can reconstruct the session manually, but the maintenance burden when they change the gateway eats whatever you saved on compute.

When does self-hosted Playwright actually win?

When you have a duty engineer on call, when the target does not fingerprint aggressively, and when the workload tolerates being restarted hours later instead of immediately.

toolingautomationai agentsarchitectureoperations

Building something?

Start a project