Process automation

Marketplace feed automation: a two-week Postgres playbook

A two-week playbook for swapping triple-keyed marketplace listings with one Postgres source-of-truth and a nightly diff worker. Code, gotchas, the kill switch.

Jacob Molkenboer· Founder · A Brand New Company· 6 Jun 2026· 10 min

Weathered brass switchboard plug with frayed cloth cord beside cream index card with green wax seal on ivory linen.

It is a Tuesday night in May. The operations lead at a kitchen-equipment wholesaler near the Maasvlakte has three browser tabs open: Bol Partner Platform, Amazon Seller Central, Marktplaats Admarkt. Summer promo prices go live Friday. She is on row 412 of an 1,840-row spreadsheet, copying prices into three portals because the inventory of fryers, mixers, and induction tops changes faster than her assistant can keep up. Two weeks ago a wrong decimal shipped a €1,299 combi-steamer at €129. Bol auto-cancelled the orders. Amazon did not.

This is the playbook we ran with that wholesaler. Two weeks, one Postgres database, one nightly worker. No more triple-keying.

One source of truth, three sinks

The first decision is the only one that matters. You pick a place where the product exists, and you make every marketplace a downstream sink. Never the other way around. Not Bol. Not Amazon. Not your ERP if your ERP is a shared Google Sheet from 2019 with merged cells.

We picked Postgres because the wholesaler already ran one for their B2B portal, and because Postgres handles JSON columns well enough that channel-specific quirks (Amazon's product-type schema, Marktplaats's category tree, Bol's delivery code) can sit in a sidecar column instead of polluting the product table.

The hard part is convincing the team to stop editing in the portals. We solved this on day one: revoke editor permissions, leave one person with read-only access for sanity checks, post the new rule in Slack with the date on it. If you cannot get that buy-in, stop reading. The rest of the playbook will not save you.

Days 1 to 3: model the product, not the listing

A product has a SKU, a name, dimensions, weight, a price, stock on hand, and a set of photos. A listing is what one marketplace shows. They are different things. People confuse them constantly because the marketplace UIs encourage it.

Here is the schema we landed on. Trimmed to the essentials.

create table products (
  sku              text primary key,
  ean              text not null,
  title            text not null,
  brand            text,
  weight_grams     integer,
  length_mm        integer,
  width_mm         integer,
  height_mm        integer,
  cost_cents       integer not null,
  list_price_cents integer not null,
  stock_on_hand    integer not null default 0,
  attrs            jsonb   not null default '{}'::jsonb,
  updated_at       timestamptz not null default now()
);

create table channel_listings (
  sku              text not null references products(sku) on delete cascade,
  channel          text not null check (channel in ('bol','amazon','marktplaats')),
  channel_sku      text not null,
  state            text not null check (state in ('active','paused','draft')),
  last_pushed_hash text,
  last_pushed_at   timestamptz,
  channel_attrs    jsonb not null default '{}'::jsonb,
  primary key (sku, channel)
);

create index on channel_listings (channel, last_pushed_at);

The attrs column holds canonical product attributes. channel_attrs holds per-channel overrides (a longer title for Amazon, the Bol delivery code, the Marktplaats category id). The last_pushed_hash column is the trick: it is a SHA-256 of the payload we last pushed to that channel for that SKU. The worker reads it before deciding to push again.

Days 4 to 6: backfill from the marketplaces

You cannot start pushing until you know what is already out there. Each marketplace gives you a way to enumerate your existing offers.

For Bol, the Retailer API exposes a request-an-export pattern: you POST to /retailer/offers/export, poll for the report id, then download a CSV of every offer you currently have live. For Amazon, the Selling Partner API Listings Items endpoint and Reports API together give you the same picture, though you will fight more enums per channel. Marktplaats's Admarkt is the friendliest of the three: a flat CSV in, a flat CSV out.

We loaded everything into a staging table, matched on EAN where SKU was missing, hand-resolved the 23 SKUs that did not line up, and then wrote the matched data into channel_listings. The mismatch count is your single best signal of how rough the manual era was. Twenty-three out of 1,840 is fine. Two hundred would mean the company has a deeper data problem to fix before anything else.

EAN matching has a quiet failure mode worth naming. Some of your SKUs will share an EAN: a six-pack of the same fryer carries the same barcode as the single unit. Match on EAN alone and you collapse two products into one. The marketplace will accept the offer because it does not care which one you meant; the customer support inbox will care two weeks later. We added a uniqueness check on (ean, pack_size) and flagged the duplicates for the ops lead to disambiguate. Twenty minutes of triage on day five saved a week of confused order tickets later.

Days 7 to 9: the diff worker

This is the engine. Every night at 03:00 the worker wakes up, walks the product table, and for each SKU computes what each channel should look like. Then it hashes that payload, compares to last_pushed_hash, and only pushes when they differ.

import { createHash } from 'node:crypto';
import { Pool } from 'pg';
import { renderBol, renderAmazon, renderMarktplaats, push } from './adapters';

const pool = new Pool();

export async function syncOne(sku: string) {
  const { rows: [product] } = await pool.query(
    'select * from products where sku = $1', [sku]
  );
  if (!product) return;

  const renders = {
    bol: renderBol(product),
    amazon: renderAmazon(product),
    marktplaats: renderMarktplaats(product),
  };

  for (const [channel, payload] of Object.entries(renders)) {
    const hash = createHash('sha256')
      .update(JSON.stringify(payload))
      .digest('hex');

    const { rows: [listing] } = await pool.query(
      `select last_pushed_hash from channel_listings
       where sku = $1 and channel = $2`,
      [sku, channel]
    );

    if (listing?.last_pushed_hash === hash) continue;

    const result = await push[channel](payload);
    if (!result.accepted) continue;

    await pool.query(
      `update channel_listings
         set last_pushed_hash = $1, last_pushed_at = now()
       where sku = $2 and channel = $3`,
      [hash, sku, channel]
    );
  }
}

Two things to note. First, the hash is over the rendered payload, not the source product, so a change to a channel-specific override triggers exactly the channels it should. Second, last_pushed_hash is written only after the marketplace confirms acceptance. If the push fails, the next run retries it. There is no separate retry table.

Concurrency is the other thing nobody warns you about until production. Bol rate-limits per offer-id, so you can fan out across SKUs aggressively. Amazon rate-limits per operation type with a token bucket that the docs describe in one place and then quietly tightens under load. We run the worker single-threaded per channel, with a per-channel queue, and let the slow channel slow itself. The total nightly run finishes in eleven minutes for 1,840 SKUs. Faster would not buy the ops lead anything she would notice.

Warning

Amazon queues most listing changes asynchronously. A 200 from patchListingsItem means accepted-for-processing, not accepted. Read the submission report before writing last_pushed_hash, or you will hash-skip a payload the marketplace silently rejected.

Days 10 to 12: the three adapters

Each adapter is a pure function from canonical product to channel payload, plus a push function that calls the API. We kept them small. Each is under 200 lines.

Bol's Retailer API is the most pleasant. PUT to /retailer/offers/{offer-id}, get a process-status id back, poll once a minute up to ten times. We send price, stock, fulfilment, and delivery code. Title and description live on the product, which you create separately and then attach the offer to.

Amazon's SP-API is a fight. The right endpoint for ongoing updates is patchListingsItem, which takes a JSON Patch document against the listing. The product-type schemas are versioned and large. We pulled the schema for each product type once, cached it in Postgres, and validated payloads locally before sending. This catches about 80% of what would otherwise be a rejected submission and a 15-minute round trip to find out.

Marktplaats is a CSV upload to a known SFTP endpoint, with a separate status feed you fetch to see which rows it accepted. It feels dated. It is also the most reliable of the three.

All three adapters share two patterns: an idempotency key derived from sku + channel + hash, and a structured log line per push so we can grep channel=amazon sku=XYZ result=rejected at 09:00 on a Monday without opening a dashboard.

The log line shape is worth a paragraph on its own. Every push emits a single JSON object with sku, channel, action, hash, status, latency_ms, and (on rejection) the marketplace's error code verbatim. One line per push. No multi-line stack traces unless something blows up inside the adapter itself. When something does go wrong the question is always "what changed for this SKU between yesterday and today," and the answer is two grep commands away.

Days 13 and 14: cutover and the kill switch

Two flags in a config table govern the worker: dry_run and enabled_channels. We ran day 13 with dry_run = true across all three channels. The worker computed every diff and logged what it would push, but pushed nothing. The ops lead read the log over morning coffee, flagged two SKUs where our title renderer had stripped a unit ("RVS 4L" became "RVS"), we patched the renderer, ran again. Clean.

Day 14, dry-run off for Marktplaats first. Watched for four hours. Bol next. Amazon last, because Amazon is the channel that surfaces every schema bug you missed.

The kill switch is a single SQL update: update worker_config set enabled_channels = '{}'. We taped a note to the monitor with that exact command. Three people have psql access. Nobody has needed to run it yet.

The morning ritual matters more than the cutover itself. Every day for the first two weeks, the ops lead opened the worker log at 09:00 with a coffee, scanned for anything not green, and either fixed it or queued it for our standup. We sat with her for the first three mornings. By day five she stopped calling. By day ten she was forwarding the log to her assistant. The point of automation is not the worker. It is the new shape of someone's morning.

What this stops costing you

The wholesaler's operations lead used to spend roughly nine hours a week on portal updates across the three channels. After cutover, she spends about 40 minutes a week, most of it reviewing the worker's morning log and deciding whether the four or five flagged warnings need a human. Pricing errors that used to leak through (the €129 combi-steamer) are caught at the renderer because list price has a sanity check against cost price plus a configurable floor margin.

The other thing it stops costing you is the meeting where someone asks "who changed the title on Amazon" and three people quietly check their browser history. Source of truth means the answer lives in products.updated_at and a git log of the renderers.

And the boring failure mode (the one nobody screenshots because it is too embarrassing to admit happened) goes away. Until April, a wholesale buyer was paying €1,299 list on Amazon for a SKU already discounted to €949 on Bol, because nobody had remembered to push the promo end-date through to Amazon. Source of truth means the next promo triggers all three channels at the same moment, or none of them.

What we would do differently

Three things, in order of how much pain they would have saved.

One: build dry-run mode on day one, not day twelve. Most of the renderer bugs we caught on day 13 would have been obvious on day 4 if we had been logging diffs against the existing portal state from the start.

Two: do not hash the entire payload. We started by hashing the whole rendered JSON. That meant any schema-version bump from Amazon forced a re-push of every SKU on that channel. We now hash only the semantic fields (price, stock, title, description, dimensions, photos), and treat schema envelope changes separately.

Three: write the audit log to the same Postgres. We initially shipped logs to a separate observability tool. Then someone needed to answer "show me every change to SKU 8421 in May" and we had to switch tools. Now there is a push_events table, and the query is one join.

When we built this for the Rotterdam wholesaler, the corner that ate most of our second week was Amazon's product-type schema validation: it changes more often than the docs suggest, and silent rejections look identical to silent acceptance for the first hour. We solved it by caching schemas locally and revalidating on push failure, which moved failure detection from "tomorrow morning" to "two seconds before the request leaves the worker." That kind of plumbing is most of what our process automation work actually looks like.

If you are about to start something like this, the smallest first move is to run one query against your own data: how many of your live SKUs across all channels have a title that does not match your master spreadsheet today. The answer is your starting point.

Key takeaway

If your team edits product data in more than one place, you do not have a product database. You have a synchronization problem dressed up as data entry.

FAQ

Why Postgres and not a dedicated PIM?

For 1,800 SKUs across three channels, a PIM is overkill and slow to configure. Postgres with a JSONB attrs column covers the same job in a week, and your team already knows SQL.

How do you handle marketplace rejections?

The worker writes last_pushed_hash only after the marketplace confirms acceptance. Failed pushes stay un-hashed and the next nightly run retries them automatically.

What about real-time stock updates?

Nightly is fine for prices and descriptions. For stock, run the same worker every 15 minutes with a filter that only touches the stock field. Same hash trick, faster cadence.

Can you skip the dry-run mode?

You can. You will regret it on day one. Dry-run takes an hour to build and catches the renderer bugs that would otherwise hit production listings.

process automationautomationintegrationse-commercearchitectureoperations

Building something?

Start a project