← Blog

Process automation

Google Sheet to Postgres: a half-day warehouse playbook

The warehouse spreadsheet has locked again at 11:18 with three pickers editing it from the floor. You have half a day to move it into Postgres before the next truck.

Jacob Molkenboer· Founder · A Brand New Company· 7 May 2024· 9 min
Open brass card-index drawer with ledger cards, chartreuse paper tab, red wax seal on ivory paper.

It is Wednesday afternoon. The operations lead at a parts distributor near Eindhoven sends a screenshot: the Google Sheet that tracks 4,200 SKUs has locked again because three pickers and a buyer are all editing it from the floor. The "Quantity on hand" column has two versions, one from 11:14 and one from 11:18, and nobody knows which is right. The sheet has 38 tabs, a VLOOKUP that takes eleven seconds to recalc, and a script trigger that fires twice on every save.

You have half a day before the next inbound truck. You can either spend it fixing the sheet, or move the data into Postgres with a tiny admin on top and never touch the sheet again. This is not a migration story in the Airbyte-and-dbt sense. It is the quiet version: one table, one CSV, one form, and a team that stops losing data on Wednesday afternoons. This is that second option, hour by hour.

Hour one: read the sheet honestly

Do not open the sheet with the intent to migrate it. Open it with the intent to understand it. Spreadsheets that run real operations always do three things at once: they store data, they compute reports, and they hold workflow rules in cell colours and conditional formatting. Your Postgres table only replaces the first one. The other two move to code or to the admin UI.

Pick the one tab that is the source of truth. There is always one. Everything else is a view, a buffer, or a report. For our distributor it was the tab called Voorraad, 4,217 rows wide, 22 columns deep. Of those columns, eight were real data and the rest were derived (VLOOKUPs, ARRAYFORMULAs, cell-coloured status flags). Write that down on paper. You will refer to it three times in the next ninety minutes.

Then list the writers. Who actually edits this sheet, and at what cadence? In our case: two warehouse pickers (every fifteen minutes during a shift), one buyer (twice a day), and the operations lead (whenever something breaks). Four humans. That is your concurrency budget. You are not building Shopify. You are building something three to five people will hit at the same time, and that lets you skip a lot of architecture.

Spend the last ten minutes reading the conditional formatting. Pickers and buyers encode meaning in cell colours: red means deleted but not actually deleted, yellow means waiting on the supplier, light green means counted today. Each colour is either a column you have to add to your schema or a workflow you have to honour outside it. The hard part of replacing a spreadsheet is not the schema. It is figuring out which columns were data and which were a UI hack the team built because they had no other tool.

Hour two: a schema that survives Monday

You now have a list of eight real columns. Resist the urge to model the world. You are not designing for the next decade. You are designing for the next month, and you want migrations to be cheap when you learn what you got wrong.

For the distributor, the table looked like this:

create table sku (
  id            bigserial primary key,
  code          text not null unique,
  description   text not null,
  location      text not null,
  on_hand       integer not null default 0,
  reorder_at    integer not null default 0,
  supplier      text,
  notes         text,
  updated_at    timestamptz not null default now(),
  updated_by    text not null default 'system'
);

create index sku_location_idx on sku (location);
create index sku_low_stock_idx on sku (on_hand) where on_hand < reorder_at;

Three things to notice. The code column is the natural key from the sheet, but the table still gets a synthetic id. Foreign keys to a SKU should never depend on the operations team renaming a part. The updated_at and updated_by columns are there from day one because the first question after go-live is always "who changed this and when". And the partial index on low stock is the report the team currently runs by sorting the sheet and squinting. Make it a query instead.

Two small calls worth defending. bigserial rather than uuid because a warehouse with ten thousand SKUs will never run out of bigint, and a sortable primary key makes the admin list cheaper to paginate. timestamptz rather than timestamp because someone, eventually, will run the admin from a different timezone, and a naive timestamp will silently shift by an hour. Both choices cost nothing today and save a migration in six months.

If you want a full audit trail, add a second table:

create table sku_event (
  id          bigserial primary key,
  sku_id      bigint not null references sku(id) on delete cascade,
  field       text not null,
  old_value   text,
  new_value   text,
  changed_at  timestamptz not null default now(),
  changed_by  text not null
);

A trigger fills this on every update. You will thank yourself the first time someone asks why a count dropped overnight. The Postgres docs on trigger functions cover the pattern in about forty lines.

Hour three: the load

Export the sheet as CSV (File, Download, Comma-separated values). Do not use a live API connection for the first load. You want a single, frozen snapshot that you can re-run when the import inevitably misses an edge case.

Clean the CSV in code, not by hand. Pickers type "12 stuks" in a quantity column. Buyers leave trailing spaces. One row in 200 has a comma inside a description and was saved with the wrong quoting. A forty-line Node script handles all of it and gives you a diff you can show the operations lead before the data hits Postgres.

import { readFileSync } from 'node:fs'
import { parse } from 'csv-parse/sync'
import postgres from 'postgres'

const sql = postgres(process.env.DATABASE_URL)
const rows = parse(readFileSync('voorraad.csv'), {
  columns: true,
  skip_empty_lines: true,
  trim: true,
})

const cleaned = rows
  .filter(r => r.Artikelcode)
  .map(r => ({
    code: r.Artikelcode.trim().toUpperCase(),
    description: r.Omschrijving.replace(/\s+/g, ' ').trim(),
    location: (r.Locatie || 'UNASSIGNED').trim(),
    on_hand: parseInt(String(r.Voorraad).replace(/[^0-9-]/g, ''), 10) || 0,
    reorder_at: parseInt(r.Minimum, 10) || 0,
    supplier: r.Leverancier?.trim() || null,
    notes: r.Notities?.trim() || null,
  }))

for (const row of cleaned) {
  await sql`
    insert into sku ${sql(row)}
    on conflict (code) do update set
      description = excluded.description,
      location    = excluded.location,
      on_hand     = excluded.on_hand,
      reorder_at  = excluded.reorder_at,
      supplier    = excluded.supplier,
      notes       = excluded.notes,
      updated_at  = now(),
      updated_by  = 'csv-import'
  `
}

console.log(`Loaded ${cleaned.length} rows.`)
await sql.end()

Before you insert anything in production, log the first ten cleaned rows and put them in front of the operations lead. They will spot the column you got wrong in about fifteen seconds. If Voorraad arrives as a string of mixed digits and Dutch words, you want to see that on screen before it lands in a column typed as integer.

Run it against a staging database first. Diff the row count against the sheet. If you are missing rows, it is almost always one of three things: an empty article code, a row hidden by a filter you forgot to clear, or a row the team marked deleted by colouring it red but never actually removed. Fix the script, not the data, and re-run.

Warning

If you find rows that exist only because of conditional-formatting "soft deletes", get the operations lead to confirm what to do with them before the cutover. Migrating ghost rows into Postgres is the fastest way to lose trust in the new system on day one.

Hour four: the admin

You are not building a CMS. You need three screens: a searchable list, an edit form, and a low-stock view. Next.js with the app router gives you all three in under 200 lines, and server actions remove the need for any client-side state management.

The list page is a server component that queries Postgres directly:

// app/sku/page.tsx
import { sql } from '@/lib/db'
import Link from 'next/link'

export default async function SkuList({ searchParams }) {
  const q = (searchParams.q ?? '').trim()
  const rows = await sql`
    select id, code, description, location, on_hand, reorder_at
    from sku
    where ${q
      ? sql`code ilike ${'%' + q + '%'} or description ilike ${'%' + q + '%'}`
      : sql`true`}
    order by code
    limit 200
  `

  return (
    <main>
      <form><input name="q" defaultValue={q} placeholder="Search SKU or description" /></form>
      <table>
        <thead>
          <tr><th>Code</th><th>Description</th><th>Location</th><th>On hand</th></tr>
        </thead>
        <tbody>
          {rows.map(r => (
            <tr key={r.id} className={r.on_hand < r.reorder_at ? 'low' : ''}>
              <td><Link href={`/sku/${r.id}`}>{r.code}</Link></td>
              <td>{r.description}</td>
              <td>{r.location}</td>
              <td>{r.on_hand}</td>
            </tr>
          ))}
        </tbody>
      </table>
    </main>
  )
}

The edit page is a form that posts to a server action. The action does one thing: update the row and write an event. No optimistic UI, no toast library, no state machine. A redirect after the update is fast enough that the team will not notice.

// app/sku/[id]/actions.ts
'use server'
import { sql } from '@/lib/db'
import { redirect } from 'next/navigation'
import { auth } from '@/lib/auth'

export async function updateSku(id, formData) {
  const user = await auth()
  const on_hand = parseInt(formData.get('on_hand'), 10)
  const location = formData.get('location').trim()

  await sql`
    update sku
    set on_hand    = ${on_hand},
        location   = ${location},
        updated_at = now(),
        updated_by = ${user.email}
    where id = ${id}
  `
  redirect('/sku')
}

Auth can be the simplest thing that works. For four users on an internal network, a single shared link with a passphrase behind Next.js middleware is fine for week one. Add Google SSO later when the warehouse manager asks for it, not before.

Do not add pagination, multi-column sorting, saved filters, or row-level highlighting until someone actually asks. The week-one admin is a thin shell. Every feature you add before the team asks for it is a feature you will defend later when someone wants it removed, and warehouse teams are merciless about unused buttons in the way of the count field.

Deploy on Vercel or a small VPS. Either works. Point Postgres at Supabase, Neon, or a managed RDS instance. The database choice matters less than getting the cutover right.

The cutover

Pick a low-traffic window. For a warehouse, that is end of shift on a weekday, not Monday morning. Tell the team in writing what is changing, what to do if the new tool breaks, and who to call. A printed one-pager taped to the picking station beats any chat announcement.

Bring printed cheat sheets to the floor before you flip the switch. One sheet, one side, shows the new URL, the login passphrase, and the three actions the team does daily: update on-hand count, change a location, mark a SKU for reorder. The pickers will reference it twice on day one and never again, and that is exactly the right amount of training material.

At cutover time, do exactly this:

  1. Lock the sheet (File, Share, viewer-only for everyone).
  2. Export a fresh CSV.
  3. Run the load script against the production database.
  4. Hand out the admin URL.
  5. Stay on call for the next ninety minutes.

The first edits will reveal three things you missed. A location code with a slash in it. A SKU the pickers track but the sheet never had a column for. A field the buyer needs that you assumed was derived. Fix them live. Each fix is a five-minute schema change and a deploy.

What runs for the next week

Leave the sheet visible but read-only for one week. Do not delete it. Two reasons: the team will keep referring to old comments in cells, and you want a frozen reference if a number in Postgres looks wrong. After seven days, archive the sheet to a folder named archive-pre-postgres-2026 and move on.

Add three things during that week. A nightly backup of the Postgres database (your provider almost certainly has this built in, turn it on). A weekly CSV export from the new admin, emailed to the operations lead, so they have the spreadsheet they are used to without anyone editing it. And a simple low-stock email that fires at 07:00 each morning listing every SKU below its reorder level. That last one is usually the moment the team stops missing the sheet.

Watch the audit table on day two. Sort by changed_at descending and skim the last hundred rows. You are looking for unexpected patterns: one picker doing all the edits because the others quietly gave up on the new tool, two users hammering the same SKU twenty seconds apart, or a field you forgot to add that someone is updating by writing the value into the notes column. Each pattern is feedback the spreadsheet never gave you.

The pattern, not the project

This playbook works because the scope is honest. You are not building an ERP. You are replacing a single spreadsheet with a single table and a thin UI, and you are doing it in the time it takes the warehouse to process two inbound trucks. The same shape applies to the invoice tracker the finance team runs, the lead pipeline marketing keeps in Airtable, and the project schedule that lives in a Notion database with seventeen views.

When we built a process automation like this for a wholesaler in Limburg last month, the thing we hit on cutover was a column called Status that turned out to hold four different concepts depending on the row. We split it into three boolean columns before go-live and nobody has asked for the old version back. Pick the spreadsheet that hurts most this week and give it half a day.

Key takeaway

Replacing a spreadsheet with a Postgres table plus a thin Next.js admin is a half-day job once you accept that you are not building an ERP, just retiring one sheet.

FAQ

Why Postgres instead of staying in Google Sheets with Apps Script?

Sheets locks under concurrent edits and has no real audit trail. Postgres gives you constraints, indexes, and a row-level history table that survives the first time someone asks why a count changed overnight.

Do I need an ORM for the Next.js admin?

No. For three screens and four users, a thin SQL client like postgres.js is enough. Add Drizzle or Prisma later if you outgrow it, not before.

What about offline edits from the warehouse floor?

If the floor has unreliable wifi, keep the admin online-only and give pickers a paper count sheet that gets typed in at the end of shift. Real offline sync is a separate, much larger project.

How do I handle the buyer who wants to keep using a spreadsheet?

Email them a daily CSV export of the same table. They get their spreadsheet, you keep one source of truth, and edits only happen in the admin.

Can I do this without writing the load script in Node?

Yes. psql with COPY works for clean CSVs, and Python with pandas is fine if you prefer it. The script language matters less than running it twice in staging before production.

process automationautomationworkflowoperationstoolingarchitecture

Building something?

Start a project