Strategy

Nonfiction AI panic: the 17 mistakes a redacteur can undo

Seventeen ways a Dutch nonfiction publisher gets the AI panic wrong, ranked by which mistakes a redacteur can undo in one prompt and which mean a full corpus rebuild.

Jacob Molkenboer· Founder · A Brand New Company· 4 Jan 2026· 8 min

Seventeen cream index cards fanned on ivory paper, brass clip, green ribbon, red wax seal, brass pencil, leather corner.

23:14 on a Tuesday, the redacteur at a Dutch nonfiction house with a turnover somewhere between fifteen and eighteen million is on her third coffee. The najaars-catalogus closes Friday at noon. Forty chapters sit in her inbox, each generated last weekend by a ghostwriting agent the CTO stood up after the board forwarded a Hacker News thread titled "AI killed self-help nonfiction." Each chapter is technically fine. None of them is readable. The lead author whose backlist they were supposed to extend hasn't been told yet.

This is the moment we get the call.

The wrong question

The thread on HN was not wrong, exactly. The cheapest nonfiction is now free, and a midlist self-help author no longer competes with three other midlist self-help authors. She competes with a Notion template and a forty-minute Claude conversation. The board reading that thread on a Sunday was not wrong to feel something. They were wrong about what to do on Monday.

"Ship an AI line by autumn" is not a response to the question. The question is: what does an AI-shaped tool actually move for a backlist publisher whose unit economics depend on a brand-name author, a copyeditor's eye, and a sales rep who knows the buyer at Bruna by first name?

You can answer that. But you have to stop generating chapters first.

How to read this list

We've now done editorial-assist work for three publishers in the Netherlands and Belgium. The same seventeen mistakes show up in every aborted v1. We've grouped them by what it costs to fix.

Tier 1 mistakes you can fix in one prompt revision while the redacteur sits next to you. Tier 2 mistakes need an editorial-pipeline change — a second agent, a new step, a different output schema — but no data work. Tier 3 mistakes mean the retrieval corpus underneath the agent is shaped wrong, and the only honest fix is to rebuild it. If you are five days from a catalogue deadline, you can afford Tier 1 and parts of Tier 2. Tier 3 is a Q1 conversation, not an autumn one.

Tier 1: undo in a single prompt revision

1. Asking for "an authoritative voice." Every bad ghostwriting prompt opens this way. It produces the LinkedIn-essay register: confident, sourceless, mid. Replace it with three short paragraphs from your actual author and the line "match this register, including the parts that feel uncertain."

2. No house-style examples. Your house has a style — semicolons or no semicolons, headings as sentences or as labels, footnotes or endnotes. Without three to five example paragraphs in the system prompt, the model defaults to a transatlantic blog voice that does not exist in any Dutch publisher's catalogue.

3. Temperature at 1.0 because the CTO thought "creative." For nonfiction prose where the facts are fixed and only the phrasing should vary, 0.4 to 0.6 is the sweet spot. Above 0.8 you get the chapter where the author "spent a winter in Lapland," which she did not.

4. No reading-level constraint. Dutch trade nonfiction sits around CEFR B2. Without "write at CEFR B2, sentences under twenty words, one idea per paragraph" in the system prompt, the model writes at C1-going-on-C2 and your reader bounces in the bookshop.

5. No per-chapter length cap. A frontlist chapter is 2,800 to 4,200 words. Without a cap, the agent produces 7,000-word chapters that all start strong and dissolve in the second half. Anthropic's prompting guide covers length and structure control with examples — read it before you write your next system prompt.

Tier 2: editorial pipeline changes

6. Single-pass generation. Asking one model in one call to do research, structure, drafting, and self-edit is asking a junior writer to file finished copy from a brief. Split it into a four-step chain: outline agent, draft agent, editor agent, fact-check agent. Each step gets its own prompt, its own output schema, and one upstream dependency. The fact-check agent reads only the draft's own citations; if any claim has no resolved source, the chapter bounces back to draft before it ever reaches the redacteur's inbox.

7. No fact-check step. The agent will invent a Dutch researcher at Universiteit Leiden with a plausible name and a fabricated 2019 study. Run every claim through a retrieval-grounded checker and reject chapters with unresolved citations. Do not let the copyeditor be the fact-checker of last resort; you will burn her out by week three.

8. Same prompt for every chapter. Forty chapters from one prompt give you forty chapters that read the same. Pass the chapter's role in the book — opener, deepening, case study, recap — as a parameter, and let the prompt branch on it.

9. No author-voice samples in retrieval. The author's previous three books are the highest-signal training data you have, and the most expensive to replace. Embed them, chunk them paragraph-wise, and feed two or three retrieved paragraphs into every draft call as voice anchors.

10. Treating the agent as a writer instead of a research assistant. This is the philosophical mistake under most of the others. The agent is good at "find every claim in the corpus about late-life career changes, group by argument, surface contradictions." It is bad at "be Bregman." Use it for the first job. Pay your author to do the second.

11. No source preservation in the output. If your agent's output is plain prose, the copyeditor can't verify anything. Force a structured output where every paragraph carries the chunk IDs it was drafted from. The reader never sees these; the editor and the lawyer do.

12. No reader persona. "Write for the curious general reader" is the same as "write for nobody." Pin a specific persona — the 52-year-old who bought your author's last book at Schiphol — into the system prompt, and the prose tightens in one pass.

Tier 3: full corpus rebuild

These are the ones that hurt. If your agent has any of these problems, you cannot ship the autumn line on its current foundation. You can still ship a smaller, honest thing: an editorial-assist tool for your living authors, with the corpus you already have, used differently.

13. The retrieval corpus is full PDFs, not semantically chunked. A 280-page book chunked by page is useless for retrieval. Chunk by argument boundary — usually 400 to 800 tokens, respecting paragraph breaks — and you get retrieval that returns the right two paragraphs instead of the right two pages.

14. No metadata on chunks. Every chunk needs, at minimum: author, book, year, genre, register, language, and licence status. Without this, you cannot filter the retrieval, and the agent draws from your 1998 management title when it should be drawing from your 2023 climate-anxiety book.

15. Licensed and unlicensed sources in the same index. The agent does not know which sources you have the right to derive from. If your index mixes your own backlist with scraped Goodreads reviews and a few PDFs the intern downloaded, every output is legally radioactive. Separate indices, hard ACLs, and a log of which chunks each chapter drew from. The EU's AI Act (Regulation 2024/1689) obliges you to be able to answer the provenance question. So does your author's lawyer.

16. Wrong embedding model. If your corpus is 90% Dutch and you embedded it with an English-tuned model, your retrieval will be lukewarm and your CTO will tell the board that "RAG doesn't work." Use a multilingual embedding model that lists Dutch in its benchmark table. Re-embedding when you switch is faster than you think — usually a weekend, not a quarter.

17. No provenance trail at all. When the author of the book whose voice you ghosted from asks, in writing, what was used to generate chapter eleven, you need a one-click answer. If your pipeline does not log, per paragraph, which chunks were retrieved, which prompts ran, and which model versions produced the text, you have built a liability and called it a product.

Warning

The mistake that ends careers is not generating bad chapters. It is shipping bad chapters under a living author's name without her sign-off on every sentence. Don't.

What to do before Friday

If the catalogue closes this week and you are sitting at mistake fifteen, the answer is not "ship anyway and apologise in spring." The answer is: pull the AI line from the autumn catalogue and ship a smaller, true thing in its place. Your strongest living author's next title, say, with an editorial-assist agent helping her structure and fact-check, running on Tier 1 and Tier 2 fixes only.

This is a less exciting board update than "we have launched an AI imprint." It is also a board update you can give again next year, with the same author still under contract.

The five-minute audit

Before you do anything else, open the system prompt your team is using and check three things. Does it contain real example paragraphs from a real author in your house? Does it cap chapter length? Does it tell the model what register and reading level the output is for? If any of those three is missing, you're in Tier 1 territory and you can move the agent's output quality by 30% in the next hour. If they are all present and the output is still flat, the problem is in Tier 2 or Tier 3, and the rest of this guide is your map.

When we built an editorial-assist agent for a Dutch nonfiction publisher last spring, the thing we ran into was mistake 16: the corpus was Dutch, the embeddings were English, retrieval was returning nothing useful, and the CTO had concluded the technology didn't work. We re-embedded with a multilingual model over a weekend, kept the rest of the pipeline, and the same agent started returning the paragraph you actually wanted — the architecture sits on our AI agents page.

Open your system prompt. Find the word "authoritative." Delete it. Paste in three paragraphs from your author. That is today's work.

Key takeaway

Most AI nonfiction failures are prompt-level mistakes a redacteur can fix in an hour. Only a handful force a full corpus rebuild — know which is which before Friday.

FAQ

Should a sub-€18M Dutch publisher still launch an AI nonfiction line in 2026?

Not as a standalone imprint. As an editorial-assist layer behind your living authors, yes. The economics work; the brand risk of ghostwritten frontlist does not.

Why does the Dutch-language corpus matter so much?

Most embedding and reasoning models are tuned primarily on English. If your corpus is 90% Dutch and your model is not multilingual, retrieval is weak and the agent looks broken when the real problem is the index.

Which of the seventeen mistakes can a redacteur fix without a developer?

All five Tier 1 mistakes and parts of Tier 2 — anything that lives in the system prompt or the editorial review step. Tier 3 needs engineering.

What's the smallest useful editorial-assist agent worth building this quarter?

A research-and-structure agent for one living author: it surfaces claims, finds contradictions, drafts an outline. The author writes the prose. Three weeks of work, no corpus rebuild.

How do we handle the EU AI Act for AI-generated nonfiction?

Log per-paragraph provenance: which chunks were retrieved, which prompts ran, which model versions produced the text. Keep licensed and unlicensed sources in separate indices with hard access controls.

strategyai agentsragknowledge baseworkflowoperations

Building something?

Start a project