← Blog

AI agents

RFQ agent for a semicon shop: €4.12 to €0.38 per quote

A buyer pasted a STEP file into Teams at 16:52 on a Friday and asked for a quote by Monday. Behind that file sat a 12-year-old Plex ERP and 16 stacked RFQs.

Jacob Molkenboer· Founder · A Brand New Company· 19 Jun 2026· 9 min
Antique brass switchboard with cloth patch cables, RFQ paper with chartreuse sticky note, brass bell, pencil on ivory surface.

A buyer at one of Europe's larger fabless customers pasted a STEP file into Teams at 16:52 on a Friday, asked for a quote "by Monday morning, ideally," and went home. The shop floor lead at our client, a 28-person semiconductor toelevering in Eindhoven, already had sixteen of those open. Quoting one of them, end to end, took her forty-five minutes on a good day and an hour on a bad one. Her week was already booked.

That single Teams paste was the third RFQ we watched that afternoon. By the end of the visit we had counted 164 of them across the week. Annualised, the same shop sees roughly 42,000 drawing requests a year. The cost per quote, fully loaded, was €4.12. The accept-rate on quotes returned after the Monday deadline was 11 percent. On quotes returned in under six hours it was 38 percent. The arithmetic on that gap funded the entire project.

The thing the buyer actually sends

A semicon tekening-aanvraag is not one file. On a typical week it arrives as a STEP or IGES geometry, a 2D PDF with GD&T callouts, a tolerance spec referencing a customer-specific drawing standard, and a freeform email saying things like "we need 12 of these, anodised black, but the threaded holes are H7, not H8 like last time." The chat agent's job is not to quote. Its job is to assemble the quoting context, ask the buyer the three questions that always get asked anyway, and hand a complete RFQ envelope to the shop floor lead.

Doing that well required reading the STEP file. That was the part that, until April, lived inside a SolidWorks macro running on a Windows VM in the corner of the office.

Why the SolidWorks bridge had to die

The previous setup was, charitably, a Heath-Robinson contraption. SolidWorks ran inside a Windows VM. A C# macro opened each new STEP file, measured bounding box, surface area, hole count, and wall thickness, and wrote a JSON blob into a shared folder. A Python script picked the blob up, called a homegrown PostgreSQL stored procedure that mapped feature counts to routing operations (drilling, milling, EDM, surface treatment), and produced a draft quote.

It worked. It cost a SolidWorks seat at €4,800 a year, plus the Windows VM, plus the fact that any change to the macro required a developer who knew both SolidWorks API quirks and the firm's routing conventions. There were two such people in the building. One of them was on parental leave. When the macro crashed at 02:00 on a Saturday morning, which it did about once a month because Windows Update, nobody was on call.

The deeper problem was reach. The bridge could only answer the questions the macro had been written for. A buyer asking "what if we drop the H7 to H8, does the price move?" required a human to open SolidWorks, change the model, re-export, re-run. That round-trip killed the same-day accept-rate.

CadQuery as the parser

CadQuery is a Python library on top of the OpenCascade kernel. It reads STEP and IGES, gives you a real B-Rep tree, and lets you query features programmatically. We picked it for three reasons: it runs anywhere Python runs, it has no licensing cost, and its API is verb-shaped enough that Claude can call it through tool use without much hand-holding.

The replacement is a small FastAPI service that exposes the parts of CadQuery the agent actually needs:

import tempfile
from fastapi import FastAPI, UploadFile
import cadquery as cq

app = FastAPI()

@app.post("/inspect")
async def inspect(file: UploadFile):
    with tempfile.NamedTemporaryFile(suffix=".step", delete=False) as tmp:
        tmp.write(await file.read())
        path = tmp.name

    body = cq.importers.importStep(path)
    solid = body.val()
    bbox = solid.BoundingBox()
    return {
        "bbox_mm":      [bbox.xlen, bbox.ylen, bbox.zlen],
        "volume_mm3":   solid.Volume(),
        "surface_mm2":  solid.Area(),
        "hole_count":   len([f for f in solid.Faces()
                             if f.geomType() == "CYLINDER"]),
        "thin_walls_mm": thin_wall_scan(solid, threshold=1.5),
    }

That endpoint is one of nine the agent can call. The others cover sectional analysis, tolerance extraction from the PDF (via a separate OCR step that runs in the same loop), looking up customer-specific drawing standards, and reading routing rules out of Postgres.

The Claude tool-use loop

The orchestrator is a Claude tool-use loop along the lines described in Anthropic's tool-use docs. The agent receives the Teams message, the attachments, and a system prompt that names it the firm's "RFQ-voorbereider." It then plans, calls tools, asks the buyer follow-up questions when the geometry is ambiguous, and finally hands a complete envelope to the shop floor lead inside Plex.

tools = [
    {"name": "inspect_step",                "input_schema": {...}},
    {"name": "extract_tolerances_from_pdf", "input_schema": {...}},
    {"name": "lookup_routing",              "input_schema": {...}},
    {"name": "lookup_customer_standard",    "input_schema": {...}},
    {"name": "draft_quote_in_plex",         "input_schema": {...}},
    {"name": "ask_buyer",                   "input_schema": {...}},
    {"name": "check_account_status",        "input_schema": {...}},
]

resp = client.messages.create(
    model="claude-sonnet-4-5",
    system=SYSTEM_PROMPT,
    tools=tools,
    messages=conversation,
    max_tokens=4096,
)

while resp.stop_reason == "tool_use":
    tool_results = run_tools(resp.content)
    conversation.append({"role": "assistant", "content": resp.content})
    conversation.append({"role": "user",      "content": tool_results})
    resp = client.messages.create(
        model="claude-sonnet-4-5",
        system=SYSTEM_PROMPT,
        tools=tools,
        messages=conversation,
        max_tokens=4096,
    )

Two design choices matter here. First, draft_quote_in_plex writes to a staging table, never to the production quote table. A human in Plex still clicks "release." Second, ask_buyer is a tool, not a fallback. The agent is encouraged to ask, in plain Dutch, the same three questions the shop lead would have asked: confirm material, confirm batch size, confirm surface finish. Buyers like it. The conversion rate on agent-asked clarifications is higher than on the buyer's own free-text spec.

Wiring it to a 12-year-old Plex install

Plex is a manufacturing ERP from the late 2000s, now owned by Rockwell Automation. The install we worked with had not seen a major upgrade since 2014. Its REST API exists but is undocumented for half the endpoints we needed. The routing data lived in a separate homegrown PostgreSQL database the firm had built in 2017 because Plex's routing module did not match how they actually sequenced operations.

We did not touch Plex. We wrote a small read/write adapter that talked to its database through a service account, plus a Postgres adapter for the routing DB. Both adapters are exposed to the agent as tools. Everything the agent writes goes into a staging table that Plex's nightly job picks up. If the staging row fails Plex validation, the agent gets the error back as a tool result and gets one retry before it pages the shop lead.

Warning

Resist the urge to "modernise the ERP first." Every project we have seen that started with "we'll just migrate Plex" took eighteen months and never reached the AI step. Wrap, don't replace.

The numbers, end of week twelve

The agent went live on a Monday in March. By the end of week twelve it was handling 820 tekening-aanvragen a week, which is a hair over 90 percent of total inbound. The shop floor lead reviewed every quote for the first month, then moved to spot-checking one in five.

The unit economics:

  • Per-aanvraag cost dropped from €4.12 to €0.38. About €0.21 of that €0.38 is Claude API tokens; the rest is the FastAPI service, OCR for the PDFs, and a slice of Postgres.
  • Same-day quote return rate moved from 24 percent to 87 percent.
  • Accept-rate on same-day quotes held at 38 percent. We expected it to dip because the agent quotes more aggressively. It did not.
  • The SolidWorks seat is gone. The Windows VM is gone. The Saturday on-call rota for the macro is gone.

Annualised, the cost saving on quoting alone is around €157,000. The revenue saving, the quotes that would have been late and rejected, is harder to pin down, but a conservative estimate puts it at roughly €1.1M of additional accepted work in the first year.

What broke in the first three weeks

The agent was wrong about wall thickness on a thin-walled aluminium housing in week two. CadQuery's bounding-box-based heuristic missed an internal pocket. The shop lead caught it in review. We added a second tool, thin_wall_scan, that ray-casts across the model and reports the thinnest wall. The agent now calls that tool whenever the material is aluminium, magnesium, or titanium.

The agent was over-eager to ask clarifying questions in week one. Buyers complained. We tightened the system prompt: ask only when the geometry is genuinely ambiguous, not as a hedge. Tool-use loops will, by default, ask too much. You have to push them the other way.

The agent forwarded a quote to a buyer whose company was, that morning, in a payment dispute with the firm. Nobody had told the agent. We added check_account_status as the first tool fired on every conversation, and we now route flagged accounts to a human. The agent does not know what your CFO knows. Tell it.

What the shop lead actually does now

Before this project, the shop floor lead spent her week copy-pasting between Teams, SolidWorks, Plex, and the routing database. She now spends it on the work the agent cannot do: walking the floor, talking to the buyers on the few RFQs the agent escalates, and renegotiating routing rules with the operators when something new walks through the door. The agent surfaced six routing rules in the first month that had drifted from what the operators were actually doing on the floor; she fixed them. Her job got harder in the interesting direction.

Why this shape of project ships now

This project would not have shipped two years ago. Tool-use loops were not reliable enough. CadQuery's STEP importer had bugs that have since been fixed. Claude's instruction-following on multi-step engineering tasks crossed a threshold somewhere in late 2025. Three earlier prototypes we ran in 2024, against the same client requirements, stalled on the same loop-reliability issue the current model handles cleanly. The fourth attempt was the one that shipped.

When we built this for the Eindhoven shop, the thing we kept running into was that the agent needed to know what a human quoter would have asked next, not what the file contained. The fix was not a smarter model. It was wrapping the firm's actual routing rules as tools and letting the loop call them. If your team spends its week assembling context before doing the real work, that gap is where an AI agent earns its keep.

The five-minute audit: count last week's inbound RFQs, tickets, or invoices. Multiply by the loaded cost of the person who handles them. If the answer is over €40,000 a year, you have a project worth scoping.

Key takeaway

Wrap the legacy ERP, do not replace it. A Claude tool-use loop that assembles RFQ context will outearn a quoting model trained to price.

FAQ

Why CadQuery instead of the SolidWorks API?

CadQuery runs anywhere Python runs, has no licensing cost, and its verb-shaped API is easy to expose as agent tools. The SolidWorks bridge required a Windows VM and a developer who knew both the SolidWorks API and the firm's routing rules.

Does the agent quote autonomously?

No. It writes to a staging table inside Plex. A human in Plex still clicks release. The agent's job is to assemble the quoting context, not to commit to a price.

What was the per-quote cost breakdown after the rebuild?

€0.38 per RFQ end to end. Roughly €0.21 is Claude API tokens. The rest is the FastAPI service, OCR for PDFs, and the slice of Postgres the routing adapter touches.

Did the accept-rate drop because the agent quotes faster?

No. Accept-rate on same-day quotes held at 38 percent and the volume of same-day quotes nearly quadrupled, so total accepted work went up sharply.

Why not migrate off Plex first?

ERP migrations take 12 to 18 months and stall AI work. Wrapping Plex with read/write adapters and a staging table got the agent live in twelve weeks without touching the ERP itself.

ai agentscase studyautomationintegrationsarchitectureoperations

Building something?

Start a project