Integrations
Microsoft Graph quirks: 17 traps in an Outlook agent rollout
We shipped an inbox-triage agent into a 22-seat Outlook tenant in Hengelo and hit seventeen Graph quirks in three weeks. Here is the ranked list, worst first.

It is a Friday afternoon in Hengelo. The operations lead at a 22-person zakelijk-dienstverlener is watching her inbox. The agent we shipped two days earlier has just replied to a forwarded RFP thread, and the reply landed in a new conversation instead of under the original. She opens Outlook, scrolls, and sees the same thread split into three boxes. The agent did its job. Microsoft Graph did not.
That afternoon kicked off three weeks of cataloguing every quirk the Graph, Exchange Online and Outlook REST surfaces threw at us. We ended with seventeen. The list below is ranked: the ones at the top fail silently and corrupt data; the ones at the bottom waste a developer hour and then behave. If you are pointing anything at Outlook on behalf of more than a handful of users, read down until something looks familiar.
1. conversationId rotates on long forwarded threads
The worst one. Microsoft documents conversationId as the stable handle for a thread. In practice, when a thread crosses roughly forty messages and has been forwarded outside the tenant at least once, the id rotates. The new value points at a sub-conversation Exchange invents to keep its own internal index sane. Your agent, which keyed its memory on the old id, now thinks it has never seen the customer before. It greets them politely. The customer is annoyed.
The fix is to also store internetMessageId (RFC 5322’s Message-ID header) and References for every message you index. When you see an unknown conversationId, walk the references backwards two hops; if you find a message you already know, merge.
// fallback when conversationId looks new
async function resolveThread(msg: Message): Promise<ThreadKey> {
const known = await db.thread.byConvId(msg.conversationId);
if (known) return known.key;
// walk Message-ID chain — survives the rotation
for (const h of msg.internetMessageHeaders ?? []) {
if (h.name === 'In-Reply-To' || h.name === 'References') {
const parent = await db.message.byInternetId(h.value);
if (parent) return parent.threadKey;
}
}
return db.thread.create(msg);
}
2. 202 Accepted that quietly drops categories
This one bit us in week two. A PATCH to /users/{shared}/messages/{id} that sets categories: ["Triaged"] returns 202 Accepted in about 120 ms. The status feels like a write confirmation. It is not. On shared mailboxes with more than about twenty-five delegated users, the categories array is dropped before the change is committed to the mailbox store. A subsequent GET shows the field empty. No 4xx, no warning header, no entry in the change log.
We only spotted it because our reconciler runs a GET after every PATCH. The workaround is to fall back to a single-user write via X-AnchorMailbox set to the delegate, then re-share via your own metadata table. Categories on shared mailboxes are per-delegate anyway (quirk #9), so the shared label was a fiction to begin with.
3. Throttling has undocumented sub-limits
The public throttling page lists 10,000 requests per 10 minutes per app per mailbox. What it does not list is a roughly 4 req/s burst ceiling per mailbox that fires before you hit the 10-minute window. For a triage agent that classifies on receipt, four messages arriving in the same second is normal traffic. You will get a 429 with Retry-After: 1. Honour it; do not back off exponentially, or one second of contention turns into a minute of silence.
4. Delta tokens die at 30 days, with no warning
If you use $deltaToken for incremental sync (worth doing; see Microsoft’s delta query overview), the token returns 410 Gone after 30 days. Fine. But the failure mode for an agent that runs nightly is: it works for a month, then one morning re-syncs the entire mailbox. For a 50,000-message mailbox that is forty minutes of throttled GETs and a surprise on the cost dashboard. Stamp every delta token with a created-at and refresh proactively at day 25.
5. Webhook subscriptions need renewal every ~70 hours
Mail subscriptions max out at 4230 minutes, call it 70 hours. Renew at 48. Do not chain the renewal off the webhook itself; if you miss a delivery for any reason, the renewal goes with it, and the subscription dies in its sleep. Use a separate scheduled job.
6. internetMessageHeaders is not returned by default
Every Graph response that says “complete message” is lying unless you $select the headers explicitly. internetMessageHeaders, parentFolderId and singleValueExtendedProperties are all opt-in. The default payload is shaped for Outlook web, not for an agent that needs the RFC fields.
7. immutableId needs a Prefer header
Message ids change when a message moves folders. To get a stable id, set Prefer: IdType="ImmutableId" on every request. Forget it on one endpoint and the same message will appear under two ids in your store.
8. /me and /users do not share scopes
Delegated Mail.Read on /me does not grant /users/{id}. You need Mail.Read.Shared for the delegated case and Mail.Read as an application permission for the unattended case. We saw a senior dev burn a full afternoon on a 403 because the consent screen had glossed over the difference.
9. Categories on shared mailboxes are per-delegate
Each delegate has their own masterCategories list. A category set by user A is invisible to user B unless both have a category of the same name. There is no shared category surface in Graph. Build your own table and project it back per delegate.
10. ReplyAll silently drops Bcc
The Graph /reply and /replyAll actions follow Outlook UX, which does not surface Bcc on reply. If you need Bcc to survive (audit, archive-to-CRM patterns), do not use the action. POST a new draft with toRecipients, ccRecipients and bccRecipients set explicitly, then /send.
11. changeKey conflicts on draft updates
Every Outlook entity carries a changeKey for optimistic concurrency. PATCH a draft twice in a row from two threads in the same agent process and the second wins, or the second 412s, non-deterministically. Serialise writes per resource id.
12. Attachments above ~3MB need an upload session
The docs say 4MB inline. In practice, anything above 3MB occasionally rejects with 413 because base64 encoding pushes the payload past the limit. Use /createUploadSession for anything above 2.5MB and stop guessing.
13. Mentions require an explicit @odata.type
If you want a message body to render an @mention in Outlook, the mentions collection on the message entity needs "@odata.type": "#microsoft.graph.mention" per item. Omit it and the field is silently ignored. No 4xx.
14. Send-as needs application permissions
Send-on-behalf works with delegated Mail.Send.Shared. Send-as needs Mail.Send as an application permission plus an Exchange Online RBAC role assignment. The two are not interchangeable, and the failure mode is a sent message that arrives with the wrong From header.
15. inferenceClassification resets on move
Outlook’s Focused/Other split is exposed as inferenceClassification. Move a message between folders via Graph and the value resets to focused. If your agent moves messages, restore the classification in the same call.
16. Prefer: outlook.timezone is per-request
Set the timezone header once at startup and you will still get UTC on every subsequent request. The header is not session-sticky. Bake it into your HTTP client middleware.
17. GET on a shared mailbox marks as read
Reading a message via Graph on a shared mailbox sets isRead: true, even if you only fetched the metadata. The fix is to PATCH it back, or use $select to skip the body and avoid the flip. The team in Hengelo noticed this first; the agent was burning through unread badges every morning before anyone arrived.
The pattern behind the pattern
Twelve of the seventeen quirks share one shape: the API returns 2xx, the side effect is wrong, and there is no warning. That is the part that erodes trust in agentic systems faster than anything else. The unreliable layer in an agent stack is rarely the model; it is the surface the model writes to. Microsoft Graph is one of the more honest examples. It tells you it accepted your write, and that is true, but accepted is not committed, and committed is not visible.
The defensive posture that worked for us, after three weeks of this:
- Read your own write before acting on it. Always.
- Key on
internetMessageId, never onconversationIdalone. - Treat every 2xx as advisory. Reconcile against a GET.
- Keep a separate scheduled job for token, subscription and delta-token refreshes. Do not nest them inside the webhook handler.
- Log the
request-idresponse header on every Graph call. When you open a Microsoft support ticket, it is the only thing they ask for.
Microsoft Graph’s failure mode is the silent 2xx. If your agent is not reading back its own writes, it is lying to you on a schedule.
What we shipped, after
When we built the inbox-triage AI agent for the Hengelo zakelijk-dienstverlener, the thing we ran into was that none of these quirks live in one document; they live in seventeen, half on learn.microsoft.com and half in old blog posts. We solved it by writing a reconciler that sits between every Graph call and our agent state, plus a nightly diff job that flags any mailbox where our local view drifted from Outlook’s.
If you are about to point an agent at Outlook for the first time: write the reconciler first, then the agent. Spend an afternoon today logging the request-id header on every Graph response in your prototype; you will need it within the week.
Key takeaway
Microsoft Graph's failure mode is the silent 2xx. If your agent is not reading back its own writes, it is lying to you on a schedule.
FAQ
Does the conversationId rotation happen on internal-only threads?
We only reproduced it on threads that had been forwarded outside the tenant at least once and crossed roughly forty messages. Pure-internal long threads stayed stable in our tests.
Can application permissions read shared-mailbox categories across delegates?
Yes, but the categories array is still per-delegate. Reading via app permissions returns the mailbox owner's list, not the union across delegates. You have to project a shared view yourself.
Is the 4 req/s burst limit documented anywhere?
Not publicly. We inferred it from the Retry-After headers on 429s and confirmed it with Microsoft support on the phone. The 10,000 per 10 minutes figure is the only number written down.
How do you renew a delta token before it expires at 30 days?
Stamp every token with a created-at timestamp and trigger a fresh delta run at day 25. There is no extend endpoint; you re-prime the cursor and discard the old token.
Is send-as worth the extra RBAC setup over send-on-behalf?
Only if the From header matters to recipients. Send-on-behalf shows 'X on behalf of Y' in most clients, which is fine for internal flows but reads as suspicious externally.