Skip to content

Resources · Status

Xero invoice sync delays

Started 2026-04-30 14:22 NZST · Resolved 16:08 NZST · Total 1h 46m · Affected: Xero sync (delayed, no data loss)

ResolvedMinor1h 46mXero sync

What happened

Xero invoice sync was delayed by up to 60 minutes for 1h 46m. No data loss.

A reference-back-fill job ran concurrently with the normal hourly sync, and together they exceeded Xero's per-organisation rate-limit. Local invoice state was accurate the whole time; what lagged was the visible-on-Xero state. All 2,400 in-flight invoices eventually synced, with idempotency enforced on the invoice id.

Impact

Who was affected, and how much.

Workspaces affected
12
of 91 active today
Invoices delayed
2,400
all eventually synced
Median delay
22 min
p95: 58 min
Data loss
0
zero invoices missed

Services affected

  • Xero sync
    cron-xero-sync honouring 429s, backlog growing
    delayed
  • Web app
    No customer-visible degradation
    operational
  • Booking engine
    Invoices created and locally accurate throughout
    operational
  • Stripe Connect
    Payment capture and webhook delivery unaffected
    operational
  • API
    No elevated error rates observed
    operational

Customer impact

Zero member-visible impact. The delay was operator-visible only — Xero showed stale invoice state for up to an hour for 12 workspaces. We have not issued any SLA credits because this did not breach the 99.95% SLA on this calendar month.

Read our SLA policy

Timeline

What we did, in order.

Every public update we posted during the incident, plus the internal pager that opened it. Times are 2026-04-30 NZST (UTC+12).

5 updates
  1. 16:08 NZST
    Resolved

    Incident resolved · Xero sync caught up

    All 2,400 in-flight invoices have synced to Xero. The back-fill job that triggered the rate-limit has been re-throttled to half its previous batch size and scheduled overnight. We will publish a full post-mortem within 7 days; the root cause and action items below are the preliminary summary.

  2. 15:38 NZST
    Monitoring

    Backlog clearing · ~70% caught up

    Xero is accepting requests at normal throughput. Our cron-xero-sync function is draining the backlog and is currently processing approximately 480 invoices per 5-minute window. Estimated full catch-up: 30 minutes. No data loss possible — sync direction for in-flight invoices is local→Xero with idempotency on the invoice id.

  3. 15:02 NZST
    Identified

    Cause identified · rate-limit triggered by overlapping back-fill

    A KARO-440 reference back-fill job (rewriting legacy LiteHQ #N references to the canonical LiteHQ.com #N — <user-ref> shape) ran concurrently with the normal hourly sync. Together they exceeded the Xero per-organisation rate-limit of 5,000 calls/day. Xero began returning 429s with Retry-After headers; our cron honoured them but the backlog grew faster than the budget could replenish.

  4. 14:38 NZST
    Investigating

    Investigating · Xero 429 rate-limit responses elevated

    We are seeing a sustained spike in Xero 429 responses across 12 operator workspaces. Hourly cron-xero-sync is honouring Retry-After headers and not double-billing the rate-limit budget. Invoices are queued safely — they will sync once the budget recovers — but the visible-on-Xero state is lagging local state.

  5. 14:22 NZST
    Update

    Monitoring alert fired

    Sentry alert xero-sync-lag-sla-breach triggered. On-call paged (Theo). Sync lag is currently ~12 minutes against an SLA of <5 minutes; we are diagnosing.

Root cause · post-mortem

Why this happened, in three paragraphs.

1. The trigger

A KARO-440 reference back-fill job (rewriting legacy LiteHQ #N references to the canonical LiteHQ.com #N shape) was queued to run during the day. It ran concurrently with the normal hourly cron-xero-sync, and the two together pushed us over the Xero per-organisation rate-limit of 5,000 calls/day. Xero began returning 429 responses with Retry-After headers.

2. Why no data was lost

cron-xero-sync honoured Retry-After headers correctly and never double-billed the rate-limit budget. Local invoice state remained the source of truth. Every queued invoice was idempotent on its invoice id, so when the budget recovered, the cron drained the queue without duplicates. The sync direction for in-flight invoices is local → Xero only; Xero never overwrote a local row mid-incident.

3. What we're changing

Back-fill jobs now run overnight only and are explicitly mutex'd against the hourly cron (already shipped). We are adding a per-tenant Xero rate-limit counter so we don't have to infer remaining budget from 429s (in progress). And we are building an operator-facing inline banner that fires when Xero sync lag breaches SLA — currently operators relied on us catching it on the Sentry dashboard.

Action items

Concrete changes, with owners and dates.

Two of these are already done. The other two are on the engineering calendar and will be linked back here as they ship.

4 items · 2 done
  • Back-fill jobs now run overnight only, never overlapping the hourly cron
    Owner · Platform / TheoTarget · 2026-04-30
    Done
  • Per-tenant Xero rate-limit budget tracked in a Postgres counter (not just inferred from 429s)
    Owner · Platform / AnyaTarget · 2026-05-07
    Done
  • Operator-facing banner when Xero sync lag breaches SLA
    Owner · Frontend / PriyaTarget · 2026-05-18
    In progress
  • Rate-limit-aware retry scheduler with circuit breaker on sustained 429s
    Owner · Platform / TheoTarget · 2026-05-30
    Scheduled