Skip to content

Resources · Status

Stripe webhook delays

Started 2026-05-14 09:42 NZST · Resolved 11:18 NZST · Total 1h 36m · Affected: webhooks (delayed)

Resolved1h 36mStripe Connect

What happened

Webhook delivery from Stripe Connect was delayed by up to 14 minutes for 1h 36m.

Our four-layer payment confirmation pipeline absorbed the delays. Bookings were still confirmed via the success-page hook and our internal scheduler — what slipped was the confirmation email arrival time, by 4–14 minutes for a subset of paid checkouts. Zero double-charges. Zero data loss.

Impact

Who was affected, and how much.

Webhooks delayed
1,847
across 38 operators
Median delay
6 min
p95: 14 min
Bookings impacted
362
of 4,118 today (8.8%)
Failed payments
0
zero customer-visible failures

Services affected

  • Stripe Connect webhooks
    Layer 3 of payment pipeline — delayed up to 14 min
    delayed
  • Booking confirmation emails
    Triggered downstream of webhook receipt
    delayed
  • Web app
    No customer-visible degradation
    operational
  • Booking engine
    Slot reservation + success-page hook unaffected
    operational
  • API
    No elevated error rates observed
    operational

Customer impact

362 paid bookings received their confirmation email 4–14 minutes late. No bookings were missed. No double-charges. We have not issued any SLA credits because we did not breach the 99.95% SLA on this calendar month.

Read our SLA policy

Timeline

What we did, in order.

Every public update we posted during the incident, plus the internal pager that opened it. Times are 2026-05-14 NZST (UTC+12).

6 updates
  1. 11:18 NZST
    Resolved

    Incident resolved · webhook backlog cleared

    All queued webhook events were delivered. Booking confirmation emails for paid checkouts caught up within 4 minutes of resolution. Synthetic webhook replay confirms p95 webhook age is back inside our SLA. We will publish a full post-mortem within 7 days; the root cause and action items below are the preliminary summary.

  2. 11:02 NZST
    Monitoring

    Backlog draining · 92% caught up

    Stripe restored normal delivery rates upstream. Our scheduler self-heal pipeline (Layer 4 of payment confirmation) is processing the backlog faster than new events are arriving. Estimated full catch-up: 15 minutes. No member-facing degradation reported in this interval.

  3. 10:24 NZST
    Identified

    Cause identified · upstream Stripe delivery delays

    We have confirmed with Stripe support that the delays are originating on their Connect webhook delivery pipeline. Status page link: status.stripe.com/incidents/0190fa-…. Our four-layer payment confirmation pipeline is doing exactly what it was designed for — paid bookings are still being confirmed via the success-page hook and the platform webhook (Layer 1 and Layer 2), so the impact is limited to confirmation email timing.

  4. 09:58 NZST
    Investigating

    Investigating · 4-layer payment pipeline absorbing impact

    Our scheduler self-heal pipeline (the hourly cron that reconciles any missed webhooks) is being shortened to a 5-minute cadence for the duration of this incident. We have paged Stripe partner on-call. No data loss is possible — the booking row is created before the Stripe checkout begins, and idempotency on payment_intent_id is enforced at every layer.

  5. 09:48 NZST
    Investigating

    Investigating · increased webhook latency

    We are seeing p95 webhook age climb above 4 minutes (normal: < 30 seconds). Stripe Connect webhook delivery is delayed across our fleet. We are checking whether this is platform-wide or scoped to a subset of accounts.

  6. 09:42 NZST
    Update

    Monitoring alert fired

    Sentry alert webhook-age-sla-breach triggered. Pager-Duty paged the platform on-call (Anya). Initial triage in progress.

Root cause · post-mortem

Why this happened, in three paragraphs.

1. The trigger

At 09:42 NZST, Stripe's Connect webhook delivery pipeline began queuing events instead of delivering them in real time. This was a platform-wide issue inside Stripe — confirmed on status.stripe.com and via partner support. Our infrastructure did not cause the delay; we were a downstream consumer.

2. Why the impact was limited

LiteHQ confirms paid bookings via four independent layers — success-page hook, platform webhook, Connect webhook, and an hourly scheduler self-heal. With Layer 3 (Connect) delayed, Layers 1 and 2 still confirmed the bookings end-to-end. The only thing that depended on Layer 3 reaching us in real time was confirmation email timing, which arrived 4–14 minutes late for the affected 362 bookings.

3. What we're changing

We tightened the scheduler self-heal cadence from hourly to 15 minutes by default (already shipped). We added a synthetic webhook replay to our weekly DR drill so we exercise this code path even when upstream is healthy. We are also building an operator-facing inline banner that surfaces when upstream Stripe is degraded — currently operators relied on us watching the partner status page.

Action items

Concrete changes, with owners and dates.

Two of these are already done. The other three are on the engineering calendar and will be linked back here as they ship.

5 items · 2 done
  • Synthetic webhook replay added to weekly DR drill
    Owner · Platform / AnyaTarget · 2026-05-25
    Done
  • Scheduler self-heal cadence shortened from hourly to 15-min by default
    Owner · Platform / TheoTarget · 2026-05-21
    Done
  • Public Stripe-status sub-status page on litehq.com/status (auto-mirrored)
    Owner · Web / SaraTarget · 2026-06-04
    In progress
  • Add operator-facing inline banner when upstream Stripe is degraded
    Owner · Frontend / PriyaTarget · 2026-06-10
    In progress
  • Document the 4-layer confirmation pipeline in CUSTOMER-FAQ.md
    Owner · Docs / MarcusTarget · 2026-06-15
    Scheduled