# Webhook delivery

> How Hatched enqueues, signs, retries, and dedupes webhook deliveries — what the platform promises and what it expects from your handler.

Source: https://docs.hatched.live/docs/concepts/webhook-delivery

This page is the contract between Hatched and your backend. The
[Verify webhooks guide](/docs/guides/verify-webhooks) covers the per-framework
handler code; this one explains the _system_ delivering them.

## At-least-once delivery

Hatched stores every webhook payload in a BullMQ queue the moment the
originating event commits. Delivery is **at-least-once** — never zero, but
the same `X-Hatched-Delivery` id can arrive more than once if your endpoint
returns a non-2xx before our retry window expires.

The implication: your handler **must be idempotent**. Hatched does not
attempt server-side delivery deduplication on your behalf because the
correct dedupe boundary is your business logic, not the HTTP layer.

The [Idempotency](/docs/guides/verify-webhooks#idempotency-in-detail)
section of the verify guide has the canonical Redis-SETNX pattern.

## Retry curve

When your endpoint returns a 4xx/5xx or times out (default 10s), Hatched
re-enqueues the delivery with exponential backoff:

| Attempt     | Delay since previous |
| ----------- | -------------------- |
| 1 (initial) | —                    |
| 2           | +5 seconds           |
| 3           | +30 seconds          |
| 4 (final)   | +5 minutes           |

After the fourth attempt the delivery is marked `failed` in the delivery
log. Hatched **does not retry automatically beyond that** — the operator
can replay manually from the dashboard once the endpoint is healthy.

A 2xx response any time during the window stops retries. A 4xx terminates
faster than a 5xx because Hatched assumes the payload is structurally
unacceptable (most often: signature reject). Both still mark the delivery
`failed` after the final attempt.

## Delivery id uniqueness

Webhook metadata lives in HTTP headers, not in the JSON body:

- **`X-Hatched-Event`** — the event name (`badge.awarded`, `buddy.hatched`).
- **`X-Hatched-Delivery`** — the outbound delivery id. This is the dedupe key.

The body is the raw per-event payload and does not contain a universal
`deliveryId`, `eventId`, `type` or `data` envelope. Some event payloads carry
domain ids such as `event_id`, `ledger_id`, `purchase_id` or `buddy_id`; use
those only when you intentionally want once-per-business-object semantics.

## Producer idempotency

Hatched itself dedupes on the _producer_ side using an internal
idempotency key derived from the originating action. Re-running the same
business action — for example a retried hatch on a stuck operation — will
not emit duplicate webhooks for the parts that already succeeded. This is
separate from your consumer-side dedupe and you don't need to do anything
to benefit from it.

## Ordering

Hatched does not guarantee global ordering. Two events for the same buddy
_tend_ to arrive in send order because the queue is FIFO per partition,
but cross-buddy or cross-event ordering is not reliable.

If ordering matters for your business logic, carry or compare domain-specific
timestamps/sequence ids in the payload rather than relying on arrival order.

## Replay window

Each delivery carries an `X-Hatched-Timestamp` header (unix seconds) that
Hatched signs alongside the body — the `X-Hatched-Signature` HMAC is computed
over `` `${timestamp}.${rawBody}` ``. SDK adapters reject anything older than
300 seconds by default — same convention as Stripe / Slack / GitHub.

Consequences:

- Every (re)delivery is re-signed with a fresh `X-Hatched-Timestamp` at send
  time, so there is no server-side age check — a retry minutes later still
  carries a current timestamp, and dashboard replays carry a fresh, valid one
  too. The 5-minute window is enforced **only on your side** by the verifier
  (SDK adapters default to a 300s tolerance).
- You **must** validate the timestamp on your side. The SDK adapter does
  this automatically; manual implementations need to compare against
  `Date.now() / 1000`.

A persistent ~30s skew between Hatched and your handler signals NTP drift
on your host — fix the clock rather than widening the tolerance.

## Delivery log

Every delivery — successful or failed — is recorded in
`webhook_delivery_logs` with:

- The masked request URL
- The signed raw payload
- Response status + body excerpt
- Attempt number
- Duration

Dashboard → Developers → Webhook deliveries surfaces this log per endpoint.
The SDK exposes it via `client.webhooks.deliveries({ endpointId })`.

## Health alerts and digest emails

Hatched rolls the delivery log into a customer-scoped health summary:

- `GET /webhook-configs/health` is available from the Webhooks settings surface
  and is not gated by analytics packaging.
- `GET /analytics/webhooks` returns the same contract for analytics dashboards
  and CSV exports.
- The summary includes active endpoint count, success rate, recent failures,
  retries in the last 24 hours, top failing events, alert severity, recommended
  action, and the last digest timestamp.
- A worker sweep runs hourly and sends at most one webhook health digest email
  per workspace per UTC day while delivery is degraded or has recent failures.

The digest links back to Settings → Webhooks so an operator can inspect recent
deliveries and replay the affected event after fixing the receiver.

## Dead-letter handling

Failed deliveries stay in the log indefinitely (retention follows the
customer's data retention setting, default 90 days). The operator can:

1. Inspect the response body to debug the handler.
2. Click **Replay** in the dashboard, or call
   `client.webhooks.replay(endpointId, deliveryId)`, after fixing the endpoint.
3. Bulk-replay a date range when migrating to a new endpoint.

There is no separate DLQ — the delivery log _is_ the DLQ.

## Cause webhooks

The `cause.threshold_reached` event uses a parallel delivery system that
ships per-cause webhook URLs configured in the dashboard, rather than the
customer-wide endpoints. Same signing envelope and same replay window, but a
different retry curve — 3 attempts (initial + 2 retries) with +1s/+4s backoff,
dispatched inline rather than via the BullMQ queue. The
[webhook payloads reference](/docs/reference/webhook-payloads#cause-threshold-reached)
covers the wire format.

## What this means for your handler

In summary, your endpoint needs three guarantees and one habit:

1. **Idempotent** — `X-Hatched-Delivery` dedupe before any side effect.
2. **Signature-verifying** — never trust the body before checking the HMAC.
3. **Fast** — acknowledge with `2xx` within 10s; queue slow work.
4. **Observable** — log the delivery id and response status so you can
   correlate platform-side dashboard entries with your own traces.

Get those right and the platform handles the rest.
