Which HTTP methods require an Idempotency-Key header?

POST and PATCH require explicit idempotency keys because they are neither safe nor inherently idempotent by RFC 9110. PUT and DELETE are protocol-idempotent but still benefit from keys when side-effects (audit logs, event emission) must be deduplicated. GET, HEAD, OPTIONS, and TRACE never need keys — they are safe and idempotent by definition.

Can PUT requests cause duplicate side-effects even though they are idempotent?

Yes. PUT is idempotent with respect to the resource state it writes, but server implementations commonly fire audit log entries, webhook triggers, or timestamp updates on every invocation. If those side-effects must fire exactly once, the endpoint needs an explicit idempotency key regardless of the HTTP method.

What breaks when a reverse proxy strips the Idempotency-Key header?

The downstream service receives no deduplication token and treats every retried request as a new transaction. In payment pipelines this causes duplicate charges. In inventory systems it causes over-decrement. Header-stripping by WAFs, CDN edge workers, and API gateways is a leading cause of production idempotency failures.

HTTP Method Semantics & Safety

Part of: Idempotency Fundamentals & API Guarantees

When a network failure causes a client to retry a request, the HTTP method is the first signal the infrastructure uses to decide whether that retry is safe. RFC 9110 encodes two orthogonal properties — safety (no server-side state change) and idempotency (repeated identical calls leave the same final state) — into the method itself. Getting this classification wrong is one of the most common root causes of duplicate charges, over-decremented inventory, and split-brain ledger state in distributed systems. This page maps every standard method to its RFC-defined contract, identifies where real implementations diverge from that contract, and shows how to enforce the correct boundaries at every layer from the load balancer to the application.

Guarantee model

RFC 9110 defines two independent properties:

Safe: the request must not cause any observable state change on the server. Clients and intermediaries may freely retry, cache, and pre-fetch safe requests.
Idempotent: sending the same request N times must produce the same server state as sending it once. The response may differ (e.g., a second DELETE returns 404), but the state does not.

Method	Safe	Idempotent	Requires explicit key
GET	yes	yes	never
HEAD	yes	yes	never
OPTIONS	yes	yes	never
TRACE	yes	yes	never
PUT	no	yes	when side-effects must fire exactly once
DELETE	no	yes	when side-effects must fire exactly once
POST	no	no	always on write endpoints
PATCH	no	no	always on write endpoints

The boundary between “idempotent by protocol” and “idempotent in practice” is where most production incidents occur. PUT is protocol-idempotent, but a server that fires a webhook, writes an audit log entry, or updates a modified_at timestamp on every PUT call introduces side-effects that are not idempotent. The guarantee model breaks the moment business logic diverges from the pure resource-replacement semantics the RFC assumes.

Where it breaks under partition or clock skew:

A network partition between client and server can cause the client to retry a PUT while the first request is still in-flight, producing two concurrent writes that race at the storage layer.
Clock skew makes timestamp-based deduplication unreliable. A request arriving 200 ms late with an earlier Date header will appear to be the older request even though it arrived second.
Linearizable reads are required to detect an in-progress duplicate — eventually-consistent replicas will miss a key registered on a primary that has not yet replicated.

Why safe methods need zero deduplication overhead

Safe methods — GET, HEAD, OPTIONS, TRACE — carry an unconditional guarantee from RFC 9110: they produce no state mutations. Why GET and HEAD are inherently idempotent covers the protocol mechanics in detail. The architectural implication is direct: clients, load balancers, CDN edge nodes, and service mesh proxies can all retry these requests without any coordination. Adding an idempotency-key lookup on a GET endpoint wastes round-trip budget and adds false complexity. It is also an anti-pattern that signals the endpoint is secretly doing writes — a design smell worth investigating.

Enforcement rule: if your GET handler touches anything other than reads, split the mutation into a separate POST or PUT call and guard that call with an idempotency key generation strategy.

Core algorithm: method-scoped request processing

The processing path diverges by method class at the earliest possible point — ideally inside the API gateway, before the request reaches application code.

Incoming request
      │
      ▼
┌─────────────────────┐
│ Method classifier   │  GET/HEAD/OPTIONS/TRACE → bypass dedup, serve or proxy
└────────┬────────────┘
         │ POST / PUT / PATCH / DELETE
         ▼
┌─────────────────────┐
│ AuthN + rate limit  │
└────────┬────────────┘
         │
         ▼
┌─────────────────────────────────────────────────────┐
│ Idempotency-Key header present?                     │
│                                                     │
│  no  → reject with 400 Bad Request (write methods) │
│  yes → atomic lookup in deduplication store         │
│         ├─ key found, status=COMPLETE → replay resp │
│         ├─ key found, status=IN_FLIGHT → 409        │
│         └─ key not found → register, continue       │
└────────┬────────────────────────────────────────────┘
         │
         ▼
  Business logic → commit → cache response → return

The IN_FLIGHT state prevents the window between key registration and business-logic completion from being exploited by a concurrent retry. Without it, two requests carrying the same key can both pass the “not found” check before either commits a result.

Below is the state transition for a single idempotency key across its full lifecycle:

Implementation variants

Variant 1 — Redis atomic SET NX (high-throughput, eventual consistency)

Register the key with a single atomic command. Lua guarantees the check-and-set is indivisible even under concurrent requests.

-- Lua script executed atomically via EVAL
local key    = KEYS[1]
local status = redis.call("GET", key)

if status == "COMPLETE" then
  return redis.call("GET", key .. ":response")   -- replay cached response
elseif status == "IN_FLIGHT" then
  return redis.error_reply("CONFLICT")
elseif status == false then
  redis.call("SET",  key, "IN_FLIGHT")
  redis.call("EXPIRE", key, 3600)                -- 60-minute window
  return "PROCEED"
end

After business logic completes, write the serialised response and flip the state:

redis.call("SET",  key, "COMPLETE")
redis.call("SET",  key .. ":response", ARGV[1])
redis.call("EXPIRE", key, 86400)                 -- 24-hour replay window

Trade-offs: sub-millisecond latency; single-node Redis is a SPOF; Cluster mode risks split-brain during network partition; no durability without AOF or RDB persistence.

Variant 2 — PostgreSQL unique constraint (strong consistency)

CREATE TABLE idempotency_keys (
  key         TEXT        PRIMARY KEY,
  status      TEXT        NOT NULL DEFAULT 'IN_FLIGHT'
                          CHECK (status IN ('IN_FLIGHT','COMPLETE','FAILED')),
  response    JSONB,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  expires_at  TIMESTAMPTZ NOT NULL
);

CREATE INDEX idx_idempotency_keys_expires
  ON idempotency_keys (expires_at)
  WHERE status = 'COMPLETE';

INSERT INTO idempotency_keys (key, expires_at)
VALUES ($1, NOW() + INTERVAL '24 hours')
ON CONFLICT (key) DO UPDATE
  SET key = EXCLUDED.key          -- no-op; returns existing row
RETURNING status, response;

Read the status column of the returned row:

IN_FLIGHT → return 409 Conflict to the caller.
COMPLETE → return the cached response with the original status code.
row not returned (clean insert) → proceed with business logic.

Trade-offs: serialisable isolation gives strong single-writer guarantees; connection pool exhaustion under burst write traffic is a real risk — size pools for peak concurrency, not average load; p99 latency is 3–15 ms on local Postgres vs. 1–2 ms for Redis.

Variant 3 — DynamoDB conditional write (global, highly available)

import boto3, json
from botocore.exceptions import ClientError

ddb = boto3.client("dynamodb")

def register_key(idempotency_key: str, ttl_seconds: int = 86400) -> str:
    """Returns 'PROCEED', 'IN_FLIGHT', or the cached response JSON."""
    import time
    try:
        ddb.put_item(
            TableName="IdempotencyKeys",
            Item={
                "key":    {"S": idempotency_key},
                "status": {"S": "IN_FLIGHT"},
                "ttl":    {"N": str(int(time.time()) + ttl_seconds)},
            },
            ConditionExpression="attribute_not_exists(#k)",
            ExpressionAttributeNames={"#k": "key"},
        )
        return "PROCEED"
    except ClientError as e:
        if e.response["Error"]["Code"] != "ConditionalCheckFailedException":
            raise
        row = ddb.get_item(
            TableName="IdempotencyKeys",
            Key={"key": {"S": idempotency_key}},
            ConsistentRead=True,       # must be strongly consistent
        )["Item"]
        if row["status"]["S"] == "COMPLETE":
            return row["response"]["S"]
        return "IN_FLIGHT"

ConsistentRead=True is mandatory. An eventually-consistent read after a recent write may return a stale ABSENT state and allow a second processor to bypass deduplication.

Trade-offs: global tables provide multi-region exactly-once semantics at the cost of replication latency (typically 100–300 ms cross-region); TTL expiry is eventually consistent — records may persist up to 48 hours beyond their TTL value; DynamoDB charges per write unit, so high-volume deduplication tables require capacity planning.

Variant comparison

Variant	Consistency	p99 latency	Durability	Best for
Redis Lua	eventual (single-region)	1–3 ms	configurable (AOF)	high-throughput, latency-sensitive APIs
PostgreSQL unique	serialisable	3–15 ms	ACID	payment ledgers, financial APIs
DynamoDB conditional	linearisable (strong read)	5–20 ms (same-region)	managed	globally distributed platforms

Intermediary interference: where method semantics break in production

RFC compliance only holds if every intermediary between client and server honours the method and its headers. In practice, several layers routinely violate this contract.

Method override and header stripping

Framework-level overrides such as _method=PUT in form-encoded bodies, or X-HTTP-Method-Override: DELETE in SOAP-over-HTTP stacks, hide the real verb from routing layers that inspect only the wire method. Reverse proxies and WAFs configured to normalise requests may:

Strip Idempotency-Key headers as “unknown” or “potentially sensitive” custom headers.
Rewrite PATCH to POST for compatibility with HTTP/1.0 upstreams.
Buffer and re-transmit POST bodies without forwarding the original Content-Length, causing signature verification failures.

When a gateway retries a POST after stripping the Idempotency-Key header, the downstream service treats it as a distinct transaction. In a payment pipeline this is a duplicate charge. Header preservation must be an explicit gateway policy, not an assumed default.

Enforcement checklist:

Deny any X-HTTP-Method-Override header at the edge unless explicitly required.
Add Idempotency-Key to the gateway’s passthrough / allowlist header policy.
Validate Content-Length and Content-Type at the edge before forwarding write requests.

Body buffering and stream consumption

Validating that a retried request carries the same payload as the original requires hashing the body. Streaming architectures make this non-trivial.

Node.js — the incoming stream can only be consumed once. Use a Transform that accumulates chunks, computes a SHA-256 digest, then re-emits the same bytes downstream:

const crypto = require("crypto");
const { Transform } = require("stream");

function hashingTransform() {
  const hasher = crypto.createHash("sha256");
  return new Transform({
    transform(chunk, _enc, cb) {
      hasher.update(chunk);
      this.push(chunk);
      cb();
    },
    flush(cb) {
      this.digest = hasher.digest("hex");
      cb();
    },
  });
}

Go net/http — io.TeeReader duplicates the stream without buffering the full body in memory:

import (
    "crypto/sha256"
    "encoding/hex"
    "io"
    "net/http"
)

func bodyDigest(r *http.Request) (string, error) {
    h := sha256.New()
    tr := io.TeeReader(r.Body, h)
    // Replace Body so downstream handlers can still read it
    r.Body = io.NopCloser(tr)
    // Force full read so the hash is complete before routing
    if _, err := io.Copy(io.Discard, tr); err != nil {
        return "", err
    }
    return hex.EncodeToString(h.Sum(nil)), nil
}

Spring MVC — ContentCachingRequestWrapper buffers the entire body into memory, which is unsafe for multi-megabyte payloads. For platform-scale APIs, wrap the InputStream with a DigestInputStream instead and stream through a MessageDigest:

import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletRequestWrapper;
import java.io.*;
import java.security.DigestInputStream;
import java.security.MessageDigest;

public class DigestRequestWrapper extends HttpServletRequestWrapper {
    private final byte[] body;
    private final String sha256;

    public DigestRequestWrapper(HttpServletRequest request) throws Exception {
        super(request);
        MessageDigest md = MessageDigest.getInstance("SHA-256");
        try (DigestInputStream dis = new DigestInputStream(
                request.getInputStream(), md)) {
            body = dis.readAllBytes();
        }
        sha256 = bytesToHex(md.digest());
    }

    @Override
    public ServletInputStream getInputStream() {
        // return a stream backed by the buffered body
        ByteArrayInputStream bais = new ByteArrayInputStream(body);
        return new DelegatingServletInputStream(bais);
    }

    public String getSha256() { return sha256; }
}

Middleware must always place idempotency validation after authentication and rate-limiting but before business logic. The correct pipeline order is:

Routing → AuthN → Rate limiting → Idempotency check → Business logic → Response cache

Placing idempotency before AuthN exposes the deduplication store to unauthenticated abuse. Placing it after business logic means the logic already ran — the key arrives too late.

Edge cases & failure scenarios

Failure Scenario	Remediation Steps	Observability Hooks
Gateway strips `Idempotency-Key` header on retry	Audit gateway header-passthrough policy; add `Idempotency-Key` to explicit allowlist; log stripped headers at the edge	Edge access log field `x_stripped_headers`; alert when count > 0 per minute
Deduplication store returns stale ABSENT for an IN_FLIGHT key (replica lag)	Use `ConsistentRead=True` (DynamoDB) or read from primary (Postgres `synchronous_commit`); never read dedup state from async replicas	Replica lag metric `replication_delay_seconds`; alert at 500 ms
Partial write: business logic commits but response-cache write fails	Store idempotency key and response in the same atomic transaction (same DB) or use a two-phase write with explicit retry; never rely on two separate writes both succeeding	`idempotency.cache_write_failure_total` counter; alert if non-zero over 5 min
Concurrent `POST` requests with same key arrive within milliseconds	Rely on atomic SET NX / unique-constraint INSERT; return `409 Conflict` to the second caller; the client must poll or wait and retry after the first completes	`idempotency.concurrent_conflict_total`; spike indicates client retry loop misconfiguration
Client generates non-unique keys (e.g., using a timestamp with second resolution)	Reject keys shorter than 128-bit entropy at the edge (validate UUID format or length ≥ 22 chars base64); log and alert on rejection	`idempotency.key_format_rejected_total`; feed into idempotency key generation strategy runbook
PUT side-effects (webhooks, audit logs) fire on every retry	Wrap side-effect emission in the same idempotency guard as the resource write; check key status before emitting; record emitted events by key	`side_effect.duplicate_emission_total` per event type; alert if non-zero

Operational concerns

TTL management

Short TTL (15–30 minutes): minimal storage cost; risks replaying an expired key during a prolonged network outage if the client’s retry window exceeds the TTL. Use only when client retry windows are bounded by contract (e.g., mobile SDKs with a 10-minute retry budget).
Standard TTL (24 hours): covers virtually all client retry scenarios including overnight batch jobs. Recommended default for payment and booking APIs.
Extended TTL (72 hours): required for workflows where human approval can delay the retry (e.g., fraud review queues). Adds 3× storage cost; requires automated cleanup to prevent unbounded growth.

Set a hard expiry on IN_FLIGHT keys independently from COMPLETE keys. An IN_FLIGHT record that never transitions (because the worker crashed) will block all retries indefinitely. Expire IN_FLIGHT records after 5 minutes and treat the next retry as a fresh attempt.

Index strategy

For PostgreSQL: the primary key index on key is sufficient for lookups. Add a partial index on expires_at WHERE status = 'COMPLETE' to make the TTL cleanup job (DELETE WHERE expires_at < NOW()) an index scan rather than a full table scan. At 10,000 write RPS with 24-hour TTLs, the table grows to roughly 864 million rows per day — run the cleanup job every 15 minutes.

Memory and storage budgeting

Backend	Key size (bytes)	Response cache size	Keys/GB
Redis string	~100 (UUID + metadata)	up to response size	~10M keys/GB (keys only)
PostgreSQL row	~200	JSONB (variable)	~5M rows/GB
DynamoDB item	~400 (with GSI)	up to 400 KB/item	~2.5M items/GB

Size the deduplication store for peak_rps × ttl_seconds × avg_item_bytes. At 5,000 RPS with 24-hour TTLs and 200-byte PostgreSQL rows: 5000 × 86400 × 200 ≈ 86 GB — plan for this in your database capacity.

SRE alert thresholds

idempotency.key_lookup_p99_ms > 10 — dedup store under pressure; check index health.
idempotency.conflict_rate > 1% of write traffic — client retry configuration is misfiring; review retry logic and backoff fundamentals.
idempotency.replay_rate drops to 0% despite known retries — dedup store reads failing silently; check error logs.
idempotency.inflight_keys_stuck > 0 after 10 minutes — worker crash leaving orphaned IN_FLIGHT records; trigger the TTL cleanup job and page on-call.
idempotency.store_error_rate > 0.1% — fail-open vs. fail-closed policy decision point; default to fail-closed (reject the request) for financial APIs.

Idempotency Fundamentals & API Guarantees — parent: the full guarantee model, failure boundary map, and production readiness checklist for idempotency across distributed systems.
Why GET and HEAD are inherently idempotent — deep dive into the protocol mechanics that make safe methods retry-free.
Idempotency Key Generation Strategies — how to generate keys with sufficient entropy, namespace isolation, and deterministic-vs-random trade-offs.
Retry Logic & Backoff Fundamentals — exponential backoff, jitter, and how retry dynamics interact with idempotency key TTLs.
Redis Cache-Based Deduplication — production patterns for the Redis SET NX approach including Lua scripting, key namespacing, and cluster-mode caveats.
Database Unique Constraints & Upserts — PostgreSQL and MySQL implementations of constraint-based deduplication with index design and migration strategies.