Does Redis SETNX guarantee exactly-once processing?

Within a single Redis primary node, SETNX provides linearizable exactly-once claim semantics. Exactly-once delivery end-to-end additionally requires that business logic is atomic and that a FAILED state is stored on downstream write failure — otherwise the key blocks future retries indefinitely.

What TTL should I set on idempotency keys?

Align the TTL to your client's maximum retry window plus a safety margin. For synchronous payment APIs, 86400 seconds (24 hours) covers most client retry strategies. Asynchronous job pipelines may require 604800 seconds (7 days). Never set TTL below the downstream service's p99 timeout.

How do I handle Redis unavailability without losing idempotency?

Implement a circuit breaker that routes traffic to a fallback database deduplication path on Redis errors. Use a short open-circuit timeout (5–10 seconds) and emit a dedup_fallback_total metric whenever the fallback path activates.

Redis & Cache-Based Deduplication

Part of: Backend Implementation & Storage Patterns

When clients implement exponential backoff with jitter and a downstream service crashes mid-request, the same payload can arrive two, five, or fifty times before the client gives up. Redis-based deduplication intercepts those duplicates at the in-memory layer — before they reach your database, payment processor, or event broker — by treating each request fingerprint as a single-use claim token. This page covers the guarantee model, the core algorithm, four implementation variants with production code, a failure scenarios table, and the operational knobs that keep the mechanism reliable under memory pressure and node failover.

Guarantee Model

Redis SET key value NX EX ttl provides linearizable exactly-once claim semantics on a single primary node. The first caller that issues the command for a given key wins; every subsequent caller for the same key during the TTL window receives the pre-stored response and skips business logic entirely.

Where the guarantee holds:

Concurrent requests hitting the same API gateway pod — Redis serializes them.
Client retries after a 5xx timeout — the claim key already exists; the cached result is returned.
Duplicate webhook deliveries within the deduplication window — intercepted before the handler fires.

Where the guarantee breaks:

Async replica lag during failover. If the primary fails before replicating the SET, the promoted replica does not know the key was claimed. The first post-failover request for that key will be processed a second time.
Clock skew and very short TTLs. If a client’s retry arrives a few milliseconds after key expiry due to clock drift between the client and the Redis node, the guard is silently absent.
Memory eviction. If Redis evicts the key under allkeys-lru pressure before the TTL expires, subsequent requests will be treated as new.

The practical response to all three failure modes is the same: journal the idempotency state to a durable store asynchronously, so that Redis is the fast path and the database is the authoritative fallback. Idempotency Key Storage & TTL Management covers the durable-store side of this contract in detail.

Core Algorithm

The deduplication flow is a five-step state machine. The diagram below shows the request lifecycle from fingerprint derivation through terminal state storage.

The critical constraint: the PROCESSING state must have a short TTL (30 seconds) and the COMPLETED state a long TTL (86400 seconds). This prevents a crash between steps 4 and 5 from permanently blocking retries, while ensuring successful responses are returned verbatim across the full client retry window.

Step-by-step in Python

import hashlib, json, redis

client = redis.Redis(host="localhost", decode_responses=True)

def handle_request(tenant_id: str, payload: dict) -> dict:
    # Step 1 – deterministic key
    canon = json.dumps(payload, sort_keys=True, separators=(",", ":"))
    fingerprint = hashlib.sha256(canon.encode()).hexdigest()
    key = f"idemp:{tenant_id}:{fingerprint}"

    # Step 2 – atomic claim (NX = only if absent, EX = TTL in seconds)
    claimed = client.set(key, "PROCESSING", nx=True, ex=30)

    if not claimed:
        # Duplicate — return whatever state is stored
        existing = client.get(key)
        if existing and existing not in ("PROCESSING", "FAILED"):
            return json.loads(existing)
        # Still PROCESSING or FAILED: surface appropriate status to caller
        return {"status": existing or "UNKNOWN"}

    try:
        # Step 3 – business logic (your DB write, payment call, etc.)
        result = execute_business_logic(payload)

        # Step 4 – store terminal state with long TTL
        client.set(key, json.dumps(result), ex=86400)
        return result
    except Exception as exc:
        # Step 5 – mark failed so retries are not permanently blocked
        client.set(key, "FAILED", ex=300)
        raise

The Lua script equivalent bundles steps 2–3 into a single server-side round-trip, eliminating the small window between GET and SET in multi-step Python code:

-- KEYS[1] = idempotency key
-- ARGV[1] = processing TTL (seconds)
local existing = redis.call('GET', KEYS[1])
if existing then
    return existing
end
redis.call('SET', KEYS[1], 'PROCESSING', 'EX', ARGV[1])
return nil

Implementation Variants

Variant 1 — Atomic SET NX EX (single-node Redis)

The simplest and most common form. A single SET key value NX EX ttl command provides atomic claim semantics on one Redis primary. Best for monolithic or single-region deployments where sub-millisecond latency matters more than cross-node durability.

# Claim in one round-trip
ok = client.set(f"idemp:{key}", "PROCESSING", nx=True, ex=30)

Variant 2 — Redlock (multi-node quorum)

Distributed lock acquisition via Redlock spreads the claim across N independent Redis nodes (typically 5), requiring a majority quorum before granting the lock. This tolerates single-node failure without giving up the idempotency guarantee.

from redlock import Redlock

dlm = Redlock([
    {"host": "redis-1", "port": 6379},
    {"host": "redis-2", "port": 6379},
    {"host": "redis-3", "port": 6379},
])

lock = dlm.lock(f"idemp:{key}", 30_000)  # TTL in ms
if lock:
    try:
        result = execute_business_logic(payload)
    finally:
        dlm.unlock(lock)

Variant 3 — Lua scripted claim + journal

Uses a Lua script to atomically claim the key and immediately enqueue a journal write to a persistent store (via a Redis LPUSH to a journaling queue). A background worker drains the queue to PostgreSQL, providing durable state without synchronous dual-writes.

-- Atomic claim + enqueue journal entry in one round-trip
local existing = redis.call('GET', KEYS[1])
if existing then return existing end
redis.call('SET', KEYS[1], 'PROCESSING', 'EX', ARGV[1])
redis.call('LPUSH', 'dedup_journal', KEYS[1])
return nil

Variant 4 — Read-through with database fallback

On a Redis miss (cold start or post-eviction), the handler checks the database before processing. This prevents duplicate execution after Redis key eviction under memory pressure.

def safe_claim(key: str) -> str | None:
    """Returns None if claim is fresh, else returns stored state."""
    # Fast path: Redis
    existing = client.get(key)
    if existing:
        return existing
    # Slow path: database (handles post-eviction re-checks)
    row = db.query("SELECT state FROM idempotency_keys WHERE key = %s", key)
    if row:
        # Warm Redis from DB to restore fast path
        client.set(key, row.state, ex=3600)
        return row.state
    # Genuine new request
    client.set(key, "PROCESSING", nx=True, ex=30)
    return None

Variant comparison

Variant	Consistency	Latency overhead	Survives node failure	Complexity
SET NX EX (single-node)	Linearizable on primary	~1 ms	No (replication lag)	Low
Redlock (quorum)	Strong majority consensus	~3–5 ms	Yes (majority alive)	Medium
Lua script + journal	Linearizable + async durable	~1 ms + async	Yes (journal covers gap)	Medium
Read-through + DB fallback	Eventual (warm) → strong (cold)	~1 ms / ~5 ms	Yes (DB covers eviction)	Medium–High

Key Derivation & Namespace Isolation

Deterministic hashing is the first line of defence against cache collisions. SHA-256 maps normalized payload bytes to a 256-bit fingerprint. BLAKE3 offers equivalent collision resistance at 2–3x higher throughput for CPU-bound key generation.

Normalization must strip fields that vary legitimately between retries — timestamps, User-Agent headers, request trace IDs — while preserving business-critical identifiers: amount, currency, account number, and the client-supplied X-Idempotency-Key value.

import hashlib, json

STRIP_FIELDS = {"timestamp", "request_id", "trace_id", "user_agent"}

def derive_key(tenant_id: str, env: str, payload: dict) -> str:
    normalized = {k: v for k, v in payload.items() if k not in STRIP_FIELDS}
    canon = json.dumps(normalized, sort_keys=True, separators=(",", ":"))
    digest = hashlib.sha256(canon.encode()).hexdigest()
    return f"idemp:{tenant_id}:{env}:{digest}"

The idemp:{tenant_id}:{env}:{hash} namespace pattern enforces three guarantees:

Tenant isolation — keys from tenant A never collide with tenant B’s keys.
Environment segregation — staging and production keys are structurally distinct even on shared Redis instances.
Collision resistance — 256-bit entropy makes accidental hash collisions computationally negligible at any realistic request volume.

High-cardinality endpoints (bulk payment batch submissions with thousands of line items) require special handling: hash each line item independently under its own sub-key rather than hashing the entire batch payload, which would create a single point of contention and a monster key.

Edge Cases & Failure Scenarios

Failure Scenario	Remediation Steps	Observability Hooks
Redis primary fails mid-claim; promoted replica lacks the key	Set key on the new primary; allow the request to reprocess. Add a post-failover reconciliation job that cross-checks Redis state against the database journal and marks duplicates in the event log.	Alert on `redis_replication_lag_seconds > 0.5`; emit `dedup_failover_replay_total` counter each time a reconciliation job resolves a duplicate.
`PROCESSING` key never transitions to `COMPLETED` (worker crash after claim, before logic succeeds)	The 30-second `PROCESSING` TTL auto-expires, allowing the next retry to reclaim the key. Ensure the retry carries the same fingerprint so it hits the same key. Set `key_stuck_processing_total` alert if any key remains in `PROCESSING` beyond 2× the expected processing time.	Log `key`, `tenant_id`, and `stuck_since` timestamp. Alert on `dedup_stuck_processing_count > 0` for longer than 60 seconds.
Key evicted under `allkeys-lru` before TTL expires; duplicate arrives post-eviction	Route the post-eviction duplicate through the read-through database fallback path (Variant 4 above). The database journal acts as the authoritative deduplication record when Redis state is absent.	Track `dedup_cache_miss_after_eviction_total` using the `keyspace_events` notification channel (`KEx`). Alert when eviction rate exceeds 100 keys/minute in the `idemp:*` namespace.
Concurrent requests race on identical key; both read `nil` before either `SET` completes	Use `SET key value NX EX ttl` — not `GET` + `SET` — to guarantee atomicity. The NX flag ensures exactly one caller succeeds even under tight concurrency.	Emit `dedup_race_condition_detected_total` if `GET` returns nil but a subsequent `SET NX` fails (indicates a race was detected and correctly resolved).
Schema evolution changes the canonical JSON representation of a payload	Introduce a version prefix in the key (`idemp:v2:{tenant}:{env}:{hash}`) during the migration window. Run both v1 and v2 deduplication in parallel for a 24-hour overlap period, then retire v1 keys.	Log `schema_version` alongside every deduplication event. Alert on unexpected version distribution shifts.

Operational Concerns

TTL windows

Align the COMPLETED key TTL to the client SDK’s maximum retry duration plus a 20% safety margin:

Synchronous payment APIs: 86400 seconds (24 hours).
Asynchronous job queues: 604800 seconds (7 days).
Webhook delivery systems: 259200 seconds (72 hours), matching typical webhook platform retry windows.

The PROCESSING state TTL must be shorter than the shortest downstream service timeout. A value of 30 seconds covers most synchronous API calls; set it to 300 seconds for workflows that invoke slow external services.

Full lifecycle management — including sliding TTL strategies for long-running workflows — is covered in Idempotency Key Storage & TTL Management.

Memory budgeting

Each idempotency key costs approximately 200–800 bytes in Redis (key string + value payload + overhead). At 10,000 requests per second with an 86400-second TTL, the steady-state keyspace is:

10,000 req/s × 86,400 s × 600 bytes = ~518 GB

This is unsustainable for most Redis deployments. Practical mitigations:

Reduce TTL to 3600 seconds for endpoints where clients retry within one hour (reduces the example above to ~21 GB).
Store only a status code + transaction ID in the value (50 bytes) rather than the full response payload.
Shard the idemp:* namespace to a dedicated Redis instance with maxmemory set and maxmemory-policy volatile-ttl so only keys without TTLs are evicted (idempotency keys always have TTLs, so they are safe).

SRE alert thresholds

Configure the following alerts for any production Redis deduplication deployment:

redis_used_memory_rss_bytes / redis_maxmemory_bytes > 0.80 — memory headroom warning.
redis_evicted_keys_total rate > 10/second in the idemp:* keyspace — active eviction of live deduplication keys.
dedup_hit_rate (hits / total claims) < 0.001 — deduplication may not be firing correctly.
dedup_false_positive_rate > 0.0001 — hash collisions above the noise floor.
Redis instantaneous_ops_per_sec > 80% of tested throughput ceiling — capacity headroom alert.
dedup_stuck_processing_count > 0 for 60 seconds — worker crash leaves keys orphaned.

Multi-region deployments

Active-active Redis replication introduces write conflicts when identical requests hit geographically distributed nodes simultaneously. For global payment flows, use regional affinity routing in the API gateway: hash the X-Idempotency-Key header to select a canonical region for that key, then route all retries for that key to the same region. This eliminates cross-region conflict without requiring CRDT-based state merging.

When preventing race conditions across microservices, the deduplication layer must sit upstream of any service that can trigger side effects — not inside individual microservices — so that regional routing decisions are made once, before the request fans out.

Using Redis SETNX for Distributed Request Deduplication — step-by-step runbook for SET NX EX in Python, Go, and Node.js, with exact commands for simulating duplicates.
Idempotency Key Storage & TTL Management — storage selection (Redis vs. PostgreSQL vs. DynamoDB), TTL window sizing, and eviction policy configuration for the durable journal layer.
Database Unique Constraints & Upserts — the persistence-layer complement to Redis deduplication; layering the two creates defence-in-depth against duplicates that survive Redis eviction.
Transaction Scoping & Atomic Operations — coupling cache claim with downstream database commits so partial-execution rollbacks leave the system in a consistent state.
Backend Implementation & Storage Patterns — parent section covering all storage-layer strategies for idempotency and exactly-once processing.