Message Batching & Throughput Optimization for Web Push Delivery

High-volume web push campaigns demand precise orchestration to bypass provider throttling, minimize latency, and guarantee secure delivery. Implementing an efficient batching strategy is foundational to the broader Backend Delivery Architecture & Queue Management framework. This guide provides production-ready patterns for payload consolidation, connection pooling, and secure scaling, targeting full-stack developers, growth engineers, and mobile-web teams responsible for subscription lifecycle delivery.

Architecting the Batching Engine

Push providers enforce strict rate limits and connection caps per origin. Unbatched, sequential dispatches exhaust connection pools and trigger 429 Too Many Requests responses. A deterministic batching engine groups subscriptions by push service (FCM, Safari APNs, Mozilla), applies dynamic chunk sizing, and dispatches via a controlled worker pool.

Implementation Steps

Endpoint Normalization & Grouping: Partition subscription payloads by push_service and endpoint domain. Safari (APNs) and FCM require distinct HTTP headers and authentication flows.
Dynamic Batch Sizing: Configure chunk sizes between 100–500 per request. FCM typically handles 500 efficiently, while APNs requires smaller batches (100–200) to avoid connection resets.
Priority Queue Dispatch: Deploy a worker pool that drains from a Redis-backed priority queue. High-priority campaigns (e.g., security alerts) bypass standard batching windows.

Production-Ready Dispatch Implementation

The following Node.js implementation demonstrates dynamic chunking with a semaphore-controlled concurrency limiter. It includes strict error boundaries to prevent worker starvation.

import { createHash } from 'crypto';

// Configuration constants aligned with provider limits
const BATCH_SIZE = 200;
const MAX_CONCURRENCY = 10;
const DISPATCH_TIMEOUT_MS = 30_000;

interface PushSubscription {
 endpoint: string;
 keys: { p256dh: string; auth: string };
 pushService: 'fcm' | 'apns' | 'mozilla';
 metadata?: Record<string, unknown>;
}

class Semaphore {
 private permits: number;
 private queue: (() => void)[] = [];

 constructor(permits: number) {
 this.permits = permits;
 }

 async run<T>(task: () => Promise<T>): Promise<T> {
 if (this.permits > 0) {
 this.permits--;
 try {
 return await task();
 } finally {
 this.permits++;
 this.releaseNext();
 }
 }
 return new Promise<T>((resolve, reject) => {
 this.queue.push(async () => {
 try {
 resolve(await task());
 } catch (err) {
 reject(err);
 }
 });
 });
 }

 private releaseNext() {
 if (this.queue.length > 0) {
 this.queue.shift()!();
 }
 }
}

function chunkArray<T>(arr: T[], size: number): T[][] {
 const chunks: T[][] = [];
 for (let i = 0; i < arr.length; i += size) {
 chunks.push(arr.slice(i, i + size));
 }
 return chunks;
}

async function dispatchBatches(
 subscriptions: PushSubscription[],
 sendPushBatch: (batch: PushSubscription[]) => Promise<void>
): Promise<void> {
 const batches = chunkArray(subscriptions, BATCH_SIZE);
 const semaphore = new Semaphore(MAX_CONCURRENCY);

 const dispatchPromises = batches.map((batch, index) =>
 semaphore.run(async () => {
 try {
 await Promise.race([
 sendPushBatch(batch),
 new Promise((_, reject) =>
 setTimeout(() => reject(new Error(`Batch ${index} timeout`)), DISPATCH_TIMEOUT_MS)
 )
 ]);
 } catch (error) {
 // Route to dead-letter queue or retry logic
 console.error(`[DISPATCH_ERROR] Batch ${index} failed:`, error);
 throw error;
 }
 })
 );

 await Promise.allSettled(dispatchPromises);
}

Architecture Trade-offs: Higher concurrency increases throughput but risks TCP connection exhaustion and provider-side IP reputation degradation. Monitor netstat or equivalent metrics to ensure TIME_WAIT sockets do not exceed system limits.

Optimizing Payload & Connection Throughput

Throughput bottlenecks frequently originate from redundant payload serialization and ephemeral HTTP connections. Enabling HTTP/2 multiplexing allows concurrent push requests over a single persistent TCP connection, drastically reducing TLS handshake overhead. Aligning expiration windows with TTL & Expiration Handling ensures stale payloads are discarded before consuming dispatch cycles.

Implementation Steps

HTTP/2 Multiplexing: Utilize fetch with keepalive: true or a dedicated HTTP/2 client. Reuse agents across worker threads to maintain connection pools.
Payload Compression: Apply Brotli or GZIP compression for payloads exceeding 4KB. Web Push protocol supports Content-Encoding: aesgcm or aes128gcm, but transport-layer compression reduces wire size before encryption overhead.
Pre-Encryption Pipeline: Encrypt payloads using ECDH (P-256) before queue insertion. Offloading cryptographic operations to a dedicated pre-processing stage prevents CPU contention during high-throughput dispatch.

Security & Compliance Posture

PII Stripping: Remove or hash personally identifiable information before encryption. Push payloads traverse multiple network hops and are cached at intermediary gateways.
VAPID Validation: Verify Authorization: WebPush <token> headers against active key rotation schedules. Invalid signatures result in immediate 401 Unauthorized responses, wasting batch slots.
Regulatory Alignment: Enforce GDPR/CCPA consent flags at the batching layer. Subscriptions lacking explicit opt-in must be filtered before entering the dispatch pipeline.

Monitoring & Acknowledgment Integration

Batching obscures individual delivery states. Without granular tracking, a single 410 Gone response within a batch of 200 can corrupt subscription health metrics. Implementing a correlation ID per batch and mapping it to individual subscription receipts enables accurate state reconciliation. This workflow directly feeds into Delivery Tracking & Acknowledgment systems, allowing real-time throughput adjustments and failure isolation without disrupting the main dispatch pipeline.

Implementation Steps

Correlation Mapping: Attach a unique batch_id (UUIDv4) and subscription_index to each HTTP request. Store mappings in an ephemeral cache (e.g., Redis with 15m TTL).
Response Parsing: Parse 201 Created, 202 Accepted, 404 Not Found, and 410 Gone responses. Immediately flag 404/410 as invalid subscriptions and purge them from the active database.
Dead-Letter Routing: Route failed batches to a dedicated DLQ. Implement exponential backoff for transient 5xx errors, but permanently quarantine 4xx client errors to prevent retry storms.

Debugging Checklist

Verify Message-ID headers in provider responses match internal correlation IDs.
Audit batch sizes against provider-specific Retry-After headers.
Ensure worker logs capture raw HTTP status codes before abstracting them into internal metrics.

Production Deployment & Queue Scaling

Deploying a high-throughput batching system requires rigorous validation of connection limits, memory allocation during encryption, and circuit breaker thresholds. Align batch intervals with campaign velocity and enforce strict idempotency keys to prevent duplicate notifications. For infrastructure teams evaluating distributed message brokers, refer to Scaling push queues with Redis or RabbitMQ to select the optimal persistence and routing strategy.

Implementation Steps

VAPID Key Rotation Compatibility: Implement hot-swappable VAPID keys. Workers must fetch the latest public key from a centralized secrets manager before signing each batch.
Circuit Breaker Configuration: Set failure thresholds at 5% per provider. When exceeded, halt dispatch for 60s, drain pending batches to a retry queue, and alert SRE teams.
Backpressure Handling: Enforce queue depth limits at 1,000,000 pending messages. When exceeded, apply producer-side rate limiting and return 503 Service Unavailable to upstream campaign APIs.
Load Testing Protocol: Simulate 50,000 concurrent subscriptions with a 99.5% success target. Monitor GC pauses, heap allocation, and network I/O saturation during peak dispatch windows.

Security & Operational Hardening

Immutable Audit Logging: Record every dispatched batch with correlation IDs, timestamps, and VAPID key fingerprints. Logs must be append-only and retained per compliance mandates.
Idempotency Enforcement: Generate deterministic idempotency keys using SHA-256(campaign_id + batch_hash + timestamp_window). Reject duplicate dispatches within a 5-minute sliding window.
Memory Safeguards: Cap worker heap usage at 70%. Implement stream-based payload encryption to avoid loading entire batches into memory simultaneously.

By adhering to these batching, encryption, and monitoring protocols, engineering teams can scale web push delivery to enterprise volumes while maintaining strict security posture and provider compliance.