One day we got an alert: our dashboard was generating a 429 rate-limit storm — from a single user. One person. Not a hundred. Not a load test. Someone just opened a browser tab.
That moment made clear we had a deeper architectural problem, and a surface-level fix would not hold. This is the story of our journey from synchronized polling that hammered our API, to Server-Sent Events with a Redis Pub/Sub multiplexer that delivers real-time updates only when something actually changes.
The Problem: Synchronized Polling = Request Bursts
The ReplyQ dashboard shows conversations, messages, channel status, and metrics — all in real time. To keep data fresh, we built the frontend with setInterval calls that fetch data every few seconds.
The problem: all those pollers start at the same moment when the page loads. If five components poll every 5 seconds, that is 60 API requests per minute per person. With 50 users, that is 3,000 requests per minute. And with a rate limit keyed to IP address, anyone behind a corporate NAT — which is essentially every business customer — looks like a single user to the rate limiter.
The surprising finding: Even with a single user, all pollers fired at exactly the same instant after page load. This created a burst of 8–12 concurrent requests at every reload — repeating indefinitely.
Phase One: A Temporary Fix — Staggering
Our first response was to add randomness to polling intervals. Instead of:
setInterval(fetchConversations, 5000)
setInterval(fetchMessages, 10000)
We changed to:
setInterval(fetchConversations, 5000 + Math.random() * 10000)
setInterval(fetchMessages, 10000 + Math.random() * 10000)
This spread requests over time and eliminated the burst. But it was not a real solution — we were still sending thousands of unnecessary requests even when nothing in the data had changed.
The Real Solution: SSE + Redis Pub/Sub
SSE (Server-Sent Events) is a persistent, unidirectional HTTP connection. The client opens a single connection, and the server sends data only when there is something to send. This is a natural fit for a dashboard:
- No polling — no wasted requests
- Instant updates — when a new message arrives, all listeners receive it within milliseconds
- Automatic reconnection — built into the browser EventSource API
- Works with standard HTTP — compatible with Cloudflare, nginx, any proxy
The Architecture: Redis Pub/Sub
We built two SSE endpoints on the backend:
/api/v1/sse/conversations— listens for conversation updates scoped to the tenant/api/v1/sse/messages/{conv_id}— listens for messages in a specific conversation
Each endpoint subscribes to a unique Redis channel. When an inbound message arrives via webhook (WhatsApp, Instagram, Messenger), we publish to that channel:
# webhooks.py — at the message-save point
await redis.publish(
f"replyq:sse:{tenant_id}",
json.dumps({"type": "new_message", "conversation_id": conv_id})
)
The SSE endpoint receives the event and immediately pushes it to the client. No polling. No waiting.
The SSE Multiplexer: Saving Redis Connections
The next challenge: if 50 users are all listening to SSE for the same tenant, without optimization that means 50 Redis Subscribe connections — one per user.
We built an SSEManager that acts as a multiplexer: one Redis connection per channel, with fan-out to N asyncio queues:
# sse_manager.py — core concept
class SSEManager:
async def subscribe(self, channel: str, queue: asyncio.Queue):
# If there is already a subscriber for this channel,
# just add another queue — no new Redis connection
...
async def _reader_loop(self, channel: str):
# One loop per channel → fan out to all queues
...
Result: 50 users on the same tenant = one Redis connection, not 50.
The Frontend: A useSSE Hook
We built a custom React hook to manage the EventSource lifecycle:
- Opens EventSource with JWT token as a URL query parameter (standard EventSource cannot send headers)
- If a 401 is received (expired token), waits 3 seconds and reconnects with a fresh token from localStorage
- Uses a stable callback ref to avoid unnecessary React re-renders
Every page that needs real-time updates uses this hook. When an SSE event arrives, it triggers a targeted refetch of only the relevant data.
One More Optimization: The SSE Event Bus
We discovered that the conversations page was opening two EventSource connections — one inside layout.tsx and one in the page component itself. Two SSE connections to the same channel.
The fix: an sseEventBus. The layout holds a single SSE connection and broadcasts events to an internal EventEmitter. Page components subscribe to the bus instead of opening their own EventSource. Result: 33% fewer SSE connections per user session.
We Also Fixed the Rate Limit Key
Alongside the SSE migration, we changed the rate limit from IP-based to authenticated User ID-based:
- Before: 300 requests per minute per IP — a company with 10 employees behind NAT = 30 requests per person
- After: 1,000 requests per minute per authenticated user ID — each person gets an independent quota
The Dashboard Init Endpoint: 4 Requests Become 1
We also added a /api/v1/dashboard/init endpoint that returns everything needed at dashboard load in a single response: user profile, tenant settings, feature flags, and channel list — all served from Redis cache in sub-millisecond time.
On first load: 4 separate API calls became 1. On internal navigation: only getMe runs (everything else is already seeded into the React Query cache from the init response).
What We Gained
- Zero continuous polling — updates arrive only when something changes
- Instant delivery — an inbound message appears in the dashboard in under a second
- Redis connection efficiency — the multiplexer prevents N connections for the same channel
- User-scoped rate limits — NAT is no longer a problem
- Stability — automatic reconnect, 25-second keepalives to handle Cloudflare timeouts
Before: approximately 2,200 API requests per minute with 100 users. After: a small number of open SSE connections plus API requests only on actual data changes. More than 90% reduction in API load.
Lessons We Would Apply Again
- Polling is simplicity with a hidden cost. You do not feel it in development. In production, with real users and real NAT, you do.
- IP-based rate limiting is wrong for B2B. Business customers always share IPs. Always key the rate limit to an authenticated user or tenant ID.
- SSE beats WebSockets for dashboards. Unidirectional, works through every proxy, browser reconnect is built in.
- Fan-out on the backend saves a lot. One Redis subscription loop broadcasting to N asyncio queues is far cheaper than N subscription loops.
If your dashboard is built on polling and you are starting to see 429 errors, the SSE path is worth considering. For ReplyQ, it was the right call at exactly the right time.
Want to see the real-time dashboard in action? Schedule a free demo and we will walk you through it live.