ReplyQ היא פלטפורמת SaaS ישראלית לאוטומציה של שיחות עסקיות בוואטסאפ, אינסטגרם ומסנג'ר. היא כוללת בוט AI חכם, CRM מובנה, זימון תורים, פולואפ אוטומטי ותמלול הודעות קוליות.

כמה עולה ReplyQ?

ReplyQ מציעה 4 חבילות: Starter ב-499 ש״ח לחודש, Pro ב-1,199 ש״ח לחודש, Business ב-1,999 ש״ח לחודש, ו-Enterprise במחיר מותאם. ההצטרפות כוללת עלות הקמה חד-פעמית.

האם ReplyQ עובד עם המספר וואטסאפ הקיים שלי?

כן. ReplyQ תומכת במצב Coexistence דרך 360Dialog - הבוט פועל על המספר הקיים שלך ב-WhatsApp Business בלי לשנות אותו. בנוסף יש אפשרות להתחבר ישירות ל-WhatsApp Cloud API של מטא.

האם אפשר לשלב יומן Google או Outlook?

כן. ReplyQ כוללת אינטגרציה מלאה ל-Google Calendar ול-Microsoft 365. הבוט מציע ללקוחות שעות פנויות מהיומן בזמן אמת, יוצר אירועים אוטומטית עם קישור Teams/Google Meet, ושולח תזכורות.

How We Replaced Frontend Polling with SSE and Eliminated 429 Rate-Limit Errors

One day we got an alert: our dashboard was generating a 429 rate-limit storm — from a single user. One person. Not a hundred. Not a load test. Someone just opened a browser tab.

That moment made clear we had a deeper architectural problem, and a surface-level fix would not hold. This is the story of our journey from synchronized polling that hammered our API, to Server-Sent Events with a Redis Pub/Sub multiplexer that delivers real-time updates only when something actually changes.

The Problem: Synchronized Polling = Request Bursts

The ReplyQ dashboard shows conversations, messages, channel status, and metrics — all in real time. To keep data fresh, we built the frontend with setInterval calls that fetch data every few seconds.

The problem: all those pollers start at the same moment when the page loads. If five components poll every 5 seconds, that is 60 API requests per minute per person. With 50 users, that is 3,000 requests per minute. And with a rate limit keyed to IP address, anyone behind a corporate NAT — which is essentially every business customer — looks like a single user to the rate limiter.

The surprising finding: Even with a single user, all pollers fired at exactly the same instant after page load. This created a burst of 8–12 concurrent requests at every reload — repeating indefinitely.

Phase One: A Temporary Fix — Staggering

Our first response was to add randomness to polling intervals. Instead of:

setInterval(fetchConversations, 5000)
setInterval(fetchMessages, 10000)

We changed to:

setInterval(fetchConversations, 5000 + Math.random() * 10000)
setInterval(fetchMessages, 10000 + Math.random() * 10000)

This spread requests over time and eliminated the burst. But it was not a real solution — we were still sending thousands of unnecessary requests even when nothing in the data had changed.

The Real Solution: SSE + Redis Pub/Sub

SSE (Server-Sent Events) is a persistent, unidirectional HTTP connection. The client opens a single connection, and the server sends data only when there is something to send. This is a natural fit for a dashboard:

No polling — no wasted requests
Instant updates — when a new message arrives, all listeners receive it within milliseconds
Automatic reconnection — built into the browser EventSource API
Works with standard HTTP — compatible with Cloudflare, nginx, any proxy

The Architecture: Redis Pub/Sub

We built two SSE endpoints on the backend:

/api/v1/sse/conversations — listens for conversation updates scoped to the tenant
/api/v1/sse/messages/{conv_id} — listens for messages in a specific conversation

Each endpoint subscribes to a unique Redis channel. When an inbound message arrives via webhook (WhatsApp, Instagram, Messenger), we publish to that channel:

# webhooks.py — at the message-save point
await redis.publish(
    f"replyq:sse:{tenant_id}",
    json.dumps({"type": "new_message", "conversation_id": conv_id})
)

The SSE endpoint receives the event and immediately pushes it to the client. No polling. No waiting.

The SSE Multiplexer: Saving Redis Connections

The next challenge: if 50 users are all listening to SSE for the same tenant, without optimization that means 50 Redis Subscribe connections — one per user.

We built an SSEManager that acts as a multiplexer: one Redis connection per channel, with fan-out to N asyncio queues:

# sse_manager.py — core concept
class SSEManager:
    async def subscribe(self, channel: str, queue: asyncio.Queue):
        # If there is already a subscriber for this channel,
        # just add another queue — no new Redis connection
        ...

    async def _reader_loop(self, channel: str):
        # One loop per channel → fan out to all queues
        ...

Result: 50 users on the same tenant = one Redis connection, not 50.

The Frontend: A useSSE Hook

We built a custom React hook to manage the EventSource lifecycle:

Opens EventSource with JWT token as a URL query parameter (standard EventSource cannot send headers)
If a 401 is received (expired token), waits 3 seconds and reconnects with a fresh token from localStorage
Uses a stable callback ref to avoid unnecessary React re-renders

Every page that needs real-time updates uses this hook. When an SSE event arrives, it triggers a targeted refetch of only the relevant data.

One More Optimization: The SSE Event Bus

We discovered that the conversations page was opening two EventSource connections — one inside layout.tsx and one in the page component itself. Two SSE connections to the same channel.

The fix: an sseEventBus. The layout holds a single SSE connection and broadcasts events to an internal EventEmitter. Page components subscribe to the bus instead of opening their own EventSource. Result: 33% fewer SSE connections per user session.

We Also Fixed the Rate Limit Key

Alongside the SSE migration, we changed the rate limit from IP-based to authenticated User ID-based:

Before: 300 requests per minute per IP — a company with 10 employees behind NAT = 30 requests per person
After: 1,000 requests per minute per authenticated user ID — each person gets an independent quota

The Dashboard Init Endpoint: 4 Requests Become 1

We also added a /api/v1/dashboard/init endpoint that returns everything needed at dashboard load in a single response: user profile, tenant settings, feature flags, and channel list — all served from Redis cache in sub-millisecond time.

On first load: 4 separate API calls became 1. On internal navigation: only getMe runs (everything else is already seeded into the React Query cache from the init response).

What We Gained

Zero continuous polling — updates arrive only when something changes
Instant delivery — an inbound message appears in the dashboard in under a second
Redis connection efficiency — the multiplexer prevents N connections for the same channel
User-scoped rate limits — NAT is no longer a problem
Stability — automatic reconnect, 25-second keepalives to handle Cloudflare timeouts

Before: approximately 2,200 API requests per minute with 100 users. After: a small number of open SSE connections plus API requests only on actual data changes. More than 90% reduction in API load.

Lessons We Would Apply Again

Polling is simplicity with a hidden cost. You do not feel it in development. In production, with real users and real NAT, you do.
IP-based rate limiting is wrong for B2B. Business customers always share IPs. Always key the rate limit to an authenticated user or tenant ID.
SSE beats WebSockets for dashboards. Unidirectional, works through every proxy, browser reconnect is built in.
Fan-out on the backend saves a lot. One Redis subscription loop broadcasting to N asyncio queues is far cheaper than N subscription loops.

If your dashboard is built on polling and you are starting to see 429 errors, the SSE path is worth considering. For ReplyQ, it was the right call at exactly the right time.

Want to see the real-time dashboard in action? Schedule a free demo and we will walk you through it live.

תגיות

Server-Sent EventsSSE vs pollingrate limit 429Redis pub subNext.js real-timeWebSocket alternativereal-time dashboardasyncio SSE

שתף מאמר זה

WhatsApp LinkedIn Facebook X