Building Real-Time Alerts Without Polling

The first version of DevBar polled every integration on a 60-second interval. It worked, but 60 seconds is an eternity during a production incident. By the time the menu bar updated to show a new PagerDuty alert, the on-call engineer had already gotten a phone call.

We needed sub-second delivery. Here’s what we built.

Why polling is the wrong model for alerts

Polling feels simple: fetch data every N seconds, diff it, surface changes. The problems compound quickly in practice.

Latency is bounded by your interval. A 60-second poll means worst-case 60-second delay. You can’t reduce this below 5-10 seconds without hammering the upstream APIs — and most of them rate-limit you before you get close to real-time.

Wasted requests. The vast majority of polls return nothing new. You’re paying API rate limit budget and backend CPU for responses that say “same as before.”

Battery drain on macOS. Native macOS apps are expected to be good citizens with energy. Polling timers that fire every few seconds prevent CPU idle states and show up in Activity Monitor’s energy impact column. App Store review will flag it. Users will notice.

The right model for alerts is push: the server notifies the client when something changes, not the other way around.

WebSockets with JWT auth via query parameter

The DevBar macOS client embeds a WKWebView for the popover UI. WKWebView handles most things well, but it has a well-known limitation: you cannot set custom headers on the initial WebSocket handshake. The Upgrade: websocket request goes out before your JavaScript has a chance to inject an Authorization header.

This rules out the standard Authorization: Bearer <token> pattern for WebSocket auth.

Our solution is to pass the JWT as a query parameter on the WebSocket URL:

wss://api.devbar.app/ws?token=eyJhbGci...

The backend validates the token before upgrading the connection. Yes, the token appears in server access logs. We mitigate this with short-lived tokens (15-minute expiry, refreshed before connecting) and TLS everywhere. For a menu bar app talking to our own backend, this is an acceptable tradeoff. If you’re building something with stricter requirements, consider a ticket-exchange pattern: one HTTP request to get a short-lived opaque token, then connect with that.

The hub pattern

The WebSocket hub is a single goroutine that owns all connection state. No locks, no shared maps — all mutations happen in one place.

type Hub struct {
    clients    map[*Client]bool
    broadcast  chan []byte
    register   chan *Client
    unregister chan *Client
}

func (h *Hub) Run() {
    for {
        select {
        case client := <-h.register:
            h.clients[client] = true
        case client := <-h.unregister:
            if _, ok := h.clients[client]; ok {
                delete(h.clients, client)
                close(client.send)
            }
        case message := <-h.broadcast:
            for client := range h.clients {
                select {
                case client.send <- message:
                default:
                    close(client.send)
                    delete(h.clients, client)
                }
            }
        }
    }
}

Each Client has a send channel. The hub never blocks on a slow client — if the channel is full (the default branch), that client is dropped. This prevents one slow connection from stalling delivery to everyone else.

Keepalive: ping/pong

WebSocket connections through proxies and load balancers get silently dropped after idle periods. We send a ping every 30 seconds and expect a pong back within 60 seconds:

const (
    pingInterval = 30 * time.Second
    pongTimeout  = 60 * time.Second
)

func (c *Client) writePump() {
    ticker := time.NewTicker(pingInterval)
    defer ticker.Stop()
    for {
        select {
        case message, ok := <-c.send:
            if !ok {
                c.conn.WriteMessage(websocket.CloseMessage, []byte{})
                return
            }
            c.conn.WriteMessage(websocket.TextMessage, message)
        case <-ticker.C:
            c.conn.SetWriteDeadline(time.Now().Add(pongTimeout))
            c.conn.WriteMessage(websocket.PingMessage, nil)
        }
    }
}

The read pump sets a pong handler that resets the deadline. If no pong arrives within 60 seconds, the read deadline fires, the connection errors, and the client is unregistered from the hub.

Webhook receivers → event bus → WebSocket push

The actual alert delivery path:

PagerDuty, GitHub, or Datadog sends a webhook to our backend.
The webhook handler validates the signature (HMAC-SHA256 for most, custom for PagerDuty).
The event is published to an in-process event bus — a simple chan Event with typed subscribers.
A subscriber for each event type formats the payload and sends it to hub.broadcast.
The hub delivers it to all connected clients that are authenticated for that team/org.

Per-user filtering happens at the hub layer, not the webhook layer. All events flow into the bus; the hub decides which clients should see which events based on the user’s configured integrations and team membership.

Result

End-to-end latency from a PagerDuty incident firing to the DevBar menu bar icon updating is consistently under one second in our testing. For GitHub PR review requests and Datadog monitor alerts, it’s similar.

The tradeoff is connection overhead: each DevBar client holds an open WebSocket connection. At our current scale this is fine — Go handles hundreds of thousands of concurrent connections comfortably. If we get to the point where this is a scaling concern, the natural next step is a dedicated WebSocket gateway service (or just use a managed pubsub like Ably or Pusher), but we’re not there yet.

Polling was the right call for v1 — it got us to launch. Replacing it with push was the right call for v2. Most infrastructure decisions are about picking the right tool for the current problem size, not the theoretical future one.