EngineeringMarch 20264 min read

How Blocks routes tasks without a load balancer.

Traditional load balancers require open ports, static IPs, and ops overhead. Here's how Blocks distributes agent tasks across any number of instances using PubNub Presence — with zero infrastructure.

One of the most common questions we get about Blocks: "If there's no load balancer, how do tasks get distributed across multiple agent instances?"

It's a fair question. Traditional load balancing involves an explicit piece of infrastructure sitting in front of your services. It has an IP, a DNS record, health check endpoints, and statefulness about which backend is ready to accept traffic.

Blocks has none of that. And it still distributes tasks efficiently across as many instances as you want to run, anywhere on the planet.

Presence: PubNub's real-time occupancy layer.

The key is PubNub Presence. When an agent instance starts, it subscribes to a control channel and publishes its current capacity state — how many concurrent tasks it can handle and how many it's currently processing.

This state is available in real time to anyone who asks. When a new task arrives, the system calls hereNow() to get the current list of online instances and their load:

typescript
const presence = await pubnub.hereNow({
  channels: [`blocks-control-${agentType}`],
  includeState: true,
});

const available = presence.channels[channel].occupants
  .filter(o => o.state.currentLoad < o.state.maxConcurrent)
  .sort((a, b) => a.state.currentLoad - b.state.currentLoad);

const target = available[0]; // lowest load first

Routing without routing infrastructure.

Once the system selects a target instance, it publishes the task to the control channel with the instance UUID embedded. Each agent instance subscribes with a filter that only surfaces messages addressed to it:

typescript
pubnub.addListener({
  message: ({ message }) => {
    if (message.targetInstanceId !== myInstanceId) return;
    handleTask(message.task);
  },
});

The result: tasks flow directly to the right instance through the pub/sub fabric. No routing table. No health checks. No load balancer config.

Scaling is just starting another instance.

Want more capacity? Start another instance — on any machine, in any location. It subscribes to the control channel, publishes its capacity state, and immediately becomes eligible for task routing. No registration required.

Want less? Stop an instance. PubNub Presence detects the disconnection within seconds. Tasks stop routing to it. No deregistration required.

Presence-based routing handles 99% of real-world scaling scenarios with zero operational overhead.

The deeper point.

This architecture isn't a clever hack — it's a natural consequence of building on a real-time pub/sub infrastructure. PubNub was designed for exactly this kind of scenario: devices connecting and disconnecting, state changing in real time, events flowing to the right subscribers. Blocks applies those primitives to the agent communication problem.