Message queues and pub/sub are frequently treated as interchangeable — two names for the same idea of "sending messages between services asynchronously." In practice, they solve fundamentally different problems, and choosing the wrong one for a given use case produces bugs that are difficult to reproduce and expensive to fix in production.
This post draws a precise boundary between the two patterns, explains the underlying mechanism that separates them, and identifies four categories of problems where only a message queue provides the correct guarantee.
The core distinction
Both patterns decouple a sender from its receivers. The difference is in what happens to a message after it is sent.
In a message queue, a message is placed into an ordered structure and remains there until exactly one consumer retrieves and acknowledges it. The queue tracks ownership: while a consumer holds a message, no other consumer can access it. Once the consumer acknowledges successful processing, the message is removed. If the consumer fails without acknowledging, the message becomes available again.
In pub/sub, a message is published to a topic and broadcast to every active subscriber. Each subscriber receives its own independent copy. The topic does not track who is processing what — it simply delivers and moves on. If two subscribers are listening, both receive the message. If ten are listening, all ten receive it.
The analogy is simple: a queue is a postal system — one parcel, one recipient, delivery confirmed. Pub/sub is a radio broadcast — one transmission, every receiver tuned to that frequency picks it up simultaneously.
What each pattern guarantees
A message queue guarantees exactly-once delivery within a consumer group, FIFO ordering within a queue, backpressure through the natural buffering of unconsumed messages, and visibility isolation — a message being processed by one consumer is invisible to all others.
Pub/sub guarantees fan-out — every subscriber receives every message — and decoupling of producer from consumer count. The publisher does not need to know how many subscribers exist or what they do with the message. It publishes; distribution is the broker's concern.
Neither guarantee is superior. They serve different needs. The problem arises when a developer reaches for pub/sub in a situation that requires queue semantics.
Four problems only a queue solves correctly
1. Exactly-once job processing
Consider a payment processing pipeline. An order arrives, triggers a charge, and generates an invoice. This sequence must execute exactly once per order. With a queue, a single worker claims the message; all other workers in the pool are blocked from claiming the same message by the broker's visibility mechanism. Processing is atomic from the broker's perspective.
With pub/sub, every subscriber receives the event. If two billing services are subscribed — perhaps for redundancy — both execute the charge. The customer is billed twice. The root cause is not a bug in the billing service; it is an architectural mismatch. Pub/sub was designed to deliver to all subscribers, and it does so correctly. It simply cannot provide the exclusivity that payment processing requires.
2. Load balancing across a worker pool
A queue distributes work naturally. Multiple consumers read from the same queue, each claiming individual messages as capacity permits. If one hundred tasks arrive and three workers are running, the workload distributes across the pool — no task is processed more than once in aggregate.
Pub/sub fan-out is the inverse of load balancing. Three subscribers on a topic that receives one hundred messages means three hundred total executions. Work is replicated, not distributed. Pub/sub systems can approximate queue behavior through consumer group abstractions — Kafka's consumer groups being the canonical example — but this requires explicit configuration and introduces its own complexity. When the requirement is straightforward work distribution, a queue expresses that intent directly.
3. Retry logic and dead letter queues
Message queues support structured failure handling. When a worker fails to process a message — due to a transient network error, a downstream service being unavailable, or an unexpected exception — it can negatively acknowledge the message. The broker returns the message to the queue for reprocessing. After a configurable number of failures, the message is moved to a dead letter queue for inspection and manual intervention.
This retry mechanism depends on message ownership. The broker must know which consumer holds which message, and for how long, in order to detect abandonment and reassign. Pub/sub brokers do not maintain this per-message ownership model. While some pub/sub systems offer acknowledgment and retry at the subscriber level, they cannot prevent duplicate delivery across multiple subscribers — which returns us to the exactly-once problem described above.
4. Rate limiting and backpressure
Queues are pull-based. A consumer retrieves messages at its own pace, constrained by its processing capacity. If a producer sends one thousand messages per second but a worker can process only ten, the queue absorbs the difference. The worker is never overwhelmed; it simply works through the backlog at its natural rate.
Pub/sub is typically push-based. The broker delivers messages to subscribers as they arrive. A subscriber with insufficient capacity receives messages faster than it can process them, leading to memory pressure, dropped messages, or cascading failures. The queue's buffering behavior is not incidental — it is the mechanism that makes the two patterns behave differently under load.
When pub/sub is the correct choice
Pub/sub is the right pattern when an event should be known by multiple independent systems, and it is acceptable — or desirable — for all of them to act on it simultaneously.
A user registration event might need to trigger an email confirmation, a CRM entry creation, an analytics event, and an onboarding sequence. These are independent concerns. No single service should own the event. The email system does not need to know about the CRM; the analytics pipeline does not need to coordinate with onboarding. Pub/sub expresses this fan-out cleanly, and adding a new subscriber later requires no changes to the publisher or existing subscribers.
The test is straightforward: if duplicate processing would be a correctness problem — billing twice, sending two confirmation emails, reserving the same inventory slot twice — then a queue is required. If the same event legitimately needs to be processed by multiple independent systems, pub/sub is the appropriate tool.
Using both patterns together
Production systems frequently use pub/sub and message queues in combination. A common pattern is the fan-out queue: a pub/sub topic broadcasts an event to multiple queues, each owned by a separate service. Each queue then serializes delivery within its service, providing exactly-once semantics for that service's processing while preserving the fan-out behavior at the distribution layer.
AWS SNS combined with SQS is the canonical implementation of this pattern. Google Pub/Sub with per-subscriber subscriptions provides similar guarantees. The two patterns are not competing choices — they operate at different layers of a messaging architecture.
Practical tool reference
For teams choosing infrastructure: RabbitMQ and Amazon SQS are purpose-built message queues with strong exactly-once and ordering guarantees. BullMQ (Node.js, Redis-backed) and asynq (Go, Redis-backed) are well-suited for application-level job queues. Apache Kafka and NATS are primarily pub/sub systems, though Kafka's consumer group model supports queue-like behavior within a group. Google Cloud Pub/Sub and AWS SNS are managed pub/sub services that pair well with downstream queues for fan-out-then-serialize patterns.
Firebase, for completeness, provides neither a native message queue nor a fully featured pub/sub system. Cloud Pub/Sub is available as a separate Google Cloud service and requires explicit integration.
The distinction between these two patterns is not academic. It is the difference between a payment charged once and charged twice, between a task processed by one worker and processed by all of them. Understanding what each pattern guarantees — and what it does not — is a prerequisite for designing distributed systems that behave correctly under load and failure.