×
Study Notes — Certification Prep

Pub/Sub
Study Guide

Event-driven asynchronous messaging service that decouples senders and receivers. Allows for secure and highly available communication between independently written applications.

Updated: April 2026
Version: 1.2
Category: Messaging
Reading Time: ~20 min
Author: Michaël Bettan
01

Definition & Use Cases

What is Pub/Sub?

Pub/Sub is an event-driven asynchronous messaging service that decouples senders (producing events) and receivers (processing events). It allows for secure and highly available communication between independently written applications.

Availability
Global (no guarantee of multi-region storage)
SLA
>=99.95%

Use Cases

  • Data streaming from various processes or devices
  • Balancing workloads in network clusters
  • Queue can efficiently distribute tasks
  • Implementing asynchronous workflows
  • Reliability improvement - in case zone failure
  • Distributing event notifications
  • Refreshing distributed caches
  • Logging to multiple systems

Billing

  • Data throughput: Volume of data processed (GB).
  • Data storage: Charges apply for storing messages, with different rates for different storage durations.
  • Network usage: Costs are incurred for data transferred between regions or out of GCP.
02

Core Concepts & Data Model

Core Concepts

Streaming Data
Unbounded data (continuous flow of data).
Messaging service
Queue (app write > async < app read).
Loosely coupled architecture
Designing interfaces across modules to reduce the interdependencies across components (Fault tolerance, Scalability, Message Queuing).
Deduplication
Action to eliminate duplicate records.
Backlog
Accumulation of data to be processed.
Sensors
Device emitting events.
Publisher / Subscriber
Senders producing events / Receivers processing events.
Dead letter Queue
Offline message inspection.
Seek & Replay
Ability to reprocess & discard messages.
Ack deadline
Time to wait for ACK by subscriber, push/pull.
Snapshot
Captures current state at a specific point in time.
Msg Retention Duration
After ACK is up to 7 days.

Data Model

Topic
A named resource to which messages are sent by publishers.
Subscription
A named resource representing the stream of messages from a single, specific topic, to be delivered to the subscribing application.
Message
The combination of data and optional attributes that a publisher sends to a topic and is eventually delivered to subscribers.
Message Attribute
Key-value pair that a publisher can define for a message.
03

Message Flow & Considerations

Message Flow

Considerations

04

Architecture, Patterns & Batching

Architecture Overview

Pull Subscriptions

Subscribers actively request messages from the subscription. They have to explicitly acknowledge messages after successful processing. This provides more control over the message consumption rate and is suitable for work queues where guaranteed processing is crucial.

  • Large volume of messages (more than 1 per sec)
  • Critical efficiency and throughput
  • No public HTTPS endpoint

Push Subscriptions

Pub/Sub pushes messages to a registered HTTP endpoint (webhook) on the subscriber. A successful HTTP 200 OK response acts as the acknowledgement. Pub/Sub automatically manages the delivery rate based on the subscriber's response times.

  • Multiple topics for same webbook
  • Lower latency
  • Dependencies can't be set up (creds, client libraries)

Message Distribution Patterns

One-to-One
A single publisher sends messages to a single topic, consumed by a single subscriber through a single subscription.
Fan-in (Load Balancing)
Multiple publishers send messages to the same topic. Multiple subscribers can consume these messages from the same subscription, enabling parallel processing and load balancing across the subscribers. All subscribers in the subscription receive all messages.
Fan-out (Distribution)
A single publisher sends messages to a topic, and multiple subscriptions are attached to that topic. Each subscription receives all the messages, allowing the same data to be processed by different systems.

Batching

Combines multiple messages into a single publish request, optimizing throughput and reducing cost per message.

  • Default behavior: Enabled by default in client libraries, simplifying implementation.
  • Throughput improvement: Significantly reduces the overhead associated with individual publish requests, allowing for higher message throughput.
  • Cost reduction: Fewer publish requests translate directly to lower costs.
  • Latency trade-off: Introduces latency as the publisher waits for enough messages to form a complete batch or a timeout to occur before sending.
  • Impact on ordering: Messages within a batch are guaranteed to be delivered in the order they were added to the batch. However, different batches can arrive at the subscriber out of order. If strict ordering is crucial, consider using a single ordered stream within a topic.
  • Flow control and Back pressure: Batching plays a role in flow control. If the publisher sends messages faster than they can be batched and published, the client library's internal buffers may fill up, leading to backpressure.

Security Access Control

  • Resource Levels: Configurable at Project, Topic or Subscription levels.
  • IAM Roles: Admin, Editor, Publisher, Subscriber, Viewer roles.
05

Kafka Comparison

Kafka Integration & Concepts

Feature Comparison

Persistence
Kafka: Durable
Pub/Sub: Ephemeral (options for persistence)
Delivery
Kafka: At-least-once, at-most-once, exactly-once
Pub/Sub: At-least-once (exactly-once options)
Ordering
Kafka: Within a partition
Pub/Sub: Generally not guaranteed
Scalability
Kafka: Very high
Pub/Sub: High
Consumer Mgmt
Kafka: Consumer-managed offsets
Pub/Sub: Service-managed
Deployment
Kafka: Self-hosted or managed service
Pub/Sub: Managed service
Ecosystem
Kafka: Rich
Pub/Sub: Integrates with cloud services
06

Operations, Metrics & Best Practices

Best Practices

Cloud Logging Metrics

Maintain a healthy subscription
  • Monitor message backlog
  • Monitor delivery latency health
  • Monitor acknowledgment deadline expiration
  • Monitor message throughput
  • Monitor push subscriptions
  • Monitor subscriptions with filters
  • Monitor forwarded undeliverable messages
Maintain a healthy publisher
  • Monitor message throughput
Message Rates
  • subscription/num_undelivered_messages: High counts indicate potential subscriber issues (processing too slow, crashes, etc.).
  • subscription/byte_cost: Track message size to optimize costs and identify large messages that might impact performance.
  • topic/send_message_operation_count: Monitor publish rates to detect anomalies like sudden spikes or drops that could indicate publisher problems or unusual traffic.
Latency
  • subscription/oldest_unacked_message_age: High values signify processing delays in subscribers. Crucial for time-sensitive applications.
  • topic/publish_latency: Tracks publish latency, helpful for identifying publisher-side bottlenecks.
Errors
  • subscription/pull_request_count: A high volume may indicate inefficient pull configuration.
  • topic/send_message_error_count: Publishing errors to identify issues with publisher code or topic availability.
Resource Usage
  • subscription/ack_message_operation_count: Compare against published messages to ensure messages are being processed and acknowledged properly. A significant mismatch could indicate message loss or subscriber problems.

Self-Assessment Questions

Q1. What is the main difference between Push and Pull subscriptions?

Pull subscribers request messages; Push subscribers have messages sent to an HTTP endpoint.

Q2. How does persistence differ between Kafka and Pub/Sub?

Kafka uses durable log-based persistence by default, whereas Pub/Sub is primarily ephemeral with options available for extended persistence (up to 31 days).

Q3. What is a Dead Letter Queue used for?

For offline message inspection of messages that cannot be processed successfully.