Google Cloud Database Services

AlloyDB for
PostgreSQL

Fully-managed, PostgreSQL-compatible database service designed for demanding enterprise database workloads. Combines a Google-built database engine with a disaggregated, log-structured storage layer. Optimized for high-throughput OLTP, real-time analytics (HTAP), and AI-driven applications.

Availability: Regional (Multi-zone HA) & Omni

SLA: 99.99%

Compatibility: 100% PostgreSQL

Author: Michaël Bettan

Overview & Billing

99.99%

SLA Guarantee

100%

PG Compatible

Faster OLTP

100x

Faster OLAP

Compute: vCPU & Memory per hour for Primary and Read pool nodes (HA doubles primary cost).

Storage: Data storage per GiB/month. Pay-for-what-you-use, no provisioning required.

Network Egress: Charges apply for data transferred out of the system.

Backup & Logs: Snapshot and PITR log storage. PITR logs are free for the first 7 days.

I/O Charges: No I/O charges apply for reading or writing data.

Architecture & Storage Model

AlloyDB utilizes a Decoupled Compute & Storage architecture. Compute nodes and storage scale independently, with storage being a highly available distributed service shared across the cluster.

Compute Layer

Primary Instance: A single read/write node. If configured for HA, it includes an active/standby pair across zones. Max capacity is 288 vCPUs and 2,232 GiB RAM per instance.
Read Pool Instances: Up to 20 nodes for read scaling. All nodes share the same distributed storage layer, meaning no replication lag; writes on primary are immediately visible to all read pool nodes.

Storage Layer

Replication: Replicated automatically across 3 zones in a region for durability.
Auto-scaling: Auto-scales seamlessly up to 128 TiB per cluster without requiring downtime or resizing operations.

Log-Structured Storage Layer

Compute nodes do not write data blocks to storage; they only write Write-Ahead Logs (WAL). The storage layer autonomously applies WALs to blocks, drastically reducing I/O bottlenecks and vacuuming overhead.

Cache Hierarchy

Multi-layer caching (Buffer Cache, Ultra-fast Local SSDs) significantly accelerates data access compared to standard PostgreSQL.

HTAP & Columnar Engine

AlloyDB is optimized for Hybrid Transactional and Analytical Processing (HTAP). It allows for rapid operational performance while simultaneously serving analytical queries.

Built-in Columnar Engine

Keeps a column-based representation of data in memory, specifically optimized for analytical queries (aggregations, scans).

Auto-columnarization (ML)

ML algorithms observe workload patterns and automatically decide which tables/columns to load into the columnar engine.

Real-Time Insights

Enables running complex BI reporting directly on the operational database without needing ETL pipelines to a separate data warehouse.

Connection Management

An instance supports up to 1,000 concurrent connections by default, but AlloyDB has a native Managed Connection Pooler. It scales to a maximum of 240,000 concurrent connections.

Standard PostgreSQL port: 5432
Managed Connection Pooler port: 6432

AlloyDB AI & Machine Learning

Vertex AI & Vector Search

Vertex AI Integration: Integrates seamlessly with Vertex AI to call ML models directly from SQL (e.g., using google_ml.predict_row()) for inference and generating embeddings.
Vector Search: Leverages Google's ScaNN index (powering Google Search/YouTube) to deliver up to 10x faster index creation, 4x faster pure vector search, and 10x faster filtered vector searches than standard pgvector HNSW index.

Generative AI & Embeddings

Generative AI & RAG: Build scalable Generative AI and RAG applications directly on transactional data. Invokes Large Language Models directly from SQL (e.g., google_ml.generate_content leveraging the google_ml_integration extension).
Auto Vector Embeddings: Automatically synchronizes embeddings with transactional data. Supports incremental refresh for updated rows and a bulk backfill mode 130x faster than standard row-by-row processing.

Use Cases & Migrations

Key Use Cases

Tier-1 OLTP: Mission-critical operational database applications.
HTAP & Real-time Analytics: Hybrid transactional and analytical processing.
Legacy Modernization: Modernizing legacy systems, including migrations from Oracle.
Generative AI & Vector Search: Build intelligent applications leveraging integrated AlloyDB AI.
High-throughput Scaling: Workloads exceeding standard Cloud SQL performance limits.

Migration & Integrations

Database Migration Service (DMS): Native integration for secure, minimal-downtime migrations from legacy databases into AlloyDB with Gemini-assisted schema conversion for heterogeneous migrations (e.g., SQL Server → PostgreSQL). Supports Amazon RDS, Aurora, MySQL, Oracle, etc.
Datastream: Change Data Capture (CDC) to stream data from AlloyDB into BigQuery for long-term enterprise data warehousing.

AlloyDB Omni

Run AlloyDB Anywhere

AlloyDB Omni is a downloadable, containerized edition of AlloyDB designed to run anywhere while maintaining the same core performance characteristics.

Deployment Flexibility: Run on-premises, on developer laptops (via Docker/VMs), at the edge, or on competing clouds (AWS, Azure).
Core Engine Optimizations: Brings the exact same core engine optimizations (columnar engine, vector search, ScaNN) to self-managed infrastructure.

Security & Networking

Private Access: Connect securely via VPC peering to access instances through private IPs without traversing the public internet.

Encryption: Data is encrypted in transit and at rest, supporting Customer-Managed Encryption Keys (CMEK).

IAM Authentication: Centralized access control leveraging Google Cloud IAM.

VPC Service Controls: Establish secure perimeters around database resources.

Audit Logging: Detailed Cloud Audit Logs for database access and administrative actions.

PSC Support: Supports Private Service Connect (PSC) for secure connections.

High Availability & Disaster Recovery

High Availability & Backup

Rapid HA Failover: Automatic failover completes in < 60 seconds, independent of database size.
Continuous Backup & PITR: Point-in-time recovery retained for 14 days by default (adjustable from 1–35 days).
Intra-region Recovery: Provides zero RPO for point-in-time recovery scenarios within a region.

Disaster Recovery

Cross-region replication is available for robust DR strategies.

Planned switchover: Zero RPO for scheduled maintenance.
Unplanned regional outage: Minimal, non-zero RPO (dependent on replication lag at the moment of failure, but 25x lower lag than standard PostgreSQL).

Key Takeaways for AlloyDB

Log-Structured Storage is the core of AlloyDB's performance, as compute nodes only write WALs, heavily reducing I/O bottlenecks.
Read pools have no replication lag because they share the same exact storage layer as the primary instance.
AlloyDB uses Google's ScaNN index for vector search, dramatically outperforming standard pgvector.
AlloyDB Omni allows you to take the exact same core engine optimizations and run them on-prem, locally, or on other clouds.
Connection scaling is native via the Managed Connection Pooler on port 6432, scaling up to 240,000 connections.

Self-Assessment Questions

Q1. How does AlloyDB eliminate I/O bottlenecks during heavy write operations?

Through its Log-Structured Storage Layer. Compute nodes only write Write-Ahead Logs (WAL), while the storage layer autonomously applies WALs to blocks.

Q2. Which index technology does AlloyDB leverage to accelerate vector searches for AI applications?

The ScaNN index (the same technology powering Google Search/YouTube), which is significantly faster than the standard pgvector HNSW index.

Q3. If you need to connect your application to the Managed Connection Pooler instead of standard PostgreSQL, which port should you use?

Port 6432 (Standard PostgreSQL connects on port 5432).

Q4. What is the expected RPO (Recovery Point Objective) for a planned cross-region disaster recovery switchover?

Zero RPO.

Q5. How many read pool nodes can you provision in an AlloyDB cluster, and what is the typical replication lag?

Up to 20 read nodes. There is zero replication lag because all nodes read from the same shared distributed storage layer.

Study Progress — Material Coverage 100%

Video Walkthrough

Watch the AlloyDB engineering blueprint video on YouTube for a visual walkthrough.

Watch on YouTube

AlloyDB for PostgreSQL

Overview & Billing

Architecture & Storage Model

Compute Layer

Storage Layer

Log-Structured Storage Layer

Cache Hierarchy

HTAP & Columnar Engine

Built-in Columnar Engine

Auto-columnarization (ML)

Real-Time Insights

Connection Management

AlloyDB AI & Machine Learning

Vertex AI & Vector Search

Generative AI & Embeddings

Use Cases & Migrations

Key Use Cases

Migration & Integrations

AlloyDB Omni

Run AlloyDB Anywhere

Security & Networking

High Availability & Disaster Recovery

High Availability & Backup

Disaster Recovery

Key Takeaways for AlloyDB

Self-Assessment Questions

Video Walkthrough

Video Walkthrough

AlloyDB for
PostgreSQL