Skip to the content.

pg_fusion Documentation

pg_fusion runs selected analytical PostgreSQL SELECT queries through a shared DataFusion background worker. PostgreSQL still owns heap access, snapshots, MVCC visibility, TOAST, tuple decoding, and final result slots.

DataFusion is a Rust analytical execution engine over Apache Arrow columnar batches. pg_fusion uses it for selected analytical execution above PostgreSQL scan streams; it does not replace PostgreSQL storage or MVCC.

Start with the pages that answer operational questions first.

Start Here

Topic Use It For
Quick start Build the extension, configure a local pgrx cluster, and run a first query
Glossary Learn the terms: DataFusion, Arrow, slots, page pool, filters, DPHyp, CTID scans
Architecture Understand the backend/worker/shared-memory model and why rows cross into Arrow
Memory and pages Understand shared blocks, zero-copy imports, materialization, and page reuse
Execution model Follow one eligible query from planning to result slots
Query support Check which query shapes and types are currently eligible
Compatibility matrix Inspect PostgreSQL to DataFusion type, expression, function, aggregate, and window mappings
Workloads Evaluate good and poor workload candidates
Limitations Understand overhead cases, semantic boundaries, and unsupported features

Operate

Topic Use It For
Configuration Size the worker, shared memory, scan streaming, runtime filters, and spill
Metrics Diagnose scan encoding, worker backpressure, result transfer, filters, and spill
Benchmarks Run local comparison benchmarks and interpret the results

Build And Contribute

Topic Use It For
Development Set up Rust, pgrx, and the contributor workflow
Testing Run standalone Rust tests and PostgreSQL-backed pgrx tests
Roadmap Follow typed planning, PG18 support, compatibility, and testing direction

Status

pg_fusion is experimental. Treat unsupported query shapes as not implemented, not as implicitly equivalent to PostgreSQL execution.