pg_fusion Documentation
pg_fusion runs selected analytical PostgreSQL SELECT queries through a
shared DataFusion background worker. PostgreSQL still owns heap access,
snapshots, MVCC visibility, TOAST, tuple decoding, and final result slots.
DataFusion is a Rust analytical execution engine over Apache Arrow columnar batches. pg_fusion uses it for selected analytical execution above PostgreSQL scan streams; it does not replace PostgreSQL storage or MVCC.
Start with the pages that answer operational questions first.
Start Here
| Topic | Use It For |
|---|---|
| Quick start | Build the extension, configure a local pgrx cluster, and run a first query |
| Glossary | Learn the terms: DataFusion, Arrow, slots, page pool, filters, DPHyp, CTID scans |
| Architecture | Understand the backend/worker/shared-memory model and why rows cross into Arrow |
| Memory and pages | Understand shared blocks, zero-copy imports, materialization, and page reuse |
| Execution model | Follow one eligible query from planning to result slots |
| Query support | Check which query shapes and types are currently eligible |
| Compatibility matrix | Inspect PostgreSQL to DataFusion type, expression, function, aggregate, and window mappings |
| Workloads | Evaluate good and poor workload candidates |
| Limitations | Understand overhead cases, semantic boundaries, and unsupported features |
Operate
| Topic | Use It For |
|---|---|
| Configuration | Size the worker, shared memory, scan streaming, runtime filters, and spill |
| Metrics | Diagnose scan encoding, worker backpressure, result transfer, filters, and spill |
| Benchmarks | Run local comparison benchmarks and interpret the results |
Build And Contribute
| Topic | Use It For |
|---|---|
| Development | Set up Rust, pgrx, and the contributor workflow |
| Testing | Run standalone Rust tests and PostgreSQL-backed pgrx tests |
| Roadmap | Follow typed planning, PG18 support, compatibility, and testing direction |
Status
pg_fusion is experimental. Treat unsupported query shapes as not implemented,
not as implicitly equivalent to PostgreSQL execution.