The Sev-1 Database

The Sev-1 Database

[PREMIUM] Silent PostgreSQL Memory Killer

Technical deep-dive for PostgreSQL administrators, backend engineers, and platform teams running PostgreSQL on Linux. If you manage a database with shared_buffers above 8GB, this is required reading.

Haider Z @ Microsoft's avatar
Haider Z @ Microsoft
Mar 31, 2026
∙ Paid

Incident

Production PostgreSQL instance. 128GB RAM server. Suddenly OOM-killed by the Linux kernel no warning, no graceful shutdown. Just

pg_ctl: could not start server

The on-call engineer did what everyone does: restarted it and went back to sleep.

You restart it. It comes back. You go to sleep. It happens again.

This pattern recurring OOM kills with no obvious cause is one of the most misdiagnosed PostgreSQL production failures. Linux Huge Pages are not configured, and PostgreSQL is silently allocating gigabytes of shared memory using 4KB pages.


Table of Contents

1. Incident
2. What Logs Actually Said
3. What Are Linux Huge Pages and Why Does PostgreSQL Need Them
4. Trap: huge_pages
5. Diagnosing Problem
6. Calculating How Many Huge Pages You Actually Need
7.Fix: Step-by-Step
- Step 1: Verify Kernel Support
- Step 2: Allocate Huge Pages at OS Level
- Step 3: Make Allocation Permanent
- Step 4: Set OS Permissions for PostgreSQL User
- Step 5: Change PostgreSQL from try to on
- Step 6: Restart and Verify
8. Why This Actually Fixes OOM Mechanism
9. Quick Checklist
10. Summary


What Logs Actually Said

Check /var/log/syslog or dmesg:

dmesg | grep -i “oom\|killed\|postgres”

You’d see something like this:

Out of memory: Kill process 18423 (postgres) score 892 or sacrifice child
Killed process 18423 (postgres) total-vm:134217728kB, anon-rss:98304000kB

PostgreSQL logs themselves? Silent. The process didn’t crash it was killed by the kernel.


Share The Sev-1 Database


🚀 Get the PostgreSQL Health Report: a single SQL function that runs 60+ diagnostic checks including TXID wraparound detection, autovacuum health, much more and ready-to-run fixes.

Get it here → 29$


What Are Linux Huge Pages and Why Does PostgreSQL Need Them?

The Linux kernel manages memory in fixed-size chunks called pages. By default, each page is 4KB. For most workloads, fine. For PostgreSQL a serious problem.

PostgreSQL allocates a large contiguous block of shared memory at startup, primarily for shared_buffers. On a production server this might be 8GB, 16GB, or 32GB. The kernel must track every single page in that allocation:

  • 8GB shared_buffers → ~2,097,152 page table entries to manage

  • 16GB shared_buffers → ~4,194,304 page table entries

  • 32GB shared_buffers → ~8,388,608 page table entries

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2026 Haider Z · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture