TL;DR: This post explains how to horizontally scale PostgreSQL for massive telemetry data by replacing a single bloated table with declarative table partitioning. You will learn how to use
PARTITION BY RANGEto route time-series data into monthly child tables, bypassing index bottlenecks and MVCC table bloat, and how to automate this maintenance using Node.js.
⚡ Key Takeaways
- Avoid adding indexes to massive single tables (e.g., 500M+ rows), as updating the index will severely bottleneck high-volume insert speeds.
- Stop using standard
DELETEqueries to purge old logs, which causes MVCC "dead tuples" and table bloat that requires resource-heavyVACUUMprocesses to clean up. - Implement declarative table partitioning using
PARTITION BY RANGE (created_at)on a parent table to automatically route incoming telemetry into smaller child tables. - Define time-bound child partitions using
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01')so you can instantly drop old data by removing the entire table instead of deleting individual rows. - Automate the ongoing creation of future monthly partitions using a Node.js background worker to ensure uninterrupted data routing without altering your core application logic.
You launched a new application and made the smart choice to track user behavior. To do this, you created a simple database table to store telemetry—automated event data sent from your app to your server, such as button clicks, page views, error logs, or GPS coordinates.
At first, everything works perfectly. You insert thousands of rows a day into your PostgreSQL database without breaking a sweat. But fast forward six months, and your application is a massive success. Now, you are inserting millions of rows per day.
Suddenly, your database is struggling under the load. Simple queries take minutes to execute, and your server's CPU is constantly maxed out.
-- The simple table that worked perfectly on day one,
-- but is now bottlenecking your database on day 180.
CREATE TABLE app_telemetry (
id SERIAL PRIMARY KEY,
event_name VARCHAR(255) NOT NULL,
user_id INT NOT NULL,
event_data JSONB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Selecting data from yesterday takes forever when the table has 500 million rows.
SELECT COUNT(*) FROM app_telemetry
WHERE created_at >= NOW() - INTERVAL '1 day';
When searching for a solution, the most common advice is to add an Index (a fast lookup mechanism, similar to a book's glossary). However, adding an index to a table with 500 million rows introduces a new bottleneck: your insert speeds slow to a crawl because the database must update this massive index every time a new event arrives.
Next, you might try deleting old data by running a query to remove logs older than 30 days. Surprisingly, this often makes things worse. In PostgreSQL, deleting large amounts of data causes Table Bloat. Thanks to Multi-Version Concurrency Control (MVCC), Postgres doesn't immediately erase deleted rows; it merely marks them as "dead tuples" and hides them until a background process called VACUUM reclaims the space.
At this point, you might think you need to rip out PostgreSQL and migrate to a complex, specialized time-series database. But that requires weeks of engineering effort and migrating your existing data.
What if you could keep using standard PostgreSQL, but change how it stores data under the hood? The solution is Declarative Table Partitioning. In this guide, we will break down exactly what partitioning is, how to set it up in PostgreSQL, and how to automate the maintenance process using Node.js.
What is Table Partitioning?
Imagine you run an office, and you store every single piece of paperwork in one giant cardboard box. Whenever your boss asks for yesterday's invoices, you have to sift through the entire box, pushing aside papers from three years ago just to find what you need.
Adding an index is like making a master list of where every paper is located. It helps you find things faster, but every time you file a new paper, you have to rewrite the massive list.
Table Partitioning is like buying a filing cabinet with multiple drawers. You label one drawer "January 2026," another "February 2026," and so on.
- When you need a document from January, you only open the January drawer.
- When you insert a new document, you drop it straight into the current month's drawer.
- When the documents in a drawer get too old, you just pick up the whole drawer and throw it away—no messy sorting or erasing required.
In PostgreSQL, we create a "Parent" table that acts like the filing cabinet. It doesn't actually store any data itself. Instead, it routes incoming data into smaller "Child" tables (the drawers) based on a predefined rule—most commonly, the created_at timestamp.
Production Note: Partitioning is ideal for time-series data, event logging, auditing, and high-volume transactions where you regularly query recent data and eventually delete or archive older records.
Let's look at how to set this up using standard SQL commands.
-- Step 1: Create the "Parent" table.
-- Notice the "PARTITION BY RANGE" command.
-- We are telling Postgres to divide the data based on the 'created_at' column.
CREATE TABLE partitioned_telemetry (
id BIGSERIAL,
event_name VARCHAR(255) NOT NULL,
user_id INT NOT NULL,
event_data JSONB,
created_at TIMESTAMP NOT NULL
) PARTITION BY RANGE (created_at);
-- Step 2: Create the "Child" tables (the partitions).
-- We manually create a table specifically for January 2026 data.
CREATE TABLE telemetry_2026_01 PARTITION OF partitioned_telemetry
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
-- We create another table for February 2026.
CREATE TABLE telemetry_2026_02 PARTITION OF partitioned_telemetry
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
When your Node.js application inserts a row into partitioned_telemetry, PostgreSQL looks at the created_at timestamp. If the date is January 15th, it quietly routes that data into telemetry_2026_01. Your application code doesn't need to change at all!
Automating Partition Creation with Node.js
There is a catch to PostgreSQL partitioning: the database does not automatically create new partitions for future months. If February 28th rolls around and your app tries to insert data for March 1st, but the March partition table doesn't exist, PostgreSQL will throw a fatal error and crash your app.
We need a way to automatically create next month's partition before the current month ends. As part of providing custom backend development services for enterprise clients, we heavily rely on Node.js background workers to automate this database maintenance.
We will use Node.js along with the pg (PostgreSQL client) library and node-cron (a task scheduler) to automate table generation.
First, let's install our dependencies in our Node.js project:
# Initialize a new Node.js project
npm init -y
# Install the PostgreSQL client and a cron job scheduler
npm install pg node-cron dotenv
Now, let's write a script that runs every day, checks if next month's partition exists, and creates it if it doesn't. We'll use a PostgreSQL connection Pool so the script can stay running securely for days at a time.
// partition-manager.js
require('dotenv').config();
const { Pool } = require('pg');
const cron = require('node-cron');
// Setup the connection pool to our PostgreSQL database
const dbPool = new Pool({
user: process.env.DB_USER,
host: process.env.DB_HOST,
database: process.env.DB_NAME,
password: process.env.DB_PASSWORD,
port: process.env.DB_PORT,
});
async function createNextMonthPartition() {
try {
// 1. Figure out what the next month is
const today = new Date();
const nextMonth = new Date(today.getFullYear(), today.getMonth() + 1, 1);
const monthAfterNext = new Date(today.getFullYear(), today.getMonth() + 2, 1);
// Format dates to YYYY-MM-DD for PostgreSQL
const startDateStr = nextMonth.toISOString().split('T')[0];
const endDateStr = monthAfterNext.toISOString().split('T')[0];
// Format the table name (e.g., telemetry_2026_04)
const yearStr = nextMonth.getFullYear();
const monthStr = String(nextMonth.getMonth() + 1).padStart(2, '0');
const tableName = `telemetry_${yearStr}_${monthStr}`;
console.log(`Checking partition: ${tableName}...`);
// 2. Execute the SQL command to create the partition IF NOT EXISTS
const createQuery = `
CREATE TABLE IF NOT EXISTS ${tableName}
PARTITION OF partitioned_telemetry
FOR VALUES FROM ('${startDateStr}') TO ('${endDateStr}');
`;
await dbPool.query(createQuery);
console.log(`Successfully ensured partition ${tableName} exists.`);
} catch (error) {
console.error("Failed to create partition:", error);
}
}
// 3. Schedule this task to run every day at 2:00 AM
// Cron format: 'Minute Hour Day Month DayOfWeek'
cron.schedule('0 2 * * *', async () => {
console.log("Running daily partition maintenance...");
await createNextMonthPartition();
});
// Run once immediately on startup just to be safe
console.log("Partition manager running.");
createNextMonthPartition();
With this script running on your server, you completely eliminate the risk of missing a partition. It looks ahead to the next month and ensures the "drawer" is ready for incoming data.
Mastering Indexes on Partitioned Tables
Now that our data is separated into smaller chunks, we still want our database queries to be lightning-fast. To do this, we need to create an index.
Starting in PostgreSQL 11, when you create an index on a partitioned parent table, PostgreSQL is smart enough to automatically cascade that index down to all existing child tables. Furthermore, it will automatically apply the index to any future child tables our Node.js script creates!
Beginner Tip: Always index the column you use most often in your
WHEREclauses. For telemetry data, this is almost always the timestamp or the user ID.
-- Creating an index on the parent table.
-- We are indexing the 'created_at' and 'user_id' columns
-- because we frequently search for a specific user's recent events.
CREATE INDEX idx_telemetry_created_user
ON partitioned_telemetry (created_at, user_id);
-- PostgreSQL will automatically cascade this index onto:
-- telemetry_2026_01
-- telemetry_2026_02
-- And the new tables our Node.js cron job creates!
Because the child tables are much smaller than a single massive table, keeping these indexes updated requires significantly less memory and CPU power. Your insert speeds remain incredibly fast, and your read speeds become blazing fast because PostgreSQL only scans the specific partition that matches the date in your query.
Archiving: The Smart Way to Delete Data
We have solved the insertion speed and the query speed. But what about database storage space? Telemetry data builds up fast, and we likely need to delete data older than 90 days.
Remember earlier when we mentioned that the DELETE command causes "Table Bloat" and locks up the database? Partitioning offers a magical alternative called Detaching and Dropping.
When you want to throw away old data, you don't run a DELETE query row-by-row. Instead, you instantly disconnect (detach) the entire month's table from the parent table, and then you either export it to a cheap storage service (like Amazon S3) or simply delete (drop) the whole table in one swift motion.
This operation takes milliseconds, reclaims 100% of the disk space immediately, and doesn't stress the database CPU at all. We have successfully used this architecture to process massive scale event tracking for high-volume logistics platforms without a single blip in performance.
Let's build a final Node.js worker that safely drops partitions older than 3 months. This script is designed to run once and exit, making it perfect for an OS-level cron job or an AWS Lambda function.
// archiver.js
require('dotenv').config();
const { Client } = require('pg');
async function dropOldPartitions() {
const dbClient = new Client({
user: process.env.DB_USER,
host: process.env.DB_HOST,
database: process.env.DB_NAME,
password: process.env.DB_PASSWORD,
port: process.env.DB_PORT,
});
try {
await dbClient.connect();
// Calculate the date for 3 months ago
const targetDate = new Date();
targetDate.setMonth(targetDate.getMonth() - 3);
const yearStr = targetDate.getFullYear();
const monthStr = String(targetDate.getMonth() + 1).padStart(2, '0');
// Target table to delete: e.g., telemetry_2025_12
const tableNameToDrop = `telemetry_${yearStr}_${monthStr}`;
console.log(`Attempting to drop partition older than 3 months: ${tableNameToDrop}`);
// We use "IF EXISTS" so it doesn't throw an error if the table was already deleted
const dropQuery = `DROP TABLE IF EXISTS ${tableNameToDrop};`;
await dbClient.query(dropQuery);
console.log(`Successfully cleaned up old partition: ${tableNameToDrop}`);
} catch (error) {
console.error("Error dropping old partition:", error);
} finally {
// Safely close the database connection when done
await dbClient.end();
}
}
dropOldPartitions();
Note: In a production environment, before running DROP TABLE, you might run an ALTER TABLE ... DETACH PARTITION command followed by a COPY to export the table data to a CSV file, upload it to AWS S3, and then drop the table. This is known as "Cold Storage Archival".
Summary
Scaling a database doesn't always mean learning a completely new technology stack or migrating your entire infrastructure to a specialized time-series database. By understanding how your database works at a fundamental level, you can achieve massive scale with the tools you already have.
By implementing native PostgreSQL declarative partitioning, we solved three major scaling problems:
- Insert Speeds: By keeping tables small and manageable, index updates no longer bottlenecked our application.
- Query Speeds: By querying specific timeframes, PostgreSQL only searches the relevant "drawers", drastically reducing query times.
- Storage and Cleanup: By using a Node.js worker to drop old partitions entirely, we safely removed millions of rows in milliseconds without causing database bloat.
Work With Us
Need help building this in production? SoftwareCrafting is a full-stack dev agency — we ship scalable React, Next.js, Node.js, React Native, and Flutter apps for global clients.
Frequently Asked Questions
Why does deleting old telemetry data in PostgreSQL cause performance issues?
In PostgreSQL, deleting large amounts of data doesn't immediately free up disk space due to Multi-Version Concurrency Control (MVCC). Instead, it creates "dead tuples" that lead to table bloat until a background process called VACUUM reclaims the space. This cleanup process can heavily tax your database CPU and slow down overall performance.
What is declarative table partitioning in PostgreSQL?
Declarative table partitioning allows you to divide a massive table into smaller, more manageable "child" tables based on a specific rule, such as a date range. A "parent" table acts as a router, automatically sending incoming data into the correct child partition. This makes querying recent data and archiving old time-series data significantly faster.
Do I need to modify my Node.js application code to use partitioned tables?
No, your application code does not need to change. When your Node.js app inserts a row into the parent table, PostgreSQL automatically evaluates the partition key (like the created_at timestamp) and quietly routes the data to the correct child table under the hood.
How does table partitioning solve the index bottleneck for high-volume inserts?
Adding an index to a massive, unpartitioned table slows down insert speeds because the database must update a giant index for every new event. Partitioning breaks the data into smaller chunks, meaning the database only needs to update the much smaller index of the specific active child partition, keeping insert speeds fast.
Can SoftwareCrafting help automate PostgreSQL partition maintenance?
Yes, the backend experts at SoftwareCrafting can design and implement automated Node.js archival workers tailored to your specific database needs. We help ensure that new partitions are created automatically before they are needed, and old partitions are safely archived or dropped without manual intervention.
Should I migrate to a specialized time-series database instead of partitioning PostgreSQL?
Migrating to a specialized time-series database often requires weeks of engineering effort, new infrastructure, and complex data migration. For many applications, implementing native PostgreSQL partitioning provides the necessary performance boost for high-volume event logging while allowing you to keep your existing, familiar database stack.
How can SoftwareCrafting assist with migrating an existing bloated table to a partitioned architecture?
Migrating hundreds of millions of rows from a standard table to a partitioned structure requires careful planning to avoid application downtime. SoftwareCrafting provides specialized database scaling services to safely execute these high-volume migrations, optimize your indexes, and future-proof your PostgreSQL architecture.
📎 Full Code on GitHub Gist: The complete
debug-n8n-input.jsfrom this post is available as a standalone GitHub Gist — copy, fork, or embed it directly.
