Blog

Connection Initiation Overhead is Killing Your Web App - Use a Connection Pool

Customers often ask us what is the correct setting for max_connections on their PostgreSQL cluster?  There is a short answer to this question, and there is a very, very long answer. The short answer is: accept the default if that works for you, or try 10 times the number of CPU cores and see if that works ok.

There is another short answer which is: you may be asking the wrong question.

The original implicit assumption is "more connections will get me more throughput". This is often false. A question usually worth pursuing is "How can my existing set of connections be used efficiently?"

If you think that your application wants more connections and you are considering raising max_connections ...

Read More

Copy Files Through an SSH Wormhole!

Here is a fun and useful Linux console hack you might like.

If you use ssh A LOT, maybe you encounter situations where you have a shell that is three or more hops removed your local workstation. And maybe you've been in the situation where there is some file that you need to get from "there" to "here" or vice versa. And you really are annoyed by the thought of scp-ing the file over 3+ hops (and not so lucky that all the keys and permissions in the pipeline enable easy piping through ssh).

Lucky you! In a moment of inspiration, I concocted this ultra-unixy hack to copy the file in ONE hop (more or less). Assuming the file ...

Read More

PostgreSQL Non-exclusive Base Backups in Bash

Here I'm posting a bash script that implements PostgreSQL's new (since 9.6) "non-exclusive" base backup.

I often find that new customers are shy about binary PostgreSQL backups and only schedule logical pg_dump backups. PostgreSQL provides a brilliantly simple binary backup solution that enables easy point in time recovery, unlike pg_dump backups. So I always try to steer people towards these binary "base backups", as they are called in the PostgreSQL world.

In version 9.6, this base backup feature became more flexible, with an additional option allowing more than one base backup to run at a time. For customers with aggressive (or badly scheduled) backup schedules, this allows both backups to succeed, where they would both fail ...

Read More

Postgres Load Testing with pgreplay

Overview

Usually we want to test before deploying changes to a production Postgresql cluster. Commonly the test itself is executed in a context that is not very similar to the production environment. How can you run a test that is realistic so there are no horrible surprises when you deploy on the production system?  Read on! This document describes a procedure where changes can be tested on a system that is running a load that is nearly identical to a production system.

This procedure utilizes pgreplay, which reads postgres' server logs and executes the sql statements found there, with the same timing that they were executed on the production system.

This procedure also uses lvm and its volume snapshot ...

Read More

Simpycity 2.0.0 released (An ORM in Python)

What Simpycity Is

Simpycity is an object-relational mapper. It seamlessly maps PostgreSQL query
and function result sets to Python classes and class attributes.

It allows for the easy and rapid development of query- and
stored procedure-based data representations. Simpycity leverages PostgreSQL's
powerful composite type system, and the advanced type handling of the psycopg2
database access library.

What Simpycity is Not

Simpycity is not a SQL generator and does not attempt to abstract or hide SQL.
Simpycity is designed for developers who deeply understand SQL and
desire to write the best possible SQL representations for their database.
Simpycity also rejects the Active Record paradigm, whose simplistic patterns
fail in even moderately complex systems.

Core Philosophy

The core philosophy behind Simpycity ...

Read More