So what is the problem? The problem is they weren't running Autovacuum. Now many of my brethren would say, "HERESY!" but in reality there are good reasons not to run Autovacuum (although I would say not Autoanalyze). Autovacuum is unpredictable, and can cause performance problems. 99% of the time you should run Autovacuum but there is a 1% reason to consider other alternatives.
The point I trying to make here in my sleep deprived state is that Autovacuum can be turned on with just a reload. It does not require a restart. I swore up and down that it required a restart and I ended up being wrong. I am not sure why I thought it needed a restart.
So there you go folks. Tip of the day, "autovacuum = on" only needs a reload.
Tallyho read the docs!
Yes, I really did just write that. I believe the the FSF no longer fulfills its
mission. Wait, let's back up a step. I can feel the torches started to be covered
in pitch and the frankenstein cry of, "kill the heretic" starting to rumble
through the old streets of the Free Software country. I am not here to say that
the FSF is useless or that it doesn't have purpose. I am not here to say that
Richard Stallman shouldn't continue on his political mission to save the world
from the use of rightfully produced and licensed closed source software.
What I am saying is that the FSF and GNU should separate and that this
separation will act as a catalyst to allow for both to complete its
mission in a more productive manner. That's right, fsf.org and gnu.org should
be two separate non profits, with two different boards. Yes, I am aware that
the two are one and it shall and always be. I am declaring that "for better or
worse" is now worse and a divorce is now in order.
I have the deepest respect for the GNU project. I write this blog entry largely
on software that would not be possible without GNU components. I run Linux (no,
not GNU/Linux). I run KDE 4.10. I write this in Kate, although I normally prefer
Joe. I run PostgreSQL. I run Pidgin, Thunderbird, Gimp, Google Chrome (no not
Chromium), Wine, Netflix Desktop, Python, LibreOffice and my music is playing
using Amarok. And this my fine Open Source (yes Open Source, not Free Software)
denizens is exactly why I think GNU should fork from FSF.
FSF/Richard Stallman is a political movement. A political ideal full zealotry.
It is uncompromising, unrelenting, stalwart and venerable. It has done a lot of
good, it continues to strive to do a lot of good. However...
"The Free Software Foundation (FSF) is a nonprofit with a worldwide mission to
promote computer user freedom and to defend the rights of all free software
On the other hand:
"The primary and continuing goal of GNU is to offer a Unix-compatible system
that would be 100% free software. Not 95% free, not 99.5%, but 100%. The name
of the system, GNU, is a recursive acronym meaning GNU's Not Unix a way of
paying tribute to the technical ideas of Unix, while at the same time saying
that GNU is something different. Technically, GNU is like Unix. But unlike
Unix, GNU gives its users freedom."
As you can see, although both are inextricably intertwined but they are also
fundamentally different in their purpose. One is about freedom and rights
of users. The other is about developing Free Software.
It is my assertion that the continued political movement of the FSF is causing
the GNU Project to suffer from slow, politicized and in some ways arcane
development. Consider, what tools you use. How many of those tools are actually
from the GNU project? All the tools I previously listed, do they need GNU?
Absolutely. Are any of them from GNU? Only GIMP. I think you will find this is
the case with most modern Open Source (and Free Software) users.
It is time for developers not lobbyists to run GNU.
Bucardo, an asynchronous multi-master replication system (or, maybe the asynchronous multi-master replication system for PostgreSQL, cause I know of no other actively developed ones), deals with replication conflicts by providing conflict handlers in a form of standard (built-in) or custom code procedures. The built-in conflict handlers resolve a conflict by taking a row from the source or the target databases (in Bucardo 4 only 2 masters are supported, called 'source' and 'target' below), or using a delta row timestamp to determine the winner; other options include skipping conflicting rows altogether or picking one at random, but they are not very useful if you need the data to be consistent between replicas. For more complex conflict resolution rules it's necessary to write a custom conflict handler and, since the documentation is really scarce on that matter, I've decided to show how to create a simple one.
I don't usually post rants here, but this one might be actually helpful to others, so let's make an exception. It will be related to installing PostgreSQL from distro-specific packages. I usually prefer setting PostgreSQL from sources, unlike the majority of users; nevertheless, I'm familiar with how popular distros, like Debian or Fedora, manage their PostgreSQL layouts. Or so I thought until today.
My task was simple: install PostgreSQL instance for testing on a fresh Linux box, use a non-standard port. The catch: the box was running a relatively new Fedora 17.
It is simple. Most of us Open Source developers aren't generally good with average people. We are good with our "breed" of people but move us out of our element and suddenly we can be awkward, offensive, and generally weird. We talk differently than other people, we have inside humor that doesn't span directions, and are just as inclusive as the richest Skull & Bones society members. Is this bad? No, it is reality. Whenever you take a group of individuals who are on a different playing field than the average person you are going to end up in this situation.
The second point that Bruce states is that closed source users have very little interaction with users. I think this is misunderstood. To say that Open Source has more interaction with users is, in my opinion, completely false or is at least given much more weight than is reality. Ask any consultant: the majority of their customers have zero idea about the workings of the community, how to communicate with the community, or interactions with developers. Frankly, they don't want to. They have software to run, businesses to operate, and employees to pay.
This can be further illustrated by watching the community. It tooks PostgreSQL years longer than it should have to get replication, and the community is just now starting to look at logical replication, features that were available in closed source versions of PostgreSQL and as open source addons years ago. The users wanted integrated replication but the community wasn't willing to implement them at the time.
Please don't get me wrong, I love Open Source. I love Open Source development. Heck, the only closed source software I run is to play Civ5 occasionally. Everything else is Open Source but I do think that we need to keep perspective on what is going on in the very large world that does not involve Open Source. It is much bigger, in a lot of ways more productive, and employ smore people (a rarity in today's economy) than Open Source
could ever hope to.
This is the second part in a series of blog posts describing PostgreSQL analogs of common Oracle queries
One of the most intricate Oracle specific constructions is "START WITH ... CONNECT BY". According to Oracle's documentation, the syntax is:
SELECT [query] [START WITH initial_condition] CONNECT BY [nocycle] condition.
This statement is commonly used to traverse hierarchical data in the parent-child order. It's easier to illustrate how it works with an example.
Consider a table that stores opponents moves in a game of chess. Each table row contain coordinates (in algebraic notation) of a single move by whites and the move in response by blacks, as well as a column that references a preceding move, making it possible to keep multiple continuations of a specific move for the post-game analysis.
CREATE TABLE moves(id integer, parent integer, white varchar(10), black varchar(10));
The following statements describe 2 variants of a very short game, the first one leading to the early checkmate (known as a scholar's mate), and the second one to the position where black successfully avoids being checkmated.
INSERT INTO moves VALUES(1, 0, 'e4', 'e5'); INSERT INTO moves VALUES(2, 1, 'Qh5', 'Nc6'); INSERT INTO moves VALUES(3, 2, 'Bc4', 'g6'); INSERT INTO moves VALUES(4, 3, 'Qf3', 'Nf6'); -- checkmate is avoided INSERT INTO moves VALUES(5, 2, 'Bc4', 'Nf6'); INSERT INTO moves VALUES(6, 5, 'Qxf7#', NULL); -- blacks being checkmated
Let's build an Oracle query showing a sequence of moves that leads to the checkmate:
SELECT DISTINCT id AS final_move_id, LTRIM(SYS_CONNECT_BY_PATH(NVL(white,'')||':'||NVL(black,''),';'),';')||';' AS moves, LEVEL AS mate_in FROM moves WHERE white LIKE '%#' OR black LIKE '%#' START WITH id = 1 CONNECT BY PRIOR id = parent;
The query instructs Oracle to look for a checkmate:
As a result, Oracle goes from one row to another only if the parent column of the new row contains the id of the current row, accumulating all visited rows in a result set. The SYS_CONNECT_BY_PATH clause produces a string out of the specified columns of the visited rows, connecting each (parent, child) pair by the designated character (';' in our case).
Being Oracle SQL extension, CONNECT BY is not available in PostgreSQL. Recent versions of PostgreSQL implement Common Table Expressions (CTE), SQL-standard way of dealing with hierarchical data. Here's one possible rewrite of the query above for PostgreSQL using recursive CTEs:
Postgres-XC has been around for a while, it is primarily developed by NTT and EnterpriseDB. It has a small community but a dedicated engineering/hacker backing. Postgres-XC is interesting because it keeps reasonably up to date with the latest Postgres (1.0 is set to be based on 9.1 of PostgreSQL) but provides a shared nothing clustering architecture. This type of infrastructure is one of the holy grails of web based applications.
Should Postgres-XC deliver on its promises (hint: it does), you will be able to scale out (as opposed to up which Postgres already does extremely well) at an almost 1 to 1 ratio. This means that instead of having to purchase 2 large machines at 10-12k a piece you could purchase 4 machines at 1.5k a piece and achieve similar performance (theoretically, I need to test this). It also means that scaling out in the "cloud" will be easier.
I invite everyone interested in PostgreSQL to take a look at Postgres-XC. It is going to 1.0 soon and it needs community members to help flesh out the warts that haven't been found yet.
Another Postgres fork that has recently appeared is tPostgres. tPostgres (doesn't that look wrong at the beginning of a sentence?) is set to do to Microsoft SQL what EnterpriseDB did to Oracle, with one minor, small, interesting, exception: tPostgres is Open Source. Further Microsoft SQL is more in line with PostgreSQL in the types of workloads you usually see it performing. Imagine a tPostgres with Postgres-XC. Imagine an open source way to easily port Microsoft SQL apps to PostgreSQL.
Now don't get me wrong, the latest versions of Microsoft SQL are actually good products. Yes, I did just say that. However, they are not Open Source, they are expensive (comparatively) and let's get real, we want everyone to run Postgres.
Unfortunately tPostgres is only just announced and they are literally at the beginning of building their community but as it is being initiated by Denis Lussier (co-founder of EnterpriseDB), I imagine that he will come through with something very interesting indeed.
That video represents why I would put on the conferences. They were fun. We had a good time.
If you are looking for other Postgres conferences there are the following:
Personally, I would suggest staying local and attending or help organize a local PUG day for PostgreSQL. PUG days are the best in small conferences. You are meeting with many locals, quite a few contributors usually show up, and you get to go home at night. The content is always top notch and chances are you know many of the people there. There are many. We recently had them in NYC, DC/Maryland, and Austin. There is a Denver PgDay on the 26th of October (no website yet) as well.
Why does this matter? It doesn't really. I am just rambling because my sister asked me today something that surprised me, "What is UNIX?". I had to just kind of stare at the screen for a moment. Of course she asked me this as she was happily proclaiming that she received an iPhone for her birthday. How far we have come.
I explained what UNIX was, the basic history, it's involvement in the Internet and it occurred to me that for me, there was one very specific point in life that my professional world went from, "huh.... give me my 7.50/hr" to, "Hey, I can actually become educated in something useful.". It was the mental absorption of this book.
That book, allowed me to learn Unix, which allowed me to learn Linux (back when SLS was king), which brought me to Postgres95, which brought me to PostgreSQL, which brought me to co-writing this book, which lead me to be a major contributor to PostgreSQL not only through my work with the Fundraising group (via SPI)but also . I would also bring up the conferences but those are already mentioned today.
While waxing nostalgia I am reminded of a recent blog post by Bruce Momjian where he mentions, "Postgres adoption is probably five years behind Linux's adoption.". I would agree with him, and would add that a lot of it is directly contributed to our development model. Many in the community have argued for years that time based releases of PostgreSQL would help development, many others... have argued for years that this is a bad idea. Many of those opponents of time based releasing, and one very influential one at that (TGL) are now starting to come around. More on that later, I have work to do!