Mammoth
  • |
  • Contact
  • |
CMD | Command Prompt, Inc. - PostgreSQL Solutions, Support & Hosting
  • |
  • |
  • |
  • |
  • |
PgWest: 2010 Call for Papers (2nd call)
Posted Wednesday Aug 25th, 2010 12:14pm
by Joshua Drake
| Permalink

Yes, it is the second call. That means some of you haven't submitted after the first call. Of course, I haven't submitted mine either; so it is time for everyone to get on it. West is just around the corner and from all observations this West stands to be the largest PostgreSQL Conference, ever. (O.k. we might not over take Brazil).

Here is the announcement for everyone to review, enjoy and click on the CFP link:

Following on the smashing success of PostgreSQL Conference East, PostgreSQL Conference West, The PostgreSQL Conference for Decision Makers, End Users and Developers, is being held at the Sir Francis Drake Hotel in San Francisco from November 2nd through 4th 2010. Please join us in making this the largest PostgreSQL Conference to date!

  • Main conference site
  • Call for Papers

    Thank you to our sponsors:
    Founding: Command Prompt
    Diamond: EnterpriseDB

    Time line:

    July 14th: Talk submission opens
    Sept 5th: Talk submission closes
    Sept 10th: Speaker notification
    This year we will be continuing our trend of covering the entire
    PostgreSQL ecosystem. We would like to see talks and tutorials on the
    following topics:

          * General PostgreSQL: 
                  * Administration 
                  * Performance 
                  * High Availability 
                  * Migration 
                  * GIS 
                  * Integration 
                  * Solutions and White Papers 
          * The Stack: 
                  * Python/Django/Pylons/TurboGears/Custom 
                  * Perl5/Catalyst/Bricolage 
                  * Ruby/Rails 
                  * Java (PLJava would be great)/Groovy/Grails 
                  * Operating System optimization
                    (Linux/FBSD/Solaris/Windows) 
                  * Solutions and White Papers 
    


    Categories: Business, OpenSource, PostgreSQL, Python, SQL

  • An update on Replicator
    Posted Monday Aug 16th, 2010 07:55am
    by Alexey Klyukin
    | Permalink

    This is my first post on Replicator, I'm going to start by describing the terminology we use, bringing some analogies from other replication systems.

    Replicator is an asynchronous master-to-multiple-slaves replication system. It works by propagating binary changes from a single read-write node (called master) to one or more read-only nodes (called slaves) through an intermediary (forwarder) process. The data changes are stored in binary transaction files on a per-transaction basis. Additionally, each file contains a list of replication tables the data belongs to. Every slave has a distinct set of tables to replicate. Finally, transaction files are addressed by a special data structure called replication queue.

    After connecting to the forwarder for the first time each slave node performs an initial sync (full dump) by requesting a complete up-to-date snapshot of replication tables. The forwarder doesn't necessarily resend such request to a master process. Instead, it checks the queue for past full dumps and reuses them if appropriate.

    To reduce the bandwidth and disk space consumption per each slave the set of tables replicated by the slave is compared with the set of current transaction's tables, and the forwarder decides to send a transaction to the slave only when these 2 sets intersects; thus, each slave receives data only for those tables it replicates. So far, there was an important exception to this rule: full dump transactions were always sent to every slave. The original justification for this was the fact that full dump was required after addition of a new replicated table, and every slave had to be aware of this addition.

    In 1.8 we introduced a new feature called 'per-table dumps', which allowed a slave to request a snapshot of a single table, instead of requesting a full dump. Currently, when a new table is added to the replication, only a single per-table dump is requested, and there's no need for a full dump. This made possible for a slave to 'skip' a full dump, and changes committed last week implement exactly this: if the slave is in sync (i.e. doesn't wait for a dump, or recovering from an error), the forwarder just skips sending full dump transactions to this slave, therefore avoiding most bandwidth-consuming transactions. It's a clear win!

    Additionally there is another related positive change. When a slave restores full dump transaction it replaces the data of each replicated table with the one from the dump, leaving the table inconsistent for some period of time (which is usually short, but depends on table size and other factors). By reducing the average number of dumps per each slave we also reduced the number of these 'inconsistency gaps'. Double win!

    The next version, 1.9, is still in development. We put it (as well as other open-source projects) on github, so you are welcome to check it out and join the replicator mailing list.

    Stay tuned for further updates!


    Categories: OpenSource, PostgreSQL

    FOSSExperts, day 2
    Posted Wednesday Jul 28th, 2010 09:35am
    by Joshua Drake
    | Permalink

    I expected feedback from the community on FOSSExperts. I did not expect feedback with such immediacy. All the feedback I have received so far is positive. Which is a great feeling. Here are the key points that are coming back.

    How do deal with disagreement about the deliverable:

    This is an interesting one. I wanted to keep FOSSExperts simple. That is why the deliverable on the ALTER TABLE project is simple, committed to PostgreSQL Core.

    That may not work in all circumstances. So I am considering one of two options. The first option is courtesy of Josh Berkus. The idea would be to have a board of people that determine whether or not the deliverable has been met. This has merit because you have a panel of experts that make the determination. It can make the review process painless but is also takes the power out of both the funders and developers.

    The second one is to take a vote. It would work something like this.


    • Developer states project is complete and demonstrates completeness based on the deliverable.
    • Every person who funded the project votes on whether or not the project is complete.

      • If 66% vote the project complete, developer gets their money.
      • If less than 66% vote project incomplete, developer doesn't get their money.

      There would have to be some caveats. First the funders need to be able to communicate with the developer because they may not have understood part of the spec. Further as we are working with Open Source, the end deliverable may have been changed based on the will of the community versus the developer (see the Hot Standby work with PostgreSQL).

      The voting would also need to be limited to a period of time. I was thinking 14 days. The 66% would be tallied against those that voted in that 14 days.

      I like this idea because it removes the third party and it stops a single funder from calling foul as they are part of a collective vote. What do you think?


    Other than that the feedback has been extremely positive. I even posted to the LedgerSMB list and multiple people are excited to see this opportunity.

    If you have ideas, please share them. We have setup a flame page just for this.

    Remember a lot of your questions can be answered in the FAQ as well.


    Categories: Business, OpenSource, PostgreSQL, Python, SQL

    FOSSExperts, a new way to fund Open Source (Beta)
    Posted Tuesday Jul 27th, 2010 03:37pm
    by Joshua Drake
    | Permalink

    The cat is out of the proverbial bag. I originally planned to have a quiet roll out with a few close contributors but that has gone by the wayside. Now I am going to be pushing hard for people to test, beat on, object to, argue about, flame upon, scream at, praise and hopefully help us build out something that is truly useful for the FOSS Community. What am I blathering on about? FOSSExperts of course.

    FOSSExperts is a new site specifically engineered to allow FOSS developers to raise money for projects they are trying to develop. The idea stemmed from the very cool Kickstarter. With our focus obviously being on a different kind of creative.

    This is long overdue in the FOSS Community. There are a great deal of communities out there (LedgerSMB for example) that can use a place for their developers to try and raise funds for a specific feature. LedgerSMB just recently had a discussion on developing a Payroll module. Developing a Payroll module will be expensive for a single small company to absorb, but 20 small companies? Not nearly as expensive.

    What FOSSExperts is not, is a place to send money to global projects such as Debian or PostgreSQL.org. It is for specific, well defined proposals and has a specific and defined delivery as well as refund policies etc.

    Right now, we are in closed Beta. If you have a project or proposal you would like to try out you need to email me directly but we are interested. So take a look, and let the rage begin! If you like, you can review one of the larger proposals already on the site as well.


    Categories: Business, OpenSource, PostgreSQL, Python, SQL

    A better backup with PostgreSQL using pg_dump
    Posted Friday Jul 23rd, 2010 11:23am
    by Joshua Drake
    | Permalink

    This is generously borrowed from the PostgreSQL Docs, and updated to something that represents a modern approach to PostgreSQL backups. This documentation has always bothered me because it should have been re-written years ago. Yes I plan on submitting a more comprehensive version as a patch but I don't have time to push it into DocBook right now. If someone else wants to grab it, please do.

    Yes, I really do believe the use of plain text backups is a mistake. Yes I realize PostgreSQL has the limitation of not being able to backup the cluster in anyway but plain text.

    The standard for portable backups with PostgreSQL is pg_dump and pg_dumpall. When used properly pg_dump will create a portable and highly customizable backup file that can be used to restore all or part of a single database.

    The pg_dump application acts as a standard PostgreSQL client. This means that you can perform this backup procedure from any remote host that has access to the database. You do not need to be a super user to use pg_dump but you must have read (and EXECUTE for functions) access to every object within the database.

    Backups created by pg_dump are internally consistent, meaning, the dump represents a snapshot of the database at the time pg_dump began running. The backup will not block other operations on the database while it is working. (Exceptions are those operations that need to operate with an exclusive lock, such as most forms of ALTER TABLE.)

    The minimum useful syntax for pg_dump is:
    pg_dump dbname > outfile

    However, the backup created from this method has limited usefulness. It can be used to restore a single database in full. A more useful and proper form of PostgreSQL backup syntax looks like this:
    pg_dump -U $username --format=c --file=$mydatabase.sqlc $dbname

    The options in detail are:
     -U, --username=NAME      connect as specified database user
     -F, --format=c|t|p       output file format (custom, tar, plain text)
     -f, --file=FILENAME      output file name
    

    The most important of which is --format. By default pg_dump uses the plain text format. The plain text format is useful for very small databases with a minimal number of objects but other than that, it should be avoided. The custom format allows for a wealth of customizability. Using the custom format you are able to restore single objects from a backup. For example to restore only a specified index from a backup file:
    pg_restore -U $username --dbname=$dbname --index=$indexname
    

    If you wanted to restore only a single function:
    pg_restore -U $username --dbname=$dbname --function=$functionname(args)
    

    If you wanted to restore only a single table:
    pg_restore -U $username --dbname=$dbname --table=$tablename
    

    For more information on all the pg_dump options, please see the reference page.

    Restoring the dump
    The command used to restore a backup file is pg_restore. It has similar options to pg_dump. A simple restore:
    pg_restore -U$username --dbname=$databasename $filename

    Where filename is the name of the backup file.
    Do not confuse --file with $filename. The --file option is used to turn a custom format backup into a plain text backup. The value of --file will be used as the output file for that transformation.


    If you make the mistake of creating a plain text backup, pg_restore can not be used as a restoration mechanism. You can use psql to restore it:
    psql $dbname < $backupfile


    Backing up every database
    The "postgresql" way of backing up every database is to use the command pg_dumpall. Unfortunately pg_dumpall can only create plain text backups and should be considered deprecated. However it is the only way to backup the globals in your cluster. A reasonable backup strategy to backup your globals and produce a flexible backup of every database in the cluster would look like this:
    pg_dumpall -g -U$username --file=$globals.sql; 
    psql -AtU postgres -c "SELECT datname FROM pg_database \
                              WHERE NOT datistemplate"| \
    while read f; 
       do pg_dump -Upostgres --format=c --file=$f.sqlc $f;
    done;

    If someone knows of some Windows code that produces a similar result, it would be great if you would share.
    Remember, pg_dumpall creates a plain text backup. This means you will need to use psql to restore the globals backup file.


    After restoring a backup, make sure you run ANALYZE to update the statistics.

    I know this isn't as comprehensive as it could be, but hey, its just a blog.

    Categories: Business, OpenSource, PostgreSQL, SQL

    Multiple Drupal installations, single login, 10 steps
    Posted Thursday Jul 22nd, 2010 08:21pm
    by Joshua Drake
    | Permalink

    We have several Drupal sites, no I am not typing this blog on one. We needed a way to have single sign on with these Drupal sites. One of which is PostgreSQL Conference. There are a few modules out there that can do it, some don't work with PostgreSQL, some are usable but not user friendly (HTTP AUTH) and still others use external services such as OAuth. I didn't want any of these. I wanted a simpler, more flexible solution. I found it with a little PostgreSQL know-how and a modification to the Drupal settings.php file.

    The following is ten steps that assume we have three sites. At the end of the steps we will have single login between the three sites..

    Step 1: Create users
    psql -U postgres;
    create user one with encrypted password 'foo';
    create user two with encrypted password 'bar'
    create user three with encrypted password 'baz';
    

    Step 2: Create database and schemas
    create database drupal;
    \c drupal -- (assumes the use of psql)
    create schema one authorization one;
    create schema two authorization two;
    create schema three authorization three;
    

    Step 3: Sandbox users
    alter user one set search_path = 'one';
    alter user two set search_path = 'two';
    alter user three set search_path = 'three';
    

    Step 4: Install Drupal
    For the sake of brevity I am going to assume you have unpacked three copies of drupal in the same directory. Perhaps /home/www/one, /home/www/two, /home/www/three . At this point you would use your web browser and set up drupal normally. Just assign your users appropriately to each install and set your database to drupal.

    Step 5: Turn off caching (for testing)
    Go into the Drupal Administration pages and turn off caching for every install.

    Step 6: Alter users and sessions location
    This will break your installs initially. Don't fret. It does not really matter which one you pick but for consistency we will assume that the drupal install one is the canonical version.
    alter table one.users set schema public;
    alter table one.sessions set schema public;
    

    Step 7: Fix perms
    create role drupal user one,two,three;
    alter table users owner to drupal;
    alter table sessions owner to drupal;
    grant insert,update,delete on users to drupal;
    grant insert,update,delete on sessions to drupal;
    

    Step 8: Modify settings.php
    Drupal offers the ability to use a single database for multiple installs using an array called db_prefix. Modify the value in each install to:
    $db_prefix = array('users' => 'public.',
                     'sessions' => 'public.',);
    

    Step 9: Test
    At this point you should be able to login to each site using the user/pass from the one install. To test it further add a new user to any of the installs and see if you can login on a different one.

    Step 10: Marvel (Oh and turn back on caching)
    That's right, marvel. No obnoxious plugins. Simple overhead. Works even if the installs aren't on the same machine (although you would need to modify pg_hba.conf and possibly postgresql.conf).

    Categories: Business, OpenSource, PostgreSQL, SQL

    Let the jokes begin! PostgreSQL Conference West has changed locations.
    Posted Wednesday Jul 21st, 2010 09:27am
    by Joshua Drake
    | Permalink

    About a week ago I announced PostgreSQL Conference West 2010 CFP. In that CFP I also announced the location. A nice place, the Westin at Union Square in San Francisco. We were excited, the hotel was top knotch.

    Then on Monday I received notice, the hotel acquisitions team (EDB) has received an amazing counter offer from a competing hotel.

    The hotel is still in San Francisco, it is a four star hotel and the rate is much better for attendees (159.00 vs. 199.00). Here is the catch, which if you are reading on Planet you have to wait until after the jump.... Read more...


    Categories: Business, PostgreSQL, SQL

    Simpycity now available on Github
    Posted Tuesday Jul 20th, 2010 05:43pm
    by Aurynn Shaw
    | Permalink

    Following up on our brand-new Simpycity 0.3.1 release from earlier today, you're now able to get hold of Simpycity via the ever-popular code-sharing platform GitHub.

    Check us out @ GitHub, and track all the Command Prompt projects!

    Categories: OpenSource, Python

    Announcement: Simpycity 0.3.1 Released
    Posted Tuesday Jul 20th, 2010 01:13pm
    by Aurynn Shaw
    | Permalink

    Following up on the blog post covering the new coolness in 0.3, and better docs on working with Simpycity, we've just released Simpycity 0.3.1, our best release yet!

    Simpycity can be downloaded from our Wiki, and our code is available from the Subversion repository.

    Finally, starting today, all new releases of Simpycity are available on the PyPI package index, and Simpycity installable via:
    $ easy_install Simpycity
    


    Categories: OpenSource, PostgreSQL, Python

    Active Object in Simpycity
    Posted Tuesday Jul 20th, 2010 01:04pm
    by Aurynn Shaw
    | Permalink

    Simpycity is, as we've previously covered, a small library that permits for the direct mapping of arbitrary SQL statements to Python callables.

    This power allows for the development of complex representations that do not need to directly map to the underlying database representation.

    This differs from most conventional ORM technology, which follows the ActiveRecord pattern.

    Simpycity was implemented in this way for a great many reasons, first and foremost that the Active Record pattern is not the best representation of your data to your application logic. Should the application need to be aware of underlying data relationships? Should
    the application be aware of foreign keys, structures, and other underlying constructs?

    Or should the application be able to interact with the data in a form that is logical, and sensible to the application, without needing deep
    knowledge of underlying representations?

    We thought so, and Simpycity, and a concept more along the lines of Active Object is our result.

    Disparate Representations


    Simpycity, instead of writing SQL for you via query generators, requires that the developer write SQL by hand.

    The reason for this is that database representations are not generalizable into object relations - this disconnect is the entire
    reason behind the object-relational difficulties.
    A proper object representation encapsulates all the possible information about a method in a single location, as well as all the necessary
    methods to act on that data. A single object then represents a single quantum of data.

    However, for a relational system, normal form requires that disparate pieces of information are further broken down, into points of absolute
    truth about the data. A person's name, for instance, is a point of absolute truth, and should exist in only a single place in the database,
    whereas a person's name could exist in several places in an Object system, in a sensible manner.

    The disparity comes in that a Person, in terms of business requirements, is rather different from a Person in SQL terms, to the point where it would not be sensible to represent a database Person as an object Person - Address information, birthdate, all sorts of ancillary data that would normally be present isn't, per correct normal form.

    Simpycity works to avoid this, by allowing for business models that have little if anything in common with the underlying table structure,
    allowing for proper normalization as well as useful business objects.

    Forging Anew


    As Simpycity does not impose the database structure on your objects, it can't immediately provide the functionalities of .new() in the way a conventional ORM can - even though we've seen Simpycity handle the .save() feature brilliantly.

    Instead, if you instance a Simpycity model, not from the database, you get precisely and only a Simpycity instance. As it's not connected to a known set of database data, all the functions and other associated items have no way of operating, and the model just sits there, forlorn and empty.

    But since we have to match the Active Record pattern, how would we go about providing .new() in Simpycity?

    Here's how we do it:

    Given a standard model that looks like this,
    class model(ctx.Model()):
         table = ['id','name']
         __load__ = ctx.Function("load_obj", ['id'])
    


    We're able to do simple and basic load operations. Right?

    But, to create a new object, the pattern more resembles:
    
    class model(ctx.Model()):
    ...
    
    new = ctx.Function("new_obj",['name'], return_type=model)
    


    Which allows for the external interface of:

    import yourmodel
    o = yourmodel.new("Some name")
    o.commit()
    


    Providing a clean and sensible model API, following the ActiveRecord pattern,
    but still offering all the power of Simpycity.

    Twisty Little Properties


    Another very nifty capability of ActiveRecord systems is that of reflection, automatically retrieving the far end of a foreign key constraint. This allows for useful functionality like

    aModel.comments
    


    correctly reaching across the one-to-many relationship and pulling all the comments.

    As Simpycity doesn't directly map tables, capabilities such as this aren't directly implemented in Simpycity.
    However, since we do realize that business objects need to perform similar tricks and load data in via properties, we added specific support for this into Simpycity.

    But, since Simpycity is entirely callable based, we had to be able to support this feature with our existing metaphors. To that end, we included a simple function that will take any Simpycity callable (or any callable, really), interrogate its argument list, and handle argument mapping as you'd expect.

    Using this feature is as simple as:

        from simpycity.helpers import prop
        
        class myTextObject( ctx.Model() ):
            table = ['id', 'value']
            __load__ = ctx.Function("textobject.by_id",['id'])
            comments = prop( get=ctx.Function("textobject.comments", ['id']) )
            
        mto = myText(1)
        comments = mto.comments
    


    Easily allowing for sensible properties to be created, based entirely on clean Simpycity code. Properties created in this way even support set and delete functionality, identical to a standard property, allowing for property accessors to easily manipulate the database layer.

    As a note, prop() is a new feature in 0.3.1. 0.3.0 and below should use

        from simpycity.helpers import sprop
        
        class myTextObject( ctx.Model() ):
            ...
            comments = property(sprop( get=ctx.Function("textobject.comments", ['id']) ))
    


    Obviously not as clean, and not as capable. You should upgrade ASAP.

    Next


    We've stabilized the API, made everything work through the consistent Context interface, and have built a powerful callable-based model infrastructure for all sorts of application development.

    So what's next for Simpycity? Well, some of the things we're planning on include breaking the Model object away from psycopg2 dependency, allowing us to use other PG drivers (such as pg8000), as well as opening up the Model protocol we've defined for other contexts - file access, for instance. Anywhere that an application needs to represent a complex underlying structure as a simple object, the Model could be used.

    More in the future, we're really looking forward to integrating Simpycity callables with Django and SQLAlchemy model objects, using Simpycity to provide strong, clean functional and raw query support in those environments. And vice-versa as well: Binding a SQLAlchemy or Django ORM chain to a Simpycity object, using it to populate an object, and building even more complex, effective business objects for your application.

    Even farther afield, we've been looking at integrating query generation to Simpycity. There's a lot of boilerplate SQL that needs writing, and being able to hand it to an elegant, PostgreSQL-focussed abstraction would be, we think, ideal.

    As always, Simpycity is available on our Wiki, and our code can always be checked out from our Subversion repository.

    Finally, Simpycity is easily installed from the PyPI index, using easy_install Simpycity.

    Categories: OpenSource, PostgreSQL, Python


    Copyright © 2000-2010 Command Prompt, Inc. All Rights Reserved. All trademarks property of their respective owners.