Replication

Replication is used by many businesses, especially financial institutions. For instance, many banks use complex replication systems to link their security trading operations at stock exchanges around the world. Therefore, many replication servers can be found in New York, London, and Tokyo.

This section explains the main ideas involved in replication, why it is important, and installing and configuring the enterprise Replication server (eR server) for PostgreSQL.

What is Replication?

Replication is the server process of copying data modifications from one location to another. The main characteristics of replication which make it appealing are:

Although the replication server is powerful and flexible, it is not easy to administrate. Another contributing factor is that the scope and complexity of the database administrator's job expands significantly when replication is involved. The previously independent data servers are now closely related because data is replicated between them. This effectively means that the data servers have become one large system. Thus, making a database administrator's job much more difficult.

Obtaining the eR Server

Also known as Rserv

The eR server is the enterprise Replication Server for PostgreSQL. To obtain the eR server, go to the web site at www.pgsql.com. Or check in the directory path /usr/local/src/postgresql-7.1.2/contrib/ for the rserv/ folder, which holds all of the source files.

NoteNote
 

If you downloaded the eR server from the Internet, then it is important that you decompress the zipped file in the contrib/ directory beneath PostgreSQL 7.1. Once the eR server is made, it begins looking for the PostgreSQL source files. If you did not install PostgreSQL 7.1 in /usr/local/src/, then you can move the compressed file to the contrib/ directory beneath where you installed PostgreSQL.

Listed below are the required components to make replication possible between a master and slave machine:

We will refer to the path /usr/local/src/postgresql-7.1.2/contrib/ as the location which holds the eR server. You can also substitute your database name into the places where we use the booktown database.

Step 1:Unpacking the Source

If you are using the existing files in your contrib/ directory, then you can skip this step. On the other hand, if you downloaded the eR server from the Internet, you will need to use the following command to unpack the eR server:

Step 2:Installing the eR server

After the last command has finished execution, a new directory named rserv/ was automatically created. Move to that directory:

   cd rserv/
   

You will now need to make the files. Depending on your system, you can either use the make or gmake command :

   gmake
   

NoteBSD users
 

In BSD, gmake is called gnumake.

After the make has finished, continue to make the installation files by using the command:

   gmake install
   

If you do not have the Pg module installed on both the master and slave machine, then you should walk through the steps to install the Pg module. If you have the Pg module installed, you are now ready to configure the eR server. You should skip to the section entitled Configuring the eR server. It will walk you through the main steps needed to use a replication server.

Installing the Pg module

You can use the Perl utility called cpan to install the Pg module. We will not cover cpan in its entirety. Details on configuring and using cpan is beyond the scope of this document.

Note

You need to have the Pg module installed on both the master and slave machine.

Step 1: Set environment variables

You need to set the environment variables POSTGRES_INCLUDES and POSTGRES_LIB to the include and lib directory. The command below will set them to the correct path:

   POSTGRES_INCLUDE = /usr/local/pgsql/include
   POSTGRES_LIB = /usr/local/pgsql/lib
   

Step 2: Create root user

A root user is required when installing the Pg module because the cpan utility will run some tests as the root user. You can disable these tests, but this document will not cover how to do so.

Log in as your postgres superuser and create a root database user. Use the command:

   su - postgres
   createuser root
   

WarningCreating a PostgreSQL root user
 

Normally, you shouldn't create a PostgreSQL root user. However, to install the Pg module requires some tests that run as the root user. To close any security holes that may occur from creating a PostgreSQL root user, you should closely follow all of the steps outlined in this section. Especially step 4, which is removing the PostgreSQL root user.

It will ask you what type of permission should this user have. You should specify the PostgreSQL root user to only have permission to create databases but not be able to create users.

Step 3: Using cpan to install the Pg module

Start the cpan process by using the command:

   cpan
   

You can now use the command below to install the Pg module. It will take a few minutes to install.

   install Pg
   

Once it has finished running, the last line should display the message:

 
  /usr/bin/make install  -- OK
   

You can exit the cpan program by typing:

 
   quit
   

Step 4: Removing the PostgreSQL root user

It is important that you drop the PostgreSQL user named root after the installation process has completed. It is not necessary to give a root user access to the PostgreSQL database because it is a security concern. Use the command:

   dropuser root
   

Configuring the eR Server

Step 1: Dumping the database

Go to the master machine that stores the database. Use the pg_dump command to dump the database to a file. For more information, refer to the pg_dump command in the reference section. This command dumps our booktown database into a file called booktown.db. You can give the extension name any valid set of characters. Make sure that the name will help you distinguish the file as a dumped database file.

pg_dump booktown > booktown.db

Step 2: Copy the dump to the slave machine

Use scp to copy the dump from the master machine to the slave machine. The command I used was:

   scp booktown.db pma@pma:
   

You need to move the dump into the path /usr/local/pgsql/ on the slave machine. It may require you to log in as the root user. This command can be used in the directory that holds the dumped file:

   mv booktown.db /usr/local/pgsql/
   

Step 3: Create the appropriate users

The dump command does not automatically create any users when it is undumped. Therefore, we need to create these users before undumping the database.

If you haven't done so, start the PostgreSQL service. Switch to the root user and start the database using this command:

    service postgresql start
   

NotePostgreSQL: unrecognized service
 

You may receive this error message if you did not set up the sysV start up script for PostgreSQL. You can instead use the command below to start PostgreSQL in the background:

nohup /usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data </dev/null >>server.log
2>&1 >/dev/null &
   

If you want to configure your system to use the command: service PostgreSQL start, then refer to Step 9 in the PostgreSQL Installation section.

Once the PostgreSQL database has started, you can now create the database users. You should be logged in as the postgres superuser or another superuser that was set up during installation. The command to create the booktown users are:

  
  createuser mark
  createuser jennifer
  createuser jessica
  createuser william
  createuser jonathan
   

When asked if the users are allowed to create databases and new users, you should specify no for both options. The only exception is user Mark. He is the manager of the entire booktown system and should be able to create new users and databases.

Step 4: Create the database

Now you can create the database that will store the users and the tables from the dump. You should still be logged in as the postgres user. Use the command below:

createdb booktown

Step 5: Import the database

psql booktown < booktown.db

Note\connect:Fatal 1: user "pma" does not exist
 

This message appears if you forgot to create a user that owns an object which existed in the dumped database. The object could be a table, sequence, function, ... etc.

To fix this problem, you can follow the steps below:

  1. Drop the database using the command:

           dropdb booktown
           

  2. Create the user that the message complained about:

            createuser pma 
           

  3. Create the database to hold the undumped database:

           createdb booktown       
           

  4. Import the booktown database:

            psql booktown < booktown.db  
           

You should receive a list of messages which says that you are now connected as the superusers.

Step 6: Edit the pg_hba.conf file

You need to make sure the slave machine will allow connections from the master machine. Editing the pg_hba.conf file allows these connections to occur. This file is located in the $PGDATA directory at /usr/local/pgsql/data/pg_hba.conf. You also need to add the IP address of the master machine into this file. All this requires is a line which tells the slave machine the type of authentication to use. For more information, refer to the Password Authentication sect1.

We can specify the slave machine to accept packets from the master machine with the IP address 192.168.1.65 after the encrypted password has been verified. The connection will only have access to the booktown database. The record added to the pg_hba.conf file looks like this:

host    booktown    192.168.1.65    255.255.255.255    crypt
   

Step 7: Initialize the database

Use the MasterInit file to initialize the database that will act as the master database. You can log onto the master machine and perform the following command:

   ./MasterInit booktown
   

Step 8: Using MasterAddTable

The commands executed for this step should be performed on the master machine. The MasterAddTable tool allows you to select the table you want to replicate from a database on the master machine.

  1. Choose a column that you wish to be the trigger on the table for replication. Then use the syntax below to specify the column you want to replicate:

         ./MasterAddTable dbname table column
         

    For instance, this replicates the lastname column in the customer table:

         ./MasterAddTable booktown customer lastname
         

    Note

    Replication only occurs when the column specified by the above command is modified.

  2. You can replicate multiple columns in a table by running MasterAddTable multiple times.

NoteERROR: ProcedureCreate: procedure _rserv_log_ already exists with same arguments
 

You may receive this error when executing the MasterAddTable command several times on the same column in the same table and database name. You can log into the database and drop the following items:

  • four tables which begin with _rserv. The table names can be viewed using the \d command.

  • two sequences which begin with _rserv. The sequence names can be viewed using the \d command.

  • three functions which are: _rserv_log_ (), _rserv_sync_(integer), _rserv_debug_(integer). You will only be able to drop one function at a time.

  • trigger named _rserv_trigger_t_:

          DROP TRIGGER _rserv_trigger_t_ ON customer; 
          

You can now re-execute your MasterInit command. You also need to execute the MasterAddTable command again.

You should receive the message below if the MasterInit command was successful:

NOTICE:  CREATE TABLE/UNIQUE will create implicit index '_rserv_servers__server_key' for table
'_rserv_servers_'
   

Step 9: Initializing the Slave Database

You will need to log onto the slave machine and run the following command to initialize the booktown database:

   ./SlaveInit booktown
   

Step 10: Selecting the table to Replicate

Use the SlaveAddTable to select the table you want to replicate on the master machine. The syntax is the same as MasterAddTable. We will replicate the firstname column in the customer table:

   ./SlaveAddTable booktown customer lastname
   

Step 11: Testing the configuration

After the first 10 steps are completed, you can check if you configured the system correctly by performing the following tests:

  1. Log into the master database and update the column you referenced using MasterAddTable. We changed Dr.John's last name from Nathan to Nathaniel:

         UPDATE customer 
            SET lastname = 'Nathaniel' 
          WHERE cust_id = 10; 
         

  2. From the command line, use the Replicate command. The syntax for the Replicate command is:

         ./Replicate masterdb slavedb
         

    The command below was used to replicate the booktown database:

         ./Replicate --user=pma --host=local 
                     --masterhost=localhost 
                     --slavehost=pma booktown booktown
         

    You will either receive an error message, or another message which showed that it worked.

Step 12: To verify that it works, log into the slave database and perform a query that searches for your modification.

Step 13: Why the replication does not run automatically

Replication Testing

If you want to perform replication testing, then you need to install PostgreSQL with tcl. If you have not done so yet, then you can go through the installation steps again and use the following commands:

  ./configure --with-tcl
  gmake clean 
  gmake
  gmake install
   

NoteGmake Clean
 

The make clean is only needed if you already ran the configure script before. If this is the first time you are running the configure script in this directory, then the gmake clean is not needed.