Share this page: Follow me to be notified of new articles:

From MySQL to PostgreSQL

Babar the elephant

Why did we decide to migrate from MySQL to PostgreSQL?

Since Oracle bought MySQL back in 2009 a large part of the MySQL community expressed concern about what was going to happen to MySQL. I won't express my personal opinion on that topic here, but I do believe that Oracle is trying to slow down the development of MySQL and I'm concerned for the future of this RDBMS.

With time and while following Dimitri Fontaine on Twitter, I realized that PostgreSQL was the fastest growing open source RDBMS. We quickly considered MariaDB ("An enhanced, drop-in replacement for MySQL") but it wasn't solving our concern regarding slow development (MariaDB "monthly merges with the MySQL code base to ensure we keep up our compatibility and get any and all features and bug fixes Oracle adds", more info here). By the way, if you just want a simple replacement solution to MySQL, I do believe MariaDB is a great project.

I became very interested in some of PostgreSQL functionalities:
  1. The Write-Ahead Logging (WAL) ensures data integrity (we had a lot of issues with corrupt tables when our MySQL servers would crash (long electric outages...) which forced us to always restore a dump from our database (we backup our db every 15 minutes).
  2. At the opposite end, the non-Write-Ahead Logging unlogged tables that do not write-ahead log in the WAL "which makes them considerably faster than ordinary tables". They are very useful for a lot of short-term stored data we use for our Ad Server.
  3. The transactions: They do exist in MySQL's storage engine InnoDB but not on MyISAM which we were using.
  4. The very advanced replication features.
  5. Custom Functions which allowed us, for example to re-create MySQL LAST_DAY() function.
  6. Advanced logging features (you can analyze the logs with the great tool pgbadger).
  7. ...
MySQL is still a great tool and I would not recommend anyone to stop using it. It's a technical choice and every company has different issues and might be very happy with MySQL.

A few numbers

Our MySQL servers:
  1. We have about 10GB of data
  2. That's about 50,000,000 entries
  3. We store them in 300+ tables
  4. We have about 2,000 SQL queries/second (it used to be the double but we started using great tools like Memcache and MemcacheQ

Our partners

We had help from our long term partner Cyril Bouthors from ISVTEC and from Dimitri Fontaine of 2ndQuadrant, a major contributor of the PostgreSQL project. I personally know Dimitri from Hi-Media, where we both worked at one point.

December 2012: the first tests

Install PostgreSQL and PostgreSQL's PHP APIs

On a Debian server, you can just run the following command:
apt-get install postgresql postgresql-client php5-pgsql
/etc/init.d/apache2 restart # (or "/etc/init.d/apache restart" for Apache 1) can then access the PostgreSQL server newly installed by running the following commands:
sudo bash
su - postgres

...create a user:

...and your first database:
CREATE DATABASE hello_world;

...finally you can re-connect yourself to your new database with your new user:
su - gab
psql -h localhost -d hello_world -U gab

Our first MySQL export/PostgreSQL import

There is no simple way to migrate your data from MySQL to PostgreSQL. Mysqldump offers a compatible export to PostgreSQL which simply does not work.
I tried to look at the tools available on PostgreSQL's wiki but I did not found any that fit my needs.
I ended up building one with PHP that I called PHP_my2pg. It's nothing amazing (for example, it does not re-create indexes) but it works ok.

Indexes import

I created a very simple PHP tool to detect all the indexes from our MySQL schemas and re-create them on PostgreSQL: on PostgreSQL, unlike with MySQL, you can't create indexes in the CREATE TABLE syntax (see PostgreSQL and MySQL documentations).

If you realize that you need it, here is the script:

I obviously could have integrated this functionality to PHP_my2pg but I was done importing my data at that point and did not want to re-import everything.

Reset all sequences' current values

At that point I had issues with PostgreSQL's serials values. A Serial is like MySQL's AUTO_INCREMENT.

When I tried inserting data in my tables I got sequences errors (PostgreSQL was trying to insert the "id" 1, 2, 3... in tables with already millions of lines).

I had to reset all the sequences' values to my last id inserted. I found a script wrote by Isura Silva, which I slightly modified to avoid a few issues.

It is written in PL/pgSQL. Here is the script you can easily copy/paste in psql:

(if you want, you can find the original version here)

You can then call it with psql by running the following command:
SELECT * FROM seq_update();

It worked. It worked great.

Updating our websites

We had to replace all the MySQL PHP API functions with the PostgreSQL one and rewrite a lot of our queries to make them PostgreSQL compliant.

I found this great article that makes a good list of query changes between MySQL and PostgreSQL.

I knew I was going to be able to automatically convert the major part of the PHP API and that I was probably going to be able to convert automatically 95% of our queries. I wrote some sed search-replace regular expressions.

Here is the sed functions I used (I removed a few of them, very specific to our own architecture):

I got a large part of my queries (about 95-97%) to work correctly with these changes. I had to rewrite some of them that had more specific problems or that were not re-written properly.

To "play" with our files, I made a copy of our development repository, basically like this:
rm -rf /var/www/dev/pg/
mkdir /var/www/dev/pg/
cp -r /var/www/dev/website /var/www/dev/pg/

Then, I would run the "sed" commands above with a .sh script. After updating the "sed" commands to correct them, I would just re-synchronise the PHP files (only them, to go faster) with "rsync":
rsync -rv /var/www/dev/website/ /var/www/dev/pg/website/ --include=*/ --include=*.php --exclude=*

This allowed me to make mistakes and start over with a "fresh copy" of our websites as much as I needed to.

January 2013: migration

From January 14th to January 24th 2013 we worked full time from Paris, France then from Florida, USA on the migration.

Converting our MySQL schemas to PostgreSQL

We first dumped our MySQL schema with mysqldump:
mysqldump  --no-data -u gab -p DB_NAME >

Then we used Jose Duarte's Perl program mysql2pgsql.perl:
./mysql2pgsql.perl --nodrop --enc_in="LATIN1" --sepfile

Finally we "tried" to import the resulted file to PostgreSQL (we had to correct the schema several times because mysql2pgsql.perl did not give a perfect result - it worked 80%):
psql -h localhost -U dev -d DB_NAME -f -1 -v ON_ERROR_STOP=1

Migrating the data

We tried several solutions to migrate our data. The first one was pgloader. The tool is great (but quite complicated) and the main issue with it was that it was a little slow for our needs (we have a lot of data and we did not want to shut down our services for more than 2 hours). We decided to not go with pgloader at the end mainly for this reason. But you should know that Dimitri Fontaine is considering developing pgloader 3 to solve this issue.

The final choice was also the easiest one:

Dumping the data with mysql client and its "SELECT ... INTO OUTFILE" option:
sed -e 's/"0000-00-00 00:00:00"/:null:/g' -e 's/"0000-00-00"/:null:/g' -e "s/\r\\\\$//"

We converted the data from LATIN1 to UTF-8:
iconv -f iso8859-1 -t utf-8 < /tmp/DB_NAME.csv  > /tmp/tmp.csv; mv /tmp/tmp.csv /tmp/DB_NAME.csv

Then importing the data into PostgreSQL with "COPY"

It worked because our data was very clean. In a lot of cases pgloader would have been a better solution (pgloader tells you where you have an issue with your data in case of an error where COPY would not be able to give you precise information about issues you may encounter).

Migrating the queries and the PHP API

To migrate our queries in our websites and to replace the MySQL PHP API with the PostgreSQL we used the sed functions I made for our tests (they worked great).

Importing the indexes

For performance reasons, we chose to import the database indexes separately from the schema (to be able to import the data faster and then add the indexes with the data already in PostgreSQL). We just used psql:
psql -U gab -d DB_NAME -f;

Reset all sequences' current values

Here again we used the same function as during the test.

And... we went live

We shut down all our services on Friday January 24th 01:00:00 +0100 UTC. The migration took 45 minutes. We re-opened our services at 01:50:00 +0100 UTC. Everything went as planned!

February 2013: 2 weeks after the migration

We are about 2 weeks after the migration as I write this. We are very happy with the move. Since we moved, Fedora and Suse now prefer MariaDb to MySQL ("The reasoning for this move is the uncertainty about Oracle's support of MySQL as an open-source project and moves to make the database more closed.").

Here are a few charts about our servers:
Main PG server 1 Main PG server 2
Transactions Transactions
Load Load

Read more

Take time to read Dimitri's article about our migration here.
Last update of this page: March 27, 2017
Please do not copy any of the content of this website without asking me first.