Home » Databases, Debian, Headline, Linux, Mysql, Oracle, Postgresql, Shell, Tuning

MySQL vs PostgreSQL Benchmarks

18 June 2009 65 Comments
MySQL vs PostgreSQL

MySQL vs PostgreSQL Benchmark

We are living interesting times … MySQL was first purchased by SUN and now SUN was purchased by ORACLE. I don’t know what future will reserve for MySQL, but in this moment it seems MySQL is coming very very close to PostgreSQL. We are NOT talking about MySQL with MyISAM, we are talking about MySQL with InnoDB, because I’m directly interested in a set of properties what PostgreSQL already have them built-in and MySQL achieve them through InnoDB (and the new Maria Data plugin). This properties are Atomicity, Consistency, Isolation, Durability = ACID, in other words, very stable, good integrity and crash proof database. Why an ACID database? Sometimes we are more interested in ACID for our data than raw speed. For example do you keep your savings to a bank who is running a NON ACID database? I think you understand my concern.

When ORACLE launched the InnoDB 1.0.3 patched with the InnoDB Google patch I was very interested to see a benchmark between MySQL 5.1.30 and PostgreSQL 8.3.7. Until now I didn’t find any benchmark between this databases so I found some free time to do that myself.

Hardware Resources
CPU: Intel(R) Pentium(R) D CPU 3.00GHz Dual Core
RAM: 3G Ram
HDD: WDC WD3200AAJS-0

Software Resources
Debian Lenny 5.0 64 bit arch
Linux painkiller 2.6.26-2-amd64 #1 SMP Wed May 13 15:37:46 UTC 2009 x86_64 GNU/Linux
MySQL 5.0.51a-24+lenny1
MySQL 5.1.30 Sun compiled
InnoDB 1.0.3 Plugin compiled by ORACLE for MySQL 5.1.30
PostgreSQL 8.3.7
osdb 2.1 – Opensource Database benchmark.

OSDB Benchmark methodology (From OSDB manual)
DB structure creation (tables, indices, etc.)

Database creation includes:

* table creation – create_tables() – creates tables in the OSDB database, which are a background for benchmark execution (temporary tables are not included in this section)
* data generation – load() – copies data from the provided data files to all tables previously created,
* primary keys and index creation – create_idx_…_h() or create_idx_…_bt() – creates Primary Keys and indices and checks data in the specific column (if the load() had any problems and generated not unique values for Primary Ks you should see the error message). Thereare two types of index :
* B-tree – represented by “_bt” symbol (clustered indexes use onlyB-tree)
* hash – represented by “_h” symbol
* foreign key formation – create_idx_…_foreign() – creates Foreign Keys. There is a possibility you will see warning messages such as “NOTICE: ALTER TABLE … ADD CONSTRAINT will create implicit trigger(s) for FOREIGN KEY check(s)” or similar, which comes from the DBMS to inform you some additional operations were required and were performed in order to create the Foreign Keys

The result of this part of benchmark execution is given by the populateDataBase() line in seconds [s].

Single-user test part
Single-user test part contains simple data operations such as:

* select – simply select a record from a table with a given condition
* join – select record(s) from tables (2 or 3) joined together on some condition
* aggregation – counts averages, minimal, maximal or summary values
* view (report) – creates views (with aggregations)
* projection – selects with the SQL “distinct” option from one or more tables
* add – adds a record to a table
* update – updates one or more values (records) in a table
* delete – deletes a record from a table with a given conditon
* bulk operation – adds/deletes many records into/from a table
* index update – updates indices after data changes
* integrity test – checks data integrity in tables with foreign keys (if DBMS does not provide integrity this test fails)

Operations are mixed and the result is given by the Single User Test line in seconds [s].

Multi-user test

The Multi-user test is designed for testing transactions and the behavior of the DBMS when many users try to simultaneously access data. It is divided into two main parts:

Mixed IR – selects a tuple from a table with a given condition that is randomly generated from all possible values stored in the table. The result is given in tuples per second [tup/sec]. Mixed OLTP (On-Line Transaction Processing) – updates a tuple which fulfills a condition randomly selected, in a table. The result is given in tuples per second [tup/sec].

Following actions are executed after Mixed IR as well as after Mixed OLTP:

* crossSectionTests() – includes some parts of the single-user test plus multi-user data modifications
* mu_checkmod_…() – randomly and sequentially checks data after modifications

The result of the multi-user part is given by Multi User Test line in seconds [s].

Database Structure

I added the structure of the tables from MySQL InnoDB database only for reference.

CREATE TABLE `hundred` (
`col_key` int(11) NOT NULL,
`col_int` int(11) NOT NULL,
`col_signed` int(11) default NULL,
`col_float` float NOT NULL,
`col_double` float NOT NULL,
`col_decim` decimal(18,2) NOT NULL,
`col_date` char(20) NOT NULL,
`col_code` char(10) NOT NULL,
`col_name` char(20) NOT NULL,
`col_address` varchar(80) NOT NULL,
UNIQUE KEY `hundred_key_bt` (`col_key`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `tenpct` (
`col_key` int(11) NOT NULL,
`col_int` int(11) NOT NULL,
`col_signed` int(11) default NULL,
`col_float` float NOT NULL,
`col_double` float NOT NULL,
`col_decim` decimal(18,2) NOT NULL,
`col_date` char(20) NOT NULL,
`col_code` char(10) NOT NULL,
`col_name` char(20) NOT NULL,
`col_address` varchar(80) NOT NULL,
UNIQUE KEY `tenpct_key_bt` (`col_key`),
UNIQUE KEY `tenpct_key_code_bt` (`col_key`),
KEY `tenpct_int_bt` (`col_int`),
KEY `tenpct_signed_bt` (`col_signed`),
KEY `tenpct_double_bt` (`col_double`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `tiny` (
`col_key` int(11) NOT NULL,
UNIQUE KEY `tiny_key_bt` (`col_key`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `uniques` (
`col_key` int(11) NOT NULL,
`col_int` int(11) NOT NULL,
`col_signed` int(11) default NULL,
`col_float` float NOT NULL,
`col_double` float NOT NULL,
`col_decim` decimal(18,2) NOT NULL,
`col_date` char(20) NOT NULL,
`col_code` char(10) NOT NULL,
`col_name` char(20) NOT NULL,
`col_address` varchar(80) NOT NULL,
UNIQUE KEY `uniques_key_bt` (`col_key`),
KEY `uniques_code_h` (`col_code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

CREATE TABLE `updates` (
`col_key` int(11) NOT NULL,
`col_int` int(11) NOT NULL,
`col_signed` int(11) default NULL,
`col_float` float NOT NULL,
`col_double` float NOT NULL,
`col_decim` decimal(18,2) NOT NULL,
`col_date` char(20) NOT NULL,
`col_code` char(10) NOT NULL,
`col_name` char(20) NOT NULL,
`col_address` varchar(80) NOT NULL,
UNIQUE KEY `updates_key_bt` (`col_key`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Configuration

I tried , as much as possible, to tune all 3 databases. Bellow I listed the configuration for all 3 Databases:

MySQL 5.0.51a-24+lenny1 – Debian Lenny Standard

[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0
[mysqld]
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp
language = /usr/share/mysql/english
skip-external-locking
bind-address = 127.0.0.1
key_buffer = 384M
max_allowed_packet = 1M
thread_stack = 128K
thread_cache_size = 8
myisam-recover = BACKUP
max_connections = 400
table_cache = 512
thread_concurrency = 8
query_cache_limit = 1M
query_cache_size = 32M
expire_logs_days = 10
max_binlog_size = 100M
skip-bdb

default-storage-engine=InnoDB
innodb_file_per_table=1

innodb_additional_mem_pool_size = 16M
innodb_buffer_pool_size = 1500M
innodb_file_io_threads = 8
innodb_thread_concurrency = 16
innodb_max_dirty_pages_pct = 90
innodb_lock_wait_timeout = 120

[mysqldump]
quick
quote-names
max_allowed_packet = 16M

[mysql]

[isamchk]
key_buffer = 16M

MySQL 5.1.30 – Innodb 1.0.3

[client]
port = 3307
socket = /usr/local/mysql/mysqld.sock
[mysqld_safe]
socket = /usr/local/mysql/mysqld.sock
nice = 0

[mysqld]
user = mysql
pid-file = /usr/local/mysql/mysqld.pid
socket = /usr/local/mysql/ysqld.sock
port = 3307
basedir = /usr/local/mysql/
datadir = /usr/local/mysql/data
tmpdir = /tmp
language = /usr/local/mysql/share/english
skip-external-locking
bind-address = 127.0.0.1
key_buffer = 384M
max_allowed_packet = 1M
thread_stack = 128K
thread_cache_size = 8
myisam-recover = BACKUP
max_connections = 400
table_cache = 512
thread_concurrency = 8
query_cache_limit = 1M
query_cache_size = 32M
expire_logs_days = 10
max_binlog_size = 100M
skip-innodb

default-storage-engine=InnoDB
innodb_file_per_table=1
innodb_file_format=barracuda
innodb_strict_mode=1

innodb_additional_mem_pool_size = 16M
innodb_buffer_pool_size = 1500M
innodb_file_io_threads = 8
innodb_thread_concurrency = 16
innodb_max_dirty_pages_pct = 90
innodb_lock_wait_timeout = 120

[mysqldump]
quick
quote-names
max_allowed_packet = 16M

[mysql]

[isamchk]
key_buffer = 16M


PostgreSQL 8.3.7 – Debian Standard

data_directory = ‘/var/lib/postgresql/8.3/main’ # use data in another directory
hba_file = ‘/etc/postgresql/8.3/main/pg_hba.conf’ # host-based authentication file
ident_file = ‘/etc/postgresql/8.3/main/pg_ident.conf’ # ident configuration file
external_pid_file = ‘/var/run/postgresql/8.3-main.pid’ # write an extra PID file
port = 5432 # (change requires restart)
max_connections = 512 # (change requires restart)
unix_socket_directory = ‘/var/run/postgresql’ # (change requires restart)
ssl = true # (change requires restart)
shared_buffers = 512MB # min 128kB or max_connections*16kB
temp_buffers = 8MB # min 800kB
work_mem = 10MB # min 64kB
maintenance_work_mem = 16MB # min 1MB
max_fsm_pages = 16001 # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 1000 # min 100, ~70 bytes each
max_files_per_process = 1000 # min 25
fsync = on # turns forced synchronization on or off
commit_delay = 1000 # range 0-100000, in microseconds
checkpoint_segments = 32 # in logfile segments, min 1, 16MB each
effective_cache_size = 1100MB
client_min_messages = log # values in order of decreasing detail:
log_min_messages = log # values in order of decreasing detail:
log_error_verbosity = default # terse, default, or verbose messages
log_min_error_statement = warning # values in order of decreasing detail:
log_line_prefix = ‘%t ‘ # special values:
autovacuum = on # Enable autovacuum subprocess? ‘on’
datestyle = ‘iso, mdy’
lc_messages = ‘en_US.UTF-8′ # locale for system error message
lc_monetary = ‘en_US.UTF-8′ # locale for monetary formatting
lc_numeric = ‘en_US.UTF-8′ # locale for number formatting
lc_time = ‘en_US.UTF-8′ # locale for time formatting
default_text_search_config = ‘pg_catalog.english’
max_locks_per_transaction = 64 # min 10

In my opinion is NOT a big difference between configurations, but sincerely I’m not a genie in DBMS so if you think are other improvements what can be made from this configurations please don’t hesitate to announce me.

Running

Creating the benchmark data

osdb-my –generate-files –size 1024m –datadir /tmp –dbpassword password

I used the following commands to run the test

Mysql 5.0

osdb-my –mysql=innodb –datadir /tmp –dbpassword password

Mysql 5.1.30

osdb-my –mysql=innodb –datadir /tmp –dbpassword password –dbport 3307 –dbhost 127.0.0.1

OSDB should be compiled against this version of Mysql. The standard version of libraries from Debian are not working with 5.1.30

Postgresql 8.3.7

su – postgres -c “osdb-pg –datadir /tmp –dbuser postgres –dbpassword dbpass”

The test was made on a 1024MB of data and the multiple users test was made with 100 users simultaneously on the same database.

Results:

SQL Operation
MySQL 5.0.51
seconds
MySQL 5.1.30 with InnoDB 1.0.3
seconds
PostgreSQL
seconds
create_tables() 0.03 0.06 0.02
load() 277.7 253 208.45
create_idx_uniques_key_bt() 1756.51 112.25 29.46
create_idx_updates_key_bt() 1758.63 119.33 41.14
create_idx_hundred_key_bt() 1681.67 117.65 37.51
create_idx_tenpct_key_bt() 76.22 118.33 30.42
create_idx_tenpct_key_code_bt() 60.02 10.85 8.04
create_idx_tiny_key_bt() 0.02 0.05 0.01
create_idx_tenpct_int_bt() 69.66 12.74 6.95
create_idx_tenpct_signed_bt() 216.55 16.28 10.09
create_idx_uniques_code_h() 531.3 26.35 8.77
create_idx_tenpct_double_bt() 378.33 24.06 15.95
create_idx_updates_decim_bt() 458.86 21.65 25.41
create_idx_tenpct_float_bt() 594.48 27.46 15.6
create_idx_updates_int_bt() 244.93 13.17 6.95
create_idx_tenpct_decim_bt() 889.33 17.51 25.42
create_idx_hundred_code_h() 561.48 26.78 8.9
create_idx_tenpct_name_h() 1639.52 29.85 11
create_idx_updates_code_h() 568.32 22.39 11.1
create_idx_tenpct_code_h() 2242.57 24.61 11.47
create_idx_updates_double_bt() 790.16 26.65 15.78
create_idx_hundred_foreign() 349.51 295.99 11.66
populateDataBase() 14868.15 1063.95 331.65
Single user test
sel_1_cl() 0 0.02 0.04
join_3_cl() 0.06 0.05 0.02
sel_100_ncl() 0.01 0.01 0.03
table_scan() 10.54 4.13 0.64
agg_func() 3.21 2.9 1.77
agg_scal() 0 0 0.01
sel_100_cl() 0 0.01 0.01
join_3_ncl() 0.06 0.01 0.03
sel_10pct_ncl() 0.02 0 0
agg_simple_report() 14.14 11.55 4.09
agg_info_retrieval() 0.15 0.06 0.04
agg_create_view() 0.03 0.02 0.13
agg_subtotal_report() 0.62 0.49 4.43
agg_total_report() 0.47 0.4 2.03
join_2_cl() 0 0 0
join_2() 1.78 1.62 0.89
sel_variable_select_low() 0.03 0.02 0
sel_variable_select_high() 12.52 7.82 49.51
join_4_cl() 0 0 0
proj_100() 7.13 7.35 149.97
join_4_ncl() 0.03 0 0
proj_10pct() 9.13 2.68 20.06
sel_1_ncl() 0 0 0
join_2_ncl() 0 0 0
integrity_test() 3.99 3.18 1.4
drop_updates_keys() 1545.63 0.02 0.21
bulk_save() 0.02 0.05 0.01
bulk_modify() 0.2 0.05 640.61
upd_append_duplicate() 0 0 0.02
upd_remove_duplicate() 0 0 0
upd_app_t_mid() 0 0 0
upd_mod_t_mid() 0 0.01 0.61
upd_del_t_mid() 0 0 0.61
upd_app_t_end() 0 0 0.03
upd_mod_t_end() 0 0 0.63
upd_del_t_end() 0 0 0.61
create_idx_updates_code_h() 277.3 21.16 8.99
upd_app_t_mid() 0 0 0.01
upd_mod_t_cod() 0 0 0
upd_del_t_mid() 0 0 0.66
create_idx_updates_int_bt() 310.24 12.71 7.62
upd_app_t_mid() 0 0 0
upd_mod_t_int() 0 0 0
upd_del_t_mid() 0 0 0.61
bulk_append() 0.1 0.21 0.03
bulk_delete() 0.04 0.02 577.67
Results in seconds 2197.43 76.53 1473.99
Multi user test
*MixedIR(tup/sec)* 101.62 95.46 30.41
sel_1_ncl() 0 1.6 0.09
agg_simple_report() 1683.81 40.47 350.08
mu_sel_100_seq() 0.29 0.13 0.29
mu_sel_100_rand() 0.27 3.66 0.18
mu_mod_100_seq_abort() 68.32 83.06 0.45
mu_mod_100_rand() 0.22 0.12 0.01
mu_unmod_100_seq() 0.12 0.34 0.26
mu_unmod_100_rand() 0.1 0.17 0.57
*crossSectionTests(MixedIR)* 1753.13 129.55 382.33
mu_checkmod_100_seq() 0.01 0.04 0.03
mu_checkmod_100_rand() 0.01 0.04 0
*MixedOLTP(tup/sec)* 806.21 430.08 295.79
sel_1_ncl() 0 0 0.01
agg_simple_report() 126.42 177.46 72.46
mu_sel_100_seq() 0.66 0.35 0.12
mu_sel_100_rand() 0.11 0.18 0.04
mu_mod_100_seq_abort() 0.11 0.05 0.08
mu_mod_100_rand() 0.05 0.07 0.05
mu_unmod_100_seq() 0.11 0.02 0.02
mu_unmod_100_rand() 0.01 0.04 0.06
*crossSectionTests(MixedOLTP)* 127.47 178.16 72.83
mu_checkmod_100_seq() 0.03 0.17 0.06
mu_checkmod_100_rand() 0.19 0.06 0.03

Legend

Name SQL Querys
create_idx_uniques_key_bt() create unique index uniques_key_bt on uniques BTREE(col_key)
agg_create_view() create view reportview(col_key,col_signed,col_date,col_decim, col_name,col_code,col_int) as select updates.col_key, updates.col_signed, updates.col_date, updates.col_decim, hundred.col_name, hundred.col_code, hundred.col_int from updates, hundred where updates.col_key = hundred.col_key
agg_func() select min(col_key) from hundred group by col_name
agg_info_retrieval() select count(col_key) from tenpct where col_name = ‘THE+ASAP+BENCHMARKS+’ and col_int <= 100000000 and col_signed between 1 and 99999999 and not (col_float between -450000000 and 450000000) and col_double > 600000000 and col_decim < -600000000
agg_scal() select min(col_key) from uniques
agg_simple_report() select avg(updates.col_decim) from updates where updates.col_key in (select updates.col_key from updates, hundred where hundred.col_key = updates.col_key and updates.col_decim > 980000000)
agg_subtotal_report() select avg(col_signed), min(col_signed), max(col_signed), max(col_date), min(col_date), count(distinct col_name), count(col_name), col_code, col_int from reportview where col_decim >980000000 group by col_code, col_int
agg_total_report() select avg(col_signed), min(col_signed), max(col_signed), max(col_date), min(col_date), count(distinct col_name), count(col_name), count(col_code), count(col_int) from reportview where col_decim >980000000
bulk_append() insert into updates select * from saveupdates
bulk_delete() delete from updates where col_key < 0
bulk_modify() update updates set col_key = col_key – 100000 where col_key between 5000 and 5999
bulk_save() insert into saveupdates select * from updates where col_key between 5000 and 5999
create_idx_hundred_code_h() create index hundred_code_h on hundred HASH(col_code)
create_idx_hundred_key_bt() create unique index hundred_key_bt on hundred BTREE(col_key)
create_idx_tenpct_code_h() create index tenpct_code_h on tenpct HASH(col_code)
create_idx_tenpct_decim_bt() create index tenpct_decim_bt on tenpct BTREE(col_decim)
create_idx_tenpct_double_bt() create index tenpct_double_bt on tenpct BTREE(col_double)
create_idx_tenpct_float_bt() create index tenpct_float_bt on tenpct BTREE(col_float)
create_idx_tenpct_int_bt() create index tenpct_int_bt on tenpct BTREE(col_int)
create_idx_tenpct_key_bt() create unique index tenpct_key_bt on tenpct BTREE(col_key, col_code)
create_idx_tenpct_key_code_bt() create unique index tenpct_key_code_bt on tenpct BTREE(col_key, col_code)
create_idx_tenpct_name_h() create index tenpct_name_h on tenpct HASH(col_name)
create_idx_tenpct_signed_bt create index tenpct_signed_bt on tenpct BTREE(col_signed)
create_idx_tiny_key_bt() create unique index tiny_key_bt on tiny BTREE(col_key)
create_idx_uniques_code_h() create index uniques_code_h on uniques HASH(col_code)
create_idx_uniques_key_bt() create unique index uniques_key_bt on uniques BTREE(col_key)
create_idx_updates_code_h() create index updates_code_h on updates HASH(col_code)
create_idx_updates_decim_bt() create index updates_decim_bt on updates BTREE(col_decim)
create_idx_updates_double_bt() create index updates_double_bt on updates BTREE(col_double)
create_idx_updates_int_bt() create index updates_int_bt on updates BTREE(col_int)
create_idx_updates_key_bt() create unique index updates_key_bt on updates BTREE(col_key)
join_2() select uniques.col_signed, uniques.col_name, hundred.col_signed, hundred.col_name from uniques, hundred where uniques.col_address = hundred.col_address and uniques.col_address = ‘SILICON VALLEY’
join_2_cl() select uniques.col_signed, uniques.col_name, hundred.col_signed, hundred.col_name from uniques, hundred where uniques.col_key = hundred.col_key and uniques.col_key =1000
join_2_ncl() select uniques.col_signed, uniques.col_name, hundred.col_signed, hundred.col_name from uniques, hundred where uniques.col_code = hundred.col_code and uniques.col_code = ‘BENCHMARKS’
join_3_cl() select uniques.col_signed, uniques.col_date, hundred.col_signed, hundred.col_date, tenpct.col_signed, tenpct.col_date from uniques, hundred, tenpct where uniques.col_key = hundred.col_key and uniques.col_key = tenpct.col_key and uniques.col_key = 1000
join_3_ncl() select uniques.col_signed, uniques.col_date, hundred.col_signed, hundred.col_date, tenpct.col_signed, tenpct.col_date from uniques, hundred, tenpct where uniques.col_code = hundred.col_code and uniques.col_code = tenpct.col_code and uniques.col_code = ‘BENCHMARKS’
join_4_cl() select uniques.col_date, hundred.col_date, tenpct.col_date, updates.col_date from uniques, hundred, tenpct, updates where uniques.col_key = hundred.col_key and uniques.col_key = tenpct.col_key and uniques.col_key = updates.col_key and uniques.col_key = 1000
join_4_ncl() select uniques.col_date, hundred.col_date, tenpct.col_date, updates.col_date from uniques, hundred, tenpct, updates where uniques.col_code = hundred.col_code and uniques.col_code = tenpct.col_code and uniques.col_code = updates.col_code and uniques.col_code = ‘BENCHMARKS’
mu_checkmod_100_rand() select count(*) from updates, sel100rand where updates.col_key=sel100rand.col_key and not updates.col_double=sel100rand.col_double)
mu_checkmod_100_seq() select count(*) from updates, sel100seq where updates.col_key=sel100seq.col_key and not (updates.col_double=sel100seq.col_double)
mu_ir_select() select col_key, col_code, col_date, col_signed, col_name from updates where col_key = %ld
mu_mod_100_rand() update updates set col_double=col_double+100000000 where col_int between 1001 and 1100
mu_mod_100_seq_abort() update updates set col_double = col_double+100000000 where col_int between 1001 and 1100
mu_oltp_update() update updates set col_signed=col_signed+1 where col_key=%ld
mu_sel_100_rand() insert into sel100rand select * from updates where updates.col_int between 1001 and 1100 USE_INTO_TEMP(`sel100rand~)
mu_sel_100_seq() insert into sel100seq select * from updates where updates.col_key between 1001 and 1100 USE_INTO_TEMP(`sel100seq~)
mu_unmod_100_rand() update updates set col_double=col_double-100000000 where col_key between 1001 and 1100
mu_unmod_100_seq() update updates set col_double=col_double-100000000 where col_key between 1001 and 1100
proj_100() select distinct col_address, col_signed from hundred
proj_10pct() select distinct col_signed from tenpct
sel_1_cl() select col_key, col_int, col_signed, col_code, col_double, col_name from updates where col_key = 1000
sel_1_ncl() select col_key, col_int, col_signed, col_code, col_double, col_name from updates where col_code = ‘BENCHMARKS’
sel_100_cl() select col_key, col_int, col_signed, col_code, col_double, col_name from updates where col_key <= 100
sel_100_ncl() select col_key, col_int, col_signed, col_code, col_double, col_name from updates where col_int <= 100
sel_10pct_ncl() select col_key, col_int, col_signed, col_code, col_double, col_name from tenpct where col_name = ‘THE+ASAP+BENCHMARKS+’
sel_variable_select_high() select col_key, col_int, col_signed, col_code, col_double, col_name from tenpct where col_signed < :prog_var;
sel_variable_select_low() select col_key, col_int, col_signed, col_code, col_double, col_name from tenpct where col_signed < :prog_var;
table_scan() select * from uniques where col_int = 1
upd_app_t_end() insert into updates values (1000000001, 50005, 50005, 50005.00, 50005.00, 50005.00,’1/1/1988′, ‘CONTROLLER’, ‘ALICE IN WONDERLAND’, ‘UNIVERSITY OF ILLINOIS AT CHICAGO’)
upd_app_t_mid() insert into updates (col_key, col_int, col_signed, col_float, col_double, col_decim, col_date, col_code, col_name, col_address) values (5005, 5005, 50005, 50005.00, 50005.00, 50005.00, ‘1/1/1988′, ‘CONTROLLER’, ‘ALICE IN WONDERLAND’, ‘UNIVERSITY OF ILLINOIS AT CHICAGO’)
upd_append_duplicate() insert into updates (col_key, col_int, col_signed, col_float,col_double, col_decim, col_date, col_code, col_name, col_address) values( 6000, 0, 60000, 39997.90, 50005.00, 50005.00, ’11/10/1985′, ‘CONTROLLER’, ‘ALICE IN WONDERLAND’, ‘UNIVERSITY OF ILLINOIS AT CHICAGO’)
upd_del_t_end() delete from updates where col_key = ‘-1000′
upd_del_t_mid() delete from updates where (col_key=’5005′) or (col_key=’-5000′)
upd_mod_t_cod() update updates set col_code = ‘SQL+GROUPS’ where col_key = 5005
upd_mod_t_end() update updates set col_key = ‘-1000′ where col_key = 1000000001
upd_mod_t_int() update updates set col_int = 50015 where col_key = 5005
upd_mod_t_mid() update updates set col_key = ‘-5000′ where col_key = 5005
upd_remove_duplicate() delete from updates where col_key = 6000 and col_int = 0
   

Conclusions
I draw several conclusions from the results above:

  • MySQL 5.0.51a-24+lenny1 – Debian stable version is the worst performer (also the MySQL version is a little bit old).
  • MySQL 5.1.30/InnoDB 1.0.3 with Google SMP patch – Compiled by SUN and InnoDB compiled by ORACLE outperform PostgreSQL in some cases.
  • PostgreSQL 8.3.7-0lenny1 – Debian Stable version – It’s a top performer from the Debian standard distro and is out performed only by the latest InnoDB 1.0.3.
  • Differences between PostgreSQL and MySQL 5.1.30/InnoDB 1.0.3 are very small
  • PostgreSQL is outperforming any MySQL version on Creating, Loading, Created Indexes operations (this can be very usefull on a DB recovery).
  • I don’t know why PostgreSQL is performing bad on bulk_delete() bulk_modify() querys … is possible to have a glitch in my configurations?

Update:
I re-run the PostgreSQL test (thanks Neil) with shared_buffers modified to 1500 MB, but the results are still the same (+/- in some cases but nothing big). You can see the result test here: http://www.randombugs.com/docs/postgres2.log


65 Comments »

  • SQLDenis said:

    The create index difference between MySQL and PostgreSQL and the two versions of MySQL itself is just astonishing

    Instead of bulk delete I would also love to see a truncate command comparison

  • admin (author) said:

    In the near future I will get an new server and then I will do new tests with OSDB and PolePosition. I hope the PolePosition benchmark software will be more flexible in adding new test.

    Regards
    Adrian

  • progg.ru said:

    Детальное сравнение производительности MySQL и PostgreSQL…

    Thank you for submitting this cool story – Trackback from progg.ru…

  • lol said:

    You’re idiot. To make a meaningful benchmark you will have to use a scaled up database. At least several terabytes of data to do all the stuff against.

  • Amr Ellafi said:

    well done ! thanks for the effort

  • admin (author) said:

    You’re idiot. To make a meaningful benchmark you will have to use a scaled up database. At least several terabytes of data to do all the stuff against.

    The most users doesn’t have terabytes of data. Who have terabytes of data probably use another solutions and not MySQL or PostgreSQL. Also probably with terabytes of data you should have the right hardware for that and I don’t have such hardware. My benchmarks help me to see what database perform better in 95% of real life cases. Most users don’t have big databases, but they still need good performance for their small DBMS.

    If you are curious to see a benchmark on terabytes of data just go to spec.org: http://www.spec.org/osg/jAppServer2004/results/res2009q2/jAppServer2004-20090324-00129.html

    I would really like to see some of your terabytes tests and the costs they implied. The study will be more complete with your help.

  • SR said:

    You’re not an idiot. Good job. One thing I don’t get is why bulk_delete and bulk_update tests are so slow on Postgress. Any ideas?

  • admin (author) said:

    I saw the problem, but also in a another tests, with only 100MB of data, results was almost the same:

    |bulk_delete()|0.19|62.74|
    |bulk_modify()|0.05|63.18|

    (first result is MySQL and second is PostgreSQL)

    I really don’t understand what is happen with PostgreSQL (is possible to be from configuration ?), but I will run the tests again and I will try to find the problem.

    Anyway I will come this month with another test made on quadcore cpu (i7 920), 8G of ram and with 2 x 1T HDD in RAID1. I wan’t to do the next tests on a sample of 20G of data and with more than 100 users simultaneously. I only hope to have some free time to do that.

  • Rick James said:

    Questions about MySQL:

    There are some more settings that are important:
    * innodb_flush_log_at_trx_commit
    * auto_commit
    Was the disk RAID? Did it have a write cache?

    The InnoDB plugin and Google patch might shine even better on a 4-core or 8-core box, but only if queries are done in parallel. Was there any parallelism?

    Did you fill up the 1.5GB buffer_pool? Provide this at the height of the tables sizes:
    * SHOW CREATE TABLE tbl\G
    * SHOW TABLE STATUS LIKE ‘tbl’\G

    Query Cache seems to be turned on. Is the fair? Or perhaps the SELECTs would not take advantage of it anyway. In that case, turning it OFF would let MySQL run a little faster.

    What was the CPU usage during the test? I/O usage? (These questions applie to both MySQL and Postgres.) There can be a significant performance difference between a fully cached table versus hitting the disk.

    Making it clear that the db fits in RAM might have avoided lol’s insult. I agree with some of the other comments — small databases (eg smaller than RAM) are part of the ‘real world’. I can point at several complete systems each with a footprint under 1GB. OTOH, scaling up your benchmark so that the data is 10x the cache would also be an interesting data point. (Beyond 10x, into the terabytes, probably would not be much different. But would take longer to run.)

    Was each test run from a cold start, or were they run sequentially without restarting the server. (Just checking for loss of cache.)

  • admin (author) said:

    1. The Disk is not RAID is just a simple sata 2 Disk. It have write cache.
    2. The CPU is a dual core and we try to run 100 query’s simultaneously. I will come with another test made on a quad core with a 20G database.
    3. I didn’t check if the buffer_pool was filled. I updated the post with the table structure.
    4. Query cache is on… In the next benchmark I will turn him off.
    5. I didn’t monitor the CPU and IO !
    6. I don’t see the “REAL” reason for the insult. If is don’t agree with my tests it can explain why.
    7. Yes, was a cold start for each database benchmark run.

    Thank your for your comment. Is very helpful and it help me for the next benchmarks.

  • Neil said:

    Why do you have:

    innodb_buffer_pool_size = 1500M

    for MySQL, but:

    shared_buffers = 512MB

    for PostgreSQL?

    Isn’t this a little unfair?

  • admin (author) said:

    Hmmm … probably this is the reason why MySQL perform better on bulk_delete and bulk_modify. I was a little bit confuse reading the PostgreSQL documentation “Several tens of megabytes are recommended for production installations.”, but however that means PostgreSQL rocks. I will repeat the tests again.

  • admin (author) said:

    I re-run the tests on PostgreSQL with the shared_buffers = 1500 MB. Almost nothing changed … Results are here: http://www.randombugs.com/docs/postgres2.log

  • WebDevHobo said:

    Question: will you do this test again when MySQL 5.4 comes out? It would be very interesting to see the improvements.

  • admin (author) said:

    I think the InnoDB Engine is the key here and not the MySQL itself. But probably I will test with MySQL 5.4, right now I’m preparing a test on a more powerful hardware spec.

  • Gladia2r said:

    Good job!

  • Jenny said:

    Pretty nice post. I just stumbled upon your blog and wanted to say
    that I have really liked reading your posts. Anyway
    I’ll be subscribing to your feed and I hope you post again soon!

  • sean said:

    why not compare a commercial distro using postgres plus from enterprisedb, instead? use a supported, certified version of one like the other, and the results i’m sure will be even more weighted toward postreSQL.

  • Upquark said:

    Thanks for this, very interesting.

  • derekpm said:

    Rather interesting. Has few times re-read for this purpose to remember. Thanks for interesting article. Waiting for trackback

  • How to Clear Debts said:

    Very interesting

    I run several large ecommerce sites so I have to weigh number crunching speed with stability and security. They are ALL important

  • Blogs de l’OSSA - Blogs de l’OSSA said:

    [...] Benchmark de innoDB 1.0.3 avec Google SMP patch [...]

  • Gaspy said:

    A real eye-opener — I think we can put the “MySQL is fast, PostgreSQL is slow” myth to rest.

    Somewhat related, I’ve tried to use Views in MySQL and they horribly slow. Yeah, you might say “why use views at all?” Well, having a background in MSSQL, views can be pretty useful, but MySQL query optimizer is well behind other databases.

  • admin (author) said:

    Both databases are good, but I really prefer Postgres when we talk about integrity, scalability and recovery. I still don’t have enough trust on MySQL.

    Regards

  • Sean said:

    nice post some good data as opposed to a mud slinging match about which is better and what have you…………

  • pha said:

    you made one big mistake

    to get any meaningful performance from postgres, you must set

    fsync=off

    you’ll see different numbers for sure
    letting postgres do a lot of fsync() calls really kills performance

    also, try these:

    work_mem = 4M
    wal_buffers = 1M
    default_statistics_target = 1000
    bgwriter_delay = 200ms

  • admin (author) said:

    fsync was intentionally keep on. I wanted to test 2 crash proof databases and not raw speed, and even in this configuration postgres is performing very very well.

    Regards

  • Frank said:

    As mentioned by Pha, the wal_buffers might be the problem with the bulk operations.

    From the PostgreSQL-manual: The setting need only be large enough to hold the amount of WAL data generated by one typical transaction, since the data is written out to disk at every transaction commit.

    The default is only 64 kilobytes…

  • Intramar » MySQL – retour sur les performances said:

    [...] Benchmark de innoDB 1.0.3 avec Google SMP patch [...]

  • Frank said:

    >> select distinct col_address, col_signed from hundred

    PostgreSQL can’t use an index with distinct. Use a workaround with GROUP BY:
    select col_address, col_signed from hundred GROUP BY col_address, col_signed;

    And make sure you have an index on these two columns. Start testing EXPLAIN and you’ll see the difference.

  • admin (author) said:

    Thanks for your discover. I hope I will have enough time to retest all again with the latest versions.

    Regards

  • Mark Cotner said:

    Best test I’ve seen posted. Very well done.

    Now if we could just create a test to show functionality warts in MySQL vs PostgreSQL.

    I’m a full time MySQL DBA who loves PostgreSQL. :)

    Completely sick and tired of the bug of the week. I’m perfectly happy to deal with a few planner warts in certain circumstances on PG vs completely hair brained bug and improper feature additions in MySQL.

    PostgreSQL has a significantly better planner overall and supports many different types of joins. MySQL isn’t cost based and only supports loop joins.

    MySQL is a toy.

  • Ka Chun Leung said:

    The slow times for Postgresql Bulk Modify/Bulk Delete can be explained by foreign key references to the updates table.

    MySQL automatically creates indexes on foreign keys which speeds up modifies/deletes on referenced fields at the expense of modify/insert performance on the referencing field. Postgresql lets you decide

  • admin (author) said:

    I’m totally agree with you. Anyway MySQL is going with big stpes to real enterprise. Probably in the future it will be better.

    Regards

  • just walked by said:

    Nice job but there is only one problem!
    Your conclusion is making me hard to choose between MySQL or PostgreSQL.
    I was going for PostgreSQL… but now I am not certain.

  • another passer said:

    Hmmm, we have a production server that runs mysql… (mainly because we needed the speed) but seems that postgres could do a “Much better” work…
    anyway nice post… keep up the good work…

  • ivoras said:

    You should be careful when benchmarking anything involving a classical hard drive – differences in rotational speeds of different cylinders and sensitivity to file fragmentation can wreck performance of anything. Some suggestions: either use a RAM drive for the database itself (best – will eliminate all effects from the drive), use a SSD (but always blank it with the factory-supplied utility before each benchmark to erase internal fragmentation / wear leveling data) or set up a special partition which has a size close to the database size, always format it with the same file system and file system options before doing different benchmark runs (e.g. when changing databases).

    Running database benchmarks from a “generic” partition where all the rest of the system data is, and especially running benchmarks on different databases installed on the same partition / file system one after the other can unbelievably increase file system fragmentation.

  • Kevin Grittner said:

    As has already been mentioned, the bulk delete results don’t mean much because the index to make that fast was present in MySQL but not PostgreSQL. There are legitimate reasons you might not want such an index in some circumstances, such as an “insert-only” table where the effort of maintaining such an index would be wasted, so PostgreSQL lets you choose whether to create it.

    Also, it appears that the database only contained 1 GB of data, and the available RAM exceeded that. In such situations you generally get better plans in PostgreSQL by configuring random_page_cost and seq_page_cost both to 0.01. The given configuration would be more appropriate for a PostgreSQL database with hundreds or thousands of GB; and even there I usually have enough of the active portion of the database cached to drop random_page_cost to 2.

  • Greg Smith said:

    Ignore the suggestion to turn fsync=off for PostgreSQL, and everything else pha said for that matter, those are all terrible ideas. Disabling fsync means you’ll end up with a corrupted database eventually that way after a crash–one where the server won’t even start. If you want to play loose with transaction commit dynamics in order to speed things up, you can try changing synchronous_commit=off (available in PG8.3 and later). That will reduce the number of times fsync is called by committing things in larger chunks, with the downside you might lose something (but not have database corruption!) in that configuration after a crash.

    You’ve adjusted commit_delay = 1000, which is a bad idea too. Leave that one at the default. I doubt it will impact any of your tests, but it’s a badly implemented feature. It doesn’t do what you’d expect from the documentation, and it’s due to be retired soon. The effective impact of increasing that parameter right now is that you could be adding needless waits for a commit in some circumstances.

    The rest of your PostgreSQL configuration is decent, and I’d think a fair representative of what a basic (non-database expert) tuned database would look like given your hardware. Good job. There’s a guide to this area at http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server that covers most of what people know to get started here you might pick up some hints from. You might get a little boost by changing wal_buffers=16MB on some of the write-heavy tests here.

    As for the slow deletes, that could just be PostgreSQL having more overhead when deleting a record due to the MVCC scheme it uses to handle transaction visibility. That same overhead is what dramatically reduces locking and therefore improves scalability in more common read tasks, so in some respects it’s probably want you want. There are also some easy mistakes to make in how you setup foreign keys in PostgreSQL, and I’m not sure if osdb has been audited to look for those recently. I don’t see one thing I’d expect on a quick glance at its source code, so it’s possible you’re seeing an unfair comparison difference in the benchmark code itself. Will have to take a look at exactly what it’s doing here to confirm whether that’s really the case or not.

  • Magmatrix said:

    A few suggestions:

    – Try creating an index on the referenced table and re-run the bulk modify/bulk delete tests in PostgreSQL. Should be *much* faster.

    – Also, you say you wanted to test “crash proof databases”. Run the tests again, and do the power plug test when the disk is busy. Does MySQL handle recovery as well as PostgreSQL does?

  • Frank said:

    Very nice Testing :)
    comparatively with Firebird Sql and / or Oracle would be very interesting.

    Thx for testing !

  • On getting lighttpd + PHP working on OS X « Pongo's thoughts said:

    [...] this is site-specific. You probably want MySQL and don’t want PostgreSQL (even though it takes a lot to squeeze even equal performance out of MySQL).  That said, the only oddity is that on OS X, PHP [...]

  • Pablo Domínguez said:

    Impressive tests, best ones I’ve seen. Amazed with PostgreSQL current performance.

    Congrats!

  • GoGo - The Future of MatchServer - Page 3 - RaGEZONE - MMORPG server development forums said:

    [...] I think you meant to say MySQL instead of MSSQL (if not, your question has been answered multiple times). PostgreSQL is just plain a BETTER engine. Not only is it faster than InnoDB [1], it is much safer (ask Guy about that). We can have (gasp!) actual check constraints. Knowing the type of things you guys like to do to your DB, I consider that a "Good Thing" ™. [1] MySQL vs PostgreSQL Benchmarks | Random Bugs [...]

  • PostgreSQL和MySQL对比 | CloudGeo:开源地理信息云系统 said:

    [...] 链接:http://www.randombugs.com/linux/mysql-postgresql-benchmarks.html [...]

  • آینده جامعه متن باز | PHPDevelopers.ir said:

    [...] با چند تغییر دیگر به کارایی خود افزوده. اطلاعات بیشتر، مقایسه MySQL و PostgreSql، آینده [...]

  • Ganesh said:

    Sorry, i can’t forget Mysql

  • DotCom said:

    i have used mysql for many years now. I always preffered innodb over myisam. Now i’m using postgre sql because of one thing: full text searching. I can’t beleive that transactional engine like innodb doesn’t have this support. Anyway, i also noticed that my websites run faster with postgre sql and php PDO under high load.

    Sure, postgre is harder to learn, it’s more complex, but it’s worth it. If this benchmarks are true, it’s worlds most advanced and now the fastest database.

  • Whose is the future آینده از آن کیست | دنیای فناوری اطلاعات said:

    [...] با چند تغییر دیگر به کارایی خود افزوده. اطلاعات بیشتر، مقایسه MySQL و PostgreSql، آینده DBMS درصد غالب سیستم های مدیریت محتوا متن باز [...]

  • Pino said:

    For a recent project we have chosen for PostgreSQL, because it has better support for stored procedures. It gives better control over constraints. The type system is strong and let you extend the type system. It combines good with Haskell (our primary server language), which also features a strong and extendable type system. I find the two pleasantly alike. It has better developed extensions. (Like PostGIS).

    But MySQL is simpler to use, which makes it suitable for rapid prototyping.

    Performance is not the only thing, what counts. If that was the case, we would all go to custom data types or NoSQL Databases, thereby dropping all the wonderful tools relational databases gives us.

    And even if performance is the main issue, we can transform the offending parts of the system into better performing ones.

    Profile, optimize and eventually rewrite.

  • admin (author) said:

    Thanks Pino for your suggestions. I’m agree with you, every project can have it’s own requirements, and just chose the DB what is suitable for you and your project.

  • vicsanca said:

    You need autovacuum to be disabled to make comparison with mysql. bulk operations are surely slower caused by vacuuming.

  • Eric Gillette said:

    Nice comparison, but it does seem a little skewed towards PostGreSQL.

    LOL!

    I think the performance benefits of a properly tuned InnoDB MySQL Database would seriously kill PostGreSQL.

    This isn’t to say PostGreSQL doesn’t have a place — I think it’s best used for back-end applications and/or intranet systems.

    But for modern day database-driven websites, I think MySQL is the clear choice since speed is one of the most important factors on a website.

  • ALEX said:

    Topic Starter, for operation “bulk insert” in postgresql
    must be used COPY operation(see fu*king manual).
    It have someone nuances, but ACID has been observed.
    PS
    Sorry for my english.
    I went to shoot the bears.

  • Anuraag G said:

    The absolute best benchmark article I have read comparing mysql versus postgresql. Sure, there could be improvements, as people have suggested. But the attention to details in your post is amazing.

    I wish if you could find time to redo the benchmark with latest versions of the databases – 5.6 for MySql and 9.x for PG.

  • admin (author) said:

    I will try to finish another set of benchmarks, but I need some time for that. Anyway thanks for comment.

  • chris said:

    Does anyone actually Delete data?
    Flag as deleted is much better and common.
    When delete is a Must … then add it to a maintenance queue task/job. Do it at scheduled off peak time.

  • Ranjeeth said:

    Author did a excellent job!!

    Now we are planning to deploy our applications on Postgres. Until now i was seriously confused with tons on posts online when comparing Postgres and Mysql databases by both parties.

  • Ruslan said:

    Could you re-run these tests with commented out “skip-innodb” in mysql configuration. I have a feeling that innodb was not actually used…

  • admin (author) said:

    :) It was used. Is disabling the built-in Innodb. Read this: http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-old-mysql.html

  • ferensick said:

    Very interesting. Tests like these are time consuming and this post is very appreciated.
    It would be nice, to see mysql fork tests VS postgresql.
    Such as Percona or MariaDB VS Postgresql. :)

  • MySQL vs PostgreSQL | x443 said:

    [...] A new benchmark for MySQL 5.0.51 and MySQL 5.1.30 with InnoDB 1.0.3 (MySQL 5.4 contains the patches from InnoDB 1.0.3) compared with PostgreSQL 8.3.7 suggests that MySQL and PostgreSQL are almost equal in terms of scalability, by at least one standard of measure. [...]

  • Dawid said:

    I sugest to turn fsync off:

    fsync = off
    synchronous_commit = off
    wal_sync_method = fsync

  • admin (author) said:

    The main ideea was to test an ACID database …

Leave your response!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.