Some MongoDB and MySQL comparisons for a real world site

We recently did some tests with regards to replacing an existing MySQL implementation with MongoDB. I thought some of the tests would be interesting to share.

MySQL ver 14.12 Distrib 5.0.27, for Win32 (ia32)

MongoDB v2.0.4 for Win32 (journaling not enabled)

The test was centred around a table that has 400000 records with numbered names

The table was indexed on two fields, id and an_id

Selection from specific folder by name:

SELECT id FROM table WHERE (an_id=2 AND name=’some name_251504′);

db.files.find({an_id:1, name:’some name_255500′}, {id:1});

* no index for ‘name’

MySQL:

0.83 s

MongoDB:

0.44 s

 

Increased records number to 800 000 (reached limit on 32bit OS for the data file size)

*Added index for ‘name’

Data files size:

MySQL:

238 MB

MongoDB:

1.4 GB

 

Selection of files from specific folder by name pattern:

SELECT count(*) FROM table WHERE (an_id=1 AND name like ‘%ame_2%’);

db.files.find({an_id:0, fi_name:/ame_2/}, {id:1, fi_name:1}).count();

> 202 225 records found

MySQL:

9.69 s

0.69 s

MongoDB:

3.62 s

1.34 s

* first run and others (match pattern changes slightly to prevent cache usage)

Count

select count(*) from table where (id > 500000 and id < 550000);

db.files.find({id:{$gt:500000, $lt:550000}}).count()

> 50 000 records found

MySQL:

0.02 s

MongoDB:

0.08 s

 

Delete 10 records:

delete from table where (id > 800000 and id < 800010);

db.files.remove({id:{$gt:800000, $lt:800010}});

 

MySQL:

0.13 s

MongoDB:

0.00 s

 

Delete 50 000 records: 

delete from table where (id > 600000 and id < 650000);

db.files.remove({id:{$gt:600000, $lt:650000}});

MySQL:

5.72 s

MongoDB:

2.00 s

 

Update 10 records:

UPDATE table SET name=’some new name’ WHERE (an_id=2 AND id > 200000 AND id <= 200010);

db.files.update({an_id:1, id:{$gt:200000, $lte:200010}}, {$set:{name:’some new name’}}, false, true);

MySQL:

0.08 s

MongoDB:

0.02 s

 

Update 50 000 records:

UPDATE table SET name=’sone new name 2′ WHERE (id > 250000 AND id <= 300000);

db.files.update({id:{$gt:250000, $lte:300000}}, {$set:{name:’some new name2′}}, false, true);

MySQL:

10.63 s

MongoDB:

3.54 s

Insert 50 records:

MySQL:

0.08 s

MongoDB:

0.02 s

 

Insert 500 records:

MySQL:

0.13 s

MongoDB:

0.09 s

Conclusions and other thoughts:

MongoDB has a clear  advantage on speed and this increases as more records are added.

Concerns are:

– MongoDB is not as battle tested or hardened

– The “gotcha’s (lack of our knowledge in part..)

– In MySQL data can be obtained from multiple tables with a single query whereas in mongoDB it seems multiple queries are needed to obtain data from multiple collections. Whereas there are latency advantages when dealing with a single collection these are negligible when dealing with multiple collections. Also, tuning of MySQL buffers and partitioning reduces speed advantages once again.

The conclusion was to stick with MySQL but to keep an eye on MongoDB.

Dealing with MySQL issues in the Cloud: Automating restart on error

MySQL is the mainstay of most Cloud Applications (including this WordPress Blog !), however if MySQL has an issue, either through number of connections maxing out, or MySQL being locked and not available it can result in site outages. We’ve seen clients who have ended up with their SQL DB down from a couple of hours to a couple of days before they suddenly realised there was an issue.

To that end we wrote a small script that can be used to automate the restarting of MySQL in such scenarios.

The script is called mysqlrestart.sh and is listed below. You need root access to be able to use it. If you use it and  ever reboot the server you will need to login as root and run nohup ./mysqlrestart.sh  & to restart it.

Set the script to run every 30 seconds using Cron. It will then check for a number of connections and if it cannot get a connection or the number of connections is greater than the number defined (defined as 90 in the example below), it will restart mysql.

#!/bin/bash

SQLCONNECTION_THRESHOLD=90

echo `date`  sqlrestart started >> run.out

while true; do

        sqlconnections=`mysql –skip-column-names -s   -e  “SHOW STATUS LIKE ‘Threads_connected'” -u root | awk ‘{print $2}’`

        #exclude myself from the number of thread connections             

        sqlconnections=$((sqlconnections – 1))

        echo `date` sqlrestart connections  $sqlconnections  >> run.out

        if [ $sqlconnections -gt $SQLCONNECTION_THRESHOLD ] || [ $sqlconnections -lt 0 ]

        then

                echo `date` restarting mysql server $sqlconnections  >> restart.out

                service mysql restart >> restart.out 2>&1

                echo `date` restart  complete   >> restart.out

        fi

        sleep 30

done