Selecting random record from MySQL database table.

The simplest way of selecting random rows from the MySQL database is to use “ORDER BY RAND()” clause in the query.

Solution 1 [SQL]
SELECT * FROM `table` ORDER BY RAND() LIMIT 0,1;

The problem with this method is that it is very slow. The reason for it being so slow is that MySQL creates a temporary table with all the result rows and assigns each one of them a random sorting index. The results are then sorted and returned.

There are several workarounds to speed things up.

The basic idea is to get a random number and then select a specific row using this number.

In the case that all the rows have unique ids we will just have to pick a random number between the smallest and the biggest id and then select the row with id that equals that number. To make this method work when ids are not evenly distributed we will have to use “>=” operator instead of “=” in the last query.

To get the minimum and maximum id values in the entire table we will use MAX() and MIN() aggregate functions. These functions will return minimum and maximum value in the specified group. The group in our case is all the values of `id` column in our table.

Solution 2 [PHP]
$range_result = mysql_query( ” SELECT MAX(`id`) AS max_id , MIN(`id`) AS min_id FROM `table` “);
$range_row = mysql_fetch_object( $range_result );
$random = mt_rand( $range_row->min_id , $range_row->max_id );
$result = mysql_query( ” SELECT * FROM `table` WHERE `id` >= $random LIMIT 0,1 “);

As we mentioned this method is limited to tables with unique id for each row. What to do if it’s not the case?

The solution is to use the MySQL LIMIT clause. LIMIT accepts two arguments. The first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1).

To calculate the offset to the first row we will generate a random number between 0 and 1 using MySQL’s RAND() function. Then we will multiply this number by number of records in the table, which we will get using COUNT() function. Since LIMIT arguments must be integers and not float values we will round the resulting number using FLOOR() function. FLOOR() is an arithmetic function that calculates the largest integer value that is smaller than or equal to the expression. The resulting code will look like this:

Solution 3 [PHP]
$offset_result = mysql_query( ” SELECT FLOOR(RAND() * COUNT(*)) AS `offset` FROM `table` “);
$offset_row = mysql_fetch_object( $offset_result );
$offset = $offset_row->offset;
$result = mysql_query( ” SELECT * FROM `table` LIMIT $offset, 1 ” );

In MySQL 4.1 and later we can combine two previous methods using subquery like so:

Solution 4 [SQL]
SELECT * FROM `table` WHERE id >= (SELECT FLOOR( MAX(id) * RAND()) FROM `table` ) ORDER BY id LIMIT 1;

This solution has the same weakness as the solution 2 e.g. it only works for tables with unique ids.

Remember the reason we started looked for alternative ways of selecting random rows? Speed! So how do these methods compare in terms of execution times. I am not going to go into specifics of hardware and software configuration or give precise numbers. The approximate results are:

* The slowest method is solution 1. Let’s say that it took 100% of time to execute.
* Solution 2 took 79%.
* Solution 3 – 13%.
* Solution 4 – 16%.

The winner is solution 3.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: