PHP/MySQL Efficient Paged Data
There was some talk recently about the inefficiency of running two queries verses a single query to retrieve paged data, so I decided to do some testing of my own, and here are my results.
The first way to retrieve paged data is to run two queries, the first query would be just like the second, just without the LIMIT statement. You would use mysql_num_rows to retreive the total count using the first query, and the second query would be your page of data.
Testing average for this configuration was 1.9 seconds for every 200 repetitions.
The second way is almost the same as the first, but instead of using mysql_num_rows to get a total count, we use SQL to provide the count.
Testing average for this configuration was 1.8 seconds for every 200 repetitions.
The third way is to use a single query, and utilize PHP to split the data into pages. It was suggested that this would be the fastest way because it only has a single call to the database, however testing showed that was not a correct assumption. The overhead for two calls to the database was not enough to out weigh the processing time needed for all the rows to be returned and PHP to process them.
Testing average for this configuration was 2.2 seconds for every 200 repetitions.
It would appear that the second method is the most efficient, followed closely by the first. This could change based on your dataset, for instance the third method might turn out to be the best in a situation where you dont have many rows in your table. There are also situations where the first method would run much slower than the second. My opinion would be to stick with the second method.



Nice examples, however, a count(*) is always faster than selected columns and a start/limit clause. The filtering of columns to exclude/include takes time.
Furthermore, depending on your mysql server in the network – localhost or on another host – you wil find larger differences between outcomes due to band width.
The thing which annoys me is the comparrisons you make. Your compare statements are not equal, and therefore will always show a difference in execution time. A ‘count(*)’ needs less bandwidth compared to a ‘select *’, as a count has only 1 record and a ‘select *’ has x records. It would be better if you had done a test on ‘select * from … LIMITstart, end’ vs ‘select column_name_1, column_name_n, … FROM … LIMIT start, end’. Than compare an unordered query, one ordered ASC and onother DESC and find the difference. Use a table with say 6,500,000 records … and compare it against queries run against a table with 1,000.
(Why do you use double quotes around your sql statements! It’s better to use single quotes in PHP code; concat strings/variables/constants escape required single quotes. It won’t show much different on sites with little visitors, but when you get 100+ per minute you definitely will notice the difference.)
Carl – im not sure if you actually read this post or looked at the code, because I never use a ‘select *’ anywhere in my tests. Sorry that annoys you.
My comparison is equal, because I wanted to know the quickest way to return paged data and a record count, and that’s what all of the examples do.
As for testing band with, that is completely out of my control, therefore not included in the test (im using localhost).
Thanks for the tip on double/single quotes.
Shea,
I read the post, but for me it’s probably logical to run 2 queries for a count and a record selection. I work with a separated mysql cluster (a master and 2 slaves) We use the master for writing only (if possible) and the slaves for reading. We need to take bandwith into account.
So a complete table dump to get a record count is strictly “not done”. It’s a ‘SELECT count(*)’, and after it’s a ‘SELECT whatevercolumnsonewants FROM whatevertable WHERE clause LIMIT start, end’. That’s fastest as we found out. We work for a number of publishers which are using vouchers to obtain licenses for ‘Electronic Study Systems’ (which is just another word for ‘learning by PC’). The registered students are currently near 400K, the vouchers number in millions, so indexes must be right and just enough – too many an index gives also a performance problem.
Using PHP to sort, limit or whatever manipulation on a dataset from MySQL, not done if it can be avoided. It will not be as fast as a tuned MySQLserver can do it, the latter is often – if not always – the case for poor database performance.
Carl
Im just not seeing your point here. Did you see the last sentence in this post?
“This could change based on your dataset, for instance the third method might turn out to be the best in a situation where you dont have many rows in your table. There are also situations where the first method would run much slower than the second.”