06 April 2011

As early as 2006, it was noted that the MAX() function is sometimes quite slow in a MySQL database. In fact, there are less clear references going further back than that. The easy-to-read MAX() query below was about 20-30 times slower than the second version that yields the exact same result:

  1. SELECT MAX(clicked) FROM {directmail_log}
  2. SELECT clicked FROM {directmail_log} WHERE clicked IS NOT NULL ORDER BY clicked DESC LIMIT 1

I assume that it recomputes the max and stores it with each row it looks at. That would be pretty wasteful for this basic query, but it would make sense in the context of other queries that have multiple aggregate functions (e.g., MIN, MAX and SUM in the same call). However, given that this problem is more than 5 years old, it would be nice for the engine to have an optimization for each of the basic aggregate functions so that they are at least fast in basic queries.

That assumption is somewhat established by the EXPLAIN. When using the order by, MySQL is able to use filesort. Apparently, that underlying technology is much faster than the in-memory computation.


blog comments powered by Disqus