INFORMATION_SCHEMA is usually the place to go when you want to get facts about a system (how many tables do we have? what are the 10 largest tables? What is data size and index size for table t?, etc). However it is also quite common that such queries are very slow and create lots of I/O load. Here is a tip to avoid theses hassles: set innodb_stats_on_metadata to OFF .

This is a topic we already talked about, but given the number of systems suffering from INFORMATION_SCHEMA slowness, I think it is good to bring innodb_stats_on_metadata back on the table.

The problem

Let’s look at a system I’ve seen recently: MySQL 5.5, working set fitting in memory but not the whole dataset, around 4000 InnoDB tables.

The I/O load is very light as the server is an idle replica. You can see the I/O load from this partial pt-diskstats output:

#ts device rd_s rd_avkb wr_s wr_avkb busy in_prg 1.0 sda2 0.0 0.0 0.0 0.0 0% 0 1.0 sda2 0.0 0.0 16.0 9.2 0% 0 1.0 sda2 0.0 0.0 0.0 0.0 0% 0 1.0 sda2 0.0 0.0 0.0 0.0 0% 0 1.0 sda2 0.0 0.0 2.0 4.0 0% 0 1 2 3 4 5 6 #ts device rd_s rd_avkb wr_s wr_avkb busy in_prg 1.0 sda2 0.0 0.0 0.0 0.0 0 % 0 1.0 sda2 0.0 0.0 16.0 9.2 0 % 0 1.0 sda2 0.0 0.0 0.0 0.0 0 % 0 1.0 sda2 0.0 0.0 0.0 0.0 0 % 0 1.0 sda2 0.0 0.0 2.0 4.0 0 % 0

The customer wanted to know what could be improved from the schema so we started by finding the 10 largest tables:



mysql> SELECT table_schema as 'DB',

table_name as 'TABLE',

CONCAT(ROUND(( data_length + index_length ) / ( 1024 * 1024 * 1024 ), 2), 'G') 'TOTAL'

FROM information_schema.TABLES

ORDER BY data_length + index_length DESC

LIMIT 10;

[...]

10 rows in set (1 min 32.23 sec)



1mn32s is slow, but it’s not really a problem. But the I/O load triggered by this query IS a problem:

#ts device rd_s rd_avkb wr_s wr_avkb busy in_prg 1.0 sda2 0.0 0.0 18.0 8.7 0% 0 1.0 sda2 0.0 0.0 0.0 0.0 0% 0 1.0 sda2 100.0 16.0 0.0 0.0 79% 1 1.0 sda2 184.0 16.0 5.0 4.8 96% 1 1.0 sda2 97.0 16.0 0.0 0.0 98% 1 1.0 sda2 140.0 16.0 0.0 0.0 98% 1 1.0 sda2 122.0 16.0 17.0 4.0 98% 1 1.0 sda2 147.0 16.0 0.0 0.0 98% 1 [...] 1.0 sda2 136.0 16.0 0.0 0.0 98% 1 1.0 sda2 139.0 16.0 0.0 0.0 98% 1 1.0 sda2 149.0 16.0 0.0 0.0 98% 1 1.0 sda2 114.0 16.0 0.0 0.0 98% 1 1.0 sda2 147.0 16.0 8.0 4.0 96% 1 1.0 sda2 192.0 16.0 0.0 0.0 97% 1 1.0 sda2 141.0 16.0 0.0 0.0 98% 1 1.0 sda2 167.0 16.0 0.0 0.0 98% 1 1.0 sda2 15.0 16.0 0.0 0.0 6% 0 1.0 sda2 0.0 0.0 0.0 0.0 0% 0 1.0 sda2 0.0 0.0 16.0 4.0 0% 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 #ts device rd_s rd_avkb wr_s wr_avkb busy in_prg 1.0 sda2 0.0 0.0 18.0 8.7 0 % 0 1.0 sda2 0.0 0.0 0.0 0.0 0 % 0 1.0 sda2 100.0 16.0 0.0 0.0 79 % 1 1.0 sda2 184.0 16.0 5.0 4.8 96 % 1 1.0 sda2 97.0 16.0 0.0 0.0 98 % 1 1.0 sda2 140.0 16.0 0.0 0.0 98 % 1 1.0 sda2 122.0 16.0 17.0 4.0 98 % 1 1.0 sda2 147.0 16.0 0.0 0.0 98 % 1 [ . . . ] 1.0 sda2 136.0 16.0 0.0 0.0 98 % 1 1.0 sda2 139.0 16.0 0.0 0.0 98 % 1 1.0 sda2 149.0 16.0 0.0 0.0 98 % 1 1.0 sda2 114.0 16.0 0.0 0.0 98 % 1 1.0 sda2 147.0 16.0 8.0 4.0 96 % 1 1.0 sda2 192.0 16.0 0.0 0.0 97 % 1 1.0 sda2 141.0 16.0 0.0 0.0 98 % 1 1.0 sda2 167.0 16.0 0.0 0.0 98 % 1 1.0 sda2 15.0 16.0 0.0 0.0 6 % 0 1.0 sda2 0.0 0.0 0.0 0.0 0 % 0 1.0 sda2 0.0 0.0 16.0 4.0 0 % 0

The disks are 100% busy reading InnoDB pages for our query. No doubt that if the server was running queries from the application, they would have been negatively impacted.

Now let’s execute the same query with innodb_stats_on_metadata = OFF ;



mysql> SET GLOBAL innodb_stats_on_metadata = OFF;

mysql> SELECT [...]

10 rows in set (0.45 sec)



And let’s look at pt-diskstats:

#ts device rd_s rd_avkb wr_s wr_avkb busy in_prg 1.0 sda2 0.0 0.0 16.0 9.2 0% 0 1.0 sda2 0.0 0.0 0.0 0.0 0% 0 1.0 sda2 0.0 0.0 0.0 0.0 0% 0 1.0 sda2 0.0 0.0 16.0 4.0 1% 0 1.0 sda2 0.0 0.0 0.0 0.0 0% 0 1.0 sda2 0.0 0.0 0.0 0.0 0% 0 1.0 sda2 0.0 0.0 0.0 0.0 0% 0 1.0 sda2 0.0 0.0 0.0 0.0 0% 0 1 2 3 4 5 6 7 8 9 #ts device rd_s rd_avkb wr_s wr_avkb busy in_prg 1.0 sda2 0.0 0.0 16.0 9.2 0 % 0 1.0 sda2 0.0 0.0 0.0 0.0 0 % 0 1.0 sda2 0.0 0.0 0.0 0.0 0 % 0 1.0 sda2 0.0 0.0 16.0 4.0 1 % 0 1.0 sda2 0.0 0.0 0.0 0.0 0 % 0 1.0 sda2 0.0 0.0 0.0 0.0 0 % 0 1.0 sda2 0.0 0.0 0.0 0.0 0 % 0 1.0 sda2 0.0 0.0 0.0 0.0 0 % 0

No read load this time (and a much faster query!).

What is innodb_stats_on_metadata?

When the option is set to ON, InnoDB index statistics are updated when running SHOW TABLE STATUS , SHOW INDEX or when querying INFORMATION_SCHEMA.TABLES or INFORMATION_SCHEMA.STATISTICS . These statistics include the cardinality and the number of entries, they are used by the optimizer to find an optimal execution plan.

So even if SELECT statements cannot change the real statistics, MySQL updates the statistics for InnoDB tables. This is counterintuitive.

Is it useful? Not really, because InnoDB will always compute statistics when you open a table for the first time and when significant portions of the table have been modified (and when you run ANALYZE TABLE ).

Now why did we have such a high read load when innodb_stats_on_metadata was set to ON ? For InnoDB, statistics are estimated from random index dives, which translates to random reads.

The problem was magnified in the example because the whole dataset was not fitting in memory, the number of tables was high and the I/O subsystem was not very powerful.

Conclusion

It’s worth mentioning that the default value is now OFF with MySQL 5.6. So if you’re using MySQL 5.6, there’s no need to change anything. If you’re using MySQL 5.1 or 5.5, set innodb_stats_on_metadata to OFF and show your boss how you were able to get a 200x performance boost on some queries! And if you’re using MySQL 5.0 or below, you’ve just found another reason to upgrade!