Quick Wins Performance Tuning MySQL Third Party Patches

Document Sample
Quick Wins Performance Tuning MySQL Third Party Patches Powered By Docstoc
					     Quick Wins: Performance
    Tuning MySQL & Third Party
             Patches
        MySQL Meetup New York City
            28 October 2009

           Morgan Tocker, firstname at percona dot com
                 Director of Training, Percona Inc.




1
    You probably know Percona as:




2
        This talk comes in two parts:
★   First we’re going to look at Performance Tuning a MySQL
    installation.
★   Then we’re going to look at cool stuff - external
    enhancements to MySQL.




3
               Part 1: Introduction
★   So you’ve written/inherited an application.
★   Usage has gone crazy.
★   And you’ve diagnosed that the database is the
    bottleneck.
★   So how do we fix it......




4
                What this talk is about.
★   We’re going to be talking about tuning a system that’s
    never been shown love.
    ✦
        There’s more than one strategy to do this.
    ✦
        I’m going to show you the three common paths people take.




5
        The three ways people take:
★   Option 1: Upgrade the Hardware.
★   Option 2: Change a configuration setting in MySQL.
★   Option 3: Improve Indexing/Tune Queries.




6
           Option 1: Add Hardware
★   This “Kind of” works... but for how long?
★   There’s a price point where big machines get
    exponentially more expensive.




7
                 Add Hardware (cont.)
★   If you are going to do this, beware:
    ✦
        Bigger Machines will have the same speed drives.
    ✦
        Check what your bottleneck is first!




8
               Add Hardware (cont.)
★   Potential Bottlenecks:

DISK                                   RAM
This is the number one bottleneck for Use it to relieve pressure off our
a lot of people.                      disks, particularly with expensive
                                      random reads.


CPU                                    NETWORK
Not normally a bottleneck, but having We care about round trips, but we’re
fast CPUs can mean locking code is not usually limited by throughput.
blocking for less time.



9
     The memory problem




10
     When this technique really works:
★    You had good performance when all of your working set
     [1] of data fitted in main memory.
★    As your working set increases, you just increase the size
     of memory[2].



[1] Working set can be between 1% and 100% of database size. On poorly indexed systems
it is often a higher percentage - but the real value depends on what hotspots your data has.

[2] The economical maximum for main memory is currently 128GB. This can be done for
less than $10K with a Dell server.



11
                   Add Hardware (cont.)
★    Pros:
     ✦
         Good if you have a large working set or an “excess money”
         problem.
★    Cons:
     ✦
         Not as easy to get many multiples better performance.
     ✦
         Can get expensive once you get past a price point of hardware.
     ✦
         Still some features missing in MySQL hurting the very top end of
         users (save contents of buffer pool for warm restarts)
★    Conclusion:
     ✦
         I wouldn’t normally recommend this be your first optimization path
         if the system hasn’t ever been tuned.

12
      Option 2: Change Configuration
★    The “--run-faster startup option”.
★    This may work, but it assumes misconfigured setting
     to start with.
★    There are no silver bullets.




13
                 Changing Configuration
★    Most of this is based off running the command SHOW
     GLOBAL STATUS first, then analyzing the result.
★    Be careful when running this - the period of time it
     aggregates data for may cause skew.
     ✦
         A very simple utility called ‘mext’ solves this - http://
         www.xaprb.com/mext




14
A few very quick examples....
            Temporary Tables on Disk
★    You may be able to change tmp_table_size and
     max_heap_table_size to end up with to increase the
     threshold.

     Created_tmp_disk_tables 0      0     0
     Created_tmp_files        5     0     0
     Created_tmp_tables   12550   421   417




16
     Temporary Tables on Disk (cont.)
★    These are often caused by some internal GROUP BYs
     and complex joins with an ORDER BY that can’t use an
     index.
★    They default to memory unless they grow too big, but...
★    All temporary tables with text/blob columns will be
     created on disk regardless!




17
                    Binary Log cache
★    Updates are buffered before being written to the binary
     log. If they’re too big, the buffer creates a temporary file
     on disk:
★    mysql> show global status like 'binlog%';
     +-----------------------+-------+
     | Variable_name          | Value |
     +-----------------------+-------+
     | Binlog_cache_disk_use | 1082 |
     | Binlog_cache_use       | 78328 |
     +-----------------------+-------+
     2 rows in set (0.00 sec)
★    Corresponding Session Variable:
     binlog_cache_size


18
                         Table Cache
★    MySQL requires a copy of a each table open per
     connection. The default table_cache is lower than
     the default max_connections!
★    mysql> show global status like 'Open%tables';
     +---------------+--------+
     | Variable_name | Value |
     +---------------+--------+
     | Open_tables   | 64     |
     | Opened_tables | 532432 |
     +---------------+--------+
     2 rows in set (0.00 sec)
★    Corresponding Global Variable:
     table_cache_size


19
                       Thread Cache
★    Each connection in MySQL is a thread. You can
     reduce Operating System thread creation/destruction
     with a small thread_cache:
★    mysql> show global status like 'threads%';
     +-------------------+-------+
     | Variable_name     | Value |
     +-------------------+-------+
     | Threads_cached    | 16    |
     | Threads_connected | 67    |
     | Threads_created   | 4141 |
     | Threads_running   | 6     |
     +-------------------+-------+
     4 rows in set (0.00 sec)
★    Corresponding Global Variable:
     thread_cache_size
20
                 Cartesian Products?
★    Joining two tables without an index on either can often
     mean you’re doing something wrong. You can see this if
     Select_full_join > 0:
★    mysql> show global status like 'Select_full_join';
     +------------------+-------+
     | Variable_name    | Value |
     +------------------+-------+
     | Select_full_join | 0     |
     +------------------+-------+
     1 row in set (0.00 sec)




21
End of Example
                    Best way to do this?
★    Trained Eye helps, but you still miss things sometimes.
★    Most of this can be automated. The tool I like the most
     (for simplicity) is this one:
     ✦
         Matthew Montgomery’s Tuning Primer:
         http://forge.mysql.com/projects/project.php?id=44




23
             Every setting has a range!
★    You really can have too much of a good thing.
★    It takes more resources to allocate larger chunks of
     memory, and in some cases you’ll miss valuable CPU
     caches.
★    We’ve blogged about this with sort_buffer_size
     here:
     ✦
         http://www.mysqlperformanceblog.com/2007/08/18/how-fast-
         can-you-sort-data-with-mysql/




24
    Where this has substantial benefit:
★    InnoDB. The defaults are very conservative -
                                                           Defaults
     ✦
         8M for innodb_buffer_pool_size                   improve in
                                                          MySQL 5.4!
     ✦
         5M for innodb_log_file_size
★    These settings should be more like:
     ✦
         70-80% of system memory for innodb_buffer_pool_size
         and 64M to 256M for innodb_log_file_size.
★    More information:
     ✦
         http://www.mysqlperformanceblog.com/2006/09/29/what-to-
         tune-in-mysql-server-after-installation/
     ✦
         http://www.mysqlperformanceblog.com/2007/11/01/innodb-
         performance-optimization-basics/

25
           Change Configuration (cont.)
★    Pros:
     ✦
         Can get some quick wins, sometimes.
★    Cons:
     ✦
         Assumes a setting is misconfigured in the first place. Over-tuning
         can cause negative effects. Try setting your sort_buffer_size
         to 400M to find out how!
★    Conclusions:
     ✦
         Not a bad approach - since it is easy to apply without changing
         your application, or rolling in new hardware.




26
              Option 3: Add an index
★    Should really be called “Add an index, or slightly rewrite a
     query”.
★    This is the least “fun” approach.
★    It delivers the most value for money though!




27
         The EXPLAIN Command
mysql> EXPLAIN SELECT Name FROM Country WHERE continent =
'Asia' AND population > 5000000 ORDER BY Name\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: Country
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 239
        Extra: Using where; Using filesort
1 row in set (0.00 sec)



28
                  Explain (cont.)
mysql> ALTER TABLE Country ADD INDEX p (Population);
Query OK, 239 rows affected (0.01 sec)
Records: 239 Duplicates: 0 Warnings: 0
mysql> EXPLAIN SELECT Name FROM Country WHERE Continent =
'Asia' AND population > 5000000 ORDER BY Name\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: Country
         type: ALL
possible_keys: p
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 239
        Extra: Using where; Using filesort
1 row in set (0.06 sec)
29
                      Now it is...
mysql> EXPLAIN SELECT Name FROM Country WHERE Continent = 'Asia'
  AND population > 50000000 ORDER BY Name\G
  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE
          table: Country
           type: range
  possible_keys: p
            key: p
        key_len: 4
            ref: NULL
           rows: 54
          Extra: Using where; Using filesort
  1 row in set (0.00 sec)




30
                  Another Index..
mysql> ALTER TABLE Country ADD INDEX c (Continent);
  Query OK, 239 rows affected (0.01 sec)
  Records: 239 Duplicates: 0 Warnings: 0

mysql> EXPLAIN SELECT Name FROM Country WHERE Continent = 'Asia'
  AND population > 50000000 ORDER BY Name\G
  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE
          table: Country
           type: ref
  possible_keys: p,c
            key: c
        key_len: 1
            ref: const
           rows: 42
          Extra: Using where; Using filesort
  1 row in set (0.01 sec)
31
       Changes back to p at 500M!

mysql> EXPLAIN SELECT Name FROM Country WHERE Continent = 'Asia'
AND population > 500000000 ORDER BY Name\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: Country
         type: range
possible_keys: p,c
          key: p
      key_len: 4
          ref: NULL
         rows: 4
        Extra: Using where; Using filesort
1 row in set (0.00 sec)



32
               Try another index...
mysql> ALTER TABLE Country ADD INDEX p_c (Population, Continent);
  Query OK, 239 rows affected (0.01 sec)
  Records: 239 Duplicates: 0 Warnings: 0

mysql> EXPLAIN SELECT Name FROM Country WHERE Continent = 'Asia'
  AND population > 50000000 ORDER BY Name\G

*************************** 1. row ***************************
             id: 1
    select_type: SIMPLE
          table: Country
           type: ref
  possible_keys: p,c,p_c
            key: c
        key_len: 1
            ref: const
           rows: 42
          Extra: Using where; Using filesort
  1 row in set (0.01 sec)
33
             How about this one?
mysql> ALTER TABLE Country ADD INDEX c_p (Continent,Population);
  Query OK, 239 rows affected (0.01 sec)
  Records: 239 Duplicates: 0 Warnings: 0

mysql> EXPLAIN SELECT Name FROM Country WHERE Continent = 'Asia'
  AND population > 50000000 ORDER BY Name\G
  *************************** 1. row ***************************
             id: 1
    select_type: SIMPLE
          table: Country
           type: range
  possible_keys: p,c,p_c,c_p
            key: c_p
        key_len: 5
            ref: NULL
           rows: 7
          Extra: Using where; Using filesort
  1 row in set (0.00 sec)
34
                      The Best...
mysql> ALTER TABLE Country ADD INDEX c_p_n
  (Continent,Population,Name);
  Query OK, 239 rows affected (0.02 sec)
  Records: 239 Duplicates: 0 Warnings: 0
mysql> EXPLAIN SELECT Name FROM Country WHERE Continent = 'Asia'
   AND population > 50000000 ORDER BY Name\G
   *************************** 1. row ***************************
              id: 1
     select_type: SIMPLE
           table: Country
            type: range
   possible_keys: p,c,p_c,c_p,c_p_n
             key: c_p_n
         key_len: 5
             ref: NULL
            rows: 7
           Extra: Using where; Using index; Using filesort
35 1 row in set (0.00 sec)
               So what’s the end result?
★    We’re looking at 9 rows, not the whole table.
     ✦
         We’re returning those rows from the index - bypassing the table.
★    A simple example - but easy to demonstrate how to
     reduce table scans.
★    You wouldn’t add all these indexes - I’m just doing it as a
     demonstration.
     ✦
         Indexes (generally) hurt write performance.




36
            Example 2: Join Analysis
   mysql> EXPLAIN SELECT * FROM city WHERE countrycode IN
   (SELECT code FROM country WHERE name='Australia')\G
   *************************** 1. row ***************************
              id: 1
     select_type: PRIMARY
           table: city
            type: ALL
   possible_keys: NULL
             key: NULL
         key_len: NULL
             ref: NULL
            rows: 4079
           Extra: Using where
   *************************** 2. row ***************************
              id: 2
     select_type: DEPENDENT SUBQUERY
           table: country
            type: unique_subquery
   possible_keys: PRIMARY
             key: PRIMARY
         key_len: 3
             ref: func
            rows: 1
37         Extra: Using where
                  Join analysis (cont.)
mysql> EXPLAIN SELECT city.* FROM city, country WHERE
city.countrycode=country.code AND country.name='Australia'\G
*************************** 1. row ***************************
            id: 1
   select_type: SIMPLE
         table: city
          type: ALL
possible_keys: NULL
           key: NULL
       key_len: NULL
           ref: NULL
          rows: 4079
         Extra:
*************************** 2. row ***************************
            id: 1
   select_type: SIMPLE
         table: country
          type: eq_ref
possible_keys: PRIMARY
           key: PRIMARY
       key_len: 3
           ref: world.city.CountryCode
          rows: 1
38       Extra: Using where
                  Try an index...
★    mysql> ALTER TABLE city ADD INDEX (countrycode);
     Query OK, 4079 rows affected (0.03 sec)
     Records: 4079 Duplicates: 0 Warnings: 0




39
                       Is that any better?
     mysql> EXPLAIN SELECT city.* FROM city, country WHERE city.countrycode=country.code
     AND country.name='Australia'\G
     *************************** 1. row ***************************
                id: 1
       select_type: SIMPLE
             table: city
              type: ALL
     possible_keys: CountryCode
               key: NULL
           key_len: NULL
               ref: NULL
              rows: 4079
             Extra:
     *************************** 2. row ***************************
                id: 1
       select_type: SIMPLE
             table: country
              type: eq_ref
     possible_keys: PRIMARY
               key: PRIMARY
           key_len: 3
               ref: world.city.CountryCode
              rows: 1
             Extra: Using where
40   2 rows in set (0.01 sec)
                     Try Again
★    mysql> ALTER TABLE country ADD INDEX (name);
     Query OK, 239 rows affected (0.01 sec)
     Records: 239 Duplicates: 0 Warnings: 0




41
                          Looking good...
     mysql> EXPLAIN SELECT city.* FROM city, country WHERE city.countrycode=country.code
     AND country.name='Australia'\G
     *************************** 1. row ***************************
                id: 1
       select_type: SIMPLE
             table: country
              type: ref
     possible_keys: PRIMARY,Name
               key: Name
           key_len: 52
               ref: const
              rows: 1
             Extra: Using where
     *************************** 2. row ***************************
                id: 1
       select_type: SIMPLE
             table: city
              type: ref
     possible_keys: CountryCode
               key: CountryCode
           key_len: 3
               ref: world.country.Code
              rows: 18
             Extra:
42   2 rows in set (0.00 sec)
                          My Advice
★    Focus on components of the WHERE clause.
★    The optimizer does cool things - don’t make assumptions.
     For Example:
     ✦
         EXPLAIN SELECT   * FROM City WHERE id = 1810;
     ✦
         EXPLAIN SELECT   * FROM City WHERE id = 1810
         LIMIT 1;
     ✦
         EXPLAIN SELECT   * FROM City WHERE id BETWEEN 100
         and 200;
     ✦
         EXPLAIN SELECT   * FROM City WHERE id >= 100 and
         id <= 200;



43
                                The answer...
mysql> EXPLAIN SELECT * FROM City WHERE id = 1810;
    +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
    | id | select_type | table | type | possible_keys | key      | key_len | ref   | rows | Extra |
    +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
    | 1 | SIMPLE       | City | const | PRIMARY        | PRIMARY | 4       | const |    1 |       |
    +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
    1 row in set (0.00 sec)

mysql> EXPLAIN SELECT * FROM City WHERE id = 1810 LIMIT 1;
    +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
    | id | select_type | table | type | possible_keys | key      | key_len | ref   | rows | Extra |
    +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
    | 1 | SIMPLE       | City | const | PRIMARY        | PRIMARY | 4       | const |    1 |       |
    +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------+
    1 row in set (0.00 sec)




44
                               The answer (2)
mysql> EXPLAIN SELECT * FROM City WHERE id BETWEEN 100 and 200;
    +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
    | id | select_type | table | type | possible_keys | key      | key_len | ref | rows | Extra        |
    +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
    | 1 | SIMPLE       | City | range | PRIMARY        | PRIMARY | 4       | NULL | 101 | Using where |
    +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
    1 row in set (0.01 sec)


mysql> EXPLAIN SELECT * FROM City WHERE id >= 100 and id <= 200;
    +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
    | id | select_type | table | type | possible_keys | key      | key_len | ref | rows | Extra        |
    +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
    | 1 | SIMPLE       | City | range | PRIMARY        | PRIMARY | 4       | NULL | 101 | Using where |
    +----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
    1 row in set (0.00 sec)




45
                       More information
★    http://dev.mysql.com/EXPLAIN
★    Some examples are also in:
     ✦
         High Performance MySQL 2nd Edition
         Authored by three Percona employees, et. al.
     ✦
         Zend Enterprise PHP Patterns
         Authored by John Coggeshall and myself.




46
               Add an index (conclusion)
★    Pros:
     ✦
         The biggest wins. Seriously.
★    Cons:
     ✦
         Takes a bit of time for analysis. If you need to rewrite a query - you
         need to go inside the application (not everyone can).
★    Conclusion:
     ✦
         My #1 Recommendation.




47
                      The Scoreboard

        Option             Effort       Wins

     Add Hardware
                           **           1/2

     Tweak Settings
                           **           **
     Add an Index
                           ***         *****



48
Third Party Patches time!
         Patch #1 - Slow Query Filtering
★    Let me tell a story first:
     ✦
         mysql> select * from employees WHERE birth_date
         BETWEEN '1960-01-01' AND '1960-03-31' ORDER by
         RAND();
         117138 rows in set (0.63 sec)
     ✦
         mysql> select * from employees WHERE birth_date
         BETWEEN '1960-01-01' AND '1960-03-31';
         117138 rows in set (0.25 sec)




50
                          First One:
★    mysql> EXPLAIN select * from employees WHERE birth_date BETWEEN
     '1960-01-01' AND '1960-03-31' ORDER by RAND()\G
     *************************** 1. row ***************************
                id: 1
       select_type: SIMPLE
             table: employees
              type: range
     possible_keys: birth_date
               key: birth_date
           key_len: 3
               ref: NULL
              rows: 11554
             Extra: Using where; Using temporary; Using filesort
     1 row in set (0.00 sec)




51
                        Second One
★    mysql> EXPLAIN select * from employees WHERE birth_date BETWEEN
     '1960-01-01' AND '1960-03-31'\G
     *************************** 1. row ***************************
                id: 1
       select_type: SIMPLE
             table: employees
              type: range
     possible_keys: birth_date
               key: birth_date
           key_len: 3
               ref: NULL
              rows: 11554
             Extra: Using where
     1 row in set (0.00 sec)




52
                     ... what do we see?
★    So the query that doesn’t have to sort records is slightly
     faster. No surprises here.
★    What about the difference in scalability between these two
     queries?
     ✦
         Scalability != performance!




53
                  The empirical test
★    #!/bin/sh

     ITERATIONS=10
     CONCURRENCY="1,2,4,8,16,32,64,128"

     mkdir -p results

     ./mysqlslap -q "select * from employees WHERE birth_date
     BETWEEN '1960-01-01' AND '1960-03-31' ORDER by RAND()" --
     create-schema=employees -i $ITERATIONS -c $CONCURRENCY >
     results/ORDER_BY_rand.txt

     ./mysqlslap -q "select * from employees WHERE birth_date
     BETWEEN '1960-01-01' AND '1960-03-31'" --create-
     schema=employees -i $ITERATIONS -c $CONCURRENCY > results/
     ORDER_BY_null.txt

54
                                    Results

     65.00
                                                                     60.05


     48.75


     32.50
                                                             26.96

     16.25                                                           18.56
                                                    14.39
                                            9.81
                                     4.88                    5.16
        0           1.22     2.40
                             0.89    1.76
                                            3.48     4.55
             0.63
             0.25   0.47
              1      2        4       8     16       32      64      128
55                         RAND()                  No Rand
                                 Why is this?
★    The query that requires the sort hits a different bottleneck,
     the sort.
     ✦
         Sorts in memory cause a lot of CPU pressure.
            • Might not scale.
     ✦
         Temporary tables or sorts on disk cause IO pressure.
            • Certainly won’t scale.




56
                 Why is this (cont.)?
★    Sometimes it’s just as important to monitor “what queries
     take a long time” as “what may not work with
     concurrency”.
★    Examples of things that might be expensive and reduce
     concurrency are.....




57
                        Expensive Things...


     qc_miss             The query was not found in the query cache.
     full_scan           The query performed a full table scan.
     full_join           The query performed a full join (a join without indexes).
     tmp_table           The query created an implicit internal temporary table.
     tmp_table_on_disk   The query's temporary table was stored on disk.
     filesort            The query used a filesort.
     filesort_on_disk    The filesort was performed on disk.




58
             Introducing Log slow filter...
★    [mysqld]
     log_slow_filter=tmp_table_on_disk,filesort_on_disk


       qc_miss             The query was not found in the query cache.
       full_scan           The query performed a full table scan.
       full_join           The query performed a full join (a join without indexes).
       tmp_table           The query created an implicit internal temporary table.
       tmp_table_on_disk   The query's temporary table was stored on disk.
       filesort            The query used a filesort.
       filesort_on_disk    The filesort was performed on disk.




59
                           Extra Feature #1
★    log_slow_verbosity - You can also *get* more
     information written to the log:
     microtime          Log queries with microsecond precision (mandatory).

     query_plan         Log information about the query's execution plan (optional).

     innodb             Log InnoDB statistics (optional).



     # User@Host: mailboxer[mailboxer] @ [192.168.10.165]
     # Thread_id: 11167745 Schema: board
     # QC_Hit: No Full_scan: No Full_join: No Tmp_table: Yes Disk_tmp_table: No
     # Filesort: Yes Disk_filesort: No Merge_passes: 0
     # Query_time: 0.000659 Lock_time: 0.000070 Rows_sent: 0 Rows_examined: 30
     Rows_affected: 0 Rows_read: 30
     #   InnoDB_IO_r_ops: 1 InnoDB_IO_r_bytes: 16384 InnoDB_IO_r_wait: 0.028487
     #   InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000
     #   InnoDB_pages_distinct: 5
     select count(distinct author_id) from art87.article87 force index (forum_id) where
     forum_id = 240215 and thread_id = '710575'
60
                            Extra Feature #2
★    It also allows you to set the long_query_time to
     microseconds in MySQL 5.0:
     ✦
         long_query_time = 100000

★    If we think that it takes an average of up to 7* queries to
     generate a page, 1 second is too long.
     ✦
         MySQL finally fixed this in MySQL 5.1




         * Source: Brian Aker somewhere.
61
                       More information
★    This patch was originally authored by Percona
★    More Information:
     http://www.percona.com/docs/wiki/patches:microslow_innodb




62
           Patch #2 - Index Statistics
★    mysql> EXPLAIN SELECT Name FROM Country WHERE continent =
     'Asia' AND population > 5000000 ORDER BY Name\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: Country
         type: ALL                              What were
possible_keys: NULL                            the possible
          key: NULL                              indexes?
      key_len: NULL
          ref: NULL
         rows: 239
        Extra: Using where; Using filesort
1 row in set (0.00 sec)



63
                             Answers
★    The following indexes work (to varying degrees):
     ✦
         (continent)
     ✦
         (continent, population)
     ✦
         (continent, population, name).




64
                        Answers (cont.)
★    The following answers are incorrect:
     ✦
         name. Using an index on name would avoid the sort, but it would
         actually be quicker to table scan.
     ✦
         population. Using an index on population is not very helpful,
         since it doesn’t filter enough rows (most countries have
         populations > 5M).




65
                How would you tell...
★    .. if you added the wrong index?

     You know, given that indexes will hurt things like write
     performance.




66
                     The simple answer...
★    You can’t!
     ✦
         There is no way in MySQL to find “dead indexes”.
            • Other than drop all indexes and start adding again ;)
     ✦
         Did I mention index selection changes over time based on data
         distribution?
★    .. and it’s really annoying.




67
            Introducing Index Statistics
★    SELECT DISTINCT s.TABLE_SCHEMA, s.TABLE_NAME, s.INDEX_NAME FROM
     information_schema.statistics `s` LEFT JOIN
     information_schema.index_statistics IS ON (s.TABLE_SCHEMA = IS.TABLE_SCHEMA
     AND s.TABLE_NAME=IS.TABLE_NAME AND s.INDEX_NAME=IS.INDEX_NAME) WHERE
     IS.TABLE_SCHEMA IS NULL;
     +--------------+---------------------------+-----------------+
     | TABLE_SCHEMA | TABLE_NAME                | INDEX_NAME      |
     +--------------+---------------------------+-----------------+
     | art100       | article100                | ext_key         |
     | art100       | article100                | site_id         |
     | art100       | article100                | hash            |
     | art100       | article100                | forum_id_2      |
     | art100       | article100                | published       |
     | art100       | article100                | inserted        |
     | art100       | article100                | site_id_2       |
     | art100       | author100                 | PRIMARY         |
     | art100       | author100                 | site_id         |
     ...
     +--------------+---------------------------+-----------------+
     1150 rows IN SET (1 min 44.23 sec)

     Source: http://www.mysqlperformanceblog.com/2008/09/12/unused-indexes-by-
     single-query/
68
                     Index statistics is...
★    A very simple statistics tool that increments counters as
     index rows are read.
     ✦
         Allows you to figure out which indexes should be dropped or
         tweaked.
     ✦
         Very little overhead.




69
                 More information
★    Actually comes as part of a patch for
     INDEX_STATISTICS, USER_STATISTICS,
     TABLE_STATISTICS.
★    Patch originally authored by Google.
★    More Information:
     http://www.percona.com/docs/wiki/patches:userstatv2




70
                                 The End
★    A few quick public service announcements if I may:

★    Many of the examples I use show up   ★   I’m in town to deliver InnoDB and
     on our blog:                             XtraDB training on Friday. We have
     ✦
         www.mysqlperformanceblog.com         only 4 seats left!
★    We’re a consulting company. Our
                                              ✦
                                                  www.percona.com/training/
     most popular service is a                    performance-optimization-for-
     Performance Audit. You might be              mysql-with-innodb-and-
     interested:                                  xtradb.html
     ✦
         www.mysqlperformanceblog.com/
                                          ✦
                                              I have two books to give
         2008/11/24/how-percona-does-a-       away:
         mysql-performance-audit/




71
                    Finding bad queries
★   MySQL has a feature called the slow query log.
★   We can enable it, and then set the long_query_time to
    zero[1] seconds to find a selection of our queries.




[1] Requires MySQL 5.1 or patches MySQL 5.0 release.
72
     root@ubuntu:~# perl mk-query-digest /bench/mysqldata/ubuntu-slow.log
     # 1461.1s user time, 39.2s system time, 22.20M rss, 57.52M vsz
     # Overall: 7.26M total, 38 unique, 17.28k QPS, 18.88x concurrency ________
     #                    total     min     max     avg     95% stddev median
     # Exec time          7929s    12us   918ms     1ms     4ms     10ms  138us
     # Lock time           154s       0    17ms    21us    36us     33us   18us
     # Rows sent          5.90M       0     246    0.85    0.99     6.71   0.99
     # Rows exam          6.90M       0     495    1.00    0.99   13.48       0
     # Time range        2009-09-13 17:26:54 to 2009-09-13 17:33:54
     # bytes            765.14M       6     599 110.56 202.40     65.01   80.10
     # Rows read              0       0       0       0       0        0      0




73
     ..
     # Query 1: 655.60 QPS, 4.28x concurrency, ID 0x813031B8BBC3B329 at byte 518466
     # This item is included in the report because it matches --limit.
     #              pct   total      min     max    avg      95% stddev median
     # Count          3 274698
     # Exec time     22   1794s     12us   918ms    7ms      2ms   43ms   332us
     # Lock time      0        0       0       0      0        0       0      0
     # Rows sent      0        0       0       0      0        0       0      0
     # Rows exam      0        0       0       0      0        0       0      0
     # Users                   1 [root]
     # Hosts                   1 localhost
     # Databases               1    tpcc
     # Time range 2009-09-13 17:26:55 to 2009-09-13 17:33:54
     # bytes          0   1.57M        6       6      6        6       0      6
     # Query_time distribution
     #   1us
     # 10us ###
     # 100us ################################################################
     #   1ms ##
     # 10ms ##
     # 100ms #
     #    1s
     # 10s+
     commit\G




74
     ..
     # Query 2: 2.05k QPS, 4.20x concurrency, ID 0x10BEBFE721A275F6 at byte 17398977
     # This item is included in the report because it matches --limit.
     #              pct   total      min     max    avg      95% stddev median
     # Count         11 859757
     # Exec time     22   1758s     64us   812ms    2ms      9ms    9ms   224us
     # Lock time     17     27s     13us     9ms   31us     44us   26us    28us
     # Rows sent      0        0       0       0      0        0       0      0
     # Rows exam      0        0       0       0      0        0       0      0
     # Users                   1 [root]
     # Hosts                   1 localhost
     # Databases               1    tpcc
     # Time range 2009-09-13 17:26:55 to 2009-09-13 17:33:54
     # bytes         22 170.52M      192     213 207.97 202.40     0.58 202.40
     # Query_time distribution
     #   1us
     # 10us #
     # 100us ################################################################
     #   1ms ############
     # 10ms ###
     # 100ms #
     #    1s
     # 10s+
     # Tables
     #    SHOW TABLE STATUS FROM `tpcc` LIKE 'order_line'\G
     #    SHOW CREATE TABLE `tpcc`.`order_line`\G
     INSERT INTO order_line (ol_o_id, ol_d_id, ol_w_id, ol_number, ol_i_id, ol_supply_w_id,
     ol_quantity, ol_amount, ol_dist_info) VALUES (3669, 4, 65, 1, 6144, 38, 5, 286.943756103516,
     'sRgq28BFdht7nemW14opejRj')\G




75
     ..
     # Query 4: 2.05k QPS, 1.42x concurrency, ID 0x6E70441DF63ACD21 at byte 192769443
     # This item is included in the report because it matches --limit.
     #              pct   total      min     max    avg      95% stddev median
     # Count         11 859769
     # Exec time      7    597s     67us   794ms  693us   467us     6ms   159us
     # Lock time     12      19s     9us    10ms   21us    31us    25us    19us
     # Rows sent      0        0       0       0       0       0       0      0
     # Rows exam      0        0       0       0       0       0       0      0
     # Users                   1 [root]
     # Hosts                   1 localhost
     # Databases               1    tpcc
     # Time range 2009-09-13 17:26:55 to 2009-09-13 17:33:54
     # bytes          7 56.36M        64      70  68.73   65.89    0.30   65.89
     # Query_time distribution
     #   1us
     # 10us #
     # 100us ################################################################
     #   1ms #
     # 10ms #
     # 100ms #
     #    1s
     # 10s+
     # Tables
     #    SHOW TABLE STATUS FROM `tpcc` LIKE 'stock'\G
     #    SHOW CREATE TABLE `tpcc`.`stock`\G
     UPDATE stock SET s_quantity = 79 WHERE s_i_id = 89277 AND s_w_id = 51\G
     # Converted for EXPLAIN
     # EXPLAIN
     select s_quantity = 79 from stock where s_i_id = 89277 AND s_w_id = 51\G




76
     ..
     # Rank   Query ID           Response time    Calls   R/Call       Item
     # ====   ================== ================ ======= ==========   ====
     #    1   0x813031B8BBC3B329 1793.7763 23.9% 274698     0.006530   COMMIT
     #    2   0x10BEBFE721A275F6 1758.1369 23.5% 859757     0.002045   INSERT   order_line
     #    3   0xBD195A4F9D50914F   924.4553 12.3% 859770    0.001075   SELECT   UPDATE stock
     #    4   0x6E70441DF63ACD21   596.6281 8.0% 859769     0.000694   UPDATE   stock
     #    5   0x5E61FF668A8E8456   448.0148 6.0% 1709675    0.000262   SELECT   stock
     #    6   0x0C3504CBDCA1EC89   308.9468 4.1%    86102   0.003588   UPDATE   customer
     #    7   0xA0352AA54FDD5DF2   307.4916 4.1%    86103   0.003571   UPDATE   order_line
     #    8   0xFFDA79BA14F0A223   192.8587 2.6%    86122   0.002239   SELECT   customer warehouse
     #    9   0x0C3DA99DF6138EB1   191.9911 2.6%    86120   0.002229   SELECT   UPDATE customer
     #   10   0xBF40A4C7016F2BAE   109.6601 1.5% 860614     0.000127   SELECT   item
     #   11   0x8B2716B5B486F6AA   107.9319 1.4%    86120   0.001253   INSERT   history
     #   12   0x255C57D761A899A9   103.9751 1.4%    86120   0.001207   UPDATE   warehouse
     #   13   0xF078A9E73D7A8520   102.8506 1.4%    86120   0.001194   UPDATE   district
     #   14   0x9577D48F480A1260    91.3182 1.2%    56947   0.001604   SELECT   customer
     #   15   0xE5E8C12332AD11C5    87.2532 1.2%    86122   0.001013   SELECT   UPDATE district
     #   16   0x2276F0D2E8CC6E22    86.1945 1.1%    86122   0.001001   UPDATE   district
     #   17   0x9EB8F1110813B80D    83.1471 1.1%    86106   0.000966   UPDATE   orders
     #   18   0x0BF7CEAD5D1D2D7E    80.5878 1.1%    86122   0.000936   INSERT   orders
     #   19   0xAC36DBE122042A66    74.5417 1.0%     8612   0.008656   SELECT   order_line
     #   20   0xF8A4D3E71E066ABA    46.7978 0.6%     8612   0.005434   SELECT   orders




77
           Advanced mk-query-digest
★    Query Review - the best feature ever.
★    Saves the fingerprint of your slow query, and only shows
     you what you haven’t already looked at:

     $ mk-query-digest --review h=host1,D=test,t=query_review \
       /path/to/slow.log




78