Pages

Thursday, October 11, 2012

Query Optimization Tips


Below listed are some of the most important query optimization tips.

  • The parameters query_cache_type and query_cache_size can be set in the my.cnffile to speed up the query performance.
    - Query_cache_size is suitable where the database has to repeatedly run the
    same queries on the same data set, returning the same results each time. Thus
    MySQL can cache the result set, avoiding the overhead of running through the
    data over and over.

    - Query_cache_size can differ depending upon the application’s requirements,
    the type of queries and RAM availability.

    - Possible values of query_cache_type
    o 0 = disabled
    o 1 = enabled (Recommended when the queries are slow and RAM
    available is more)
    o 2 = on demand (Queries are only cached if the Hint SQL_CACHE is
    used)
  • Avoid using DISTINCT when you have or could use GROUP BY.Example:
    Original: select distinct col1 from table1
    Alternative: select col1 from table1 group by col1
  • If you use DISTINCT, try to use LIMIT (MySQL stops as soon as it finds row_countunique rows) and avoid ORDER BY (it requires a temporary table in many cases).
  • Enhance Insert statement performance- INSERT statement includes the below activities:-
    o Connecting
    o Sending query to server
    o Parsing query
    o Inserting row
    o Inserting indexes
    o Closing
    - Batch INSERT and REPLACE should be used when inserting multiple rows at the
    same time.
    SYNTAX
    INSERT INTO tbl_name (col1, col2) VALUES (val1, val2);
    or
    INSERT INTO tbl_name SELECT...
    or
    INSERT INTO tbl_name VALUES (val1, val2), (val1, val2) …

    - For even better performance use LOAD DATA instead of INSERT. LOAD DATA
    INFILE reads rows from a text file into a table at a very high speed.
    SYNTAX
    LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name'
    [REPLACE | IGNORE]
    INTO TABLE tbl_name
    [CHARACTER SET charset_name]
    [{FIELDS | COLUMNS}
    [TERMINATED BY 'string']
    [[OPTIONALLY] ENCLOSED BY 'char']
    [ESCAPED BY 'char']
    ]
    [LINES
    [STARTING BY 'string']
    [TERMINATED BY 'string']
    ]
    [IGNORE number LINES]
    [(col_name_or_user_var,...)]
    [SET col_name = expr,...]
  • LIMIT m, n may not be as fast as it sounds.Alternative: Precompute positions and then use POS BETWEEN (m+1) and (m+n)
    Example:
    Original: select * from orders limit 100, 10;
    Alternative: select * from orders where POS between 101 and 110;
  • Don't use ORDER BY RAND() if you have > ~2K recordsAlternative:
    - Add an indexed column called “random” and populate it with random values.
    - Do order by random instead of order by rand()
    - Once you’ve selected the row update “random” to new random value.
    - Periodically run process to reset random values.
  • use SQL_NO_CACHE hint when you are SELECTing frequently updated data orlarge sets of data.
  • Use SQL_CACHE if you have setup MySQL Query Caching to explicit mode (setquery_cache_type = 2) to tell MySQL which queries to cache.
  • The HIGH_PRIORITY hint can be used on SELECT or INSERT statements to letMySQL know that this is a high priority query.
  • The LOW_PRIORITY hint can be used on INSERT or UPDATE statements. If you usethe LOW_PRIORITY keyword, execution of the query is delayed until no other
    clients are reading from the table.
  • An INSERT LOW_PRIORITY statement will not return until the statement has beenexecuted, which could possibly be forever. Instead you can use an INSERT
    DELAYED statement. It will return immediately, but it will still wait until other
    clients have closed the table before executing the statement.
    Note: INSERT DELAYED only works on MyISAM, MEMORY, and ARCHIVE tables.
  • STRAIGHT_JOIN hint can be used to tell MySQL to join the tables in the order thatthey are specified in the FROM clause.
  • SQL_BUFFER_RESULT hint tells MySQL to put the result of the query into atemporary table. This will free up a table lock while the resultset is being sent to
    the client. So you would only want to use this on large result sets.
  • SQL_BIG_RESULT can be used with DISTINCT and GROUP BY SELECT statements. Ittells MySQL that the result set will be big. MySQL directly uses disk-based
    temporary tables if needed, and prefers sorting to using a temporary table with a
    key on the GROUP BY elements.
  • SQL_SMALL_RESULT uses fast temporary tables to store the resulting table insteadof using sorting. Since this is typically the default route of the optimizer, this hint is
    often not needed.
  • Avoid wildcards at the start of LIKE queries.
  • Avoid correlated subqueries. In “select” and “where” clause try to avoid “in” clause.
  • Avoid calculated comparisons and isolate indexed columns.Example:
    order_created is the indexed column.
    select * from orders where to_days(order_created) – to_days(current_date()) >=7;
    - Here the index is not used.
    select * from orders where order_created >= current_date() - interval 7day;
    - Here the index is used as the indexed column is isolated.
  • ORDER BY and LIMIT work best with equalities and covered indexes
  • Separate text/blobs from metadata, don't put text/blobs in results if you don'tneed them
  • Derived tables (subqueries in the FROM clause) can be useful for retrieving BLOBswithout sorting them. (Self-join can speed up a query if 1st part finds the IDs and
    uses then to fetch the rest).
  • ALTER TABLE...ORDER BY can take data sorted chronologically and re-order it by adifferent field -- this can make queries on that field run faster.
  • Do not make the queries very complex, rather join smaller ones.
  • Delete small amounts at a time if you can. The reason being Delete acquires a lockon the table blocking further access till the execution completes.
  • Make similar queries consistent so cache is used.
  • Have good SQL query standards.(http://dev.mysql.com/doc/refman/5.1/...atibility.html)
  • Don't use deprecated features.
  • Turning OR on multiple index fields (<5.0) into UNION may speed things up (withLIMIT), after 5.0 the index_merge should pick stuff up.
  • Don't use COUNT * on Innodb tables for every search, do it a few times and/orsummary tables, or if you need it for the total # of rows, use
    SQL_CALC_FOUND_ROWS first and then SELECT FOUND_ROWS().
  • If you often need to calculate COUNT or SUM based on information from a lot ofrows (articles rating, poll votes, user registrations count, etc.), it makes sense to
    create a separate table and update the counter in real time, which is much faster. If
    you need to collect statistics from huge log tables, take advantage of using a
    summary table instead of scanning the entire log table every time.
  • Don’t use REPLACE (which is DELETE+INSERT and wastes ids): use INSERT … ONDUPLICATE KEY UPDATE instead (i.e. it’s INSERT + UPDATE if conflict takes place).
    The same technique can be used when you need first make a SELECT to find out if
    data is already in database, and then run either INSERT or UPDATE.
  • Use groupwise maximum instead of subqueries.
  • Avoid using of HAVING clause — it’s rather slow.
  • If you have lots of DELETEs in your application or updates of dynamic format rows(rows containing VARCHAR, BLOB or TEXT column are dynamic format rows)
    schedule running OPTIMIZE TABLE query every weekend by crond
  • Use leftmost index prefix — in MySQL you can define index on several columns sothat left part of that index can be used as a separate one which results in lesser
    indices.
  • Do not use “! =” when you are about to compare two result set. Instead of it use“not exists” clause.
  • Disable unique and foreign key constraints during data import operations as wellas set Autocommit to false. Set it to true once the import is complete.

No comments:

Post a Comment