The SQL Problem
The SQL Problem
Table of Contents
The SQL Problem…………………………………………………page 3 The Right Way to Optimize SQL……………………………..page 6 Productivity………………………………………………………….page 10 Avoiding Database Optimization Pitfalls…………………..page 11 The Best Solution to the Problem…………………………….Page 15 LECCO SQL Expert solves SQL Performance Problem….Page 17
The SQL Problem
LECCO SQL Expert automates the process of identifying and optimizing problematic SQL statements.
"The bad news with SQL is that you don't get traditional debugging tools! The poor programmer is quickly back to the days of ‘Wright brothers software engineering’: Put it (the query) all together and push it off a cliff to see if it flies" Joe Celko, “CELKO”, February 9, 2000 Volume 3 - Number 3
The SQL Problem
SQL is not a procedural programming language. You cannot step through it one line at a time with a program debugging tool. Because it is based on Set Theory, all data is retrieved, stored, deleted, and updated in whole, indivisible sets of data. The SQL statement either results in the correct set of data accessed in a relational database or the wrong set, and there is no way to see what the results will be except to run the query to completion. The major challenge in the development of relational database technology has been the problem of creating a physical system using current computer and software technology that conforms to the logical rules of relational database theory while still delivering optimum performance for retrieving and storing data.
What is the difference between a good SQL statement and a performance optimized statement?
Theoretically, a good SQL statement is any one that correctly selects, deletes, updates or inserts the desired set of rows in the database. In reality, however, there are many semantically equivalent ways to write a given query. The problem is that some SQL statements have much better performance than other semantically equivalent ones. Therefore, a good SQL statement can be considered a problematic one if it takes too long to run. It is estimated that thirty percent or more of complex
LECCO Technology The SQL Problem Page 3
SQL statements can be rewritten to perform better, and that such problematic SQL statements are responsible for sixty to ninety percent of database performance problems. So the best SQL statement not only gets the correct result, but also produces optimum performance.
The problem defined: too much complexity
How many different ways are there to write this SQL statement?
SQL1 SELECT EMP_ID, EMP_SALARY FROM EMPLOYEE WHERE NVL(EMP_SALARY, 50000) BETWEEN 50000 AND 55000 ORDER BY EMP_ID, EMP_SALARY
According to the LECCO SQL Expert Optimizer, there are 9 different semantically equivalent ways to write this statement in Oracle 8, without Oracle optimizer Hints applied. With several Oracle optimizer Hints applied, LECCO SQL Expert generated 273. The fastest alternative ran in a few milliseconds, compared to original one, which ran in 2 minutes 17 seconds – an improvement factor of at least 13,700 percent. How many semantically equivalent ways are there to write the following query?
SQL2 select emp_name, dpt_name, grd_desc INTO EMPLOYEE_NAME, DEPARTMENT_NAME, GRADE_DESC from employee, department DEPARTMENT1, grade where emp_grade = grd_id and emp_dept = dpt_id and EXISTS (SELECT 'X') from department DEPARTMENT2 WHERE dpt_avg_salary in (select min(dpt_avg_salary) from department (DEPARTMENT3) AND dpt_id = EMPLOYEE.emp_dept)
LECCO Technology The SQL Problem Page 4
Actually, no one really knows exactly because there are too many ways to do it. According to the LECCO SQL Expert Optimizer, there are 71 different semantically equivalent ways to write this statement in Oracle 8, without Oracle optimizer Hints applied. With several of the Oracle optimizer Hints applied, LECCO SQL Expert generated 520 semantically equivalent statements. The fastest alternative ran in a 3.3 seconds compared to original one, which ran in 8.15 seconds – an improvement factor of 60 percent. The problem with SQL optimization is that there are so many semantically equivalent ways to write even a single, moderately complex SQL statement, that no programmer can possibly know how to write it every possible way. Even if the programmer did know how to write this statement in every possible way, he would be hard pressed to find the time to do it. Relational database technology is still evolving. The core of the relational database is the optimizer. As the RDBMS grows in functionality, it will also grow in complexity. As the RDBMS optimizer grows in sophistication, it will grow in complexity. The increase in complexity and sophistication of RDBMS optimizers will, accordingly, increase the complexity and sophistication of SQL syntax itself. Thus, because of the continuous development of RDBMS technology, SQL performance problems will continue to grow and will require an ever-increasing need for SQL training and performance optimization tools. This development comes at time when application performance and efficiency is more important than ever as the basis of most strategic applications, such as e-commerce, Enterprise Resource Planning (ERP), Customer Relationship Management (CRM) and data warehouses.
The SQL Problem
The Right Way to Optimize SQL
There are, basically, three steps in the process of optimizing SQL: • • • Identify the SQL statement causing the performance problem Rewrite the SQL statement Test the alternative statements
LECCO SQL Expert reduces the man-hours dedicated to this task by automating this process. For example, the time it takes to identify, rewrite and test can be reduced from hours to minutes, or from days to hours by using the LECCO SQL Expert Scanner and Optimizer. In a case study, a senior developer at a large custom database application development company in Plano, Texas used LECCO SQL Expert to identify, rewrite and test a problematic SQL statement and improved it’s performance from fifteen minutes to just three seconds--and this process took him only ten minutes! When asked how long it would have taken him without LECCO SQL Expert he replied, “At least a day and half.” Imagine this scenario with a key customer service application that is used many, many times a day. Today, DBAs are often confronted with optimizing SQL statements written by SQL developers. A more effective approach would be for developers, who write the original SQL statements and have detailed knowledge of the application and the data; to optimize their SQL statements at the time they write them. Currently, developers focus mainly on how to code a SQL statement to retrieve the correct set of data from the database. Performance issues are often not identified until after the developer has done the coding, either in Integration testing, QA or worst case, at run time. With LECCO SQL Expert it is possible for developers to optimize SQL achieving the best performing SQL from the very start. LECCO SQL Expert’s powerful SQL Rewriter, Scanner, In-process Training, and ease of use make this almost effortless. As a result, not only is there an increase in the quality and speed of the program but the DBA is relieved from reactive optimization.
The SQL Problem
Identify Before a problematic SQL statement can be tuned, it must be identified. Slow running SQL statements are usually identified in a production environment by user complaints or problem reports. This is a reactive approach to optimization. LECCO SQL Expert uses a truly proactive approach to SQL optimization. It can scan all objects in the database that contain SQL, analyze them and indicate which SQL statements can or should be optimized. So, for the first time, developers and DBAs can analyze every SQL statement in their database applications and identify potential SQL performance problems, proactively, before going into production or Quality Assurance (QA.)
Rewrite Any book, manual or class on database tuning will emphasize that the first step is always to rewrite the SQL statement. Rewriting a SQL statement can result in dramatic improvements in performance, often from hours to minutes, or from minutes to seconds, or from seconds to sub-seconds. How is this possible? SQL is not designed for performance. It is designed to implement relational database theory, independent of the hardware or software that is executing it. So the task of the RBDMS optimizer is a daunting one. Its task is to efficiently execute a logically complex request for data from a logically complex database structure on any kind of physical hardware and operating system it may be running on. Because the combination of physical data distribution, logical data definition, logical query, and environmental/system factors are often unique at any given moment, it is difficult for the RDBMS optimizer to always estimate correctly the most efficient access path (the execution plan). Also, because there are so many different semantically equivalent ways to write a given SQL statement, there are often a great variety of execution plans possible for a given query or application. It is this great variety of execution plans that can result in wildly disparate run-times for a set of semantically equivalent SQL statements.
The SQL Problem
The LECCO SQL Expert Optimizer can rewrite a given SQL statement as many different semantically equivalent ways as is possible to rewrite it. In other words the Optimizer can generate every possible execution plan for a given query. Because it generates every possible execution plan for a given query, it gives you the best chance of finding one that is faster than the original one. How does it do this? The LECCO SQL Expert Optimizer first rewrites a given query in every semantically equivalent way possible. It then obtains the execution plan for each new alternative SQL statement and compares it to the original query. If the new execution plan is the same as the original, it will be not displayed. Only the execution plans that are different from the original are displayed because they are the only ones that might be faster than the original. For “SQL1”, the first SQL example on page 4, LECCO SQL Expert generated 273 semantically equivalent alternatives. That would be hard enough for a human SQL developer to do, but it would be even more difficult to determine which of those 273 alternative queries had a unique execution plan. There were, actually, only nine. Note that SQL1 is not even a very complex statement. LECCO SQL Expert’s innovative “feedback searching” (artificial intelligence) engine automates the time-consuming, tedious, and difficult process of rewriting SQL statements. LECCO SQL Expert automates the first and most important step in database optimization. Test After a SQL statement is rewritten, it must be tested. There is no way to predict the run-time or the actual result set of a SQL statement. So it must always be run to see the results. It is difficult to know which SQL statement from a set of semantically equivalent SQL statements will be the fastest without actually running them.
The SQL Problem
LECCO SQL Expert provides an automated SQL testing function that enables the SQL developer to efficiently test all the alternative SQL statements generated by the Optimizer. The SQL developer can optimize and test more SQL, much more quickly; by using this automated function rather than manually running each one to completion, one at a time. LECCO SQL Expert will rank the alternative SQL statements by the Oracle Cost for each, but the cost is often not predictive of run time. There is a rule of thumb that the lower the Oracle Cost the faster the statement will run, but this is often not true. Oracle Cost is only an estimate of resource usage (CPU + Memory + I/O). High resource usage is not a predictor of short or long run time. In fact, a SQL statement with a high Oracle Cost, relative to the other semantically equivalent statements, can run much faster than one with a small Oracle Cost. LECCO SQL Expert’s Batch Run function makes the testing process highly efficient by automatically terminating any alternative SQL statement that runs longer than the original statement or the previous best. This can result in saving many hours of time that it would normally take to test and discover which SQL statement is the fastest. Report The documentation of the identification, rewrite, and testing results is often a necessary but time consuming part of the SQL developer’s or DBA’s work. LECCO SQL Expert will automatically generate reports of scanning, optimizing, and testing results. Such documentation is usually necessary to justify changes to existing SQL code. The Optimized SQL report documents all the alternative SQL statements and their corresponding execution plans and test run results. The scanned SQL report documents all the SQL statements belonging to a given database object, along with their execution plans, and any important analysis resulting from the plans.
The SQL Problem
How much time does it take to optimize SQL without LECCO SQL Expert?
A complex Decision Support System (DSS) query could take days or weeks to rewrite and optimize, and it is possible that significant improvements in run-time might never be achieved. Even a moderately complex SQL statement could take days to rewrite and test for a less skilled SQL developer. With LECCO SQL Expert, one DBA was able to rewrite, in less than five minutes, a problematic statement that would have taken him days to rewrite and improved run-time from 5.54 seconds to 0.47 seconds. His database, when scanned by the LECCO SQL Expert Scanner, identified hundreds of problematic statements. This application was a frequently run DSS query accessed 100s of times a day by key company managers. The automation of the process of identifying, rewriting, testing, and reporting can mean saving hundreds of hours of tedious work for DBAs and developers. The resulting improvement in database performance can mean saving thousands of dollars in deferred hardware upgrades, and improved productivity for database users.
The SQL Problem
Avoiding Database Optimization Pitfalls
There are many factors that affect database performance. These include database settings, indexes, memory, CPU and disk speed, database design, application design, etc. But the proper steps must be taken to achieve results and avoid waste.
1. Don’t skip the first step.
The first step in database optimization is
always to rewrite the poorly performing SQL statements. If this first step is skipped, the result can be hundreds of hours and thousands of dollars of futile, wasted effort. Some ERP and data warehouse queries optimized by LECCO SQL Expert have had run times reduced from several hours to a few minutes. If under performing hardware is your problem, the upgrading can give you great improvement. isn’t your problem, then you have wasted your money. database optimization can give incremental improvements in overall If it SQL hardware upgrades or
performance, at best. But it cannot resolve the performance problem of a particular SQL statement. Rewriting a SQL statement, on the other hand, for ERP or data warehouse applications can result in performance improvements that far surpass that of other optimization methods. Many ERP and vendor applications have generic SQL, written to be able to run on any brand of database. This a good solution for making the application multiplatform, but means that much of the SQL will probably perform poorly until it is optimized for the specific brand of database. The first step in optimizing the SQL is to rewrite it.
2. Tuning by Indexes?
Often, the vendor recommendation is for the
customer not to change SQL code. If the SQL code cannot be changed, then the only other alternative is to tune by indexes. If the SQL statements have not already been tuned, then tuning by indexes can worsen overall performance problems. Why is this?
The SQL Problem
The creation of an execution plan is influenced by the indexes on the tables referenced in the query. The more indexes there are in a database, and the more columns referenced in the query, the more options there are to consider for the access path, and the more difficult the optimization problem is for the database optimizer. Some queries may run faster because of an index, but others may run more slowly because of these same indexes. At one site in Dallas, LECCO SQL Expert improved one SQL statement’s run time from six minutes and fifty-one seconds to 0.02 seconds. How was this possible? The LECCO SQL Expert Optimizer generated a better execution plan by suppressing an index access in the query. This caused the database optimizer to create a more efficient access path.
3. The Execution Plan Analysis.
Relational databases can be very
complicated. A typical large database can have hundreds of tables, thousands of indexes and key columns, and millions or billions of rows of data. In a typical data warehouse or DSS database design, most of the tables have many relationships to many other tables. Reports and ad-hoc queries usually require data from multiple related tables and columns. In order to execute a query against the database, the database kernel must create an access path (an execution plan) to the physical data, and then attempts to find the fastest, most optimal path. The database optimizer performs this particular function. At the moment the optimizer is running, the target is not moving. However, data in a database is constantly changing, so the optimizer is often trying to create an optimal path to a moving target--the optimal path today is not the best execution plan tomorrow or next week. Thus, creating the need to continually optimize your database with LECCO SQL Expert. Before LECCO SQL Expert, the usual approach to SQL tuning was to generate an execution plan for the SQL statement and have a DBA analyze it to see if there were any potential performance problems, such as full table scans. This manual process of generating and reviewing the execution plan for an alternative statement is a good practice, but is time consuming and limited in effectiveness. It can only go so far as to reduce the possibility that a very badly coded SQL
LECCO Technology The SQL Problem Page 12
statement will go into production, but it cannot guarantee the best possible execution plan has been created.
With LECCO SQL Expert, every possible execution plan for a given query is automatically generated. This allows the SQL developer to pick the best possible execution plan for a query. If LECCO SQL Expert does not generate a faster SQL statement, then it is clear that factors other than the SQL statement are responsible for the performance problem of an application. LECCO SQL Expert also gives the DBA much of the information needed to further debug the SQL performance problem if none of the alternative statements are faster than the original by displaying every execution plan possible. A DBA can then examine each execution plan for the given query and see which indexes are being used or not used.
4. Reactive vs. Proactive Optimization.
In most organizations the
identification step of SQL optimization is done when a database performance problem has already occurred. Typically, the first indication that a problematic SQL statement needs to be identified is a complaint from a user. Most DBAs use some kind of database or SQL monitor to capture and identify SQL statements that are running too slowly in production database systems. The statistics they use for identifying problematic SQL statements are things like disk reads, buffer reads, buffer hit ratio, executions, etc. It is, however, proactive only in the sense that performance problems are being monitored and analyzed- but they are not being prevented. There can be thousands of objects in a database that contain SQL statements. Each object may contain multiple SQL statements. These objects can be programs, JAVA and HTML scripts, binary files, database views, stored procedures, the SQL cache, etc. It would be impossible for a human being to extract and analyze every SQL statement in a large database application manually within a reasonable amount of time. In fact, that is why no one ever does it.
The SQL Problem
Using the LECCO SQL Expert Scanner is the most proactive approach to identifying SQL problems before they occur. Potential bottlenecks, due to poor SQL performance, are identified by the developer in the development or quality assurance phases before the database application goes into production or can be proactively eliminated by the DBA; thereby, preventing performance degradation and user complaints.
The SQL Problem
The Best Solution to the Problem
LECCO SQL Expert offers the only solution to the SQL problem. Adopting LECCO SQL Expert in the development phase can dramatically improve the overall efficiency of database applications. LECCO SQL Expert can be adopted at any time during four separate stages: Programming and Unit Test, Stress Test and User Acceptance Test, Optimization and Maintenance.
Feasibility Study LECCO SQL Expert Professional Maintenance Design
LECCO SQL Expert Professional
Adopting LECCO SQL Expert within the Development and Deployment Cycle
Programming & Unit Test
LECCO SQL Expert Developer
Production Integration Test
LECCO SQL Expert Developer & Professional
Stress Test & User Acceptance Test
! Programming and Unit Test Stress Test and User Acceptance Test can be minimized when you use LECCO Expert to optimize, as the embedded SQL statements are already the most efficient at the time of implementation.
The SQL Problem
! Stress Test & User Acceptance Test Database changes and increases in data volume may have an impact on the SQL statements. It is best to review the database server to identify potentially problematic SQL statements, as you may need to optimize again.
! Optimization LECCO SQL Expert can be used to review SQL statements executed in production. Potentially problematic SQL statements can be identified and optimized easily.
! Maintenance Periodic SQL scanning and optimization no longer has to be a timeconsuming and expensive task, as with the aide of LECCO SQL Expert, all SQL statements can be easily reviewed.
The SQL Problem
LECCO SQL Expert solves SQL performance problems.
Ease of deployment Ease of use In-process training Benchmark Non-intrusive Proactive Multi-platform Migrations Every execution plan Automatically formats SQL Database Explorer Developers can tune Installs in minutes on Windows 95/98/NT/2000 workstation Developers and DBAs can be proficient with the product in one hour Helps SQL developers learn more about writing and Optimizing SQL Assures that the best possible SQL is being written Does not require any modifications to the server or database Identifies potential SQL performance problems before They occur Works with Oracle, MS SQL Server, and Sybase The best solution for proactive SQL optimization when Migrating to new releases of databases or applications Generates every possible execution plan for a given query LECCO SQL Expert Indenter gives SQL statements a Standard format Point and click to find details about any object in the database Enables developers to optimize SQL statements during the development phase
To find out more about LECCO Technology and LECCO SQL Expert, visit our web site at www.leccotech.com
Where the experts go for performance.
The SQL Problem