Friday, September 4, 2009

SQL Query Optimization

If you are a Dynamics GP developer in the US, I'm sure you're looking forward to the upcoming holiday weekend so that you can sit down at your computer and relax by reading articles on the SQL Server Performance web site.

No??? So it IS only just me??? (blush)

I discovered the web site several years ago when I was working on a pretty challenging project. My client was receiving detailed billing data from a vendor and had asked me to develop a routine to validate the transactions against data in their database before importing the voucher into Dynamics GP. The challenge was that the vendor was sending over 250,000 records each week, and the single table in the client's operational system database was 2 terabytes.

So there I was, trying to do a JOIN of 250,000 records against a 2TB table. Not pretty. My initial query would just run, and run, and run, while hogging CPU and memory on the server. That's when I stumbled on the SQL Server Performance web site. I optimized my JOINs, I added more WHERE clauses, tweaked my indexes, and I learned all about "non-sargable" WHERE clauses. My eternal query eventually took less than 10 seconds to run, making me quite proud. ;-)

Since then, I've always been aware that whenever I write a complex SQL query, there is usually a different, and likely better, more efficient way to write it. If the query is only run once a week and it takes 10 seconds, that may be adequate and no further optimization may be necessary. But if the query is run 50,000 times a day, even a sub-second query may benefit from further optimization.

One of the biggest learning curves I had was how to analyze the performance of my queries and determine which part of the query was inefficient. I found a book on SQL Server query optimization that taught me the "Old School" method, using the SET SHOWPLAN_TEXT ON statement. When you run this statement, and then run your query, SQL Server will display the execution plan instead of your query results.

For example, these statements:

SET SHOWPLAN_TEXT ON
SELECT COUNT(*) FROM PM00200 WHERE VENDORID LIKE '%ACE%'

Would display this result:

|--Compute Scalar(DEFINE:([Expr1004]=CONVERT_IMPLICIT(int,[Expr1005],0)))
|--Stream Aggregate(DEFINE:([Expr1005]=Count(*)))
|--Index Scan(OBJECT:([TWO].[dbo].[PM00200].[AK6PM00200]), WHERE:([TWO].[dbo].[PM00200].[VENDORID] like '%ACE%'))

(Just make sure to run SET SHOWPLAN_TEXT OFF when you are done.)

I eventually got past the arcane presentation of the text plan information, learned how to review the operations being performed, and found that I pretty much needed to look at the inner-most (most indented) operation to see where I had opportunities to optimize my query.

On the other hand, most people use the more modern and convenient "Display Estimated Execution Plan" feature in Query Analyzer and SQL Server Management Studio. This feature displays a graphical representation of the query plan and includes much more information and statistics on each operation. If you highlight a query in Management Studio, you can use the CTRL+L shortcut to show the execution plan.



And if you hover your mouse over one of the steps, you will get more detailed information about the operation.




It isn't always obvious or easy identifying which operations can benefit from optimization, but an easy way is to start with the step that has the highest cost and start researching to see if there is a different way to perform the query step.

I learned that query optimization was much more of an art than a science, requiring situation-specific analysis, and consideration of the environment, database, and tables that are involved with the query. But based on my experience, my general rule is that if you have a query that is taking longer than 10 seconds to execute, it is likely that there are opportunities for improving its performance.

1 comment:

Unknown said...

I have been thinking a lot about this subject recently we have nearly 2 million prices in SOP10108 that means you have to design your queries carefully around it.
I agree that anyone working with Dynamics GP could benefit from a holiday weekend reading up on this subject.
We don't have any bank holidays until Christmas having just had one last week, so a long wait....