Tuesday, March 13, 2018

Beware of MIN, MAX, and TOP in Dynamics GP SQL queries!

By Steve Endow

A few weeks ago I started some research to compare the performance of MAX vs. TOP(1) in SQL Server queries.

After finding some unexpected results, I created a second video showing some odd behavior of MAX and TOP on one particular Dynamics GP table.  At that time, I couldn't figure out what was causing the performance issue with the MAX function.

Well, thanks to some very generous help and amazing insight from Kendra Little, I finally have a definitive explanation for the performance issue.

After looking at the IV30500 table and sample queries using MAX and TOP 1, Kendra quickly noticed that the table had ANSI NULLs turned off.  I explained some history of Dynamics GP and it's older database design quirks, and she pondered the performance issue further.

The next morning, she had found the issue.  She sent me this query to check the ANSI_PADDING settings on the char fields in the IV30500 table.

select OBJECT_NAME(object_id) AS TableName, name AS ColumnName, is_ansi_padded, * from sys.columns
where object_id = OBJECT_ID('IV30500') AND system_type_id = 175

The query shows that is_ansi_padded = 0 for IV30500.  But if I run the same query on the IV30500REBUILD table, which was created by SQL Server when I exported data out of IV30500, I get different results.

select OBJECT_NAME(object_id) AS TableName, name AS ColumnName, is_ansi_padded, * from sys.columns
where object_id = OBJECT_ID('IV30500REBUILD') AND system_type_id = 175

So why does this matter?

Well, to understand that, you have to learn a bit more about the ANSI_PADDING setting in SQL Server and how that affects SQL queries.  I still don't fully understand the details, but here are some references in case you want to learn more:




The key sentence from the third link is:
When ANSI_PADDING set to OFF, queries that involve MIN, MAX, or TOP on character columns might be slower than in SQL Server 2000.

Apparently when ANSI PADDING is off for a field, when it needs to evaluate the field for a query, it must sometimes pad spaces on the end of every single row value before it can perform a comparison.  As a result it must scan every row in the table, or in an index, before it can fulfill the query.

This is what I was seeing with the MAX function on the TRXSORCE field of IV30500.  Every single record was being returned by the Index Scan.

The TOP 1 operator apparently does not have to perform this same padding operation, so it is able to simply retrieve one row from the index and return it.

This is a pretty big deal.

While I'm guessing that MIN and MAX aren't widely used in Dynamics GP queries, there are certainly some situations where they would be useful.  If they are used on a table with a large number of rows, the performance hit will be significant.

In some cases, using TOP 1 may help, but as the key sentence above states, even the TOP operator may trigger the same performance issue in some queries.

If you are using MIN, MAX, or TOP and see an Index Scan in your execution plan that returns every row in the table, that may be a sign that you are encountering this issue.

In the video above, I show one way to modify a specific field to turn ANSI_PADDING on, but I don't know that I would recommend this in a production Dynamics GP environment.  It may work fine, but you'll have to be careful to perform the update after every Dynamics GP release or service pack, as any tables re-scripted by Dynamics GP will likely revert back to having ANSI_PADDING off.

And that, mercifully, finally solves the riddle of the poor performance of MAX in Dynamics GP.

Steve Endow is a Microsoft MVP in Los Angeles.  He is the owner of Precipio Services, which provides Dynamics GP integrations, customizations, and automation solutions.

You can also find him on Twitter, YouTube, and Google+

No comments: