OK, I'm definitely not the first person in the world to figure this out, but it bit me in the behind last week. So I figured I would share this bit of wisdom with you so it doesn't happen to you.
Most of the data that's stored in this MSDE (SQL Server) database lives in an "engineering variable data" table. In this parallel universe, the term "variable" doesn't mean quite what we programmers think it should mean, so I'll call it a "datum" A "datum" can have a value, a specification (that determines how it is used in a calculation), upper and lower limits, residuals, bound type, and other properties of interest to engineers.
I experienced a dreadful performance problem in the production system. Out of necessity, the program would sometimes have to delete 10,000 records from the VariableData table and insert a new copy of those records. In the development system with a much smaller database, this worked as intended. However, when we started getting upwards of a million records in the table, this data replacement step would bring the program to its knees for 30-120 seconds. Sometimes the queries would time out, gobble up RAM like candy on Halloween, or both. I tried various hacks and workarounds to improve performance, including using a bulk insert, but to no avail.
We finally took a look at the indexes:
IX_SolutionCaseID: key to an integer in SolutionCase tableIX_Specification: key to a varchar(7), whose value can be one of 7 possibilitiesIX_Guid: unique key for a uniqueidentifierIX_VariableID: key to an integer in Variable tableIX_VariableVariableID: a redundant key
I realized that the way I get data out of the database is by SolutionCaseID and VariableID, which uniquely identifies a datum. So the queries I was currently running were using one of the existing indexes and then doing a table scan to find the record I actually wanted. Ick. So I added a new two-field key:
IX_SolutionCaseIDVariableID: added key for both SolutionCaseID and VariableID
Adding this index to a populated table only took a few seconds and it cut the wait time in half!
Next I started hacking away indexes that I never use.
IX_VariableVariableID: deleted - this had no noticeable effect on performance.IX_Specification : deleted - this improved performance. This is analogous to have an index on sex "M" or "F", which is always a bad idea.IX_Guid: deleted - I never look up data by Guid, this is for UI magic only. This had a huge benefit on performance
So the conclusions can be:
And if you didn't know this already, you can always force an index to be used in a select query (use at your own discretion):
SELECT Field1, Field2, Field3 FROM TableName WITH (INDEX = MyIndexName) WHERE Field4 = 42
And now my client is much happier, and there was much rejoicing!
Remember Me
Disclaimer The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.