Statistics Used by the Query Optimizer in Microsoft SQL Server 2005

The SQL Server query optimiser uses statistical information (stats) about the distribution of data within columns and indexes in order to determine the best execution plan to use. If this information is unavailable or unusable for any reason then a poor plan is more likely to be chosen. It can not be stressed enough how important good stats are to achieving good performance. The Statistics Used by the Query Optimizer in Microsoft SQL Server 2005 article on Microsoft TechNet provides some very good coverage of stats in SQL Server 2005. Here are a few points of note from the article;

  • New in SQL 2005; string summary stats, asynchronous auto-update, more intelligent stats updating.
  • The automatic creation of stats is on by default, asynchronous update of stats is off by default.
  • Automatic creation of stats always uses the default sample rate. Specifying the full sample rate for a manual creation will cause the full rate to be used for manual updates with RESAMPLE specified, even if the table originally had 10 rows and now has 10,000,000.
  • Auto creation of stats ONLY CREATES SINGLE-COLUMN STATS. Consider manually creating multi-column stats to support multi-column predicates.
  • Be aware that auto creation of stats CAN LEGITIMATELY fail - watch out for a workload pattern that causes particular auto stats to regularly fail.
  • Default sample rate assumes a random ordering of the data within the column.
  • Temp tables DO have stats, table variables and multi-statement table-valued functions DO NOT.
  • "If you use a local variable in a query predicate instead of a parameter or literal, the optimizer resorts to a reduced-quality estimate, or a guess for selectivity of the predicate. Use parameters or literals in the query instead of local variables, and the optimizer typically will be able to pick a better query plan.". Also try to minimise complexity in expressions; aim for <A> <op> <B> where <A> and <B> are (as opposed to result in) scalars.
  • Parameter sniffing uses the value of parameters as passed into an SP - do not modify the parameter inside the SP.

Note: Missings stats info is written to the default trace - there's no need to do anything to capture this info as it's already being captured. See my page on Using the Default Trace for details of how you can access this info.

A couple of things that you might not know about stats are that Statistics are not transactional, and that sp_updatestats does not do indexed view stats.

Some useful BOL links;

SQL Server 2000 Statistics Notes

Some work done quite a while ago on stats for SQL Server 2000 produced these links which are still worthwhile.

Nice microsoft.public.sqlserver.server discussion around UPDATE STATISTICS and sample/fullscan.

MS KB FIX: Cardinality Estimates Are Too Low When Value Is Outside Histogram

Another microsoft.public.sqlserver.server discussion around stats with a nicely detailed response from MS at the end.

| Copyright 2007-2020 RMJCS