Tuesday, June 16, 2009

Tips for Selecting Non-Clustered Indexes

Selecting non-clustered indexes is somewhat easier than clustered indexes because you can created as many as is appropriate for your table. Here are some tips for selecting which columns in your tables might be helped by adding non-clustered indexes.

  • Non-clustered indexes are best for queries that return few rows (including just one row) and where the index has good selectivity (above 95%).
  • If a column in a table is not at least 95% unique, then most likely the SQL Server Query Optimizer will not use a non-clustered index based on that column. Because of this, don't add non-clustered indexes to columns that aren't at least 95% unique. For example, a column with "yes" or "no" as the data won't be at least 95% unique.
  • Keep the "width" of your indexes as narrow as possible, especially when creating composite (multi-column) indexes. This reduces the size of the index and reduces the number of reads required to read the index, boosting performance.
  • If possible, try to create indexes on columns that have integer values instead of characters. Integer values have less overhead than character values.
  • If you know that your application will be performing the same query over and over on the same table, consider creating a covering index on the table. A covering index includes all of the columns referenced in the query. Because of this, the index contains the data you are looking for and SQL Server doesn't have to look up the actual data in the table, reducing logical and/or physical I/O. On the other hand, if the index gets too big (too many columns), this can increase I/O and degrade performance.
  • An index is only useful to a query if the WHERE clause of the query matches the column(s) that are leftmost in the index. So if you create a composite index, such as "City, State", then a query such as "WHERE City = 'Houston'" will use the index, but the query "WHERE STATE = 'TX'" will not use the index.

Generally, if a table needs only one index, make it a clustered index. If a table needs more than one index, then you have no choice but to use non-clustered indexes. By following the above recommendations, you will be well on your way to selecting the optimum indexes for your tables.

SELECT COUNT(*) performance

If you have ever performed a SELECT COUNT(*) on a very large table, you know how long it can take. For example, when I ran the following command on a large table I manage:


SELECT COUNT(*) from


It took 1:09 to count 10,725,948 rows in the table. At the same time, SQL Server had to perform a lot of logical and physical I/O in order to perform the count, chewing up important SQL Server resources.


A much faster, and more efficient, way of counting rows in a table is to run the following query:


SELECT rows

FROM sysindexes

WHERE id = OBJECT_ID('') AND indid <>


When I run the query against the same table, it takes less than a second to run, and it gave me the same results. Not a bad improvement and it took virtually no I/O. This is because the row count of your tables is stored in the sysindexes system table of your database. So instead of counting rows when you need to, just look up the row count in the sysindexes table.


There is one potential downside to using the sysindexes table. And that this system table is not updated in real time, so it might underestimate the number of rows you actually have. Assuming you have the database option turned on to "Auto Create Statistics" and "Auto Update Statistics," the value you get should be very close to being correct, if not correct. If you can live with a very close estimate, then this is the best way to count rows in your tables.