INDEX
OVERVIEW |
See Also
- SQL Server Books Online - Indexes
DEFINITION |
|
An index
is a structure associated with a table that speeds retrieval
of the rows in the table. An index contains keys built from
one or more columns in the table. These keys are stored in a
structure that allows SQL Server to find the row or rows associated
with the key values quickly and efficiently. |
|
|
|
USING
CLUSTERED INDEXES |
|
A clustered
index determines the physical order of data in a table. A clustered
index is analogous to a telephone directory, which arranges
data by last name. Because the clustered index dictates the
physical storage order of the data in the table, a table can
contain only one clustered index. However, the index can comprise
multiple columns (a composite index), like the way a telephone
directory is organized by last name and first name.
A clustered index is particularly efficient on columns that
are often searched for ranges of values. After the row with
the first value is found using the clustered index, rows with
subsequent indexed values are guaranteed to be physically adjacent.
For example, if an application frequently executes a query to
retrieve records between a range of dates, a clustered index
can quickly locate the row containing the beginning date, and
then retrieve all adjacent rows in the table until the last
date is reached. This can help increase the performance of this
type of query. Also, if there is a column(s) that is used frequently
to sort the data retrieved from a table, it can be advantageous
to cluster (physically sort) the table on that column(s) to
save the cost of a sort each time the column(s) is queried.
Clustered indexes are also efficient for finding a specific
row when the indexed value is unique. For example, the fastest
way to find a particular employee using the unique employee
ID column emp_id is to create a clustered index or PRIMARY KEY
constraint on the emp_id column. |
|
|
Note:
PRIMARY KEY constraints create clustered indexes automatically
if no clustered index already exists on the table and a nonclustered
index is not specified when you create the PRIMARY KEY constraint.
Alternatively, a clustered index could be created on lname,
fname (last name, first name), because employee records are
often grouped and queried in this way rather than by employee
ID |
|
|
|
|
Considerations |
|
|
It is important
to define the clustered index key with as few columns as possible.
If a large clustered index key is defined, any nonclustered
indexes that are defined on the same table will be significantly
larger because the nonclustered index entries contain the
clustering key.
Before creating clustered indexes, understand how your data
will be accessed.
Use a clustered index for:
Columns that contain a large number of distinct values.
Queries that return a range of values using operators
such as BETWEEN, >, >=, <, and <=
Columns that are accessed sequentially.
Queries that return large result sets.
Columns that are frequently accessed by queries involving
join or GROUP BY clauses; typically these are foreign key
columns. An index on the column(s) specified in the ORDER
BY or GROUP BY clause eliminates the need for SQL Server to
sort the data because the rows are already sorted. This improves
query performance.
OLTP-type applications where very fast single row lookup is
required, typically by means of the primary key. Create a
clustered index on the primary key.
Clustered indexes are not a good choice for:
Columns that undergo frequent changes - This results in the
entire row moving (because SQL Server must keep the data values
of a row in physical order). This is an important consideration
in high-volume transaction processing systems where data tends
to be volatile.
·
Wide keys - The key values from the clustered index are used
by all nonclustered indexes as lookup keys and therefore are
stored in each nonclustered index leaf entry.
|
|
|
|
USING
NONCLUSTERED INDEXES |
|
A nonclustered
index is analogous to an index in a textbook. The data is stored
in one place, the index in another, with pointers to the storage
location of the data. The items in the index are stored in the
order of the index key values, but the information in the table
is stored in a different order (which can be dictated by a clustered
index). If no clustered index is created on the table, the rows
are not guaranteed to be in any particular order.
Similar to the way you use an index in a book, Microsoft®
SQL Server 2000 searches for a data value by searching
the nonclustered index to find the location of the data value
in the table and then retrieves the data directly from that
location. This makes nonclustered indexes the optimal choice
for exact match queries because the index contains entries describing
the exact location in the table of the data values being searched
for in the queries.
If the underlying table is sorted using a clustered index, the
location is the clustering key value; otherwise, the location
is the row ID (RID) comprised of the file number, page number,
and slot number of the row. For example, to search for an employee
ID (emp_id) in a table that has a nonclustered index on the
emp_id column, SQL Server looks through the index to find an
entry that lists the exact page and row in the table where the
matching emp_id can be found, and then goes directly to that
page and row. |
|
|
|
|
Multiple
Nonclustered Indexes |
|
|
Some
books contain multiple indexes. For example, a gardening book
can contain one index for the common names of plants and another
index for the scientific names because these are the two most
common ways in which the readers find information. The same
is true for nonclustered indexes. You can define a nonclustered
index for each of the columns commonly used to find the data
in the table. |
|
|
|
|
Considerations |
|
|
Before
you create nonclustered indexes, understand how your data will
be accessed.
Consider using nonclustered indexes for:
Columns that contain a large number of distinct values, such
as a combination of last name and first name (if a clustered
index is used for other columns). If there are very few distinct
values, such as only 1 and 0, most queries will not use the
index because a table scan is usually more efficient.
Queries that do not return large result sets.
Columns frequently involved in search conditions of a query
(WHERE clause) that return exact matches.
Decision-support-system applications for which joins and grouping
are frequently required. Create multiple nonclustered indexes
on columns involved in join and grouping operations, and a clustered
index on any foreign key columns.
Covering all columns from one table in a given query. This eliminates
accessing the table or clustered index altogether. |
|
|