Friday, June 29, 2018

Indexing in Database

To retrieve instructor details given department name, db system would look up an index to find on which disk block the corresponding record resides, and then fetch the disk block, to get the appropriate instructor details.

Two kinds of indices

1) Ordered Indices- based on sorted ordering of the values
2)Hash Indices- based on uniform distribution of values determined by hash function across a range of buckets

Each technique evaluated based on access types, access time, insertion time, deletion time and space overhead.
An attribute or set of attributes used to look up records in a file is called search key. If there are several indices on a file, there are several search keys.
 Ordered Indices
Index structure is used to gain fast random access to records in a file.
Primary index: in a sequentially ordered file, the index whose search key specifies the sequential order of the file.
  • Also called clustering index. A clustered index is very useful when you want to retrieve many rows of data, a range of data, or when the BETWEEN clause is used in the WHERE clause.
  • The search key of a primary index is usually but not necessarily the primary key. Files with a clustering index on the search key are called index-sequential files.
Types of Ordered Indices

Dense index: An index entry appears for every search key value in the file.  In a dense clustering index, index record contains the search key value and a pointer to the first data record with that search key value. In a dense non clustering index, index must store list of pointers to all records with the same search-key value.

Sparse Index: contains index records for only some search-key values.Applicable when records are sequentially ordered on search-key. Search file sequentially starting at the record to which the index record points. Generally slower , less space and less maintenance overhead for insertions and deletions.than dense index for locating records.



Creating an index: 

It’s a good practice to only create indices for columns rather than rows, because the indices need to be updated when the table is updated as well.


 1) The table to be indexed must be in your own schema.
 2) You must have the INDEX object privilege on the table to be indexed.
 3) You must have the CREATE ANY INDEX system privilege.

CREATE INDEX the_index_name ON the_table_name (the_column_name);
create index  on (); 
create index dept_index on instructor(dept_name);
create unique index dept_index on instructor(dept_name);
dept_name is the candidate key for instructor.

To drop index: drop index indexname;
Multilevel Indices: Indices with two or more levels.Construct a sparse outer index on the original index (inner index ). index entries are always in sorted order. To search large tuples (Eg: 100,000,000), binary search is expensive and time consuming as primary index does not fit in memory. solution is treat primary index kept on disk as a sequential file and construct a sparse index on it. It is closely related to tree structures such as binary trees used for in-memory indexing. On insertion and deletion, indexes must be updated.
  • outer index –a sparse index of primary index
  • inner index –the primary index file
Secondary index: An index whose search key specifies an order different from the sequential order of the file. Also called non-clustering index. it must be dense. A secondary index must contain pointers to all records. If a person wants to find all the records whose values in a particular field satisfy some condition. For example: Employee relation stored sequentially by EName (non search key), find all employees in CS department.
A sequential scan in clustering index order is efficient because records in the file are stored physically in the same order as the index order.

Sequential scan using primary index is efficient, but a sequential scan using a secondary index is expensive.

SQL Server unique indexes vs. unique constraints:

  • A unique index ensures that the values in the index key columns are unique.
  • A unique constraint also guarantees that no duplicate values can be inserted into the column(s) on which the constraint is created. When a unique constraint is created a corresponding unique index is automatically created on the column(s).
Table without a clustered index. It’s an assorted collection of objects. It has multiple extents with different page data. Generally there is no functional difference between a unique index and a unique constraint. The latter is also listed as a constraint, however this is only a method to emphasize the purpose of the index. There is no difference for the query optimizer whether the index is created as a unique index or a unique constraint, therefore there is no performance difference. However there are some differences for creation where some index creation options are not available for unique constraints.

Clustered index- The data rows are stored in order based on clustered index key. It is implemented as a B-tree index structure. It have only one row in sys.partitions, with index_id=1. Data pages in the leaf level are linked in a doubly linked list.


Non Clustered index- It is implemented as a B-tree index structure. It do not affect the order of data rows. Each index row contains the non-clustered key value, row locator (row id) and any included or non-key columns.
Indexing tips
Tip 1
Unused Indexes: Drop unused indexes
Tip 2: Missing index
Tip 3: Duplicate indexes: Drop duplicate index

No comments:

Post a Comment

GEN AI

  Stay Tuned....