Tuesday, July 13, 2021

Difference between Clustered and Non-Clustered Indexes in Database? SQL Question Answer

Clustered vs Nonclustered Indexes in SQL
The difference between Clustered and Nonclustered index in a relational database is one of the most popular SQL interview questions almost as popular as the primary key vs unique key, the difference between truncate and delete, and correlated vs noncorrelated subqueries. Indexes are a very important concept, it makes your queries run fast and if you compare a SELECT query which uses an indexed column to one who doesn't you will see a big difference in performance. 

There can be two kinds of indexes in relational databases Clustered and Nonclustered indexes. A clustered index determines the physical sorting order of rows in a table similar to entries on yellow pages which are sorted in alphabetical order.

Suppose you have a table Employee, which contains emp_id as primary key then a clustered index which is created on a primary key will sort the Employee table as per emp_id. That was a brief introduction of What is clustered index in SQL.

On another hand, the Non-Clustered index involves one extra step which points to the physical location of the record. In this SQL Interview question, we will see some more differences between clustered and nonclustered indexes in point format.

Btw, if you are very new to SQL and don't even know what is an index, what is the real use of an index in a table, and how to create and drop an index then you should first go through these free SQL courses to start with. It's one of the best resources to learn SQL fundamentals in a quick time.

Most of the things we discuss here will make more sense if you have a basic understanding of what is n index and how they work.



Difference between Clustered and Nonclustered Indexes in SQL

In the last section, we have seen What is Clustered and Nonclustered indexes and How they resolve the physical location of the record. Let's see some more differences between them to answer this SQL interview question :

1.Physical Sorting
 A Clustered Index physically sorts all rows while Nonclustered Index doesn't.

2. Count
In SQL, one table can only have one Clustered Index but there is no such restriction on the NonClustered Index.

3. Primary Key
In many relational databases, the Clustered Index is automatically created on the primary key column.

4. Performance
Clustered Index can improve the performance of data retrieval while a non-clustered index should be created on columns that are used in the join, where, and order by clause. You can further see the SQL Performance Explained by Markus Winand to learn more about how to index work in different databases like MySQL, SQL Server, and Oracle.

Anyway, here are some key differences between a clustered index and a non-clustered index in the tabular format:

What is difference between Clustered and Non-Clustered Indexes in SQL



That's all on the difference between Clustered vs Nonclustered Index in SQL. An index is very critical to query performance but too many indexes can actually harm the performance of INSERT and UPDATE queries because these require rearrangement of indexes which can be time-consuming based on the size of the table or number of records on the table.

Always create Indexes on columns that are frequently used in a SELECT query because there is a huge performance difference between an indexed and non-indexed column in SQL queries, though you should be careful with creating too many indexes because they will make the update and delete statement slow.

An index is maintained in separate places and costs to update the index tree are directly proportional to the number of indexes you have. If you have indexed in all columns of the database then obviously, you have to re-arrange most of those index trees whenever you update or delete a row.


Other database and SQL articles you may like
What are the ACID properties of database transactions?
10 frequently used MySQL commands for beginners
Top 15 Linux command interview questions with answers


P. S. - If you are looking for online training/courses to learn SQL from scratch, I suggest you join these best SQL and Database courses. It's one of the best resources to learn SQL fundamentals like join, subquery, aggregate functions, window functions, grouping data, advanced filtering, and SQL query optimization.

7 comments:

  1. One of the key thing to remember about Clustered Index is that it organize data in cluster, which helps to quickly scan and retrieve data in a range. also since clustered index decides physical order of data, there can be only one clustered index in a table. All other non clustered index, uses a pointer, which points to clustered index. That's why in index tree, in case of clustered index, leaf contains data, but in case of nonclustered index, it contain pointer to data.

    ReplyDelete
  2. Always remember indexes in SQL are double edged sword, though index improves SELECT query performance (retrieval), it affects insert and update query performance, because index needs to be updated, rearranged every time.

    ReplyDelete
  3. NONCluster index create separate physical structure for searching

    ReplyDelete
  4. A table with a clustered index is called clustered table. Its rows are stored in a B-Tree structure sorted.
    Advantage: The query will run much faster than if the rows were being stored in some random order on the disk
    A non-clustered index requires separate storage than the table storage to store the index information.
    A table without any clustered indexes is called non-clustered table. Its rows are stored in heap structure unsorted. Both are very useful in sql.

    ReplyDelete
    Replies
    1. A table without a clustered index is called a heap table.

      Delete
  5. it should always remove unused indexes that affect insert , update , delete

    ReplyDelete
  6. Nice article, but I would have at least hinted at storage locations. If the database has multiple file groups, with files on multiple disks (or SAN locations), then it is always a best practice for performance to create your non-clustered indexes on a different file group from the clustered index. Doing this will allow for parallel I/O operations to occur.

    ReplyDelete

Feel free to comment, ask questions if you have any doubt.