Hello guys, what is the difference between UNION vs UNION ALL is one of the
most popular SQL interview questions and often asked programmers during a
telephonic round of interviews. Though both UNION and UNION ALL is used to
combine results of two SELECT queries, the main difference between them is
that UNION doesn't include duplicate record but UNION ALL does.
Another difference between them is that UNION ALL is faster than UNION but may
look slow because it returns more data which takes more time to travel via the
network. The difference between UNION and UNION ALL can be a tricky SQL question, especially for developers, who have not used this useful keyword
ever.
Since the UNION clause is not as common as a SELECT clause in SQL, it's
usually asked in a telephonic round of programming interviews to check whether
the candidate is comfortable with SQL or not. It's in the same league of
questions like
clustered vs non-clustered index
or primary vs unique key. UNION
is very different than other SQL commands because it operates on data rather
than columns.
Anyway, the answer to this question is simple, though both
UNION
and
UNION ALL are used to combine the result of two separate SQL queries on the same or
different table, UNION does not keep a duplicate record (a row is
considered duplicate if the value of all columns is same), while UNION ALL
does.
Since you mostly don't want duplicate rows, UNION is preferred over UNION ALL in reporting and application development. By the way, you should keep in mind that UNION ALL performance better than UNION because it doesn't have to remove duplicates, so no extra work.
This keyword is very well supported by all major databases like Oracle, Microsoft SQL Server, MySQL, and PostgreSQL. Another thing to keep in mind is the amount of data returned by UNION ALL; if your database server is quite far away and you have limited bandwidth, UNION ALL may appear slower than UNION because of the number of duplicates it returned.
The cost of transferring duplicate rows can exceed the query execution benefits in many cases. We will see a couple of examples UNION and UNION ALL in SQL, but before that few things to keep in mind. In order to combine the results of two queries, they must contain the same number of columns.
For example, if one query contains 3 columns and the other contains 4 columns then you can not use UNION or UNION ALL. This is because a row will only be considered duplicated when all columns will have the same value, irrespective of the name of the columns themselves.
By the way, if you are just starting with SQL and not familiar with essential SQL concepts and commands then I highly recommend you to join an online SLQ course like SQL for Newbs: Data Analysis for Beginners on Udemy to start with. One of the highest-rated beginner SQL courses on Udemy and you can get it for just $10 on sales.
Since you mostly don't want duplicate rows, UNION is preferred over UNION ALL in reporting and application development. By the way, you should keep in mind that UNION ALL performance better than UNION because it doesn't have to remove duplicates, so no extra work.
This keyword is very well supported by all major databases like Oracle, Microsoft SQL Server, MySQL, and PostgreSQL. Another thing to keep in mind is the amount of data returned by UNION ALL; if your database server is quite far away and you have limited bandwidth, UNION ALL may appear slower than UNION because of the number of duplicates it returned.
The cost of transferring duplicate rows can exceed the query execution benefits in many cases. We will see a couple of examples UNION and UNION ALL in SQL, but before that few things to keep in mind. In order to combine the results of two queries, they must contain the same number of columns.
For example, if one query contains 3 columns and the other contains 4 columns then you can not use UNION or UNION ALL. This is because a row will only be considered duplicated when all columns will have the same value, irrespective of the name of the columns themselves.
By the way, if you are just starting with SQL and not familiar with essential SQL concepts and commands then I highly recommend you to join an online SLQ course like SQL for Newbs: Data Analysis for Beginners on Udemy to start with. One of the highest-rated beginner SQL courses on Udemy and you can get it for just $10 on sales.
UNION and UNION ALL Example in Microsoft SQL Server
Let's see one simple example of UNION and UNION ALL, this will not only show you how they work but also where you can use them. This example is from my sample database and the following screenshot is from SQL Server Management Studio.
We have two tables,
Employee
and
Customer. In order to use
UNION
and
UNION ALL, I have kept the same persons as employee and customer, so you will see the
same id on
emp_id
and
customer_id, and the same name as well.
If you look at the result of the first two
select queries, you will see that the first query returns two rows and the second query
returns three rows, where two rows have exactly the same data as the first
query.
Key things to note is that column names are different in both result sets,
first one has
emp_id
and
emp_name, while second data set has customer_id and customer_name, but most important
both dataset has only two columns. This is a must in order to combine them
using
UNION
and
UNION
ALL
keywords.
The third query is an example of how to use the UNION clause in SQL, you can see that the combined result has just three columns, all are
unique. Duplicate columns from the second result set were not included. This is
more like how you do
UNION
in Set theory, where the final result contains data from both sets.
The fourth query is how you should use
UNION ALL, it contains five rows, two from the first query and three from the second
query. It has not removed duplicate rows from the second query, that's why you
see
Ken
and
Bob
repeating twice.
This example teaches us the core concept that the UNION
doesn't depend upon the column name but the data. You can combine the result
of as many queries as possible until the number of columns in all of them is
the same and the data is from the same set.
Regarding performance, you need to run UNION and UNION ALL with a large database, containing millions of rows. There you can
monitor how much time both takes and compare them.
Theoretically, UNION ALL should take less time to execute but more time to transfer data to the
client. By the way, if are new to SQL Server then I recommend you to go
through an introductory course like Microsoft SQL for Beginners on Udemy. It's a great course to start learning SQL using MSSQL and
tools.
Difference between UNION and UNION ALL command in SQL
Now we know how union and union all works and has some background by following the above examples, let's summarise the similarities and difference between them for quick revision :1. Combining Results
Both
UNION
and
UNION ALL
are used to combine the results of two separate SQL queries, it could be on
the same table or a different table but the data should be the same. For
example, if
product_id
is used in two tables like Product
and
Order, then two SQL queries which pulls
product_id
from these two tables can be combined using
UNION
or
UNION ALL.
2. Duplicates
The key difference between
UNION
and
UNION ALL is that the former will remove duplicates but later will keep them. In
other words,
UNION
is equal to running
distinct
on the output of
UNION ALL. For example, if product_id 10 is returned by both SQL query
then it will only appear once if you use
UNION
and appear twice if you use
UNION ALL.
3. Execution time
Due to the above difference query execution time of
UNION ALL
is smaller than
UNION, which means the former runs faster than the latter. So if you want faster
output and don't care about duplicates use
UNION ALL.
This is something you can deduce from your existing SQL knowledge and that's
where working on fundamentals pays off. If you want to improve your SQL skill
or just want to revise SQL concepts then An Introductory Guide to SQL course
on Educative, an interactive learning platform is a good place to start
with.
And, if you find the Educative platform and their interview courses like
Grokking the System design useful then consider getting an Educative
Subscription which provides access to their 100+ courses for just $14 per
month. It's very cost-effective and great for preparing for coding
interviews.
4. Speed and Bandwith Usage
You should keep in mind that benefits gained by not removing duplicates can be
easily wiped out by transferring more data over a poor bandwidth network
connection. That's why in practice some time
UNION ALL
appears slower than
UNION because it returns a lot of data with duplicates which require more time to
travel from database server to client machine. To evaluate the performance of
UNION
and
UNION ALL
case by case.
5. Number of Columsn on ResultSet
Another worth noting thing while using
UNION
and
UNION ALL
is that all queries combined using a
UNION,
INTERSECT,
or
EXCEPT
operator must have an equal number of expressions in their target lists. For
example, if the result of query 1 has three columns and the result of query 2
has two columns then you cannot combine them using the UNION
command.
That's all on the difference between the UNION and UNION ALL command in SQL. It's one of the useful commands to combine the result of two SELECT queries when they contain the same data. There are many practical scenarios where UNION is very useful, for example when you need to create a list out of different tables containing data from the same set.
That's all on the difference between the UNION and UNION ALL command in SQL. It's one of the useful commands to combine the result of two SELECT queries when they contain the same data. There are many practical scenarios where UNION is very useful, for example when you need to create a list out of different tables containing data from the same set.
The main difference between UNION and UNION ALL is about duplicates, the former removes it while later keeps it, other differences between them
on performance and networking bandwidth usage can be easily derived by knowing
this difference. Also keep in mind that it is well supported big three
databases like MySQL, Oracle, and SQL Server. Let us know if you have been
asked this question in your SQL interview.
Other related SQL queries, Interview questions, and articles:
Thanks for reading this article, if you like this SQL article, then please
share it with your friends and colleagues. If you have any questions or
feedback, then please drop a note.
P. S. - If you are a beginner and interested in learning Database and SQL and looking for some free resources to start your journey, then you can also take a look at the Introduction to Databases and SQL Querying free course on Udemy to kick-start your learning.
- Write a SQL query to find all table names on a database in MySQL (solution)
- 5 Courses to learn Database and SQL Better (courses)
- How to join three tables in one single SQL query (solution)
- Top 5 Websites to learn SQL online for FREE? (resource)
- 4 ways to find the Nth highest salary in SQL (solution)
- Difference between Self and Equi Join in SQL? (answer)
- 5 Free Courses to learn Oracle and SQL Server? (courses)
- Top 5 Courses to learn MySQL Database for Beginners (Courses)
- What is the difference between View and Materialized View in Database? (answer)
- 5 Courses to learn Oracle and Microsoft SQL Server database (courses)
- Difference between clustered and non-clustered indexes in SQL? (answer)
- 5 Advanced SQL books to level up your SQL skills (books)
- Write a SQL query to copy or backup a table in MySQL (solution)
- Top 5 Courses to learn PostgreSQL in-depth (courses)
- 5 Free Courses to learn T-SQL and SQL Server for Beginners (Courses)
- How to migrate SQL queries from Oracle to SQL Server 2008? (answer)
- Difference between Primary and Foreign keys in the table? (answer)
- Top 5 Courses to learn Microsoft SQL Server in-depth (courses)
- How do you find the duplicate rows in a table on a database? (solution)
- The real difference between WHERE and HAVING clause in SQL? (answer)
- 5 Free Courses to learn T-SQL and SQL Server for Beginners (Courses)
- Difference between the Unique and Primary keys in the table? (answer)
- 5 Free Courses to learn Database and SQL (free courses)
P. S. - If you are a beginner and interested in learning Database and SQL and looking for some free resources to start your journey, then you can also take a look at the Introduction to Databases and SQL Querying free course on Udemy to kick-start your learning.
I am going to explain a use of union operator in SQL Server Database with a real life scenario and example.
ReplyDeleteThe UNION operator is used to combine the result-set of two or more SELECT statements.
Notice that each SELECT statement with the UNOIN must have the same number of columns. The columns must also have similar data types. Also, the columns in each SELECT statement must be in the same order.
Before moving towards union query, let understand what is our scenario. Suppose we have three tables (Product, Customer and Order) in our database and we want to see the number of orders by month name, week days and between three hours gaps in day.
for full implementation to refer here: http://www.mindstick.com/Articles/adeb48fa-85cc-4671-8b2e-86adfb34406d/Union%20Example%20in%20SQL%20Server
UNION removes duplicate records (where all columns in the results are the same), UNION ALL does not.UNION and UNION ALL should work on all SQL Servers.
ReplyDelete