ORACLE SQL TIPS, TRICKS AND HACKS
Carlos January 31, 2017 24 Comments
Count(*) Vs Count(1) Again?
I decided to write this article because the question keeps being asked, and I usually �nd myself explaining my answer again and again and/or looking for links to someone else’s articles that can help support my answer.
Here I discuss a couple of common misconceptions and share a short but irrefutable demonstration.
So, these are the question I will answer here:
What is the di៛�erence between COUNT(*) and COUNT(1)?
Short answer: None!
Is COUNT(1) more e៝�cient or faster than COUNT(*)?
Short answer: No!
Is there a way to demonstrate or prove that they are the same thing?
Short answer: Yes!
Contents [Hide]
What is the di៛�erence between COUNT(*) and COUNT(1)? Is COUNT(1) more e៝�cient or faster than COUNT(*)?
Is there a way to demonstrate or prove that they are the same thing? The proof:
One of the most common answers I see for this question is that they might produce di៛�erent results because COUNT(1) counts only rows in which the �rst column is not null.
Why is that answer incorrect?
Because COUNT receives an expression as parameter, not a column position. COUNT(1) doesn’t mean COUNT(<�rst column>). It means COUNT(1), with 1 being treated as a numeric literal.
The confusion comes most likely from the fact that you can use a column position in the ORDER BY clause, but using column positions is not allowed with the COUNT function.
Look at the syntax diagram for the ORDER BY CLAUSE:
Order By clause syntax diagram.
As you can see, it explicitly says that you can provide a position instead of an expression or an alias, but if you look at the syntax of the COUNT function, it expects an expression:
Count function syntax diagram.
So, what is the consequence of 1 being treated as a numeric literal by the COUNT function?
Well, it is true that COUNT counts only rows in which the expression passed is not null, but since 1 is a literal, and a literal doesn’t change, it will always be 1, for each and every row, and will never be null, so the �nal result is that COUNT(1) counts all of the rows returned by the query, regardless of the existence of nulls in any of the columns.
If you have doubts, try this query:
SELECT COUNT(99999), COUNT('MONKEY'), COUNT(*) FROM DUAL;
If the parameter passed to COUNT was treated as a column position, then the �rst call to COUNT would produce an error, because there is only one column in the dual table.
And the second call to COUNT shows you that you can use any type of literal, and the result is the same, because, again, a literal will not change, and thus ‘MONKEY’ will never be null, so, COUNT(1), COUNT(99999) and COUNT(‘MONKEY’) are all equivalent to COUNT(*).
Is COUNT(1) more e៝�cient or faster than COUNT(*)?
The most common argument in favor of COUNT(1) I have seen in this kind of discussion is that COUNT(*) needs to check the value of all columns in the row to determine if it needs to be counted, because COUNT doesn’t count nulls.
Why is that incorrect?
Because the o៝�cial documentation about the COUNT function explicitly says this:
“If you specify the asterisk (*), then this function returns all rows, including duplicates and nulls.”
So, COUNT(*) always counts all rows, and thus, the database doesn’t need to check the column values, because it will always count all rows, regardless of their contents.
Is there a way to demonstrate or prove that they are the same
thing?
I have seen some people use execution plans to demonstrate that COUNT(1) is equivalent to COUNT(*), but unfortunately, an execution plan comparison is not enough because even if they were di៛�erent, and one of them actually did more work than the other one, the execution plans could still be equal, so that is not a convincing or irrefutable demonstration.
If you want irrefutable proof, here it is, and if you really want to �nd the truth, this should be enough to make you take the correct side of this debate, forever
The proof:
There is a part of the database software that is called “The Optimizer”, which is de�ned in the o៝�cial
One of the components of the optimizer is called “the transformer”, whose role is to determine whether it is advantageous to rewrite the original SQL statement into a semantically equivalent SQL statement that could be more e៝�cient.
The optimizer does a lot of work under the hood, but we usually don’t notice it.
Would you like to see what the optimizer does when you write a query using COUNT(1)?
I’m going to show you how to generate an optimizer trace, also known as a 10053 trace, in which you will be able to see a log of the optimizer work. There are more methods to do it, but here is one:
You will have to use a database user to which the ALTER SESSION privilege has been granted.
I’m doing this on a SQL*Plus session, but if you are using SQL Developer, you might need to manually disconnect the session instead of executing the EXIT command in the last step.
First, you need to do something to make your trace �le easy to identify:
Then you have to enable the optimizer tracing:
Then you have to run the command you want to trace, which in this case is a SELECT statement that uses COUNT(1):
For the trace to be generated, a “hard parse” of the statement needs to occur, and in simpli�ed words, it will occur if the exact same statement has not been executed before, so if you want to run and trace the same statement several times, make sure to add a comment that makes it di៛�erent from the other times you have run the same statement. In this case, if I want to trace the same statement again, I would change the
comment to ‘test-2’, for example.
Ok, now, you need to exit the session for the trace to be written to the �le:
As I mentioned before, if you are using SQL Developer, you might need to manually disconnect the session, as just executing EXIT or DISCONNECT doesn’t appear to really end the session.
ALTER SESSION SET tracefile_identifier = 'My_count_test';
ALTER SESSION SET events 'trace [SQL_Optimizer.*]';
SELECT /* test‐1 */ COUNT(1) FROM employees;
EXIT;
Here is how this looks like on my system:
And now, the exciting part! We are going to take a look at the contents of the trace we generated, but where is this �le located?
It might vary from system to system, but you can �re this query to see the path where the trace �les are located:
Ok, here are the relevant portions of the 1850-lines trace �le that was generated.
******************************************
—– Current SQL Statement for this session (sql_id=4nbgdngzf4024) —– SELECT /* test-1 */ COUNT(1) FROM employees
******************************************* Legend
The following abbreviations are used by optimizer trace.
…
CNT – count(col) to count(*) transformation
…
As you can see, there is a “count(col) to count(*)” transformation, that is represented in the trace as CNT.
A little later in the trace, there is this:
CNT: Considering count(col) to count(*) on query block SEL$1 (#0) *************************
Count(col) to Count(*) (CNT) ************************* CNT: COUNT() to COUNT(*) done.
And a little later, there is this:
Final query after transformations:******* UNPARSED QUERY IS ******* SELECT COUNT(*) “COUNT(1)” FROM “COURSE”.”EMPLOYEES” “EMPLOYEES”
Do you see the role “COUNT(1)” plays in the �nal query?
I will put it here again, as an image:
“COUNT(1)” is just an alias! What Oracle actually runs is COUNT(*) but it returns COUNT(1) as the column title in the results, because that is what you asked for, but it actually runs COUNT(*)!
So, NO, COUNT(1) is not more e៝�cient nor faster than COUNT(*), because COUNT(1) is actually never run. It is always transformed into COUNT(*), so you always do COUNT(*) even if you don’t want it.
Share this!
Venkat Reddy
2
REPLY 14 days 10 hours ago Guest
I could even say that COUNT(1) is at least a little bit less e៝�cient, because it requires the optimizer to do a transformation that would not be needed if COUNT(*) was used from the beginning.
Are we on the same side now?
Great!
Have something to say?
Great! Post your comments below.
SQL,
Tuning
10053,
count,
COUNT(1),
optimizer,
query,
Trace,
TransformationAbout Carlos
I've been working with Oracle databases on a daily basis for more than 10 years.
View all posts by Carlos →
Leave a Reply
24 Comments on "Count(*) Vs Count(1) Again?"
Sort by: newest | oldest | most voted
It’s Really Good Sir. Thanks,
Join the discussion
0
REPLY 10 days 24 minutes agoCarlos
Author
Trung
1
REPLY 13 days 1 hour ago
0
REPLY 13 days 1 hour agoCarlos
Author Guest
Max
1
REPLY 12 days 20 hours ago
0
REPLY 10 days 24 minutes agoCarlos
Author Guest
Rocky
Guest
You are welcome, my friend.
Is it right for MySQL and other databases?
Hi, Trung.
I can con�rm the Oracle optimizer performs this transformation, but to be honest, I don’t know if other databases do something similar.
It should be researched individually for each RDBMS.
Excelente explicación y, sobretodo, aclaración. Muchas gracias.
Saludos.
Por nada, Max!
Me alegro que te haya parecido útil.
Also I thought that they are equivalent, but I had no evidence for it. Thank you for the deep and accurate deduction!
1
REPLY 12 days 7 hours ago
0
REPLY 10 days 25 minutes agoCarlos
Author
Jared Still
1
REPLY 11 days 22 hours ago
0
REPLY 10 days 26 minutes agoCarlos
Author Guest
Santhosh
1
REPLY 11 days 10 hours ago
0
REPLY 10 days 26 minutes agoCarlos
Author Guest
Rohinton Kazak
1
REPLY 10 days 21 hours ago Guest
That is the case for many people, Rocky. You are welcome!
Well done.
Thanks!
Thank you very much for the clear explanation Mr Carlos.
You are most welcome, Santhosh.
Very good and comprehensive explanation.
0
REPLY 10 days 27 minutes agoCarlos
Author
raj
1
REPLY 10 days 14 hours ago
0
REPLY 10 days 27 minutes agoCarlos
Author Guest
AlexKru
0
REPLY 10 days 7 hours ago Guest
I’m glad you liked it, Rohinton.
Excellent explanation. When i saw this question, answer pop’ed in my mind that * vs (1) is di៛�erent but ORACLE is di៛�erent for sure. I have worked with Teradata DB and it has di៛�erent syntax obviously and (1) – represents First col and not as literal – it avoids dups & NULL and (*) entire col which includes NULL & dups. However getting to know it works di៛�erent in ORACLE is a good learner for me. Your other videos are awesome and so easy to understand
Interesting!
Thanks for commenting, and for your kind words, Raj.
[* Shield plugin marked this comment as “0”. Reason: Google reCAPTCHA was not submitted. *] There are some issues about dependencies tracking for version less than 11. Only starting from Oracle 11g was introduced Fine-Grained Dependencies tracking mechanism which allow to avoid such invalidation: SQL> select * from v$version where rownum = 1; BANNER
—————————————————————- Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 – 64bi SQL> create table t_count (id number(*,0)); Table T_COUNT created. SQL> create or replace procedure p_count is v_count pls_integer; begin select count(*) into v_count from t_count; end; Procedure P_COUNT compiled SQL> select object_name, object_type, status from all_objects where object_name in (‘T_COUNT’,’P_COUNT’); OBJECT_NAME OBJECT_TYPE STATUS —————————— ——————- ——- P_COUNT PROCEDURE VALID T_COUNT TABLE VALID SQL> alter table t_count add (val varchar2(1)); Table T_COUNT altered. SQL> select object_name, object_type, status from all_objects where object_name in (‘T_COUNT’,’P_COUNT’); OBJECT_NAME OBJECT_TYPE STATUS —————————— ——————- ——- P_COUNT PROCEDURE INVALID T_COUNT TABLE VALID
0
REPLY 10 days 28 minutes ago Carlos
AlexKru
1
REPLY 8 days 6 hours ago
0
REPLY 8 days 23 minutes ago Carlos
Laurent Schneider
Guest Author Guest Author
Thanks for commenting, Alex.
I’m not sure how this relates to the article’s topic, though…
[* Shield plugin marked this comment as “0”. Reason: Google reCAPTCHA was not submitted. *] Sorry for bad formatting. I just wanted to say that probably COUT(1) is more safety than COUNT(*) for releases less than Oracle 11g. Because if you make some changes into table upon which you use COUNT(*) it will invalidate subprograms (procedures, functions etc.) in which it is used: SQL> select * from v$version where rownum = 1; BANNER
—————————————————————- Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 – 64bi SQL> create table t_count (id number(*,0)); Table T_COUNT created. SQL> create or replace procedure p_count is v_count
pls_integer; begin select count(*) into v_count from t_count; end; / Procedure P_COUNT compiled SQL> select object_name, object_type, status from all_objects where object_name in (‘T_COUNT’, ‘P_COUNT’); OBJECT_NAME OBJECT_TYPE STATUS —————————— ——————- ——- P_COUNT PROCEDURE VALID T_COUNT TABLE VALID SQL> alter table t_count add (val varchar2(1)); Table T_COUNT altered. SQL> select object_name, object_type, status from all_objects where object_name in (‘T_COUNT’, ‘P_COUNT’); OBJECT_NAME OBJECT_TYPE STATUS —————————— ——————- ——- P_COUNT PROCEDURE INVALID T_COUNT TABLE VALID
Ah, I see it now, thanks for the clari�cation.
That is an interesting observation. So, that doesn’t happen when the procedure calls COUNT(1) instead of COUNT(*)? (I don’t have a 10g instance to test it).
Count(1) has been rewritten in count(*) since 7.3 because Oracle like to Auto-tune mythic statements. In earlier Oracle7, oracle had to
1
REPLY 4 days 6 hours ago
0
REPLY 2 days 20 hours agoCarlos
Author
Balasubramaniam Ramesh
1
REPLY 3 days 2 hours ago
0
REPLY 18 hours 34 minutes agoCarlos
Author Guest
evaluate (1) for each row, as a function, before DETERMINISTIC and NON-DETERMINISTIC exist.
So two decades ago, count(*) was faster
The count(1) myth is ancient… really ancient. And untrue
Agreed.
I’m surprised how often the question gets asked in spite of how old the myth is.
Thanks for the info great
You are most welcome!