Why You Should Never Use SELECT Star (*)
I think that nearly everyone who does database programming writes a statement like this at some point:
SELECT *
FROM tblFoo F
WHERE F.pah = 791
Most database programmers should (and do) know not to do this. However, at my job I am constantly finding old SELECT * queries, usually because they broke something.
This aggravates me. As a public service, I will discuss my reasons why you should never use SELECT *.
It makes adding fields a pain.
When you run a SELECT *, most databases will cache which fields are returned from this. This means that if you add a field to a table, an existing SELECT * query will not return the field!.
For example, let’s consider the following table and query:
CREATE TABLE dbo.user_table (
user_id int IDENTITY (1, 1) NOT NULL ,
user_name varchar (50) NOT NULL ,
password varchar(50) NULL ,
email varchar(100) NOT NULL ,
date_time_created smalldatetime NOT NULL
)
SELECT *
FROM user_table
If we add a field, ‘vchMajor’ to this table, our query will not retrieve it. This is generally considered a bad thing.
It’s better to write the query like this:
SELECT
user_id,
user_name,
password,
email,
date_time_created
FROM user_table
You have to add new fields by hand either way, but this way is a lot less misleading.
It wastes database resources.
If I need to join across 5 tables to retrieve a set of 5 fields, why the heck would I want to return every field in all 5 tables?
That is potentially a huge waste of memory, for no reason! I would much rather type field names out than waste memory on the database server. When querying a database, I am always looking for a result set with a defined set of fields.
Honestly, if you find it necessary to pull an undefined set of fields (or whole tables), you should revisit your application design.
It can break your application.
Here is a problem I found at work recently.
Here is the meat of a stored procedure in our app:
IF EXISTS(
SELECT U.*
FROM User_Table U WITH(NOLOCK)
WHERE U.user_id = @iUser_ID
AND U.Paper_ID = @iPaper_ID)
SELECT @isOK = 1
ELSE
SELECT @isOK = 0
This procedure generated the following error:
“The text, ntext, and image data types are invalid in this subquery or aggregate expression.”
This error occurred because SELECT * brought back an ntext field. Why would you ever use a SELECT * in an EXISTS block? This is a very bad practice, even if it did not break the application.
A few weeks ago, I received a flood of error emails from my search application.
The problem revolved around a ColdFusion query of queries which UNIONs two queries together. The source queries used SELECT *. SELECT * and ColdFusion’s query of query duck typing caused my app to go boom. The solution was to defined the fields in the source queries, and explicitly cast them as the datatypes I expect.
While SELECT * was not the entire problem, it was part of it. I could have debugged it in half the time if the fields were defined explicitly in the source query.
The Exception That Proves the Rule?
Okay, I know that this is contrary to my entire argument, but I do use SELECT * under one condition.
I will occasionally run something like this in Query Analyzer
SELECT TOP 1 *
FROM User_Table
ORDER By dtCreated DESC
I will use this while testing certain conditions in a development server, but I never use it to drive an application. I guess I should have titled this post “Why You Should (Almost) Never Use SELECT *”.
I agree with the SELECT * fiasco. But I wonder what your feelings are on using COUNT(*) to pull a quick return of total record count. Is it viewed as wasteful, also?
EG: SELECT COUNT(*) FROM MY_TABLE
I kinda abuse that sql statement a lot, especially when I embed sql in VB code, to get a quick recordcount of a table so I can iterate through it w/o having to open it as a dynamic ADO so I can use the ADO.Recordset.Recordcount function. (IE: I find it wasteful to open a recordset as a larger, dynamic object just so I can have access to it’s recordcount, so I jury rig around it by using a COUNT(*) of the table first.)
Just curious about your thoughts on this.
COUNT(*) is one of those damned-if-ya-do situations.
Usually, the only way that you can get an accurate record count is to use COUNT(*). So, you’re kind of stuck using it.
I haven’t written and VB code in a few years now. I need to look at some of old VB code before I can speak to your question.
I don’t view count(*) as wasteful, since the resultSet of total record count doesn’t really return a large resultSet. Usually in most programming language it returns one single number.
Most database can handle the row count using that count(*), btw.
Hello admin! This is very helpful posting thanks.
I forget which specific ones (It’s been a while) but some databases actually apply extra optimizations on COUNT(*), relying on the implicit “Count returned rows” meaning.
I could be wrong, but I think it has something to do with how COUNT(column) interacts with NULL… possibly in the context of indexing.
Hello admin, thanks for posting and link back to my site.
Hello admin, thanks for posting and link back to my site.
“When you run a SELECT *, most databases will cache which fields are returned from this. This means that if you add a field to a table, an existing SELECT * query will not return the field!.”
This is certainly bullshit.
Select * is wrong because it broke your app ? Sounds more like your app didn’t handle a result that the query could have returned. I use Select * in some code (& Select field1, field2 etc in other code) because I want all fields & my code is robust enough to handle whatever result the query returns. Also what Ricky says about BS is correct.
Okay, I wrote this like 5 years ago. I can guarantee that this behavior happened then. I haven’t tested it since then, and using ORM probably makes this moot anyway. But since you’re all having so much fun commenting on a 5-year old post, go for it. After, you and Ricky can go find a post from 2003 complaining that T-SQL doesn’t have try/catch statements, and call that bullshit too.