SELECT DISTINCT and ORDER BY
June 24, 2009 – 8:52 pmORDER BY items must appear in the select list if SELECT DISTINCT is specified.
Recently discovered the above error (T-SQL, SQL Server) when trying to select DISTINCT rows for one set of columns while ordering the rows by another column.
Jeff Smith had a good explanation of why this occurs (article). In a nutshell, if you’re trying to order your distinct results but the other column you are ordering by is NOT distinct, then how will SQL Server know which value to use for ordering?
The solution is to use a GROUP BY and aggregate function (eg. MAX or MIN) to achieve the same effect. So for example, we do a GROUP BY on our “distinct” columns: column1 and column2. Then we order the results by column3. However, we use an aggregate function (MAX in this example) to ensure SQL Server has a specific column3 value with which to sort by.
SELECT column1, column2
FROM tablename
GROUP BY column1, column2
ORDER BY MAX(column3)
DISTINCT and ORDER BY with 2 non-DISTINCT columns
The above solution didn’t however provide a solution for my specific scenario. I infact had 2 non-distinct columns which I needed to deal with while still selecting distinct rows based on a different set of columns.
I had a [Rank] column by which I wanted to sort the results. The above solution would assist with that. However, I also had a non-distinct [ResultTypeID] column that I also wanted to return in my results. It didn’t matter which ResultTypeID I returned but I still wanted to return a value for this column. The solution? Again, another aggregate function…
SELECT MIN(ResultTypeID), column1, column2
FROM tablename
GROUP BY column1, column2
ORDER BY MAX(Rank)