I ran into a good example for the UNPIVOT operator today and I thought I’d quickly write it up so it does not get lost going forward.
I found a very simple data set that shows the number of wins per year by MLB team. The formatting is not ideal as it includes the team code as a column value:
(data set here)
I imported this data set into my SQL Server Express edition using the Import/Export wizard. The data set looks like this:
..but I want the team code to be an attribute of the data set so that I can write queries against it to easily see the most number of wins in a season regardless of team.
The easiest way to transform this data set is by using the UNPIVOT operator. Because the PIVOT and UNPIVOT operators are not used in everyday coding, it is no wonder that these are not widely understood.
Any easy way to think about these operators is to think about column names and data attributes. If you want column names to become attributes of the data set, it needs to be UNPIVOTED. If you want attributes (values) to become column names, the set needs to be PIVOTED.
In our MLB data set, team codes are set as column names. but we want them to be attributes of the data set. An UNPIVOT is in order.
To use UNPIVOT, we need to tell the operator the columns that we want to UNPIVOT. Here is a quick example from Microsoft:
-- Create the table and insert values as portrayed in the previous example. CREATE TABLE pvt (VendorID int, Emp1 int, Emp2 int, Emp3 int, Emp4 int, Emp5 int); GO INSERT INTO pvt VALUES (1,4,3,5,4,4); INSERT INTO pvt VALUES (2,4,1,5,5,5); INSERT INTO pvt VALUES (3,4,3,5,4,4); INSERT INTO pvt VALUES (4,4,2,5,5,4); INSERT INTO pvt VALUES (5,5,1,5,5,5); GO -- Unpivot the table. SELECT VendorID, Employee, Orders FROM (SELECT VendorID, Emp1, Emp2, Emp3, Emp4, Emp5 FROM pvt) p UNPIVOT (Orders FOR Employee IN (Emp1, Emp2, Emp3, Emp4, Emp5) )AS unpvt; GO
In the example above, you can see that after the UNPIVOT operator, they have specified a new columns named Orders and Employees. After specifying the new column names that will be created, the columns to collapse are specified.
For our example, we will create a new columns named ‘Wins’ and ‘Team’. We will then collapse all of the columns with the team code into these columns. The values go to a column called ‘Wins’ and the column name goes to a column named ‘Team’.
There are 31 columns that will collapse. To easy get these values, let’s query the system table:
Now, we have everything that we need to construct our code:
SELECT c.name+',' from sys.columns c JOIN sys.tables t on t.object_id=c.object_id and t.name='MLBRecords' WHERE LEN(c.name)=3
SELECT [Year], [G], Team, Wins FROM (SELECT * from stg.MLBRecords) p UNPIVOT (Wins FOR Team IN (ARI,ATL,BLA,BAL,BOS,CHC,CHW,CIN,CLE,COL,DET,HOU,KCR,LAA,LAD,MIA,MIL,MIN,NYM,NYY,OAK,PHI,PIT,SDP,SFG,SEA,STL,TBR,TEX,TOR,WSN)) as UNPVT
And the output looks much better now:
You can always run an internet search to remind yourself of the syntax for PIVOT and UNPIVOT. The most important thing to remember is that if you want values to become columns, use PIVOT. If you want columns to become values, use UNPIVOT.