Pivo ting data is the pro cess o f aggregating o r mo ving ro ws o f data into co lumns. Suppo se o ur sto re keeps track o f sales data. The bo ss o nly cares abo ut Unit s So ld, and he really wants to kno w ho w many units were so ld in the No rt h and ho w many were so ld in the So ut h. Business users and pro grammers o ften view data in drastically different ways.
Pro grammers might think in ro ws and co lumns o f data, such as the fo llo wing table with sales data by regio n and date: Re gio n Quart e r Unit s So ld
No rth 20 0 7-0 3-31 15,0 8 9 No rth 20 0 7-0 6 -30 18 ,79 5 No rth 20 0 7-0 9 -30 19 ,0 6 5 No rth 20 0 7-12-31 19 ,9 8 7 So uth 20 0 7-0 3-31 20 ,0 15 So uth 20 0 7-0 6 -30 19 ,8 0 6 So uth 20 0 7-0 9 -30 21,0 53 So uth 20 0 7-12-31 23,0 6 8
Sho wing the data in this way is useful, especially to pro grammers, but many business users may want to co mpare the units so ld by regio n and by date. It's difficult to lo o k at the previo us table and co mpare seco nd quarter 20 0 7 sales fo r the No rth and So uth regio ns.
Business users wo uld pro bably prefer to see the data this way:
Re gio n Unit s So ld20 0 7 -Q1 Unit s So ld20 0 7 -Q2 Unit s So ld20 0 7 -Q3 Unit s So ld20 0 7 -Q4 No rth 15,0 8 9 18 ,79 5 19 ,0 6 5 19 ,9 8 7 So uth 20 ,0 15 19 ,8 0 6 21,0 53 23,0 6 8
This representatio n sho ws exactly the same data, but no w it's easy fo r business users to see that fo r 20 0 7-Q2 the So uth o utso ld the No rth by 1,0 11 units. Graphically, a pivo t may lo o k like this:
Majo r databases such as Oracle and SQL Server no w have PIVOT keywo rds added to their SQL dialects. While MySQL do esn't currently have that keywo rd, yo u can still pivo t yo ur data witho ut much tro uble.
Let's start by adding a simple table fo r o ur demo nstratio ns. We'll use the same data and same basic structure as the first table.
Note
If yo u haven't clo sed the terminal sessio n since the last lesso n, set yo ur delimiter back fro m "//" to ";".Type the fo llo wing at the MySQL pro mpt:
mysql> CREATE TABLE SalesAnalysis
-> (
-> Region varchar(10) NOT NULL,
-> Quarter date NOT NULL,
-> UnitsSold integer NOT NULL
-> ) ENGINE=INNODB;
Query OK, 0 rows affected (0.01 sec) mysql>
Once yo u have the table created, po pulate it with o ur sample data.
T ip
Remember yo u can use the Up arro w key to repeat a co mmand in mysql and then edit the co mmand. Fo rexample, after yo u enter the first co mmand belo w, press the up arro w key and then edit the Quarter date and the UnitsSo ld integer to create the next co mmand.Type the fo llo wing at the MySQL pro mpt:
mysql> INSERT INTO SalesAnalysis values ('NORTH', '2007-03-31',15089);
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO SalesAnalysis VALUES ('NORTH', '2007-06-30',18795);
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO SalesAnalysis VALUES ('NORTH', '2007-09-30',19065);
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO SalesAnalysis VALUES ('NORTH', '2007-12-31',19987);
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO SalesAnalysis VALUES ('SOUTH', '2007-03-31',20015);
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO SalesAnalysis VALUES ('SOUTH', '2007-06-30',19806);
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO SalesAnalysis VALUES ('SOUTH', '2007-09-30',21053);
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO SalesAnalysis VALUES ('SOUTH', '2007-12-31',23068);
Query OK, 0 rows affected (0.01 sec) mysql>
Check that yo u entered everything co rrectly:
Type the fo llo wing at the MySQL pro mpt:
mysql> SELECT * FROM SalesAnalysis;
+---+---+---+ | Region | Quarter | UnitsSold | +---+---+---+ | NORTH | 2007-03-31 | 15089 | | NORTH | 2007-06-30 | 18795 | | NORTH | 2007-09-30 | 19065 | | NORTH | 2007-12-31 | 19987 | | SOUTH | 2007-03-31 | 20015 | | SOUTH | 2007-06-30 | 19806 | | SOUTH | 2007-09-30 | 21053 | | SOUTH | 2007-12-31 | 23068 | +---+---+---+ 8 rows in set (0.00 sec)
mysql>
No w that we have so me sample data, ho w do we go abo ut pivo ting the ro ws into co lumns? If yo u recall fro m the previo us lesso ns, we've already learned ho w to use the CASE statement and aggregates. We'll use bo th o f tho se features to co me up with o ur PIVOT.
The co lumn we want to pivo t is Unit sSo ld, and we want to pivo t that co lumn by the Quart e r co lumn. We'll add fo ur new co lumns named 20 0 7 -Q1, 20 0 7 -Q2, 20 0 7 -Q3, and 20 0 7 -Q4 , and use a CASE statement to allo cate UnitsSo ld to each o f tho se new co lumns. Let's try it!
Type the fo llo wing at the MySQL pro mpt:
mysql> SELECT Region,
-> CASE WHEN Quarter='2007-03-31' THEN UnitsSold END AS '2007-Q1',
-> CASE WHEN Quarter='2007-06-30' THEN UnitsSold END AS '2007-Q2',
-> CASE WHEN Quarter='2007-09-30' THEN UnitsSold END AS '2007-Q3',
-> CASE WHEN Quarter='2007-12-31' THEN UnitsSold END AS '2007-Q4'
-> FROM SalesAnalysis;
+---+---+---+---+---+ | Region | 2007-Q1 | 2007-Q2 | 2007-Q3 | 2007-Q4 | +---+---+---+---+---+ | NORTH | 15089 | NULL | NULL | NULL | | NORTH | NULL | 18795 | NULL | NULL | | NORTH | NULL | NULL | 19065 | NULL | | NORTH | NULL | NULL | NULL | 19987 | | SOUTH | 20015 | NULL | NULL | NULL | | SOUTH | NULL | 19806 | NULL | NULL | | SOUTH | NULL | NULL | 21053 | NULL | | SOUTH | NULL | NULL | NULL | 23068 | +---+---+---+---+---+ 8 rows in set (0.00 sec)
mysql>
Let's lo o k a little clo ser.
OBSERVE:
mysql> SELECT Region,
-> CASE WHEN Quarter='2007-03-31' THEN UnitsSold END AS '2007-Q1', -> CASE WHEN Quarter='2007-06-30' THEN UnitsSold END AS '2007-Q2', -> CASE WHEN Quarter='2007-09-30' THEN UnitsSold END AS '2007-Q3', -> CASE WHEN Quarter='2007-12-31' THEN UnitsSold END AS '2007-Q4' -> FROM SalesAnalysis;
Here, the UnitsSo ld in the f irst quart e r are entered into the new '20 0 7 -Q1' co lumn, the UnitsSo ld in the se co nd quart e r are put into the new '20 0 7 -Q2' co lumn, and so o n. But the results are no t exactly co rrect—they still co ntain as many ro ws as the o riginal table, with a bunch o f NULL values in between the useful values.
Ho w can we co llapse this into meaningful data? We'll use an aggregate! In this example we can use SUM to add up o ur new co lumns, gro uped by the regio n. Fo r go o d practice, we sho uld also add an ELSE clause to the CASE statement. Instead o f using NULL, we'll use zero .
Let's try o ur updated pivo t!
Type the fo llo wing at the MySQL pro mpt:
mysql> SELECT Region,
-> SUM(CASE WHEN Quarter='2007-03-31' THEN UnitsSold ELSE 0 END) AS '2007-Q1',
-> SUM(CASE WHEN Quarter='2007-06-30' THEN UnitsSold ELSE 0 END) AS '2007-Q2',
-> SUM(CASE WHEN Quarter='2007-09-30' THEN UnitsSold ELSE 0 END) AS '2007-Q3',
-> SUM(CASE WHEN Quarter='2007-12-31' THEN UnitsSold ELSE 0 END) AS '2007-Q4'
-> FROM SalesAnalysis -> GROUP BY Region; +---+---+---+---+---+ | Region | 2007-Q1 | 2007-Q2 | 2007-Q3 | 2007-Q4 | +---+---+---+---+---+ | NORTH | 15089 | 18795 | 19065 | 19987 | | SOUTH | 20015 | 19806 | 21053 | 23068 | +---+---+---+---+---+ 2 rows in set (0.00 sec)
This o ne lo o ks much better, and returns the co rrect results. Our business users will be very happy.