Simplifying your data with Pivot tables using Python

Matthew Prince
3 min readJan 23, 2020

--

Photo by Markus Spiske on Unsplash

To tell a better and understandable story with your data, it has to be presented in a simple way for one to understand in a glance or two. Pivot table helps in that regard.

Wikipedia defines a pivot table as a table of statistics that summarizes the data of a more extensive table. Shortly we will see how a pivot table summarizes a larger table and why you should have pivot table in your tool kit for data exploration and visualization.

A pivot table usually helps you make comparisons between desired subsets of a large table.

You may download the file we will use to follow me on my github page. the CSV file is titled “life expectancy”.

To create a pivot table with python, we have to import some python libraries and dependencies.

Using the read_csv() function from pandas we will read the downloaded csv file as “myfile”. We will preview the first 35 rows of the data to have a sense of what our data looks like .

(1704, 6)
country year pop continent lifeExp gdpPercap
0 Afghanistan 1952 8425333.0 Asia 28.801 779.445314
1 Afghanistan 1957 9240934.0 Asia 30.332 820.853030
2 Afghanistan 1962 10267083.0 Asia 31.997 853.100710
3 Afghanistan 1967 11537966.0 Asia 34.020 836.197138
4 Afghanistan 1972 13079460.0 Asia 36.088 739.981106
5 Afghanistan 1977 14880372.0 Asia 38.438 786.113360
6 Afghanistan 1982 12881816.0 Asia 39.854 978.011439
7 Afghanistan 1987 13867957.0 Asia 40.822 852.395945
8 Afghanistan 1992 16317921.0 Asia 41.674 649.341395
9 Afghanistan 1997 22227415.0 Asia 41.763 635.341351
10 Afghanistan 2002 25268405.0 Asia 42.129 726.734055
11 Afghanistan 2007 31889923.0 Asia 43.828 974.580338
12 Albania 1952 1282697.0 Europe 55.230 1601.056136
13 Albania 1957 1476505.0 Europe 59.280 1942.284244
14 Albania 1962 1728137.0 Europe 64.820 2312.888958
15 Albania 1967 1984060.0 Europe 66.220 2760.196931
16 Albania 1972 2263554.0 Europe 67.690 3313.422188
17 Albania 1977 2509048.0 Europe 68.930 3533.003910
18 Albania 1982 2780097.0 Europe 70.420 3630.880722
19 Albania 1987 3075321.0 Europe 72.000 3738.932735
20 Albania 1992 3326498.0 Europe 71.581 2497.437901
21 Albania 1997 3428038.0 Europe 72.950 3193.054604
22 Albania 2002 3508512.0 Europe 75.651 4604.211737
23 Albania 2007 3600523.0 Europe 76.423 5937.029526
24 Algeria 1952 9279525.0 Africa 43.077 2449.008185
25 Algeria 1957 10270856.0 Africa 45.685 3013.976023
26 Algeria 1962 11000948.0 Africa 48.303 2550.816880
27 Algeria 1967 12760499.0 Africa 51.407 3246.991771
28 Algeria 1972 14760787.0 Africa 54.518 4182.663766
29 Algeria 1977 17152804.0 Africa 58.014 4910.416756
30 Algeria 1982 20033753.0 Africa 61.368 5745.160213
31 Algeria 1987 23254956.0 Africa 65.799 5681.358539
32 Algeria 1992 26298373.0 Africa 67.744 5023.216647
33 Algeria 1997 29072015.0 Africa 69.152 4797.295051
34 Algeria 2002 31287142.0 Africa 70.994 5288.040382

I showed the first 35 so you see that the countries are arranged alphabetically with other information following horizontally.

For example, we want to compare the life expectancy between countries in a given year, lets say 1997, that will be a tough task for our eyes and brain and it will take us a while to fetch the info from the table.

Pivot table to the rescue!

We can create a simpler table that will show us how life expectancy interacts with each country in a given year and allow us to make comparisons and have a good insight of our data. Below is the code to create a pivot table with seaborn.

country  Afghanistan  Albania  Algeria  Angola  Argentina  Australia
year
1952 28.801 55.230 43.077 30.015 62.485 69.120
1957 30.332 59.280 45.685 31.999 64.399 70.330
1962 31.997 64.820 48.303 34.000 65.142 70.930
1967 34.020 66.220 51.407 35.985 65.634 71.100
1972 36.088 67.690 54.518 37.928 67.065 71.930
1977 38.438 68.930 58.014 39.483 68.481 73.490
1982 39.854 70.420 61.368 39.942 69.942 74.740
1987 40.822 72.000 65.799 39.906 70.774 76.320
1992 41.674 71.581 67.744 40.647 71.868 77.560
1997 41.763 72.950 69.152 40.963 73.275 78.830
2002 42.129 75.651 70.994 41.003 74.340 80.370
2007 43.828 76.423 72.301 42.731 75.320 81.235

From our pivot table output, we have countries on the column and years on the rows. The cells are filled with life expectancy for the particular country and year.

Congratulations! you now understand pivot table and can create one with python.

--

--

Matthew Prince
Matthew Prince

Written by Matthew Prince

I am a writer who is trying to understand the world. I write on philosophy, psychology, social justice, and everything else. For more info: princedet5@gmail.com

No responses yet