Data Tricks
Let's suppose to start with a csv data file (testdata.csv) structured in three columns as shown below, where each row represents an item that belongs to a particular category, and has a specific value.
item,cat,value
7656866085,D,46998772
6516448034,A,58853106
1588654381,A,538375067
5222817048,C,83453539
475012984,C,39846344
1532273337,H,39754601
...
GNU datamash latest version (1.7) can be compiled with Mingw 64 bit toolchain on Windows, and it can be used to check if all rows have the same number of fields:
$ datamash.exe -t, check <testdata.csv
454 lines, 3 fields
The sum of all the values in the last column for each category can be obtained with:
$ cat testdata.csv | datamash.exe -t, -H -s groupby 2 sum 3 >pareto.csv
and saved to a new csv file.
GroupBy(cat),sum(value)
A,5502572868
B,408375237
C,3887575164
D,1455346293
F,836073125
H,560329867
M,110014
R,17583634
S,14034088
T,23165466
W,13736123
This file can then be imported inside the Ipython console directly from the command line with csvpy as shown in the video below:
To create a Pareto chart from the previous data, it's necessary to extend the dataframe with a column that contains the categorical cumulative percentage (right vertical axis).
...
In [8]: df=pd.DataFrame(data[1:],columns=data[0])
In [9]: df = df.apply(pd.to_numeric, errors='ignore')
In [10]: df=df.sort_values(by='sum(value)',ascending=False)
In [11]: df["per_c"]=100*df['sum(value)'].cumsum()/df['sum(value)'].sum()
...
The below Pareto chart is a bar and line graph that displays the relative importance attributed to each category. Bars are arranged in decreasing magnitude from left to right along the x-axis, while the solid red line shows the cumulative percentage. Applied to the Business Model Canvas, a Pareto diagram shows that the majority of costs, for instance, are due to a small subset of resources and activities. This may not be trivial to identify by just looking at the data.
Pandas GUI
From the Ipython console, it's also possible to export all the dataframes to the Pandas-GUI with this simple set of instructions.