Currently migrating my blog from wordpress and other scattered sources.

Creating a filled barchart with matplotlib

It’s fairly common to encode a variable with color intensity like below. Here bar color reflects its magnitude. This post will take your step-by-step on how to create one like below.

barchart

Lets make a filled bar chart with the pandas.dataframe below. Our aim is to map the diff column to color values.

d_per =

State Boy Girl diff Boys_better
JH 38.0 35.0 3.0 True
BR 35.8 37.0 -1.2 False
UK 34.5 33.0 1.5 True
UP 44.0 43.0 1.0 True
KL 40 35 -5.0 False

Step:1 Imports and Data

import matplotlib
import matplotlib.pyplot as plt
from matplotlib import mpatches, colors
import pandas as pd

d_per = \
pd.DataFrame({
    'State':['JH','BR','UK','UP','KL'],
    'Boy': [38,35.8,34.5,44,35],
    'Girl': [35,37,33,43,40],
})
d_per['diff'] = d_per['Boy'] - d_per['Girl']
d_per['Boys_better'] = d_per['diff']>0

Step:2 Map data to color

Define _COLORS which maps color-column to corresponding rgb values using matplotlib.cm.ScalarMappable(norm, cmap). It comes with two paramerters norm (decides color spacing) and cmap (color maps used). Some popular functions for norm are listed below and more on colormaps can be found here. Each of these norms comes with their own parameters which you can experiment.

  • matplotlib.colors.Normalize
  • matplotlib.colors.LogNorm
  • matplotlib.colors.PowerNorm
  • matplotlib.colors.SymLogNorm

The colors are generated by calling to_rgba() with the data to encode as _COLORS = cc_.to_rgba(d_per["diff"]).

cc_ = matplotlib.cm.ScalarMappable(norm=colors.SymLogNorm(), cmap='bwr')

_COLORS = cc_.to_rgba(d_per["diff"])

print(_COLORS)

[[1.         0.         0.         1.        ]
 [0.34509804 0.34509804 1.         1.        ]
 [1.         0.16470588 0.16470588 1.        ]
 [1.         0.25882353 0.25882353 1.        ]
 [0.         0.         1.         1.        ]]

Step:3 Plot the figure

Now that you have the data converted to colors, you can pass it to plot function.

fig, ax = plt.subplots(figsize=(6,3))

ax.bar(np.arange(len(d_per["diff"])),
        height = d_per["diff"],
        width = 0.5, 
        align = "center", 
        color=_COLORS)

0

Set ticks and labels

ax.set_xticks(d_per.index)
ax.set_xticklabels(d_per['State'].tolist())
ax.set_xlabel("State")
ax.set_ylabel("Median Performance Difference\n(Boys - Girls)")

# add think line for seperation
ax.axhline(0, color='k', linewidth = 0.5)

1

Step:3a Create legend patches

Finally add some patches

red_patch = mpatches.Patch(color='red', label='Boys are Better')
blue_patch = mpatches.Patch(color='blue', label='Girls are Better')
plt.legend(handles=[red_patch, blue_patch], loc=3)

2