Photo by Adeolu Eletu on Unsplash

The eyes are our most important organs, as we perceive around 80% of all our impressions through vision. It is not surprising, that visualizations are the easiest way for us to gather and analyze information. When it comes to Data Science, plots and graphs of various kinds help us to understand problems of diverse complexity. They allow us to identify patterns, relationships, and outliers in our data. So, no matter what data we want to analyze, data visualization is a crucial first step. When working with Python, Matplotlib and corresponding add ons like seaborn are the go-to tools to achieve that quickly.

In this article, I want to show you a couple of tips to enhance and enrich your matplolib figures. Most importantly, I provide you with a nice way of annotating various kinds of bar charts.

Interested? So let’s dive in!

Prerequisites

To follow along with the example, you need Python 3.7+ with Matplotlib, pandas, and seaborn. As always, I recommend using Poetry for managing your Python packets and environments. You can check this article on how to set it up. As a shortcut, I recommend using pip or pipx to install it on your machine.

As a heads-up, we first gonna create bar charts from sample data without further styling. The goal of this article is to enhance and enrich these charts. You can find all the example code on my GitHub repository.

Setup

First, we create a Poetry project named nice-plots where we implement the example and add the necessary packages

poetry new nice-plots

cd nice-plots

poetry add pandas matplotlib seaborn

touch nice_plots/bar_charts.py

Now we have an isolated Python environment with everything installed that we gonna need. Great, we can start working!

The Data

As data, we use the famous iris flower dataset. Luckily, that comes directly with seaborn. To create “meaningful” bar charts, I first group the dataset by species using mean as the aggregation function. From this data, I create four different bar charts which are vertical bar charts and horizontal bar charts both in normal- and stacked versions. In code, this looks like

import seaborn as sns

import os

from dataclasses import dataclass

import pandas as pd

import matplotlib.pyplot as plt

import matplotlib

from typing import * sns.set()

# BLOCK 1

# BLOCK 2

# BLOCK 4

# BLOCK 6 data = sns.load_dataset("iris").groupby('species').mean()

fig, axes = plt.subplots(2,2)

data.plot.bar(ax=axes[0][0])

data.plot.bar(stacked=True, ax=ax[0][1])

data.plot.barh(ax=axes[1][0])

data.plot.barh(stacked=True, ax=ax[1][1])

# BLOCK 3

# BLOCK 5

# BLOCK 7 plt.show()

I have added comments # BLOCK N where I forward reference the enhancements that follow below. I hope that is not too confusing. I have also added all imports that are required later on.

As you might not sit in front of your computer, here is how the resulting bar charts look like

Hmm, not super nice, right? I guess it is worth the effort to make them nicer.

Enhancements

Figure- and Font Sizes

The first thing that is obvious when looking at the graphs is that they are way too small compared to the rest of the figure. There are several ways to change that. I prefer to set the global rcParams from pyplot. Global means that they apply to all figures you create and not only to a specific one. To change both the figure size and some font sizes, all you have to do is

# BLOCK 1

def set_sizes(fig_size:Tuple[int,int]=(9, 6), font_size:int=10):

plt.rcParams["figure.figsize"] = fig_size

plt.rcParams["font.size"] = font_size

plt.rcParams["xtick.labelsize"] = font_size

plt.rcParams["ytick.labelsize"] = font_size

plt.rcParams["axes.labelsize"] = font_size

plt.rcParams["axes.titlesize"] = font_size

plt.rcParams["legend.fontsize"] = font_size set_sizes((12,8), 10)

Running this produces

That’s already way better than before. But there is still room for improvement.

Rotate Tick Labels

Vertical text, like the x-tick labels of the horizontal bar charts, does not look appealing to me. Besides that, this vertical text also wastes a lot of figure space. To fix that, Matplotlib offers a simple way to rotate tick labels via

# BLOCK 2

def rotate_xticks(ax: matplotlib.axes, degrees : float = 45):

ax.set_xticklabels(ax.get_xticklabels(), rotation=degrees) # BLOCK 3

rotate_xticks(ax=axes[0][0],0)

rotate_xticks(ax=axes[1][0],0)

As rotating x-ticks by 45 degrees is very common, I make that the default value. However, in this example printing labels horizontally makes the most sense. You can do that by setting degrees to 0.

The output produced on my machine looks like

Annotate Bar Charts

Bar charts are perfect to get a quick intuition about how different groups compare to each other. However, we might not only be interested in a relative comparison but also want to know the corresponding absolute values. We can achieve both by annotating each bar with the respective value.

For that, I have created a class AnnotateBars that allows you to annotate both vertical- and horizontal bar charts in stacked- and unstacked version

# BLOCK 4

#Alias types to reduce typing, no pun intended

Patch = matplotlib.patches.Patch

PosVal = Tuple[float, Tuple[float, float]]

Axis = matplotlib.axes.Axes

PosValFunc = Callable[[Patch], PosVal] @dataclass

class AnnotateBars:

font_size: int = 10

color: str = "black"

n_dec: int = 2 def horizontal(self, ax: Axis, centered=False):

def get_vals(p: Patch) -> PosVal:

value = p.get_width()

div = 2 if centered else 1

pos = (

p.get_x() + p.get_width() / div,

p.get_y() + p.get_height() / 2,

)

return value, pos

ha = "center" if centered else "left"

self._annotate(ax, get_vals, ha=ha, va="center") def vertical(self, ax: Axis, centered:bool=False):

def get_vals(p: Patch) -> PosVal:

value = p.get_height()

div = 2 if centered else 1

pos = (p.get_x() + p.get_width() / 2,

p.get_y() + p.get_height() / div

)

return value, pos

va = "center" if centered else "bottom"

self._annotate(ax, get_vals, ha="center", va=va) def _annotate(self, ax, func: PosValFunc, **kwargs):

cfg = {"color": self.color,

"fontsize": self.font_size, **kwargs}

for p in ax.patches:

value, pos = func(p)

ax.annotate(f"{value:.{self.n_dec}f}", pos, **cfg)

Puh, a lot of code but no worries, I guide you on how to use it.

First, you need to create an instance of AnnotateBars. You can specify the font-size, the color of the text, and the number of decimals that should be printed. All these parameters have sensible default values.

Next, you need to call vertical or horizontal depending on the kind of bar chart you want to annotate. To those functions, you need to pass the axis object that holds the respective bar chart. Furthermore, they accept an additional parameter called centered. With that, you can determine if the annotations are printed in the center of the bar or on top/right of it. This is especially helpful when you work with stacked bar charts.

Enough talking, let’s make use of this class and annotate our four bar-charts with various configurations

# BLOCK 5

AnnotateBars().vertical(axes[0][0])

AnnotateBars(color="blue").vertical(axes[1][0], True)

AnnotateBars().horizontal(axes[0][1])

AnnotateBars(font_size=8, n_dec=1).horizontal(axes[1][1], True)

And here are the resulting output charts

Now we are even smarter and know the absolute values and not only the relations. Awesome!

As a side note, when we have very small bars in stacked bar charts, overlaying the values becomes an issue. You can see that from the stacked Setosa chart. In such cases, you have to choose if you can accept that or require something different.

Save Plot

Finally, when you have created awesome plots that you are proud of, you might want to share them with your colleagues. For that, you have to store your plots in a format like PNG that you can easily distribute. Even though the syntax for dumping figures to images is fairly simple, I tend to forget it. That is why I use this helper function

# BLOCK 6

def save_figure(fig : matplotlib.figure.Figure, path : str):

folder = os.path.dirname(path)

if folder:

os.makedirs(folder, exist_ok=True)

fig.savefig(path, bbox_inches="tight")

You only have to pass it a Figure object and the path to the output image file. If the folder you want to store your image in does not exist, the function automatically creates it for you. For me, this comes super handy many times. Let’s add this last block of code, and we are done

# BLOCK 7

save_figure(fig, "./plots/medium/bar-charts.png")

And that is how I created all the plots you have seen previously. I definitely eat my own dog food :)!

Wrap Up

In this article, I showed you a couple of Matplotlib functions to enhance your plots. Most importantly, I presented you with a nice way to annotate various kinds of bar charts.

Thank you for following this post. As always, feel free to contact me for questions, comments, or suggestions. Happy plotting!