Introduction This guest post that walks through a great example of using python to automate a report generating process. I think PB Python readers will enjoy learning from this real world example using python, jupyter notebooks, papermill and several other tools. Before we get started, I would like to introduce the author: My name is Duarte Carmo and I’m a product manager and digital consultant. Originally from Lisbon - Portugal, but currently living and working in Copenhagen - Denmark. Find more about my work and leisure in my website.

Part 1 - Tool roundup Welcome to part 1 of this two-part series post about automating report generation using python, jupyter, papermill, and a couple of other tools. In the first part, we will cover 4 main important workflows that are part of the automation process. In the second and final part, we will bring everything together and build our own report automation system. Note: This code was written in python 3.7. You might have to adapt the code for older versions of python. All of the code for this article is available on github. Alright, let’s get to work.

Automating report generation with Python - Why? Not everyone can code. This might seem like an obvious statement, but once you start using python to automate or analyze things around you, you start to encounter a big problem: reproducibility. Not everyone knows how to run your scripts, use your tools, or even use a modern browser. Let us say you built a killer script. How exactly do you make someone who has never heard the word “python” use it? You could teach them python, but that would take a long time. In this series, we will teach you how you can automatically generate shareable Html reports from any excel file using a combination of tools, centered around python.

Creating a Jupyter Notebook reports from Excel files Let us say you have an excel file sales_january.xlsx with a list of the sales generated by a group of employees. Just like this: Let’s start by using a jupyter notebook sales_january.ipynb to create a very simple analysis of that sales data. We start by importing the pandas and maplotlib libraries. After that, we specify the name of our file using the filename variable. Finally, we use the read_excel function to read our data into a pandas DataFrame. import pandas as pd import matplotlib.pyplot as plt % matplotlib inline # so plots are printed automatically filename = "sales_january.xlsx" data = pd . read_excel ( filename , index_col = 0 ) When printing the data dataframe, we get the following: After that, we plot the data using pandas: data . plot ( kind = "bar" , title = f "Sales report from { filename } " ) And we get the following: And that’s it! We have a jupyter notebook that analyzes (a very simple analysis let us say) a sales report in excel. Now let’s say we want to share that report with other people in the organization, what do we do?

Generating Html reports from Jupyter Notebooks to share with colleagues In my experience, the easiest way to share a report with colleagues is to use a little tool called nbconvert. Nbconvert allows you to generate an Html version of your notebook. To install it simply run pip install nbconvert . To do this, start by navigating to the same directory where your notebook is and run the following from your terminal: $ jupyter nbconvert sales_january.ipynb You will see that a new file named sales_january.html was created. Html files are better than ipynb in the measure that they are easily shareable via email, message, or any other way. Just make sure the person receiving the file opens it via a relatively modern browser. But lets us say that this sales report comes in every month, how can we automatically run this notebook with any excel file that has the same format?