Fortunately for us, there is an excellent python library for creating and updating PowerPoint files: python-pptx . The API is very well documented so it is pretty easy to use. The only tricky part is understanding the PowerPoint document structure including the various master layouts and elements. Once you understand the basics, it is relatively simple to automate the creation of your own PowerPoint slides. This article will walk through an example of reading in and analyzing some Excel data with pandas, creating tables and building a graph that can be embedded in a PowerPoint file.

Love it or loathe it, PowerPoint is widely used in most business settings. This article will not debate the merits of PowerPoint but will show you how to use python to remove some of the drudgery of PowerPoint by automating the creation of PowerPoint slides using python.

Refer to the input and output files to see what you get.

If you want to try it yourself:

The basic flow of this function is to loop through and create an example of every layout included in the source PowerPoint file. Then on each slide, it will populate the title (if it exists). Finally, it will iterate through all of the placeholders included in the template and show the index of the placeholder as well as the type.

# Do not overwrite the title which is just a special placeholder

# Go through all the placeholders and identify them by index and type

# Not every slide has to have a title

# Loop through them all and see where the various elements are

The output file contains marked up information to make it easier

""" Take the input file and analyze the structure.

I won’t explain all the code line by line but you can see analyze_ppt.py on github. Here is the function that does the bulk of the work:

In order to make your life easier with your own templates, I created a simple standalone script that takes a template and marks it up with the various elements.

In this image, we use Layout 1 for a completely different look.

In the image below, you can see that we are using Layout 0 and there is one placeholder on the slide at index 1.

Before diving into some code samples, there are two key components you need to understand: Slide Layouts and Placeholders . In the images below you can see an example of two different layouts as well as the template’s placeholders where you can populate your content.

Python-pptx can create blank PowerPoint files but most people are going to prefer working with a predefined template that you can customize with your own content. Python-pptx’s API supports this process quite simply as long as you know a few things about your template.

Creating your own PowerPoint

For the dataset and analysis, I will be replicating the analysis in Generating Excel Reports from a Pandas Pivot Table. The article explains the pandas data manipulation in more detail so it will be helpful to make sure you are comfortable with it before going too much deeper into the code.

Let’s get things started with the inputs and basic shell of the program:

from __future__ import print_function from pptx import Presentation from pptx.util import Inches import argparse import pandas as pd import numpy as np from datetime import date import matplotlib.pyplot as plt import seaborn as sns # Functions go here if __name__ == "__main__" : args = parse_args () df = pd . read_excel ( args . report . name ) report_data = create_pivot ( df ) create_chart ( df , "report-image.png" ) create_ppt ( args . infile . name , args . outfile . name , report_data , "report-image.png" )

After we create our command line args, we read the source Excel file into a pandas DataFrame. Next, we use that DataFrame as an input to create the Pivot_table summary of the data:

def create_pivot ( df , index_list = [ "Manager" , "Rep" , "Product" ], value_list = [ "Price" , "Quantity" ]): """ Take a DataFrame and create a pivot table Return it as a DataFrame pivot table """ table = pd . pivot_table ( df , index = index_list , values = value_list , aggfunc = [ np . sum , np . mean ], fill_value = 0 ) return table

Consult the Generating Excel Reports from a Pandas Pivot Table if this does not make sense to you.

The next piece of the analysis is creating a simple bar chart of sales performance by account:

def create_chart ( df , filename ): """ Create a simple bar chart saved to the filename based on the dataframe passed to the function """ df [ 'total' ] = df [ 'Quantity' ] * df [ 'Price' ] final_plot = df . groupby ( 'Name' )[ 'total' ] . sum () . order () . plot ( kind = 'barh' ) fig = final_plot . get_figure () # Size is the same as the PowerPoint placeholder fig . set_size_inches ( 6 , 4.5 ) fig . savefig ( filename , bbox_inches = 'tight' , dpi = 600 )

Here is a scaled down version of the image:

We have a chart and a pivot table completed. Now we are going to embed that information into a new PowerPoint file based on a given PowerPoint template file.

Before I go any farther, there are a couple of things to note. You need to know what layout you would like to use as well as where you want to populate your content. In looking at the output of analyze_ppt.py we know that the title slide is layout 0 and that it has a title attribute and a subtitle at placeholder 1.

Here is the start of the function that we use to create our output PowerPoint:

def create_ppt ( input , output , report_data , chart ): """ Take the input powerpoint file and use it as the template for the output file. """ prs = Presentation ( input ) # Use the output from analyze_ppt to understand which layouts and placeholders # to use # Create a title slide first title_slide_layout = prs . slide_layouts [ 0 ] slide = prs . slides . add_slide ( title_slide_layout ) title = slide . shapes . title subtitle = slide . placeholders [ 1 ] title . text = "Quarterly Report" subtitle . text = "Generated on {:%m- %d -%Y}" . format ( date . today ())

This code creates a new presentation based on our input file, adds a single slide and populates the title and subtitle on the slide. It looks like this:

Pretty cool huh?

The next step is to embed our picture into a slide.

From our previous analysis, we know that the graph slide we want to use is layout index 8, so we create a new slide, add a title then add a picture into placeholder 1. The final step adds a subtitle at placeholder 2.

# Create the summary graph graph_slide_layout = prs . slide_layouts [ 8 ] slide = prs . slides . add_slide ( graph_slide_layout ) title = slide . shapes . title title . text = "Sales by account" placeholder = slide . placeholders [ 1 ] pic = placeholder . insert_picture ( chart ) subtitle = slide . placeholders [ 2 ] subtitle . text = "Results consistent with last quarter"

Here is our masterpiece:

For the final portion of the presentation, we will create a table for each manager with their sales performance.

Here is an image of what we’re going to achieve:

Creating tables in PowerPoint is a good news / bad news story. The good news is that there is an API to create one. The bad news is that you can’t easily convert a pandas DataFrame to a table using the built in API. However, we are very fortunate that someone has already done all the hard work for us and created PandasToPowerPoint.

This excellent piece of code takes a DataFrame and converts it to a PowerPoint compatible table. I have taken the liberty of including a portion of it in my script. The original has more functionality that I am not using so I encourage you to check out the repo and use it in your own code.

# Create a slide for each manager for manager in report_data . index . get_level_values ( 0 ) . unique (): slide = prs . slides . add_slide ( prs . slide_layouts [ 2 ]) title = slide . shapes . title title . text = "Report for {} " . format ( manager ) top = Inches ( 1.5 ) left = Inches ( 0.25 ) width = Inches ( 9.25 ) height = Inches ( 5.0 ) # Flatten the pivot table by resetting the index # Create a table on the slide df_to_table ( slide , report_data . xs ( manager , level = 0 ) . reset_index (), left , top , width , height ) prs . save ( output )

The code takes each manager out of the pivot table and builds a simple DataFrame that contains the summary data. Then uses the df_to_table to convert the DataFrame into a PowerPoint compatible table.

If you want to run this on your own, the full code would look something like this:

python create_ppt.py simple-template.pptx sales-funnel.xlsx myreport.pptx

All of the relevant files are available in the github repository.