Sales forecasting is one the most common tasks in many sales-driven organizations. When done well, it enables organizations to adequately plan for the future with a degree of confidence. In this tutorial, we’ll use Prophet, a package developed by Facebook to show how one can achieve this. This package is available in both Python and R. We assume that the reader has a basic understanding of handling time series data in Python.

Plan of Attack

Introduction and Installation Model Fitting Making Future Predictions Obtaining the Forecasts Plotting the Forecasts Plotting the Forecast Components Cross Validation Obtaining the Performance Metrics Visualizing Performance Metrics Conclusion

Introduction and Installation

Prophet works best with hourly and weekly data over several months. When working with Prophet, yearly data is most preferred. According to Facebook Research:

At its core, the Prophet procedure is an additive regression model with four main components: 1. A piecewise linear or logistic growth curve trend. Prophet automatically detects changes in trends by selecting change points from the data. 2. A yearly seasonal component modeled using Fourier series. 3. A weekly seasonal component using dummy variables. 4. A user-provided list of important holidays.

Prophet can be installed using pip in Python as shown below. Prophet depends on a Python module called pystan. This module will be installed automatically as we install Prophet.

The dataset we’ve used for this tutorial is available here. It’s actually a weather-related dataset that looks at temperature changes in a river near Dallas, TX over time. Not quite sales data (I don’t happen to have any formatted sales data on hand), but because it’s been well-formatted and ready to use. It is important to note that the model we’ll build will work perfectly with actual sales data that has been formatted like this dataset.

Once you download the dataset make sure to delete some unnecessary rows towards the end of the file because they might interfere with the analysis. For this univariate analysis, Prophet expects the dataset to have two columns named as ds and y . ds is the date column while y is the column that we’re forecasting.

Let’s get the ball rolling by importing Pandas for data manipulation and Prophet for forecasting. Next, we load in our dataset and check its head.