Machine learning > Time Series Forecasting > Time Series Analysis > Prophet

Time Series Forecasting with Prophet

Learn how to use Facebook's Prophet library for time series forecasting. This tutorial covers installation, data preparation, model building, evaluation, and practical applications.

Introduction to Prophet

Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonality effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well.

Installation

Install the Prophet library using pip. Make sure you have Python and pip installed on your system.

pip install prophet

Data Preparation

Prophet requires the input data to be in a specific format. The time column must be named 'ds' (datetime), and the value column must be named 'y'. The code snippet demonstrates how to load data from a CSV file, rename the columns accordingly, and convert the 'ds' column to datetime objects.

Note: Replace 'example_data.csv' with your actual data file path and 'Date' and 'Value' with the correct column names.

import pandas as pd
from prophet import Prophet

# Load the data
df = pd.read_csv('example_data.csv')

# Rename columns to 'ds' (datetime) and 'y' (value)
df.rename(columns={'Date': 'ds', 'Value': 'y'}, inplace=True)

# Convert 'ds' to datetime objects
df['ds'] = pd.to_datetime(df['ds'])

print(df.head())

Model Building

Create a Prophet model instance and fit it to your historical data. The fit() method trains the model using the provided time series data.

model = Prophet()
model.fit(df)

Making Predictions

To make predictions, first create a dataframe that contains the dates for which you want to forecast. The make_future_dataframe() method generates a dataframe with the specified number of periods (days in this case) into the future. Then, use the predict() method to generate the forecast. The forecast dataframe includes columns for the predicted values (yhat), lower bound (yhat_lower), and upper bound (yhat_upper).

# Create a future dataframe for predictions
future = model.make_future_dataframe(periods=365)

# Make predictions
forecast = model.predict(future)

print(forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail())

Visualizing the Forecast

Prophet provides built-in plotting functions to visualize the forecast and its components (trend, yearly seasonality, weekly seasonality). The plot() method displays the forecast along with the historical data. The plot_components() method shows the individual components of the forecast.

# Plot the forecast
fig1 = model.plot(forecast)

# Plot the components of the forecast (trend, seasonality)
fig2 = model.plot_components(forecast)

Evaluating Model Performance

To evaluate the model's performance, use cross-validation. The cross_validation function splits the historical data into training and validation sets. The initial parameter specifies the initial training period, the period parameter specifies the spacing between cutoff dates, and the horizon parameter specifies the forecast horizon. Then, calculate performance metrics like Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) using the performance_metrics function.

from prophet.diagnostics import cross_validation
from prophet.diagnostics import performance_metrics

# Perform cross-validation
df_cv = cross_validation(model, initial='730 days', period='180 days', horizon = '365 days')

# Calculate performance metrics
df_p = performance_metrics(df_cv)
print(df_p.head())

Adding Seasonality

Prophet automatically detects yearly and weekly seasonality. However, you can add custom seasonality patterns, like monthly seasonality, by using the add_seasonality() method. The period parameter specifies the length of the seasonality cycle, and the fourier_order parameter controls the flexibility of the seasonality curve.

model = Prophet(weekly_seasonality=False, yearly_seasonality=False)
model.add_seasonality(name='monthly', period=30.5, fourier_order=5)
model.fit(df)

Adding Holidays

Prophet can also incorporate the effects of holidays on the time series. Create a dataframe containing the dates and names of the holidays, along with optional lower and upper window parameters to capture effects before and after the holiday. Pass this dataframe to the Prophet constructor using the holidays parameter.

# Create a dataframe of holidays
holidays = pd.DataFrame({
  'holiday': 'new_year',
  'ds': pd.to_datetime(['2017-01-01', '2018-01-01', '2019-01-01']),
  'lower_window': 0,
  'upper_window': 0,
})

# Initialize Prophet with the holidays
model = Prophet(holidays=holidays)
model.fit(df)

Concepts Behind the Snippet

Prophet leverages a decomposable time series model with three main components: trend, seasonality, and holidays. The trend component models long-term changes in the data. Seasonality captures recurring patterns, and holidays account for irregular events that impact the time series.

Real-Life Use Case Section

Retail Sales Forecasting: Predicting future sales based on historical data, taking into account seasonality (e.g., holiday shopping seasons) and promotional events. This allows retailers to optimize inventory management and staffing levels.

Demand Forecasting for Energy: Predicting energy demand to optimize power generation and distribution, considering factors like weather patterns (temperature-dependent usage) and time of day.

Website Traffic Forecasting: Predicting website traffic to plan server capacity, marketing campaigns, and content releases, accounting for weekly and yearly patterns and special events.

Best Practices

Data Cleaning: Ensure your time series data is clean and free of outliers. Handle missing values appropriately (e.g., interpolation or removal). Feature Engineering: Consider adding relevant external regressors to improve the model's accuracy (e.g., weather data, economic indicators). Parameter Tuning: Experiment with different Prophet parameters (e.g., seasonality strength, changepoint prior scale) to optimize model performance. Cross-Validation: Use cross-validation to rigorously evaluate the model's performance and avoid overfitting to the training data.

Interview Tip

When discussing Prophet in an interview, emphasize its strengths in handling seasonality and holiday effects. Be prepared to explain the underlying model and its components. Also, be ready to discuss scenarios where Prophet may not be the best choice (e.g., time series with complex dependencies or short time horizons).

When to Use Them

Use Prophet when you have time series data with strong seasonality and/or holiday effects, and when you need a relatively easy-to-use and automated forecasting tool. It is particularly well-suited for business forecasting problems.

Memory Footprint

Prophet's memory footprint depends on the size of the input data and the complexity of the model. For large datasets, consider downsampling or using a smaller number of changepoints to reduce memory usage.

Alternatives

ARIMA: Autoregressive Integrated Moving Average models. Suitable for stationary time series data. SARIMA: Seasonal ARIMA models. Extend ARIMA to handle seasonality. Exponential Smoothing: Methods like Holt-Winters, suitable for time series with trend and seasonality. Deep Learning (LSTM): Long Short-Term Memory networks. Can handle complex time series patterns, but require more data and computational resources.

Pros

Easy to use: Simple API and automatic handling of many common time series characteristics. Robust to missing data and outliers: Can handle missing values and outliers relatively well. Interpretable: Provides insights into trend, seasonality, and holiday effects. Automatic seasonality detection: Automatically detects yearly and weekly seasonality.

Cons

Limited ability to model complex dependencies: May not perform well when time series are highly dependent on external factors that are not included in the model. Requires sufficient historical data: Needs several seasons of historical data to accurately model seasonality. Can be less accurate than more complex models: May not achieve the same level of accuracy as more sophisticated models like deep learning methods in some cases.

FAQ

  • What data format does Prophet require?

    Prophet requires a pandas DataFrame with two columns: 'ds' (datetime) and 'y' (numeric value).
  • How does Prophet handle missing data?

    Prophet is robust to missing data. It will estimate the missing values during the fitting process.
  • How can I add custom seasonality to Prophet?

    Use the add_seasonality() method to add custom seasonality patterns, specifying the period and Fourier order.
  • How do I evaluate the performance of my Prophet model?

    Use the cross_validation and performance_metrics functions from the prophet.diagnostics module.
  • How does the changepoint_prior_scale parameter affect the model?

    The changepoint_prior_scale parameter controls the flexibility of the trend. Higher values allow the trend to change more frequently, while lower values constrain the trend to be more linear.