Time Series Decomposition is a technique I use regularly, namely because I work with fairly seasonal sales data for my day job.

While this has become second nature because I use it often, it could use some explanation to those who are not familiar. Just a warning, this will get technical with R and an application in Tableau.

**What is Time Series Decomposition?**

In non-math terms, Time Series Decomposition is a technique of breaking time series data into parts so you can analyze its parts instead of the whole. Typically, algorithms break them into their trend, seasonal and remainder (part that can’t be explained by trend or seasonality) parts.

There are two different types of time Series decomposition, additive and multiplicative.

- The additive model is useful when the seasonal variation is relatively constant over time.
- The multiplicative model is useful when the seasonal variation increases over time.

There are a few ways to estimate the seasonal variation, I typically use moving averages, and with most of the data I use it comes out additive.

**So why should I care?**

Time Series decomposition can help explain a lot of business problems. For example, lets say your boss comes to you and asks “We just ran a promotion, what’s the impact on sales?” Or “How are we trending in X product?” Instead of supplying a line chart that looks like a cardiogram, what if you could show him/her the true trend?

That’s how I got into time series decomposition. Sales guys LOVE year over year comparisons. I loathe them. Seasonality has a **HUGE** impact on these numbers. What if last year you ran a huge promotion? What if it is 10 degrees warmer this month compared to same month last year? Time Series decomposition takes all these into account to give a better look at the true trends, the true seasonality and what is just random noise.

**Lets look at some data.**

Here’s a dataset of monthly beer sales in millions of barrels, 01/1975 – 12/1990. This comes from the TSA library in R. You can download it here: beersales. Here’s what it looks like plotted:

Just by looking at it, you can see the seasonality fairly easily. The variation looks relatively constant and consistent, so we can guess it’s an additive model. (You can test it if you want.)

Cool, we have some time series data.

Now lets decompose it.

If you are using Tableau, you are going to be connecting to R. For those not familiar with R, I know that can sounds scary, but it’s really easy.

1.) Download R. Most people will tell you to go to CRAN, and download it from there, but my personal preference is to download it from Microsoft R Application Network. I have a couple reasons why I prefer this: a.) It includes the Math Kernel Libraries (MKL), which supports multi-threading. b.) It’s libraries are stored in MRAN (a mirror of CRAN), which promotes reproducibility.

2.) Once you have R installed, install the RServe library.

install.packages(“Rserve”)

3.) While still in R, run the following command to launch Rserve

Rserve()

It will then output something like the following:

4.) Go back into Tableau. Click on Help > Settings and Performance > Manage External Service Connection

5.) I’m using Tableau 2018.3, and this is a screenshot of what you will see:

Under server, type localhost, and click on Test Connection to make sure you configured it correctly. If not, go back and follow the steps again.

**Finally, we’re going to decompose some data**

1.) Once we have those all setup, we can finally get into the meat of the analysis.

We’re going to be using the stl function from R which stands for A Seasonal-Trend Decomposition Procedure Based on Loess. I tend to lean on this one a lot for weekly, monthly and yearly data. stl is in the base stats package in R, so we won’t have to install any libraries to get it to work.

2.) Set up a new calculated field called Trend with the following calculation:

SCRIPT_REAL(“tmp_table <- ts(.arg1, frequency=12);

stl_data <- stl(tmp_table, s.window=’periodic’, robust=TRUE);

stl_data$time.series[,2]

“,SUM([Sales]))

I’ll walk through this one.

tmp_table <- ts(.arg1, frequency=12)

This just means I’m creating a temporary time series object in R, with frequency of 12. This just means we have 12 months in every year, since our data is monthly. If you had weekly data, it would be 52.

stl_data <- stl(tmp_table, s.window=’periodic’, robust=TRUE);

Now we are using the stl function on that time series object. To read more, the documentation is here. For the s.window=’periodic’ part, this is my default. Robust is TRUE essentially makes the trend and seasonality more robust, so more of the unexplained/error appears in the remainder.

Now that we made our trend, now we can make our seasonality and remainder (or random) using the same methods.

Seasonality:

SCRIPT_REAL(“tmp_table <- ts(.arg1, frequency=12);

stl_data <- stl(tmp_table, s.window=’periodic’, robust=TRUE);

stl_data$time.series[,1]

“,SUM([Sales]))

Remainder:

SCRIPT_REAL(“tmp_table <- ts(.arg1, frequency=12);

stl_data <- stl(tmp_table, s.window=’periodic’, robust=TRUE);

stl_data$time.series[,3]

“,SUM([Sales]))

Throwing these all on the rows, and doing a little bit of formatting, we see the following:

It begs a lot of questions. Like What happened in December 1975? Or May 1985? Or January 1982?

As you can see, time series decomposition can open up a lot more avenues for analysis and interesting data points.

I hope you found this useful and potentially a new way of looking at your time series data!

Thanks!

Paul