Time Series Decomposition Explained (and an application in Tableau)

Time Series Decomposition is a technique I use regularly, namely because I work with fairly seasonal sales data for my day job.

While this has become second nature because I use it often, it could use some explanation to those who are not familiar. Just a warning, this will get technical with R and an application in Tableau.

What is Time Series Decomposition?

In non-math terms, Time Series Decomposition is a technique of breaking time series data into parts so you can analyze its parts instead of the whole. Typically, algorithms break them into their trend, seasonal and remainder (part that can’t be explained by trend or seasonality) parts.

There are two different types of time Series decomposition, additive and multiplicative.

  • The additive model is useful when the seasonal variation is relatively constant over time.
  • The multiplicative model is useful when the seasonal variation increases over time.

There are a few ways to estimate the seasonal variation, I typically use moving averages, and with most of the data I use it comes out additive.

So why should I care?

Time Series decomposition can help explain a lot of business problems. For example, lets say your boss comes to you and asks “We just ran a promotion, what’s the impact on sales?” Or “How are we trending in X product?” Instead of supplying a line chart that looks like a cardiogram, what if you could show him/her the true trend?

That’s how I got into time series decomposition. Sales guys LOVE year over year comparisons. I loathe them. Seasonality has a HUGE impact on these numbers. What if last year you ran a huge promotion? What if it is 10 degrees warmer this month compared to same month last year? Time Series decomposition takes all these into account to give a better look at the true trends, the true seasonality and what is just random noise.

Lets look at some data.

Here’s a dataset of monthly beer sales in millions of barrels, 01/1975 – 12/1990. This comes from the TSA library in R. You can download it here: beersales. Here’s what it looks like plotted:beer_capture

Just by looking at it, you can see the seasonality fairly easily. The variation looks relatively constant and consistent, so we can guess it’s an additive model. (You can test it if you want.)

Cool, we have some time series data.

Now lets decompose it.

If you are using Tableau, you are going to be connecting to R. For those not familiar with R, I know that can sounds scary, but it’s really easy.

1.) Download R. Most people will tell you to go to CRAN, and download it from there, but my personal preference is to download it from Microsoft R Application Network. I have a couple reasons why I prefer this: a.) It includes the Math Kernel Libraries (MKL), which supports multi-threading. b.) It’s libraries are stored in MRAN (a mirror of CRAN), which promotes reproducibility.

2.) Once you have R installed, install the RServe library.

install.packages(“Rserve”)

3.) While still in R, run the following command to launch Rserve

Rserve()

It will then output something like the following:beer_Capture1

4.) Go back into Tableau. Click on Help > Settings and Performance > Manage External Service Connection

beer_Capture2

5.) I’m using Tableau 2018.3, and this is a screenshot of what you will see:beer_Capture3

Under server, type localhost, and click on Test Connection to make sure you configured it correctly. If not, go back and follow the steps again.

Finally, we’re going to decompose some data

1.) Once we have those all setup, we can finally get into the meat of the analysis.

We’re going to be using the stl function from R which stands for A Seasonal-Trend Decomposition Procedure Based on Loess. I tend to lean on this one a lot for weekly, monthly and yearly data. stl is in the base stats package in R, so we won’t have to install any libraries to get it to work.

2.) Set up a new calculated field called Trend with the following calculation:

SCRIPT_REAL(“tmp_table <- ts(.arg1, frequency=12);
stl_data <- stl(tmp_table, s.window=’periodic’, robust=TRUE);
stl_data$time.series[,2]
“,SUM([Sales]))

I’ll walk through this one.

tmp_table <- ts(.arg1, frequency=12)

This just means I’m creating a temporary time series object in R, with frequency of 12. This just means we have 12 months in every year, since our data is monthly. If you had weekly data, it would be 52.

stl_data <- stl(tmp_table, s.window=’periodic’, robust=TRUE);

Now we are using the stl function on that time series object. To read more, the documentation is here. For the s.window=’periodic’ part, this is my default. Robust is TRUE essentially makes the trend and seasonality more robust, so more of the unexplained/error appears in the  remainder.

Now that we made our trend, now we can make our seasonality and remainder (or random) using the same methods.

Seasonality:

SCRIPT_REAL(“tmp_table <- ts(.arg1, frequency=12);
stl_data <- stl(tmp_table, s.window=’periodic’, robust=TRUE);
stl_data$time.series[,1]
“,SUM([Sales]))

Remainder:

SCRIPT_REAL(“tmp_table <- ts(.arg1, frequency=12);
stl_data <- stl(tmp_table, s.window=’periodic’, robust=TRUE);
stl_data$time.series[,3]
“,SUM([Sales]))

Throwing these all on the rows, and doing a little bit of formatting, we see the following:

beer_Capture4

It begs a lot of questions. Like What happened in December 1975? Or May 1985? Or January 1982?

As you can see, time series decomposition can open up a lot more avenues for analysis and interesting data points.

I hope you found this useful and potentially a new way of looking at your time series data!

Thanks!

Paul

Advertisements

Proportional Brushing with Tableau

I have a confession to make. I’m in love with another software package other than Tableau. It’s called JMP. I’ve used JMP for about the last 7 years or so, and I fell in love pretty quickly with its Graph Builder and Distribution capabilities.

One of their core principles is that every stat has a graph and every graph has a stat. Building complex models, such as neural nets or forest models, is also quick and easy. As of JMP 14, they also added a way to export your models to SQL, which is pretty awesome. Anyway, I digress.

In JMP, the main example data source is called Big Class. In the Tableau world, Big Class is the Superstore dataset. It’s pretty much what everyone starts out with. So instead of using Superstore, I’m going to bring JMP and Big Class into Tableau. Here’s an export of Big Class for you to look at. Big Class

Like I said, in JMP you can do distributions pretty quickly, and even export them out to interactive HTML, (pretty cool stuff.)

ezgif.com-crop

So how do you do this in Tableau? It was pretty much impossible. Now with set actions, it is very possible. Here’s a walkthrough of how to create a similar viz.

1.) Create 4 sheets, one for each. For age, create a bar chart, and for age, height, and weight, create a histogram from the Show Me Pane. For each of the histograms, they should create bins as dimensions.

2.) Drag them all onto a dashboard, and distribute evenly.

You should end up with something like this.

screenshot

3.) Go back to your Gender sheet (or any other sheet). Create a set on gender (sex). Then add it to the color mark on each of your sheets.

4.) On your Dashboard, add a set action under Dashboard, Actions, change set values. Click on Select under (“Run action on”), choose Gender Set, and click on Add all values to set. (Screenshot below).

screenshot4

5.) Now going back to your dashboard you see the following behavior.

ezgif.com-gif-maker (3)

That’s it!

Here’s the link to workbook on Tableau Public

Thanks!

Paul

 

Creating a Drill-Down Map Using Set Actions

Hey guys, welcome to the first blog post! I’ve been looking forward to Tableau 2018.3 since the beta testing, knowing that density marks were going to be included. However, Set Actions are a game changer.

In this post, I’m going to go through how to create a drill-down map, from state to counties using set actions.

First, I have a dataset I’ve compiled of State, County and Zip Code relationships. You can download it here. state_fips_zip

Like Superstore, lets create a few sets and dynamic names:

1.) Create a set of States. Call it State Set

2.) Create a set of Counties (FIPS). Call is County Set

3.) Create a Calculated Field Called Dynamic County:

IF [State Set] then [FIPS] ELSE [State] END

4.) Create a Calculated Field Called Dynamic Zip:

  • If both sets are true, return the zip code otherwise return NULL.
  • If the result is NULL return the first calculated field, otherwise return the result from step one.

IFNULL(if [State Set] and [County Set] THEN [Zip Code] END,[Dynamic County])

5.) Now create a worksheet with State, Dynamic County and Dynamic Zip in the details.

Capture14

6.) Create a new Dashboard and create 2 set actions:

7.) By clicking on a state, you see that it works…kind of. You need to go back to your sheet, and change some geographic roles.

Capture17

8.) Now, going back to your dashboard, you can drill-down from state to county with ease!

ezgif.com-gif-maker (1)

Here’s a link to the workbook on Tableau Public. There are still issues with Tableau public and set actions working, so you’ll have to download it to get the full effect.

Thanks!

Paul