How to Generate Tableau Extracts Faster

By: Eric Parker

Pro Headshot.jpg

Eric Parker lives in Seattle and has been teaching Tableau and Alteryx since 2014. He's helped thousands of students solve their most pressing problems. If you have a question, feel free to reach out to him directly via email. You can also sign up for a Tableau Office Hour to work with him directly!

I’ve been working on a project with a client for several months where we are reporting against a sizable data source. Initially that data source took about 1-3 hours to extract locally which was okay. The faster the better, but I could live with it.

We recently needed to make some updates to our data source (e.g. renaming fields) and needed to extract again. Unfortunately, that extract was not able to complete because of how significantly our data source grew in the last couple months since we last generated a local extract. It would get about 25% done within a 7-hour period, then my VPN would time out.

I was trying all sorts of wacky solutions and found some that were operational but not sustainable. I reached out to Tableau Support who provided this link. There is one vital solution shared in that document that I did not realize was possible; you can publish your data source as a live connection and then flip it to an extract once it is on Tableau Server.

Here are the steps to publish a live connection and then convert it to an Extract on Tableau Server:

First, you have to publish the live connection to Tableau Server.

169-1.png

It will show up as a live connection.

169-2.png


You can select the Live link and select Extract from the dropdown.

 
169-3.png
 

And once the extract is generated, you can even create an extract refresh cycle!

 
169-4.png
 

We tested this and it worked great. Generating an extract that would’ve taken 24+ hours on my local computer only took 20 minutes on Tableau Server. Our new process is that whenever we need to make structural changes to the data source, we pick a 30-minute window where traffic to our views is minimal, publish a live connection and then quickly flip it to an Extract.

Here are a couple quirks you should know about if you pursue this approach:

●        While the connection is live before the extract is generated, your views will probably load slower for your users.

●        If your user has a view open when the data source switches from Live to an Extract, they will likely need to reload that view, otherwise they may get an error.

 

Hopefully this saves you from the hours of researching and testing I needed to do!

Compound Interest, The Eighth Wonder of the World

How to Compare to the Same Day Last Year in Tableau

0