What Data Format Does Tableau Prefer?

Tableau is usually the last 20% of any data communication effort. What is the first 80%? This document is meant to provide an example of what the finished output of data preparation for Tableau should look like.

 

Data Structures

Normalized: Databases store data in a normalized data structure. Databases have many, smaller tables, each with their own specific focus to reduce data redundancy.

For example, a restaurant chain may have a database with many tables. Each of those tables will be a fact table (primarily contains numbers that you want to analyze) or a dimension table (primarily contains information on how data is categorized).

A couple of examples of tables that would exist in a restaurant’s database are:

A Transaction Table (fact table):

2-1.png

A Receipt Table (dimension table):

2-2.png

A Store Table (dimension table):

2-3.png

 

More on database normalization here: https://en.wikipedia.org/wiki/Database_normalization.

Denormalized: When data is denormalized, separate fact and dimension tables are merged. Denormalization is done by combining normalized tables with a combination of joins and unions. Dimensions (dates, geographies, business units, etc.) will have their own columns and are repeated as many times as necessary so that each value has its own row in the dataset.

Using the three above tables as an example, when the data is joined together and denormalized the output looks like the following:

2-4.png

Notice that fields like “Address” are repeated for each unique row of data. For simple scenarios like the above, this data preparation can take place in Tableau. For more complex scenarios you’ll want to consider an alternative like SQL Server, Alteryx, or some other data preparation tool.

More on denormalization here: https://en.wikipedia.org/wiki/Denormalization.

 

Which does Tableau Prefer?

Tableau functions best on “flat”, denormalized, disaggregated data sources. That means Tableau prefers data sources like TDEs, SQL views or combined tables as opposed to cube data sources. It also means that your data source shouldn’t have rows of data summarizing previous rows of data. All that aggregation is best done in Tableau. When you are combining multiple data tables for analysis in Tableau, this denormalization process can be done in Tableau.  For more complex data,  it might need to be done elsewhere.

Need more help?  Please contact us at freesupport@onenumber.biz.