How to Anonymize Private Data in Tableau Prep

By: Eric Parker

Pro Headshot.jpg

Eric Parker lives in Seattle and has been teaching Tableau and Alteryx since 2014. He's helped thousands of students solve their most pressing problems. If you have a question, feel free to reach out to him directly via email. You can also sign up for a Tableau Office Hour to work with him directly!

While working with personally identifiable information, you may need to suppress sensitive data. Let’s say that you are working with healthcare data and want to suppress patient names.

First, you’ll want to create an aggregate step to get a unique list of names.

 
166-1.png
 

 Next, you will output the list as a .csv.

166-2.png

After getting a row count of names that need suppression, you can use a tool (like Mockaroo) to generate a list of random names.

166-3.png

Next, you’ll put the name columns side-by-side in a table.

166-4.png

Last, you can join the generated file into the original data flow, join on the original name field, and after joining remove the original name field. Now, all that is left are the generated names.

166-5.png

If you do this, make sure you remove the original name from the workflow before outputting your table! This works great for any field (not just name). Item descriptions, IDs, dates, amounts. Any of those can be randomized in the same manner.

This exact approach is great for a one-time anonymization but won’t automate well if the customer names grow and change often. However, this same approach can be scaled using other tools for complete and automated anonymization of data.

 

Looking for help with your own Tableau project? Book a Tableau Office Hour!

How to Compare to the Same Day Last Year in Tableau

Shortcut to Create a PDF, PNG or CSV of Published Tableau Views

0