Rules of dynamic masking
The client asks you to mask data in a dashboard based on certain rules. The rules are dynamic and based on how the data is partitioned and filtered in a visualisation.
These are the requirements:
- Privacy rule 1: If the total of a single partition is below a certain threshold, mask each cell of the partition, the subtotal of that partition and the grand total;
- Privacy rule 2: Because a certain category is extra sensitive, if a certain single cell of that sensitive category differs only X from the subtotal of that partition, mask each cell of that partition, the subtotal of that partition and the grand total.
General rules:
- Masking means: show a '(*)' ;
- You should NOT be able to circumvent the masking by exporting the viz.
Deciding what route to take based on your data
There are two ways to mask data in Tableau and it depends on your data which of these two you might go for:
- If data needs masking, convert it to NULL and use the Format > Special Values (eg. NULL) option to display '(*)' when a value is NULL. This of course only works if you don't have NULLs in your source data as with this approach, you can’t distinguish a masked value from an actual NULL value.
- If data needs masking, convert it to a (large) negative number and use custom number formatting to display values below 0 as '(*)'. Like this: '0.00;(/*);0'. This of course only works if you don't have negative numbers in your source data.
So, do you have NULLS in your data? Go with option 2.
Do you have negative numbers in your data? Go with option 1.
I went with option 2 as I didn’t have negatives in the data.
The solution
LOD or Table calculation?
Since these calculations are based on the partitioning ánd filtering in the viz, you cannot pre-calculate this and thus you need either an LOD (EXCLUDE, not FIXED!) or a Table Calc. I went with the LOD route.
Setup the masking calculations
Let's say we have this table and we need masking for every region/year combination that has a quantity of less than 1400. This is the first privacy rule and is calculated along the red horizontal lines.
Also we need masking if the Office Supplies category differs less than 750 from the total of the region/year combination because Office Supplies is a sensitive category. This is the difference between the two columns marked in blue.
Then, we need 3 calculated fields:
- Privacy rule 1:
This first rule calculates the total of the row (region/year combination) by excluding the category from the level of detail. If the row total is below 1400, then we set this value to -999999.
- Privacy rule 2:
This second rule calculates the difference between the Office Supplies column and the total of the row (region/year combination). If this difference is smaller than 750, then we set this value to -999999.
- Privacy rules consolidated:
Then we need another calculation to combine these previous two calculations and check if one of them triggered the masking (or: if one of them is a negative number!). An extra complication here is that we also need to account for the (sub)total rows.
Show the masking calculations
When we then display these three calculations, it becomes clearer what we’re actually doing:
The final result then looks like this: