I wanted to share some of the most useful Row Level Security articles that I’m aware of and talk high level about one of Tableau’s newer features, ‘multi-table extracts’. I’ve deployed all these solutions as a customer at Seagate, once upon a time, so I’m happy to connect with other Tableau Champions and guide them through the minutiae of implementing this.
A quick summary of my experience with Row Level Security in Extracts is this: You can automate extracts to refresh even on an hourly cadence to get near-real time data that performs at scale. The most common misconception with Row Level Security is that you must duplicate data to make it work in an extract - This is not true and I’ve implemented alternatives successfully at scale. You can use your preferred security tables and extract them to be updated hourly.
So, with some thoughtful data preparation or even just a minor reconfiguration, one can get near real-time data in a performant Hyper extract that’s sitting on server and from there you can apply row level security to it. It’s a win-win, as you’re no longer dependent on an under-performing data platform for a live connection and when resorting to extracts, you’re not suffering the performance impact caused by inflating data with cross joins to duplicate data to set up row level security.
Multi-Table Extracts
https://www.tableau.com/about/blog/2018/10/you-can-now-choose-multiple-table-storage-extracts-94776
Because we now offer multi-table extracts, that’s another great option for solving the explosion of data due to row level security and there’s no data prep involved. In the example below, you could have two tables like the following:
[Account] | [Authorized User]
NetApp | Antonia Kealy
NetApp | Joe Schafer
Joined by Account to:
[Account] | [Qty]
NetApp | 100
Creating a denormalized data set like this:
[Customer | [Authorized User] | [Qty]
NetApp | Antonia Kealy | 100
NetApp | Joe Schafer | 100
But because we’re storing them in the extract without joining in advance, it’ll only ever need to query one row and the data set won’t be duplicated in advance! So, in short, we have lots of ways to efficiently scale Row Level Security in Tableau! I’m happy to help folks work through some POC use cases on this topic. For reference, there may be cases where Multi-Table Extracts aren’t as desirable as the fancier option of creating a concatenated string of authorized users. Chiefly, when you already need to prepare the data in advanced ways, or you want to asynchronously pull your ‘metrics data’ compared to your ‘authorization data’. For example, if your metrics data is so large that you can’t pull it more than once a day, but you want the authorization data pulled hourly still, that might be a good scenario to prefer one option over another.
Recommended Reading on Row Level Security:
General overview:
My former manager’s blog on how we implemented RLS at scale at Seagate using a hybrid approach of Active Directory groups and row level security. Note that not mentioned in his blog is a trick that shows up in the Part 2 article at the bottom of the list, where we later optimized our extract sizes using some data prep tricks.
Some additional strategies for implementing Row Level Security in parts 1 and 2:
The short version of the final article is instead of having two rows to give access to a NetApp row of data for two individuals with authority to see NetApp’s row of data, we’d have the latter single row of data, which, at scale, will drastically reduce the size of extracts.
Instead of this:
\\ filter logic: username() = [Authorized User]
\\ returns TRUE for Antonia and Joe
[Customer | [Authorized User] | [Qty]
NetApp | Antonia Kealy | 100
NetApp | Joe Schafer | 100
Use this:
\\ filter logic: contains( [Authorized Users] , username() )
\\ returns TRUE for Antonia and Joe
[Customer | [Authorized Users] | [Qty]
NetApp | Antonia Kealy, Joe Schafer | 100