HR/Legal/Other Data
I said I was looking for salary data…
“Employee type data, for example, salary or performance information, should not be stored outside whitelisted areas.”
The expected outcome from this recipe is a list of files in unsafe locations that contain employee data.
In many respects, this recipe is a lot like the PII outside whitelisted areas recipe. What is required here is some creativity in finding what makes HR data distinct from other information.
Crawl your data sources and assign them to datasets appropriately.
The art of figuring out your search terms to detect your HR data is working out how to make that HR data stand out from background information. Here are some things to consider:
It can be tempting simply to search for terms such as “salary” or “performance”. Our experience shows that these terms are so common that they’re unlikely to be effective.
Does your HR department have templates that they use? Key phrases from those templates can be a really good starting point to identify those kinds of documents.
Use a file path term to exclude specific locations from the search. You may include as many of these as are required to exclude all locations which are considered approved by using an OR condition between the locations.
If a specific file is considered a valid location, you can use a combination of a path and file name to identify it unambiguously.
This will give you a list of items which match your search terms now. You can save the search to be run again later. And you can create a workflow from it to keep track of progress.
Note for advanced users
Do your items contain case codes with a distinctive format? Adding those as a custom rule at ingestion time can help you find documents which contain items which match patterns.
Our Support team is available to help you with this functionality.