The juicy bits

Lets face it, we all prefer the tasty fruity bits over the oats and bran flakes…


“Bowl of organic muesli and spoon” by Vegan Photo is licensed under CC BY 2.0

“Any data relating to an extinguished or closed account should not be retained.”

The expected outcome from this recipe is a list of files that contain over-retained data.

These notes discuss some variations to the Over-retained data recipe. That recipe deals with the case of looking for kinds of data, but can be adapted to finding files which relate to an individual or an account held by an individual. It becomes more complicated to deal with data for multiple individuals, so we present some advanced strategies here.

In the case of finding an individual or an account, it is the PII corresponding to the individual, or account identifiers, which become the “identifying features” for the over-retained data recipe.

Here are some things to think about:

  • Are there multiple date formats in your data?

    Have you considered how you will separate a date of birth from a date of publication? Is the presence of a name or other information or key phrase enough to increase the certainty?

  • When dealing with names, are there common nicknames or variants which your account might use?

  • The temptation or obvious route is often to simply perform AND queries in the advanced search. But can you create subsets and combinations of data which would be adequate, that you could then OR together to get a broader search?

Note for advanced users with some scripting experience

Searching for large numbers of people can be a challenging task. Consider if you can create a query, and then use that as a template by exporting the query and ANDing multiples together.