Browse
 
Tools
Rss Categories

De-identification Practices

21 The five levels of identifiability

The concept of identifiability is critical to managing the privacy risks when collecting, using, and disclosing personal information. In this article we therefore present a framework to reason about this concept, and then at the end some important implications…

22 What are the different types of disclosure risk?

There are two general kinds of re-identification risk that are of concern. The first is when an intruder can assign an identity to any record in the disclosed database. For example, the intruder would be able to determine that record number 7 in the disclosed…

23 What are the quasi-identifiers that I should use for managing prosecutor risk?

If you are trying to manage prosecutor risk, then you assume that the intruder has a specific target person in mind and is trying to re-identify that person's records in the disclosed data set. The intruder is also able to get some background information…

24 What is a quasi-identifier?

As noted in a different KnowledgeBase article (view here), the primary type of disclosure risk that needs to be focused on is identity disclosure. An underlying assumption for this type of risk is that there is an intruder who has two pieces of information:…

25 What is the difference between prosecutor and journalist risk?

Disclosure risk can be characterized as prosecutor risk or journalist risk (see http://www.jamia.org/cgi/content/abstract/15/5/627). These are just colorful names for two common types of risks. They are similar in that they both pertain to the risk of an…

26 What is the re-identification risk from small simple counts of disease cases?

A custodian has been asked to release counts of people with a particular disease. For example, in the year 2008 4 people had that particular disease in Ontario. Since the count is less than five, is there a re-identification risk in disclosing this information?…

27 What is the relationship between prosecutor, journalist, and marketer risk?

In the context of identity disclosure, we are concerned with managing three kinds of risks: prosecutor risk, journalist risk, and marketer risk. These three types of risks can be measured objectively. In a previous knowledgebase article we described the difference…

28 What quasi-identifiers should I use for managing journalist risk?

With journalist risk the intruder is not looking for a specific person in the disclosed data set; re-identifying any person will achieve the goal. A classic example is the reporter who is going through a leaked medical database to find someone with a sensitive…

29 Which type of threshold should we use for de-identification?

Many types of thresholds have been suggested and used for deciding when a data set is de-identified. Some common ones are: Cell size of 5, 3, or 10 Uniqueness Rareness A question that comes up in practice is "which threshold should we use?". In fact, all…

30 Who cares about my medical records?

One question that is sometimes posed is "why would anyone want to re-identify my records?" The argument goes that if the medical records have no value to someone else, then why would anyone bother getting access to and re-identifying them? Below are the reasons…