HRPP Manual Section 12-5

Use of Anonymous and Deidentified Data

The dictionary defines anonymous as “not named or identified.” In terms of subject confidentiality issues, anonymous means that data is collected or de-identified in such a way that the identity of any subject cannot be discerned through the data.

The federal regulations infer that anonymous means "the information obtained is recorded in such a manner that the human subjects cannot be identified, directly or through identifiers linked to the subjects."45 CFR 46.101(b)

If an individual electronic database file contains subject numbers as the only identifiers, that particular file has been de-identified. However, if an investigator maintains a code linking subject numbers to identifiers, the research study itself is not completely de-identified (anonymous), as long as the investigators maintains the ability to identify an individual. Anonymous means that no one should be able to link an individual person to the responses of that person, including the investigator.

  • Face-to-face interviews are typically not anonymous.
  • Video recordings are not anonymous.
  • Audio recordings may or may not be anonymous based on the use of identifiers.
  • If phone numbers are not stored, then random digit dialing telephone interviews could be considered anonymous.
  • Mail back questionnaires are considered anonymous only if no tracking codes are utilized.
  • Internet surveys are considered anonymous only if no identifying information is collected and no IP addresses are obtained.

De-identified data is data from which an investigator or others are not able to determine the identity of any particular individual. De-identified data cannot contain the following identifiers:

  • Names
  • Street address (city, state and 5 digit zip codes are acceptable)
  • Telephone, cell, and fax numbers
  • Email address
  • Social security number
  • Medical record number
  • Health plan number
  • Account numbers
  • Certificate/license numbers
  • Vehicle identifiers and serial numbers (including license plate numbers)
  • Device identifiers and serial numbers
  • Web universal resource locators (URL)
  • Internet protocol (IP) address numbers
  • Biometric identifiers, including fingerprints and voiceprints
  • Photographic images

In most instances, the omission of these specific identifiers, such as name, social security number, or patient number, is sufficient to qualify a study as anonymous. Indirect identifiers, such as birth date, race, gender, height or weight, usually do not allow for identification under reasonable circumstances. In fact, investigators may preserve a subject’s anonymity while still retaining data on individual characteristics such as age, gender, ethnic origin, occupation, or diagnosis. However, anonymity is possible only when studying large samples or populations. When the number of potential subjects is small and/or the research setting is identified (e.g., a classroom), anonymity can be threatened or compromised even when direct identifiers have been removed from the data. Thus, Institutional Review Board (IRB) reviewers must consider what indirect identifiers are being collected and what the sample size is in order to determine if data is truly (or reasonably) de-identified.

In some cases, the IRB may require evidence that a formal disclosure analysis was performed to reasonably prevent the identification of individual subjects using variables within the database.

If research involves anonymous data, investigator may not attempt to discern the identity of individuals without the express approval of the IRB. To attempt to identify individuals from an anonymous database without IRB approval will be considered non-compliance.

This guidance document supersedes those previously drafted.

Version Date: 8-9-2011

Related HRPP Manual Sections