Maintaining patient privacy while geocoding patient addresses: Do Not Use R to Geocode

Imagine if a clinical researcher were to disclose a list of patient addresses to a third-party – government agency, for profit company or not-for-profit entity – that was outside of their hospital or health system. Imagine the researcher then publicly announced they disclosed the addresses to the third party, that the addresses belonged to patients with a specific disease, and that those patients were being treated at a specific hospital. The researcher’s Institutional Review Board (IRB) and Health Insurance Portability and Accountability Act (HIPAA) compliance office would be outraged at these violations of patient privacy. Yet this sequence of events can happen inadvertently when studying how neighborhood conditions such as access to medical facilities or neighborhood food environments affect clinical outcomes in specific patient populations. A quick search of Google Scholar shows many articles that, through this sequence of events, have disclosed patient health data.

In a recent pre-press publication we show how geocoding patient or study subject addresses using a variety of R packages, STATA, SAS and QGIS can set of a cascade of events that discloses Personal Identifying Information (PII) and Protected Heath Information (PHI) in violation of usual IRB and HIPAA rules. We also show the flaws in several approaches proposed to protect PII and PHI in neighborhood health effects research and propose best practices to protect patient and study subject confidentiality in studies on neighborhood health effects.

This entry was posted in Health Care Access, Methods, Tools. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s