Maintaining patient privacy while geocoding patient addresses: Do Not Use R to Geocode

Imagine if a clinical researcher were to disclose a list of patient addresses to a third-party – government agency, for profit company or not-for-profit entity – that was outside of their hospital or health system. Imagine the researcher then publicly announced they disclosed the addresses to the third party, that the addresses belonged to patients with a specific disease, and that those patients were being treated at a specific hospital. The researcher’s Institutional Review Board (IRB) and Health Insurance Portability and Accountability Act (HIPAA) compliance office would be outraged at these violations of patient privacy. Yet this sequence of events can happen inadvertently when studying how neighborhood conditions such as access to medical facilities or neighborhood food environments affect clinical outcomes in specific patient populations. A quick search of Google Scholar shows many articles that, through this sequence of events, have disclosed patient health data.

In a recent pre-press publication we show how geocoding patient or study subject addresses using a variety of R packages, STATA, SAS and QGIS can set of a cascade of events that discloses Personal Identifying Information (PII) and Protected Heath Information (PHI) in violation of usual IRB and HIPAA rules. We also show the flaws in several approaches proposed to protect PII and PHI in neighborhood health effects research and propose best practices to protect patient and study subject confidentiality in studies on neighborhood health effects.

Posted in Health Care Access, Methods, Tools | Leave a comment

Neighborhood Walkability and Body Mass Index among African American Cancer Survivors

Increasingly, health care systems are becoming stakeholders in urban design and infrastructure planning processes, and are considering how neighborhood environments can support the health of communities and patient populations within health system catchment areas. To this end, health systems are: contracting with planning firms to create Health District Plans with urban infrastructure that promotes healthy lifestyles; working alongside community partners to improve alignment and delivery of local health and supportive care resources; and are incorporating neighborhood-level data in electronic health records (EHR) as a new type of patient “vital sign.” And while extant literature indicates associations between neighborhood built environments and health outcomes, particularly for obesity, in the United States (US) general population, few studies have explored these relationships in cancer survivors – even fewer in non-white cancer survivor populations.

Cancer survivors are at heightened risk of weight gain after diagnosis, since they are susceptible to the energy-balance-related causes of weight gain common in the general population, as well as to cancer treatment-related weight gain. From 1997 to 2014, obesity increased more rapidly among adult cancer survivors compared with the general population. Colorectal and breast cancer survivors and non-Hispanic black survivors were at the highest risk of experiencing obesity during this period. Recent evidence among survivors of obesity-related cancers suggests that weight gain is related to higher risk of recurrence, cancer mortality, and all-cause mortality. 

In a recent publication, we assessed the cross-sectional association of residential neighborhood walkability with body mass index (BMI) in 2089 African Americans who had recently been diagnosed with cancer in Metropolitan Detroit, Michigan. Similar to prior research in the general population, among these cancer survivors, we found BMI to be inversely associated with greater neighborhood walkability. When we stratified these analyses by biologic sex, we observed the inverse associations in men but not women and, separately, among survivors reporting any regular physical activity post-diagnosis. The sex-specific findings are similar to those observed in previous studies of general populations. To our knowledge, this is the first survivorship investigation of neighborhood walkability and BMI to focus solely on African Americans, to include survivors of more than one obesity-related cancer (i.e., breast, colorectal, prostate), to incorporate men, and to use a multidimensional walkability index. Our research provides initial evidence that built environment factors influence weight among African American cancer survivors and support for health systems involvement in local urban design and planning decisions.  

This research was conducted in collaboration with colleagues at Wayne State University, using data from cancer survivors participating in the Detroit Research on Cancer Survivors (ROCS) cohort study.

Posted in Adults, Cancer Survivors, Urban Design, Walkability | Leave a comment

Improving the measurement of Neighborhood Physical Disorder

Neighborhood audit methods (AKA Systematic Social Observation) are often used to create measures of neighborhood built and social environments.  But even with the enhanced efficiency of virtual neighborhood audit methods using CANVAS-Street View, it is generally not possible to collect data from every block in a City.  Thus, spatial interpretation methods, such as kriging, are often used to estimate neighborhood conditions at locations not visited by the audit team.  Ordinary kriging uses the data collected at visited locations (blocks or intersections), and the spatial correlation between the data elements, to estimate conditions at all other locations in a neighborhood or City.  In recently published work we investigated whether Universal Kriging could be used to create improved estimates of neighborhood physical disorder across entire cities.  Universal kriging builds upon Ordinary Kriging by using additional external data, such as Census data, in the interpolation/estimation process.  We find that using additional data on housing vacancy, along with the observed physical disorder metrics, in a Universal Kriging model could improve model fit and estimation of physical disorder across a city.  In addition, Universal Kriging could create equivalently accurate estimates of physical disorder, but require the collection of disorder measures from fewer locations across a city.

Street Viewing 125th Street
Posted in CANVAS, Methods, Street View, Tools | Leave a comment

Newly Funded Work on Pedestrian Injury

We have recently been funded by NIH to conduct a four-year study of how urban design, the locations of alcohol selling establishments, night life districts and locations of services for the homeless influence pedestrian fatality risk.  We will be conducting a location-based case-control study of all pedestrian fatalities that occurred in metropolitan areas of the U.S. in 2017 and 2018 and matched control locations.  We will also be expanding the capabilities of our CANVAS tool for conducting nationwide virtual neighborhood audits via Google Street View.  This research builds upon our work to understand how urban form and access to parks and green spaces influences physical activity patterns.  Pedestrian safety is often cited as influencing engagement in pedestrian activity, particularly for older adults and for youth.  As cities make urban design changes to promote walking there is a concern that pedestrian injuries will increase as more people take to the sidewalks. In prior work we piloted the use of virtual neighborhood audits in pedestrian injury research.

However, a closer look at the pedestrian fatality data shows that, in 2018, 38% of all fatally struck pedestrians, and 44% of those between the ages of 21-65 years, were under the influence of alcohol when they were struck, a prevalence that has been consistent since at least 1997. Cities across the U.S. are currently developing plans to prevent all pedestrian fatalities, yet few if any of these city plans even mention the role of alcohol use by pedestrians or describe action items to address this modifiable risk factor  The most recent set of policy and design recommendations for reducing injuries among intoxicated pedestrians was a 1996 report from Monash University in Australia.  Thus, part of our research will focus on the locations of, and urban design around, alcohol selling establishments and the locations of night life districts as risk factors for pedestrian fatality.   We have developed methods to apply SatScan analyses to NETS business listing data, to automatically identify clusters of nightlife businesses.

We developed methods to apply SatScan analyses to business listing data to identify clusters of nightlife businesses. Panel A. Clusters of nightlife businesses detected by SaTScan in Philadelphia in 2014. Panel B. Blue rectangle in panel A, showing the Center City/Old City clusters and the East Passyunk Corridor clusters with Census blocks that overlap with the clusters and have nightlife businesses color-coded by number of nightlife businesses.

In addition, individuals experiencing homelessness appear to be at particular risk for pedestrian injury and to comprise a large percentage of the pedestrians killed while under the influence of alcohol. The Department of Housing and Urban Development estimated that there were 552,830 people experiencing homelessness in the U.S. in 2018 and 35% of these individuals live in unsheltered locations, often close to traffic (e.g., under bridges and overpasses).  The location of services for the homeless, and the pedestrian infrastructure around these services, appear to contribute to the risk of injury among those experiencing homelessness. Case-studies in the 1996 Monash University report highlight the importance of the location of homeless shelters and other services for the homeless near high volume road-ways as risk-factors for pedestrian injuries. Our study will be the first systematic investigation of the association between alcohol-involved pedestrian fatalities and locations of services for the homeless, and encampments, and informal areas where those experiencing homelessness shelter, and the first to study pedestrian infrastructure around providers of services for the homeless.     

We will also be building a new version of our CANVAS tool for conducting virtual neighborhood audits using Google Street View.  The new version will include: (a) automated sampling of locations for case-control studies; (b) image annotation tools; (c) linkage to neighborhood data via Application Program Interfaces (API) for Google Maps,, and the US Census; (d) addition of scale development and spatial statistics tools and (d) integration with emerging online street/road visualization technologies and tools.  The CANVAS tool is publicly available and has already been used in numerous completed [here, here, here] and ongoing studies.

Posted in Active Transport, CANVAS, Economic Development, Methods, Pedestrian Injury, Safety, Street View, Tools, Urban Design, Walkability | Leave a comment

Physical Activity in an Urban Environment and Associations with Air Pollution and Lung Function

In new work published in the Annals of the American Thoracic Society we analyzed the links between where children are physically active and their exposure to air pollution and lung function.  Physical activity is associated with increased ventilation because of rapid and deeper breathing. Thus, being active while being exposed to high air pollution could lead to increased inhalation of pollutant particles and gases. In urban communities features of the built environment, including locations where children engaging in physical activity, could put individuals at risk for harmful inhaled exposures leading to lung function decrements. Thus, we conducted a study to investigate locations throughout New York City where children engaged in moderate-vigorous activity. Our hypothesis was that being physically active outdoors, particularly near high sources of traffic pollution, would be associated with increased air pollution exposure and decreased lung function.

Our study was conducted as part of a longitudinal birth cohort study in affiliation with the Columbia Center for Children’s Environmental Health that recruited pregnant, non-smoking mothers who lived in Northern Manhattan and the Bronx. At the time of enrollment in this secondary study, children were 9-14 years of age and still lived in NYC.  The 151 children enrolled in the study wore global positioning system (GPS) devices in a vest to identify their locations and accelerometers on their wrists that measured physical activity level. Devices were worn for 24-hours with repeated measures after 5 days. We paired the GPS and accelerometer data and mapped the data using ArcGIS to determine where children engaged in moderate-vigorous activity. We also used data from the New York City Community Air Survey (NYCCAS) to determine annual average air pollution concentrations in the locations where children were physically active. To account for daily fluctuations in pollution we adjusted our analysis for daily NYC pollution measured at a single site as well as daily temperature and humidity. Lastly, we measured lung function at the end of each 24-hour physical activity monitoring period.

On average, children spent more time physically active indoors (71.9 ±74.7 min/day) compared to outdoors (38.2 ±39.6 min/day). However, the majority of outdoor physical activity was along sidewalks and roadbeds (30.2 ±33.3 min/day) where nitrogen dioxide (NO2) pollution was relatively high. More time spent in outdoor activity was associated with higher exposure to NO2. In warmer months, more time spent in outdoor activity was associated with lower lung function even after adjusting for air pollution exposure. This finding suggests that in addition to air pollution there may be other environmental factors that contribute to decreased lung function when children are active outdoors.

Because physical activity leads to increased ventilation, the ability to identify specific locations of activity especially near sources of high pollution can improve risk assessment for lung function impairment. Our findings of a positive association between outdoor active time in warmer weather months and both elevated NO2 exposure and reduced lung function demonstrate a need to inform individuals in urban communities about location specific exposure risks. Our findings also support the need for continued attention towards improving air quality especially in urban communities.

Posted in Accelerometers, Asthma, Childhood, GPS | Leave a comment

Mapping Food Insecurity During the COVID-19 Pandemic

Prior to the COVID-19 pandemic, 11% of households and nearly 16% of families with children were food insecure. With schools closed and families out of work, food insecurity rates are expected to skyrocket in the coming months. During the crisis, food store shelves have frequently been empty due to bulk purchasing and an increase in at-home meal consumption. In an effort to ensure food is available for SNAP shoppers when they receive their monthly benefits, social media campaigns have encouraged non-SNAP shoppers to avoid food shopping during SNAP distribution dates. This is a particular concern in counties with large SNAP populations and in states with only a few SNAP distribution days per month (e.g., Nevada, Virginia, New Jersey). To inform these efforts, BEH’s Eliza Kinsey has developed a web mapping tool of SNAP distribution schedules and participation rates by county. This tool is intended to be used, both by household food shoppers and by policy-makers in designing strategic efforts to combat the food insecurity and nutrition challenges brought on by the COVID-19 crisis.

The maps use current SNAP schedule distribution information and 2018 participation counts. Dr. Kinsey plans to update the mapping tool regularly with pertinent SNAP participation changes and unemployment data.

SNAP recipients by County

Posted in Food Environment, Social Determinants, Socioeconomic status | Leave a comment

Updated: County Level Estimates of Highly Stressed Health Care Systems

Our online mapping tool has been updated with new data showing counties that are at high risk of experiencing patient volumes that exceed their hospital capacity over the next 6 weeks.  The maps show at risk counties for three different levels of social distancing and two levels of intensity of surge responses by hospitals.  The estimates use Jeffrey Shaman and colleague’s models of disease spread and the estimates posted previously of how many critical care hospital beds can be made available under various assumptions of hospital responses to patient surges. A full description of the methods can be found |here|

As in prior posts, the mapping site is a work in progress and will be updated frequently.

Posted in Community Needs Assessment, Health Care Access | Leave a comment

County Level Estimates of When Hospital Capacity will be Overwhelmed

We have been working as part of a multi-institution team to make county level estimates for the U.S. of the time until health systems are overwhelmed with patients.  The analyses use a 28 day look forward window from 3/24/2020 and identify numerous counties where the health care system is expected to be overwhelmed; 28-day look forward analyses will be re-done weekly. Projections of time to health care systems being overwhelmed have been made for various levels of social distancing and various levels of intensity of hospital response to patient surges.  A paper detailing all of the estimates will be uploaded to pre-press sites soon.

Our online mapping tool is currently displaying maps based on these analyses that show which counties are projected to experience patient volumes that exceed their hospital capacity over the next 28 days, under three scenarios: 1) no-social distancing and low intensity hospital response to patient surge; 2) no-social distancing and medium intensity hospital response to patient surge; and 3) no-social distancing and high intensity hospital response to patient surge.  The estimates combine Jeffrey Shaman and colleague’s models of disease spread and the estimates posted previously of how many critical care hospital beds can be made available under various assumptions of hospital responses to patient surges.

As in prior posts, the mapping site is a work in progress and will be updated frequently.

Time to patient demand exceeding hospital capacity: 28 day look forward from 3/24/20, no social distancing and a medium surge response

Posted in Community Needs Assessment, Health Care Access | Leave a comment

Estimated ICU Beds Available to Respond to Patient Surges

In collaboration with Charles Branas in the Department of Epidemiology and colleagues from Patient Insight, the Mount Sinai Health System and MIT, we have been working to estimate the number of hospital critical care beds, including ICU beds and other hospital beds used for critical care purposes, that could be made available by hospitals in response to patient surges.  Three scenarios of intensity of hospital response were created, taking into account existing ICU bed availability, currently occupied ICU beds that can be made available, other beds such as post-anesthesia care unit bed, operating room beds, and step-down beds that could be converted to critical care beds for COVID-19 patients and the possibility of having two patients use one ventilator in ICU. All civilian acute medical-surgical tertiary care hospitals and long-term acute care hospitals hospitals for which data were available in the US are included.

The data are mapped on our online interactive COVID-19 mapping tool.  The documentation of the methods is here.

Screen shot of the BEH COVID-19 mapping tool showing the distribution of COVID-19 cases (blue) and estimated available ICU beds under a Moderate Intensity Response to patient surges

Posted in Community Needs Assessment, Health Care Access, Social Determinants | Leave a comment

At Risk Populations for Severe COVID-19, Part IV

Our geographer extraordinaire, James Quinn, built a new version of our interactive mapping tool for severe COVID-19.  The map depicts populations at high risk of severe COVID-19 due to older age or underlying health conditions, the availability of ICU beds and the ratios of  high risk populations to ICU beds.  The interactive mapping tool is here.  This is an ongoing project and we will keep updating the maps with new data and features as the pandemic continues.

ICU bed counts from HRSA ( As roughly 66% of these beds are currently occupied, these numbers overstate the number of beds available for COVID-19 patients.

Posted in Health Care Access, Tools | 1 Comment