Machine Learning Approaches for Measuring Neighborhood Environments in Epidemiologic Studies

We recently published a review article in Current Epidemiology Reports describing the use of machine learning to measure neighborhood environments in epidemiologic studies. Innovations in information technology, initiatives by local governments to share administrative data, and growing inventories of data available from commercial data aggregators have immensely expanded the information available to describe neighborhood environments, supporting an approach to research we call Urban Health Informatics

Increasingly researchers are turning to machine learning based approaches to work with these large pools of data and develop measures of neighborhood environments. Prominent machine learning applications in this field include automated image analysis of archived imagery such as Google Street View images, variable selection methods to identify neighborhood environment factors that predict health outcomes from large pools of exposure variables, and spatial interpolation methods to estimate neighborhood conditions across large geographic areas. In the review we highlighted successes and cautions in the application of machine learning, particularly highlighting the terms of use and possible legal issues in applying machine learning approaches to Google’s geo-spatial data.

Google’s overall terms of use and those of their Google Maps product (including “Geo Guidelines” included in the Google Maps terms) prohibit essentially every step used in common work flows for applying machine learning to image analysis of Google Street View images. The Geo Guidelines explicitly state that nonprofit and academic uses are not exempt from the terms: “these restrictions apply to all academic, nonprofit, and commercial projects,” and further that they will not grant exceptions: “If your use is not allowed, we are not able to grant exceptions, so please do not submit a request.” The enforceability of Terms of Use contracts is an area of active litigation and, as such, is unclear, but has caused our team to avoid applying machine learning techniques to Street View images.

A typical work flow for applying machine learning techniques to Street View images in relation to Google’s Terms of Use.
Posted in Methods, Street View | Leave a comment

The Evolution of Disparities in Spatial Access to Social Services in the U.S., 1990 to 2014

To address patient’s unmet social needs and improve health outcomes, health systems have developed programs to refer patients in need to social service agencies. However, the capacity to respond to patient referrals varies tremendously across communities. To understand how disparities in spatial access to social service agencies arose we used the National Establishment Time Series (NETS) data set to analyze the density of social service agencies (agencies/Km2), annually, in all populated Census tracts in the U.S. from 1990 to 2014.  Our paper describing this work was published in BMC Health Services Research.

Throughout the period, social service agencies/Km2 increased within tracts, with tracts experiencing the highest poverty rates in 1990 having the highest density of agencies through the 1990 to 2014 time-period. But from 1990 to 2014 a spatial mismatch emerged between the availability of social services and the expected need for social services as the population characteristics of neighborhoods changed. Tracts that experienced high poverty in 1990 and then experienced the steepest declines in household income through 2010, had the lowest access to social service agencies in 1990 and the smallest increases in access over the years.  Conversely, high poverty tracts that experienced the largest gains in household income from 1990 to 2010 began the period with the highest density of agencies and gained the most agencies through the study time period.     

We theorize that agglomeration economics benefits and the marketization of welfare may explain the emergence of this spatial mismatch between expected population need for services and the availability of services.  Agglomeration economics posits that there are advantages when similar businesses and institutions locate near one another and near to other physical and commercial resources that will support the mission of these institutions. Agglomerative effects may create centers of gravity that increasingly concentrate service providers in certain neighborhoods through time. The marketization of welfare describes social service providers’ increasing budgetary reliance on fee for service activities. Service providers who partially rely on fee for service activities may be particularly attracted to high poverty areas that are on an upwards economic trajectory and gaining residents who can afford to pay for services.

Agglomeration benefits predict that social services will spatially cluster and further analyzes of our data suggested that clustering of agencies was linked to other elements of urban built form, such as a more robust retail, commercial and institutional environment and access to rail transit. As hospitals and health care systems are increasingly becoming stakeholders in local urban planning, zoning and economic development decisions, they should consider how decisions about urban form may influence spatial access to social services. Hospital system’s advocacy for transit-oriented design and mixed-land use may create the conditions that attract social service agencies into hospital catchment areas. Given the significant role that market forces play in determining the placement of services, attention to interdisciplinary theories across urban planning, economics and social and health services research is needed to improve spatial access to social services.

Posted in Economic Development, Health Care Access, Social Determinants, Socioeconomic status, Transportation | Leave a comment

Higher Neighborhood Walkability is Associated with a Lower Risk of Excessive Weight Gain During Pregnancy

In partnership with the NYC Department of Health and Mental Hygiene we have been studying how neighborhood environments influence health during pregnancy and birth outcomes, with recent work focusing on weight gain during pregnancy.   In 2009, the Institute of Medicine (IOM) issued revised recommendations for healthy gestational weight gain (GWG). However, despite the new guidelines, most pregnant individuals in the U.S. still do not gain the recommended amount of weight during pregnancy; almost 50% of pregnant individuals gain more weight than is recommended for a healthy pregnancy. Excessive GWG is associated with higher risk of pregnancy complications, including pregnancy-related hypertension and greater long-term postpartum weight retention. Excessive GWG is also associated with increased odds of child asthma, obesity, and greater percent body fat and abdominal adiposity.   

Using birth record data from all births in New York City in 2015 we found that higher neighborhood walkability was associated with lower risk of excessive gestational weight gain.  This protective effect was seen after controlling for the pregnant individual’s, age, race, place of birth, and education and the poverty rate in the residential neighborhood.  Further analyses that adjusted for pre-pregnancy body mass index suggest that the link between neighborhood walkability and lower risk of excessive gestational weight gain was due to differences in physical activity patterns, especially walking, during pregnancy.  This interpretation is consistent with past studies that find pregnant individuals favor lower intensity forms of exercise such as walking and that walking activity during mid-pregnancy is associated with lower risk of excessive gestational weight gain.  We have previously shown that higher neighborhood walkability is associated with more walking and more total physical activity.  The paper describing our research on gestational weight gain was published in the journal Obesity.

Multiple guidelines exist for planners and architects on how to design for health, including the NYC Active Design Guidelines, the WELL Community Standard, the American Institute of Architects Healthy Design Research Consortium, and the Department of Health and Human Service’s Healthy People 2020 guidelines. However, due to limited research on the implications of active design for health during pregnancy, few such guides consider pregnant individuals and their infants. Given the long-lasting benefits of healthy pregnancies for parental and child health, this research provides further impetus for the use of urban design to support healthy weight and reduce the risk of excessive gestational weight gain and associated health risks.

Posted in Active Transport, Adults, Childhood, Healthy Pregnancies, Physical Activity, Urban Design, Walkability | Leave a comment

Maintaining patient privacy while geocoding patient addresses: Do Not Use R to Geocode

Imagine if a clinical researcher were to disclose a list of patient addresses to a third-party – government agency, for profit company or not-for-profit entity – that was outside of their hospital or health system. Imagine the researcher then publicly announced they disclosed the addresses to the third party, that the addresses belonged to patients with a specific disease, and that those patients were being treated at a specific hospital. The researcher’s Institutional Review Board (IRB) and Health Insurance Portability and Accountability Act (HIPAA) compliance office would be outraged at these violations of patient privacy. Yet this sequence of events can happen inadvertently when studying how neighborhood conditions such as access to medical facilities or neighborhood food environments affect clinical outcomes in specific patient populations. A quick search of Google Scholar shows many articles that, through this sequence of events, have disclosed patient health data.

In a recent pre-press publication we show how geocoding patient or study subject addresses using a variety of R packages, STATA, SAS and QGIS can set of a cascade of events that discloses Personal Identifying Information (PII) and Protected Heath Information (PHI) in violation of usual IRB and HIPAA rules. We also show the flaws in several approaches proposed to protect PII and PHI in neighborhood health effects research and propose best practices to protect patient and study subject confidentiality in studies on neighborhood health effects.

Posted in Health Care Access, Methods, Tools | Leave a comment

Neighborhood Walkability and Body Mass Index among African American Cancer Survivors

Increasingly, health care systems are becoming stakeholders in urban design and infrastructure planning processes, and are considering how neighborhood environments can support the health of communities and patient populations within health system catchment areas. To this end, health systems are: contracting with planning firms to create Health District Plans with urban infrastructure that promotes healthy lifestyles; working alongside community partners to improve alignment and delivery of local health and supportive care resources; and are incorporating neighborhood-level data in electronic health records (EHR) as a new type of patient “vital sign.” And while extant literature indicates associations between neighborhood built environments and health outcomes, particularly for obesity, in the United States (US) general population, few studies have explored these relationships in cancer survivors – even fewer in non-white cancer survivor populations.

Cancer survivors are at heightened risk of weight gain after diagnosis, since they are susceptible to the energy-balance-related causes of weight gain common in the general population, as well as to cancer treatment-related weight gain. From 1997 to 2014, obesity increased more rapidly among adult cancer survivors compared with the general population. Colorectal and breast cancer survivors and non-Hispanic black survivors were at the highest risk of experiencing obesity during this period. Recent evidence among survivors of obesity-related cancers suggests that weight gain is related to higher risk of recurrence, cancer mortality, and all-cause mortality. 

In a recent publication, we assessed the cross-sectional association of residential neighborhood walkability with body mass index (BMI) in 2089 African Americans who had recently been diagnosed with cancer in Metropolitan Detroit, Michigan. Similar to prior research in the general population, among these cancer survivors, we found BMI to be inversely associated with greater neighborhood walkability. When we stratified these analyses by biologic sex, we observed the inverse associations in men but not women and, separately, among survivors reporting any regular physical activity post-diagnosis. The sex-specific findings are similar to those observed in previous studies of general populations. To our knowledge, this is the first survivorship investigation of neighborhood walkability and BMI to focus solely on African Americans, to include survivors of more than one obesity-related cancer (i.e., breast, colorectal, prostate), to incorporate men, and to use a multidimensional walkability index. Our research provides initial evidence that built environment factors influence weight among African American cancer survivors and support for health systems involvement in local urban design and planning decisions.  

This research was conducted in collaboration with colleagues at Wayne State University, using data from cancer survivors participating in the Detroit Research on Cancer Survivors (ROCS) cohort study.

Posted in Adults, Cancer Survivors, Urban Design, Walkability | Leave a comment

Improving the measurement of Neighborhood Physical Disorder

Neighborhood audit methods (AKA Systematic Social Observation) are often used to create measures of neighborhood built and social environments.  But even with the enhanced efficiency of virtual neighborhood audit methods using CANVAS-Street View, it is generally not possible to collect data from every block in a City.  Thus, spatial interpretation methods, such as kriging, are often used to estimate neighborhood conditions at locations not visited by the audit team.  Ordinary kriging uses the data collected at visited locations (blocks or intersections), and the spatial correlation between the data elements, to estimate conditions at all other locations in a neighborhood or City.  In recently published work we investigated whether Universal Kriging could be used to create improved estimates of neighborhood physical disorder across entire cities.  Universal kriging builds upon Ordinary Kriging by using additional external data, such as Census data, in the interpolation/estimation process.  We find that using additional data on housing vacancy, along with the observed physical disorder metrics, in a Universal Kriging model could improve model fit and estimation of physical disorder across a city.  In addition, Universal Kriging could create equivalently accurate estimates of physical disorder, but require the collection of disorder measures from fewer locations across a city.

Street Viewing 125th Street
Posted in CANVAS, Methods, Street View, Tools | Leave a comment

Newly Funded Work on Pedestrian Injury

We have recently been funded by NIH to conduct a four-year study of how urban design, the locations of alcohol selling establishments, night life districts and locations of services for the homeless influence pedestrian fatality risk.  We will be conducting a location-based case-control study of all pedestrian fatalities that occurred in metropolitan areas of the U.S. in 2017 and 2018 and matched control locations.  We will also be expanding the capabilities of our CANVAS tool for conducting nationwide virtual neighborhood audits via Google Street View.  This research builds upon our work to understand how urban form and access to parks and green spaces influences physical activity patterns.  Pedestrian safety is often cited as influencing engagement in pedestrian activity, particularly for older adults and for youth.  As cities make urban design changes to promote walking there is a concern that pedestrian injuries will increase as more people take to the sidewalks. In prior work we piloted the use of virtual neighborhood audits in pedestrian injury research.

However, a closer look at the pedestrian fatality data shows that, in 2018, 38% of all fatally struck pedestrians, and 44% of those between the ages of 21-65 years, were under the influence of alcohol when they were struck, a prevalence that has been consistent since at least 1997. Cities across the U.S. are currently developing plans to prevent all pedestrian fatalities, yet few if any of these city plans even mention the role of alcohol use by pedestrians or describe action items to address this modifiable risk factor  The most recent set of policy and design recommendations for reducing injuries among intoxicated pedestrians was a 1996 report from Monash University in Australia.  Thus, part of our research will focus on the locations of, and urban design around, alcohol selling establishments and the locations of night life districts as risk factors for pedestrian fatality.   We have developed methods to apply SatScan analyses to NETS business listing data, to automatically identify clusters of nightlife businesses.

We developed methods to apply SatScan analyses to business listing data to identify clusters of nightlife businesses. Panel A. Clusters of nightlife businesses detected by SaTScan in Philadelphia in 2014. Panel B. Blue rectangle in panel A, showing the Center City/Old City clusters and the East Passyunk Corridor clusters with Census blocks that overlap with the clusters and have nightlife businesses color-coded by number of nightlife businesses.

In addition, individuals experiencing homelessness appear to be at particular risk for pedestrian injury and to comprise a large percentage of the pedestrians killed while under the influence of alcohol. The Department of Housing and Urban Development estimated that there were 552,830 people experiencing homelessness in the U.S. in 2018 and 35% of these individuals live in unsheltered locations, often close to traffic (e.g., under bridges and overpasses).  The location of services for the homeless, and the pedestrian infrastructure around these services, appear to contribute to the risk of injury among those experiencing homelessness. Case-studies in the 1996 Monash University report highlight the importance of the location of homeless shelters and other services for the homeless near high volume road-ways as risk-factors for pedestrian injuries. Our study will be the first systematic investigation of the association between alcohol-involved pedestrian fatalities and locations of services for the homeless, and encampments, and informal areas where those experiencing homelessness shelter, and the first to study pedestrian infrastructure around providers of services for the homeless.     

We will also be building a new version of our CANVAS tool for conducting virtual neighborhood audits using Google Street View.  The new version will include: (a) automated sampling of locations for case-control studies; (b) image annotation tools; (c) linkage to neighborhood data via Application Program Interfaces (API) for Google Maps, Walkscore.com, and the US Census; (d) addition of scale development and spatial statistics tools and (d) integration with emerging online street/road visualization technologies and tools.  The CANVAS tool is publicly available and has already been used in numerous completed [here, here, here] and ongoing studies.

Posted in Active Transport, CANVAS, Economic Development, Methods, Pedestrian Injury, Safety, Street View, Tools, Urban Design, Walkability | Leave a comment

Physical Activity in an Urban Environment and Associations with Air Pollution and Lung Function

In new work published in the Annals of the American Thoracic Society we analyzed the links between where children are physically active and their exposure to air pollution and lung function.  Physical activity is associated with increased ventilation because of rapid and deeper breathing. Thus, being active while being exposed to high air pollution could lead to increased inhalation of pollutant particles and gases. In urban communities features of the built environment, including locations where children engaging in physical activity, could put individuals at risk for harmful inhaled exposures leading to lung function decrements. Thus, we conducted a study to investigate locations throughout New York City where children engaged in moderate-vigorous activity. Our hypothesis was that being physically active outdoors, particularly near high sources of traffic pollution, would be associated with increased air pollution exposure and decreased lung function.

Our study was conducted as part of a longitudinal birth cohort study in affiliation with the Columbia Center for Children’s Environmental Health that recruited pregnant, non-smoking mothers who lived in Northern Manhattan and the Bronx. At the time of enrollment in this secondary study, children were 9-14 years of age and still lived in NYC.  The 151 children enrolled in the study wore global positioning system (GPS) devices in a vest to identify their locations and accelerometers on their wrists that measured physical activity level. Devices were worn for 24-hours with repeated measures after 5 days. We paired the GPS and accelerometer data and mapped the data using ArcGIS to determine where children engaged in moderate-vigorous activity. We also used data from the New York City Community Air Survey (NYCCAS) to determine annual average air pollution concentrations in the locations where children were physically active. To account for daily fluctuations in pollution we adjusted our analysis for daily NYC pollution measured at a single site as well as daily temperature and humidity. Lastly, we measured lung function at the end of each 24-hour physical activity monitoring period.

On average, children spent more time physically active indoors (71.9 ±74.7 min/day) compared to outdoors (38.2 ±39.6 min/day). However, the majority of outdoor physical activity was along sidewalks and roadbeds (30.2 ±33.3 min/day) where nitrogen dioxide (NO2) pollution was relatively high. More time spent in outdoor activity was associated with higher exposure to NO2. In warmer months, more time spent in outdoor activity was associated with lower lung function even after adjusting for air pollution exposure. This finding suggests that in addition to air pollution there may be other environmental factors that contribute to decreased lung function when children are active outdoors.

Because physical activity leads to increased ventilation, the ability to identify specific locations of activity especially near sources of high pollution can improve risk assessment for lung function impairment. Our findings of a positive association between outdoor active time in warmer weather months and both elevated NO2 exposure and reduced lung function demonstrate a need to inform individuals in urban communities about location specific exposure risks. Our findings also support the need for continued attention towards improving air quality especially in urban communities.

Posted in Accelerometers, Asthma, Childhood, GPS | Leave a comment

Mapping Food Insecurity During the COVID-19 Pandemic

Prior to the COVID-19 pandemic, 11% of households and nearly 16% of families with children were food insecure. With schools closed and families out of work, food insecurity rates are expected to skyrocket in the coming months. During the crisis, food store shelves have frequently been empty due to bulk purchasing and an increase in at-home meal consumption. In an effort to ensure food is available for SNAP shoppers when they receive their monthly benefits, social media campaigns have encouraged non-SNAP shoppers to avoid food shopping during SNAP distribution dates. This is a particular concern in counties with large SNAP populations and in states with only a few SNAP distribution days per month (e.g., Nevada, Virginia, New Jersey). To inform these efforts, BEH’s Eliza Kinsey has developed a web mapping tool of SNAP distribution schedules and participation rates by county. This tool is intended to be used, both by household food shoppers and by policy-makers in designing strategic efforts to combat the food insecurity and nutrition challenges brought on by the COVID-19 crisis.

The maps use current SNAP schedule distribution information and 2018 participation counts. Dr. Kinsey plans to update the mapping tool regularly with pertinent SNAP participation changes and unemployment data.

SNAP recipients by County

Posted in Food Environment, Social Determinants, Socioeconomic status | Leave a comment

Updated: County Level Estimates of Highly Stressed Health Care Systems

Our online mapping tool has been updated with new data showing counties that are at high risk of experiencing patient volumes that exceed their hospital capacity over the next 6 weeks.  The maps show at risk counties for three different levels of social distancing and two levels of intensity of surge responses by hospitals.  The estimates use Jeffrey Shaman and colleague’s models of disease spread and the estimates posted previously of how many critical care hospital beds can be made available under various assumptions of hospital responses to patient surges. A full description of the methods can be found |here|

As in prior posts, the mapping site is a work in progress and will be updated frequently.

Posted in Community Needs Assessment, Health Care Access | Leave a comment