Disaggregating Latino nativity in equity research using electronic health records

Miguel Marino, Katie Fankhauser, Jessica Minnier, Jennifer A. Lucas, Sophia Giebultowicz, Jorge Kaufmann, Jun Hwang, Steffani R. Bailey, Danielle M. Crookes, Andrew Bazemore, Shakira F. Suglia, John Heintzman

Research output: Contribution to journalArticlepeer-review

Abstract

Objective: To develop and validate prediction models for inference of Latino nativity to advance health equity research. Data Sources/Study Setting: This study used electronic health records (EHRs) from 19,985 Latino children with self-reported country of birth seeking care from January 1, 2012 to December 31, 2018 at 456 community health centers (CHCs) across 15 states along with census-tract geocoded neighborhood composition and surname data. Study Design: We constructed and evaluated the performance of prediction models within a broad machine learning framework (Super Learner) for the estimation of Latino nativity. Outcomes included binary indicators denoting nativity (US vs. foreign-born) and Latino country of birth (Mexican, Cuban, Guatemalan). The performance of these models was compared using the area under the receiver operating characteristics curve (AUC) from an externally withheld patient sample. Data Collection/Extraction Methods: Census surname lists, census neighborhood composition, and Forebears administrative data were linked to EHR data. Principal Findings: Of the 19,985 Latino patients, 10.7% reported a non-US country of birth (5.1% Mexican, 4.7% Guatemalan, 0.8% Cuban). Overall, prediction models for nativity showed outstanding performance with external validation (US-born vs. foreign: AUC = 0.90; Mexican vs. non-Mexican: AUC = 0.89; Guatemalan vs. non-Guatemalan: AUC = 0.95; Cuban vs. non-Cuban: AUC = 0.99). Conclusions: Among challenges facing health equity researchers in health services is the absence of methods for data disaggregation, and the specific ability to determine Latino country of birth (nativity) to inform disparities. Recent interest in more robust health equity research has called attention to the importance of data disaggregation. In a multistate network of CHCs using multilevel inputs from EHR data linked to surname and community data, we developed and validated novel prediction models for the use of available EHR data to infer Latino nativity for health disparities research in primary care and health services research, which is a significant potential methodologic advance in studying this population.

Original languageEnglish (US)
Pages (from-to)1119-1130
Number of pages12
JournalHealth Services Research
Volume58
Issue number5
DOIs
StatePublished - Oct 2023

Keywords

  • U.S. Census location
  • ethnicity
  • health disparities
  • machine learning
  • surname data

ASJC Scopus subject areas

  • Health Policy

Fingerprint

Dive into the research topics of 'Disaggregating Latino nativity in equity research using electronic health records'. Together they form a unique fingerprint.

Cite this