Building A Longitudinal Cohort From 9-1-1 to 1-Year Using Existing Data Sources, Probabilistic Linkage, and Multiple Imputation: A Validation Study

Craig D. Newgard, Susan Malveau, Dana Zive, Joshua Lupton, Amber Lin

Research output: Contribution to journalArticlepeer-review

20 Scopus citations


Objective: The objective was to describe and validate construction of a population-based, longitudinal cohort of injured older adults from 9-1-1 call to 1-year follow-up using existing data sources, probabilistic linkage, and multiple imputation. Methods: This was a descriptive cohort study conducted in seven counties in Oregon and Washington from January 1, 2011, through December 31, 2011, with follow-up through December 31, 2012. The primary cohort included all injured adults ≥ 65 years served by 44 emergency medical services (EMS) agencies. We used nine existing databases to assemble the cohort, including EMS data, two state trauma registries, two state discharge databases, two state vital statistics databases, the Oregon Physician Order for Life-Sustaining Treatment registry, and Medicare claims data. We matched data files using probabilistic linkage and handled missing values with multiple imputation. We independently validated data processes using 1,350 randomly sampled records for probabilistic linkage and 3,140 randomly sampled records for variables created from existing data sources. Results: There were 15,649 injured older adults in the primary cohort, with 13,661 (87.3%) total matched records and 9,337 (59.7%) matches to the index ED/hospital visit. The sensitivity of linkage was 99.9% (95% confidence interval [CI] = 99.3%–100%) for any match and 98.3% (95% CI = 96.2%–99.4%) for index event matches. The specificity of linkage was 95.7% (95% CI = 93.7%–97.2%) for any match and 100% (95% CI = 99.2%–100%) for index event matches. Name, date of birth, home zip code, age, and hospital had the highest yield for linkage. Patients with matched records tended to be higher acuity than unmatched patients, suggesting selection bias if unmatched patients were excluded. Compared to hand-abstracted values, the sensitivity of electronically derived variables ranged from 18.2% (abdominal-pelvic Abbreviated Injury Scale score ≥ 3) to 97.4% (in-hospital mortality), with specificity of 88.0% to 99.8%. Conclusions: A population-based emergency care cohort with long-term outcomes can be constructed from existing data sources with high accuracy and reasonable validity of resulting variables.

Original languageEnglish (US)
Pages (from-to)1268-1283
Number of pages16
JournalAcademic Emergency Medicine
Issue number11
StatePublished - Nov 2018

ASJC Scopus subject areas

  • Emergency Medicine


Dive into the research topics of 'Building A Longitudinal Cohort From 9-1-1 to 1-Year Using Existing Data Sources, Probabilistic Linkage, and Multiple Imputation: A Validation Study'. Together they form a unique fingerprint.

Cite this