Partially versus purely data-driven approaches in SARS-CoV-2 prediction

Samar A. Shilbayeh, Abdullah Abonamah, Ahmad A. Masri

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Prediction models of coronavirus disease utilizing machine learning algorithms range from forecasting future suspect cases, predicting mortality rates, to building a pattern for country-specific pandemic end date. To predict the future suspect infection and death cases, we categorized the approaches found in the literature into: first, a purely data-driven approach, whose goal is to build a mathematical model that relates the data variables including outputs with inputs to detect general patterns. The discovered patterns can then be used to predict the future infected cases without any expert input. The second approach is partially data-driven; it uses historical data, but allows expert input such as the SIR epidemic algorithm. This approach assumes that the epidemic will end according to medical reasoning. In this paper, we compare the purely data-driven and partially-data driven approaches by applying them to data from three countries having different past pattern behavior. The countries are the US, Jordan, and Italy. It is found that those two prediction approaches yield significantly different results. Purely data-driven approach depends totally on the past behavior and does not show any decline in the number of the infected cases if the country did not experience any decline in the number of cases. On the other hand, a partially data-driven approach guarantees a timely decline of the infected curve to reach zero. Using the two approaches highlights the importance of human intervention in pandemic prediction to guide the learning process as opposed to the purely data-driven approach that predicts future cases based on the pattern detected in the data.

Original languageEnglish (US)
Article number5696
JournalApplied Sciences (Switzerland)
Issue number16
StatePublished - Aug 2020


  • Corona virus detection
  • Exponential regression
  • Exponential smoothing model
  • Infected cases prediction
  • Linear regression
  • Partially data-driven approach
  • Purely data-driven approach
  • SIR model

ASJC Scopus subject areas

  • Materials Science(all)
  • Instrumentation
  • Engineering(all)
  • Process Chemistry and Technology
  • Computer Science Applications
  • Fluid Flow and Transfer Processes


Dive into the research topics of 'Partially versus purely data-driven approaches in SARS-CoV-2 prediction'. Together they form a unique fingerprint.

Cite this