This paper is the fifth in a five-part series on statistical methodology for performance assessment of multi-parametric quantitative imaging biomarkers (mpQIBs) for radiomic analysis. Radiomics is the process of extracting visually imperceptible features from radiographic medical images using data-driven algorithms. We refer to the radiomic features as data-driven imaging markers (DIMs), which are quantitative measures discovered under a data-driven framework from images beyond visual recognition but evident as patterns of disease processes irrespective of whether or not ground truth exists for the true value of the DIM. This paper aims to set guidelines on how to build machine learning models using DIMs in radiomics and to apply and report them appropriately. We provide a list of recommendations, named RANDAM (an abbreviation of “Radiomic ANalysis and DAta Modeling”), for analysis, modeling, and reporting in a radiomic study to make machine learning analyses in radiomics more reproducible. RANDAM contains five main components to use in reporting radiomics studies: design, data preparation, data analysis and modeling, reporting, and material availability. Real case studies in lung cancer research are presented along with simulation studies to compare different feature selection methods and several validation strategies.
- data-driven imaging markers
- machine learning
ASJC Scopus subject areas
- Radiology Nuclear Medicine and imaging