TY - JOUR
T1 - An interpretable clustering approach to safety climate analysis
T2 - Examining driver group distinctions
AU - Sun, Kailai
AU - Lan, Tianxiang
AU - Goh, Yang Miang
AU - Safiena, Sufiana
AU - Huang, Yueng Hsiang
AU - Lytle, Bailey
AU - He, Yimin
N1 - Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2024/3
Y1 - 2024/3
N2 - The transportation industry, particularly the trucking sector, is prone to workplace accidents and fatalities. Accidents involving large trucks accounted for a considerable percentage of overall traffic fatalities. Recognizing the crucial role of safety climate in accident prevention, researchers have sought to understand its factors and measure its impact within organizations. While existing data-driven safety climate studies have made remarkable progress, clustering employees based on their safety climate perception is innovative and has not been extensively utilized in research. Identifying clusters of drivers based on their safety climate perception allows the organization to profile its workforce and devise more impactful interventions. The lack of utilizing the clustering approach could be due to difficulties interpreting or explaining the factors influencing employees’ cluster membership. Moreover, existing safety-related studies did not compare multiple clustering algorithms, resulting in potential bias. To address these problems, this study introduces an interpretable clustering approach for safety climate analysis. This study compares five algorithms for clustering truck drivers based on their safety climate perceptions. It also proposes a novel method for quantitatively evaluating partial dependence plots (QPDP). Then, to better interpret the clustering results, this study introduces different interpretable machine learning measures (Shapley additive explanations, permutation feature importance, and QPDP). The Python code used in this study is available at https://github.com/NUS-DBE/truck-driver-safety-climate. This study explains the clusters based on the importance of different safety climate factors. Drawing on data collected from more than 7,000 American truck drivers, this study significantly contributes to the scientific literature. It highlights the critical role of supervisory care promotion in distinguishing various driver groups. Moreover, it showcases the advantages of employing machine learning techniques, such as cluster analysis, to enrich the scientific knowledge in this field. Future studies could involve experimental methods to assess strategies for enhancing supervisory care promotion, as well as integrating deep learning clustering techniques with safety climate evaluation.
AB - The transportation industry, particularly the trucking sector, is prone to workplace accidents and fatalities. Accidents involving large trucks accounted for a considerable percentage of overall traffic fatalities. Recognizing the crucial role of safety climate in accident prevention, researchers have sought to understand its factors and measure its impact within organizations. While existing data-driven safety climate studies have made remarkable progress, clustering employees based on their safety climate perception is innovative and has not been extensively utilized in research. Identifying clusters of drivers based on their safety climate perception allows the organization to profile its workforce and devise more impactful interventions. The lack of utilizing the clustering approach could be due to difficulties interpreting or explaining the factors influencing employees’ cluster membership. Moreover, existing safety-related studies did not compare multiple clustering algorithms, resulting in potential bias. To address these problems, this study introduces an interpretable clustering approach for safety climate analysis. This study compares five algorithms for clustering truck drivers based on their safety climate perceptions. It also proposes a novel method for quantitatively evaluating partial dependence plots (QPDP). Then, to better interpret the clustering results, this study introduces different interpretable machine learning measures (Shapley additive explanations, permutation feature importance, and QPDP). The Python code used in this study is available at https://github.com/NUS-DBE/truck-driver-safety-climate. This study explains the clusters based on the importance of different safety climate factors. Drawing on data collected from more than 7,000 American truck drivers, this study significantly contributes to the scientific literature. It highlights the critical role of supervisory care promotion in distinguishing various driver groups. Moreover, it showcases the advantages of employing machine learning techniques, such as cluster analysis, to enrich the scientific knowledge in this field. Future studies could involve experimental methods to assess strategies for enhancing supervisory care promotion, as well as integrating deep learning clustering techniques with safety climate evaluation.
KW - Accident prevention
KW - Cluster analysis
KW - Feature importance
KW - Interpretable machine learning
KW - Safety climate
KW - Truck driver
UR - http://www.scopus.com/inward/record.url?scp=85181052188&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85181052188&partnerID=8YFLogxK
U2 - 10.1016/j.aap.2023.107420
DO - 10.1016/j.aap.2023.107420
M3 - Article
C2 - 38159513
AN - SCOPUS:85181052188
SN - 0001-4575
VL - 196
JO - Accident Analysis and Prevention
JF - Accident Analysis and Prevention
M1 - 107420
ER -