Abstract
In clustering problems, to model the intrinsic structure of unlabeled data, the latent variable models are frequently used. These model-based clustering methods often provide a clustering rule minimizing the total false assignment error. However, in many clustering applications, it is desirable to treat false assignment errors for a certain cluster differently. In this paper, we introduce the false assignment rate for clustering and estimate it by using the extended likelihood approach. We propose VRclust, a novel clustering rule that controls various errors differently across clusters. Real data examples illustrate the usage of estimation of false assignment rate and a simulation study shows that error controls are consistent as the sample size increases.
Original language | English (US) |
---|---|
Pages (from-to) | 2932-2944 |
Number of pages | 13 |
Journal | Statistical methods in medical research |
Volume | 29 |
Issue number | 10 |
DOIs | |
State | Published - Oct 1 2020 |
Keywords
- Clustering
- extended likelihood
- false assignment rate
ASJC Scopus subject areas
- Epidemiology
- Statistics and Probability
- Health Information Management