Revealing Hidden Repeaters in the CHIME/FRB Catalog: Semi-Supervised Insights into the Fast Radio Burst Population
Authors
N. Mankatwit
P. Thongkonsing
S. Loekkesee
P. Chainakun
W. Luangtip
S. Sanpa-arsa
Abstract
Fast radio bursts (FRBs) are millisecond-duration extragalactic transients, observationally classified as repeaters or nonrepeaters. This classification may be biased, as some apparently non-repeating sources could simply have undetected subsequent bursts. To address this, we develop a semi-supervised learning framework to identify distinguishing features of repeaters using primary observational parameters from the Blinkverse database, which draws from the CHIME/FRB Catalogs. The framework combines labeled data (known repeaters and confidently classified non-repeaters) with unlabeled sources previously flagged as non-repeaters but exhibiting repeater-like characteristics. We employ uniform manifold approximation and projection with a nearest-neighbor scheme to select potential candidates, followed by semi-supervised classification using five base estimators, including random forest, support vector machine, logistic regression, AdaBoost, and Gradient boost. Each model is fine-tuned through cross-validation, and a voting strategy among the five models is employed to enhance robustness. All models achieve consistently high performance, identifying dispersion measure, peak frequency, and fluence as the most discriminative features. Repeaters tend to show lower dispersion measures, higher peak frequencies, and higher fluences than non-repeaters. We also identify a set of candidate repeaters, several of which are consistent with prior independent studies. Our approach can identify 36 additional repeater candidates that conventional methods may have missed. Finally, the results highlight dispersion measure as a key discriminator between repeaters and non-repeaters, revealing a tension between physical and instrumental origins-either environmental effects, if the two populations arise from distinct progenitors, or detection bias, as nearby sources are more easily observed.