Abstract
Datasets often contain heavily underrepresented classes. Class imbalance biases models toward frequent classes, reducing performance on rare but important categories; in-process strategies such as loss-weighting remain under-explored for software engineering artefacts. We investigate loss-weighting functions for code comment classification and package our methods into Beyond Balance, a reusable implementation offering multiple weighting strategies for Transformer- and Sentence-Transformer–based models. Loss weighting consistently improves F1 performance across datasets, demonstrating an effective and easily adoptable imbalance-handling technique through Beyond Balance.