Bogus Bugs, Duplicates, and Revealing Comments: Data Quality Issues in NPR

Julian Aron Prenner; R Robbes

doi:10.1109/APR66717.2025.00012

Back

Conference proceeding

Bogus Bugs, Duplicates, and Revealing Comments: Data Quality Issues in NPR

Julian Aron Prenner and R Robbes

2025 IEEE/ACM International Workshop on Automated Program Repair, APR 2025, Ottawa, Ontario, Canada, 29 April 2025, Proceedings, pp.43-47

2025 IEEE/ACM International Workshop on Automated Program Repair (APR) (Ottawa, 29/04/2025–29/04/2025)

2025

DOI: https://doi.org/10.1109/APR66717.2025.00012

Handle:

https://hdl.handle.net/10863/51371

Abstract

automated program repair

Data quality

The performance of a machine learning system is not only determined by the model but also, to a substantial degree, by the data it is trained on. With the increasing use of machine learning, issues related to data quality have become a concern also in automated program repair research. In this position paper, we report some of the data-related issues we have come across when working with several large APR datasets and benchmarks, including, for instance, duplicates or 'bogus bugs'. We briefly discuss the potential impact of these problems on repair performance and propose possible remedies. We believe that more data-focused approaches could improve the performance and robustness of current and future APR systems.

Files and links (1)

url

https://doi.org/10.1109/APR66717.2025.00012View

Details

Title: Bogus Bugs, Duplicates, and Revealing Comments: Data Quality Issues in NPR
Creators: Julian Aron Prenner - Free University of Bozen-Bolzano
R Robbes - Laboratoire Bordelais de Recherche en Informatique
Publication Details: 2025 IEEE/ACM International Workshop on Automated Program Repair, APR 2025, Ottawa, Ontario, Canada, 29 April 2025, Proceedings, pp.43-47
ISBN: 9798331525859
Conference: 2025 IEEE/ACM International Workshop on Automated Program Repair (APR) (Ottawa, 29/04/2025–29/04/2025)
Publisher: IEEE
Piscataway, NJ
Format: Online
Number of pages: 5
Identifiers: 979-8-3315-2585-9
(UNIBZ)94008334
991007295850301241
Web of Science ID: 001558912100007
Scopus ID: 2-s2.0-105009594050
Academic Unit: Faculty of Engineering
Language: English
Resource Type: Conference proceeding
Author Names String: Prenner JA, Robbes R

Metrics

1 Record Views