Query-Based Configuration of Text Retrieval Solutions for Software Engineering Tasks
Di Penta, M
MetadataShow full item record
Text Retrieval (TR) approaches have been used to leverage the textual information contained in software artifacts to address a multitude of software engineering (SE) tasks. However, TR approaches need to be configured properly in order to lead to good results. Current approaches for automatic TR configuration in SE configure a single TR approach and then use it for all possible queries. In this paper, we show that such a configuration strategy leads to suboptimal results, and propose quest, the first approach bringing TR configuration selection to the query level. quest recommends the best TR configuration for a given query, based on a supervised learning approach that determines the TR configuration that performs the best for each query according to its properties. We evaluated quest in the context of feature and bug localization, using a data set with more than 1,000 queries. We found that quest is able to recommend one of the top three TR configurations for a query with a 69% accuracy, on average. We compared the results obtained with the configurations recommended by quest for every query with those obtained using a single TR configuration for all queries in a system and in the entire data set. We found that using quest we obtain better results than with any of the considered TR configurations.