2023 IEEE International Conference on Data Mining (ICDM)
Download PDF

Abstract

The process of software bug localization can be described as identifying the source code files (i.e., bug location) corresponding to the bug described in the bug report. Most existing bug localization approaches fall short in handling the following three aspects, including (L1) only using partial content in the bug report (i.e., title and description), (L2) direct semantic understanding of the entire bug reports and source files, and (L3) relying solely on semantic matching between bug reports and source files. To overcome these limitations, this paper constructs datasets in which the content of each bug report is augmented with prefix comments for addressing Ll and presents a novel model named BRS_BL for bug localization. Specifically, the proposed BRS_BL designs a specially tailored bug report summarization module to extract core information for semantic representation in bug reports and a chunking source file module to split the source code files into blocks based on lines and words for addressing L2. It further uses a fine-grained matching module utilizing semantic matching and incorporating some well-characterized software-specific features for addressing L3. The experimental results show that our model BRS_BL significantly outperforms the existing representative bug localization techniques in terms of several evaluation metrics across four real-world projects.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles