Abstract
Software defect data has an imbalanced and highly skewed class distribution. The misclassification costs of two classes are not equal nor are known. It is critical to find the optimum bound, i.e. threshold, which would best separate defective and defect-free classes in software data. We have applied decision threshold optimization on Naïve Bayes classifier in order to find the optimum threshold for software defect data. ROC analyses show that decision threshold optimization significantly decreases false alarms (on the average by 11%) without changing probability of detection rates.