Abstract
Static code attributes are widely used in defect prediction studies as an abstraction model because they capture general properties of the program. To counter buffer overflow exploits, programmers use buffer size checking and input validation schemes. In this paper, we propose light-weight static code attributes that can be extracted easily, to characterize buffer overflow safety mechanisms and input validation checks implemented in the code for predicting buffer overflows. We then use data mining methods on the collected static code attributes to predict buffer overflows in application programs. In our experiments across five applications, our best classifier could achieve a recall of 95% and precision over 80% suggesting that our proposed static code attributes are effective indicators in predicting buffer overflows.