Proceedings 26th Annual International Computer Software and Applications

Abstract

The data mining system LERS (learning from Examples based on Rough Sets) was used to induce rule sets from a data set describing melanoma (a dangerous skin cancer). The main objective of our research was to decrease the error rates for diagnosis of two fatal forms of melanoma based on these rule sets. The improvement was accomplished using two different techniques for postprocessing of rule sets: changing of rule strengths and rule truncation cutoffs. A rule strength is defined as the number of training cases correctly classified by the rule. Rule truncation means an elimination of weaker rules. Using these two techniques simultaneously is introduced in this paper for the first time. The criterion for the choice of the optimal form of the rule sets was the minimum of the sum of error rates for diagnosis of the two fatal forms of melanoma. Our research shows that at the cost of a minimal increase of the total error rate for patients that do not need immediate help, the sum of error rates for dangerous forms of melanoma may be highly decreased. Also, for the optimal rule set, the sum of error rates for all forms of melanoma is minimal as well.

Related Articles