Abstract
Allergy, a hypersensitivity disorder of the immune system, is one of the steadily increasing health problems in the world. Usage of transgenic food products is constantly increasing and hence assessment of product for potential allergenicity is necessary before they are introduced into the human food chain. Though several bioinformatics approach exists for allergen prediction, discriminating the allergen-like non-allergen from allergens and non-allergens remains challenging. Hence, here we describe a novel fuzzy rule based system for assessing the protein for allergenicity. It combines five different modules-Machine learning classifier (MLC), Motif search, Global similarity with allergen, FAO/WHO evaluation scheme, and Global similarity with allergen like putative non-allergen (APN) — to assess the protein allergenicity. AllerHunter dataset, consisting of 1405 known allergens and 12474 non-allergens, were used to rigorously train and test the proposed method. The system was validated extensively through 10-fold cross validation, receiver-operator characteristic curve, and Mathew's correlation coefficient. The validation results revealed that the proposed method is effective in distinguishing allergen, non-allergen, and APN proteins. An independent dataset test, containing 1443 protein sequence, (sensitivity = 89.9%, specificity = 98.6, Accuracy = 97.85, MCC = 0.87, and AUC = 0.95) revealed that the proposed method performs significantly better than the existing methods.