2022 IEEE International Conferences on Internet of Things (iThings) and IEEE Green Computing & Communications (GreenCom) and IEEE Cyber, Physical & Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics (Cybermatics)
Download PDF

Abstract

Recently, imbalanced traffic classification has attracted more attention due to the fact that most internet traffic exhibits imbalance behavior. However, few works only have considered real-time imbalanced traffic classification. In this project, we propose a comparative study comprising several machine learning algorithms for nine different scenarios. We vary dataset and flow sizes following an under-sampling approach, in order to establish an objective evaluation of the best parameters for classification. The results showed that: 1) Combined with packet length, inter-arrival time and maximum segment size, features related to TCP session signalization enhance imbalanced traffic classification performances; 2) Ensemble approaches, especially Bagged Random Forest, achieve the best results for real-time imbalanced traffic classification; 3) Increasing flow sizes while reducing (to a certain level) training set sizes, enhances classification performances as we learn more about each individual instance. The best classification scenario includes 500 samples in each class with 8 packets flows.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Similar Articles