Principal Feature Selection Impact for Internet Traffic Classification Using Naïve Bayes
Paramita, Adi Suryaputra
MetadataShow full item record
Abstract One of the important roles for internet traffic classification is feature selection method. This method will present more accurate data and more accurate internet traffic classification which will provide precise information for bandwidth optimization. One of the important considerations in the feature selection method that should be looked into is how to choose the right features which can deliver better and more precise results for the classification process. This research will investigate how to select the principal and discriminant feature. We plan to use Principal Component Analysis (PCA) technique in order to find discriminant and principal feature for internet traffic classification. This research will try to combine PCA with another feature selection algorithm. The feature selection algorithm is Correlation Feature Selection (CFS). The Correlation Feature Selection (CFS) is used in the feature selection to find a collection of the best sub-sets data from the existing data where the Internet traffic has the same correlation that could fit into the same class. Internet traffic dataset will be collected, formatted, classified and analyzed using Naïve Bayesian. Moreover, this paper also studied the process to fit the features. The result also shows that the internet traffic classification using Naïve Bayesian and PCA has improvement for the classification accuracy. The most significant result of this result is the combination between PCA and CFS for feature selection improved internet traffic classification accuracy more than 10 %.