Principal Feature Selection Impact for Internet Traffic Classification Using Nai?ve Bayes
Paramita, Adi Suryaputra
MetadataShow full item record
One of the important roles for internet trafﬁc classiﬁcation is feature selection method. This method will present more accurate data and more accurate internet trafﬁc classiﬁcation which will provide precise information for bandwidth optimization. One of the important considerations in the feature selection method that should be looked into is how to choose the right features which can deliver better and more precise results for the classiﬁcation process. This research will investigate how to select the principal and discriminant feature. We plan to use Principal Component Analysis (PCA) technique in order to ﬁnd discriminant and principal feature for internet trafﬁc classiﬁcation. This research will try to combine PCA with another feature selection algorithm. The feature selection algorithm is Correlation Feature Selection (CFS). The Correlation Feature Selection (CFS) is used in the feature selection to ﬁnd a collection of the best sub-sets data from the existing data where the Internet trafﬁchas the same correlation that could ﬁt into the same class. Internet trafﬁc dataset will be collected, formatted, classiﬁed and analyzed using Naïve Bayesian. Moreover, this paper also studied the process to ﬁt the features. The result also shows that the internet trafﬁc classiﬁcation using Naïve Bayesian and PCA has improvement for the classiﬁcation accuracy. The most signiﬁcant result of this result is the combination between PCA and CFS for feature selection improved internet trafﬁc classiﬁcation accuracy more than 10 %.