[ACM Press the 50th Annual Southeast Regional Conference - Tuscaloosa, Alabama (2012.03.29-2012.03.31)] Proceedings of the 50th Annual Southeast Regional Conference on - ACM-SE '12 - Applying random projection to the classification of malicious applications using data mining algorithms
β Scribed by Durand, Jan; Atkison, Travis
- Book ID
- 121288498
- Publisher
- ACM Press
- Year
- 2012
- Weight
- 137 KB
- Category
- Article
- ISBN
- 1450312039
No coin nor oath required. For personal study only.
β¦ Synopsis
This research is part of a continuing effort to show the viability of using random projection as a feature extraction and reduction technique in the classification of malware to produce more accurate classifiers. In this paper, we use a vector space model with n-gram analysis to produce weighted feature vectors from binary executables, which we then reduce to a smaller feature set using the random projection method proposed by Achlioptas, and the feature selection method of mutual information to produce two separate data sets. We then apply several popular machine learning algorithms including J48 decision tree, naΓ―ve Bayes, support vector machines, and an instance-based learner to the data sets to produce classifiers for the detection of malicious executables. We evaluate the performance of the different classifiers and discover that using a data set reduced by random projection can improve the performance of support vector machine and instance-based learner classifiers.
π SIMILAR VOLUMES
Disassemblers generally assume that assembly language instructions do not overlap, therefore, an obvious obfuscation against such disassemblers is to overlap instructions. This is difficult to implement, however, as the number of instructions existing in a program which can be overlapped are typical