Correlation-based Feature Ranking in Combination with Embedded Feature Selection. Ales Pilny, Pavel Kordik, Miroslav Snorek, Wolfgang Oertel

Abstract. Most of Feature Ranking and Feature Selection approaches can be used for categorial data only. Some of them rely on statistical measures of the data, some are tailored to a specific data mining algo- rithm (wrapper approach). In this paper we present new methods for feature ranking and selection obtained as a combination of the above mentioned approaches. The data mining algorithm (GAME) is designed for numerical data, but it can be applied to categorial data as well. It in- corporates feature selection mechanisms and new methods, proposed in this paper, derive feature ranking from final data mining model. The rank of each feature selected by model is computed by processing correlations of outputs between neighboring model’s neurons in di?erent ways. We used four di?erent methods based on fuzzy logic, certainty factors and simple calculus. The performance of these four feature ranking methods was tested on artificial data sets, on well known Ionosphere data set and on well known Housing data set with continuous variables. The results indicated that the method based on simple calculus approach was signif- icantly worse than other three methods. These methods produce ranking consistent with recently published studies.

Keywords. Feature Ranking, Feature Selection, Correlation, FAKE GAME, Embedded Model.

References.

  1. R. Battiti. Using mutual information for selecting features in supervised neural net

learning. IEEE TRANSACTIONS ON NEURAL NETWORKS, 5, NO. 4, 1994.

  1. J. Biesiada, W. Duch, A. Kachel, K. Maczka, and S. Palucha. Feature ranking

methods based on information entropy with parzen windows. pages 109–119, 2005.

  1. C. J. M. C. L. Blake. Uci repository of machine learning databases.

 http://www.ics.uci.edu/ mlearn/MLSummary.html, September 2006.

  1. R. J. Chassell. About certainty factors.  http://www.rattlesnake.com/notions/certainty-

factors.html, 2009.

  1. W. A. N. Joseph Lee Rodgers. Thirteen ways to look at the correlation coe?cient.

The American Statistician, 42, 1988.

  1. P. Kord'?k. Fully Automated Knowledge Extraction using Group of Adaptive Models

Evolution. PhD thesis, Czech Technical University in Prague, FEE, Dep. of Comp. Sci. and Computers, FEE, CTU Prague, Czech Republic, September 2006.

  1. N. Kwak, C. Kim, and H. Kim. Dimensionality reduction based on ica for regression

problems. Neurocomputing, 71:2596 2603, 2008.

  1. H. Madala and A. Ivakhnenko. Inductive Learning Algorithm for Complex System

Modelling. CRC Press, 1994. Boca Raton.

  1. S. W. Mahfoud. A comparison of parallel and sequential niching methods. In Sixth

International Conference on Genetic Algorithms, pages 136–143, 1995.

10. S. W. Mahfoud. Niching methods for genetic algorithms. Technical Report 95001,

Illinois Genetic Algorithms Laboratory (IlliGaL), University of Ilinios at Urbana- Champaign, May 1995.

11. A. Piln'y, P. Kord'?k, and M. Snorek. Feature ranking derived from data mining

process. 18th International Conference on Artificial Neural Networks - ICANN 2008, pages 889–898, 2008.

12. M. Tesmer and P. Estevez. Amifs: adaptive feature selection by using mutual

information. In Proceedings of the 2004 IEEE International Joint Conference on Neural Networks, volume 1, page 308, Dept. of Electr. Eng., Chile Univ., Santiago, Chile, July 2004.

13. L. A. Zadeh. Fuzzy sets. Information and Control, 8:338–353, 1965.

Last modified by Gleb on 10/29/09 15:24:50 (3 years ago)

Attachments