Feature Selection in High Dimensional Regression Problem

Mariusz Kubus

Authors

Mariusz Kubus Opole University of Technology, Department of Mathematics and Applied Computer Science

Keywords:

feature selection, filters, embedded methods, high dimension

Abstract

Metody selekcji zmiennych dyskutowane obecnie w literaturze dzielone są na trzy główne podejścia: dobór zmiennych dokonywany przed etapem budowy modelu, przeszukiwanie przestrzeni cech i selekcja zmiennych na podstawie oceny jakości modelu oraz metody z wbudowanym mechanizmem selekcji zmiennych. W przypadku, gdy liczba zmiennych jest większa od liczby obserwacji rekomendowane są głównie podejścia pierwsze lub trzecie. Celem artykułu jest porównanie wybranych metod reprezentujących te podejścia w przypadku dużego wymiaru przestrzeni cech. W przeprowadzonych symulacjach, do sztucznie generowanych danych włączano zmienne skorelowane.

Downloads

Download data is not yet available.

References

Blum A.L., Langley P. (1997), Selection of relevant features and examples in machine learning. ,,Artificial Intelligence”, vol. 97 no. 1-2, p. 245-271

Duch W. (2006), Filter methods. [in:] Guyon I., Gunn S., Nikravesh M., Zadeh L. (Eds.), Feature Extraction: Foundations and Applications. Springer, New York

Efron B., Hastie T., Johnstone I., Tibshirani R. (2004), Least Angle Regression. ,,Annals of Statistics” 32 (2): p. 407-499

Gatnar E. (2001), Nieparametryczna metoda dyskryminacji i regresji. PWN, Warszawa

Grabiński T., Wydymus S., Zeliaś A. (1982), Metody doboru zmiennych w modelach ekonometrycznych. PWN, Warszawa

Guyon I. (2008), Practical Feature Selection: from Correlation to Causality. [in:] F. Fogelman- Soulie et al. (Eds.), Mining Massive Data Sets for Security, IOS Press

Guyon I., Elisseeff A. (2003), An Introduction to Variable and Feature Selection. ,,Journal of Machine Learning Research” 3, p. 1157-1182

Hastie T., Tibshirani R., Friedman J. (2009), The Elements of Statistical Learning: Data Mining, Inferance, and Prediction. 2nd edition, Springer, New York

Meinshausen N. (2007), Lasso with relaxation, Computational Statistics and Data Analysis 52(1): p. 374-293

Ng A.Y. (1998), On feature selection: learning with exponentially many irrelevant features as training examples, In Proceedings of the 15th International Conference on Machine Learning, p. 404-412, San Francisco, CA. Morgan Kaufmann

Nowak E. (1984), Problemy doboru zmiennych do modelu ekonometrycznego. PWN, Warszawa

Nowak E. (1997), Zarys metod ekonometrii: zbiór zadań. PWN Wyd.2, Warszawa

Osborne M., Presnell B., Turlach B. (2000), A new approach to variable selection in least squares problems. ,,IMA Journal of Numerical Analysis” 20: p. 389-404

Paul D., Bair E., Hastie T., Tibshirani R. (2008), “Pre-conditioning” for feature selection and regression in high-dimensional problems, Annals of Statistics 36(4): p. 1595-1618

Reunanen J. (2006), Search Strategies, In I. Guyon, S. Gunn, M. Nikravesh, L. Zadeh (Eds.), Feature Extraction: Foundations and Applications, Springer, New York

Tibshirani R. (1996), Regression shrinkage and selection via the lasso. ,,J.Royal. Statist. Soc. B.” 58: p. 267-288

Zou H., Hastie T. (2005), Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society Series B. 67(2): p. 301-320

Feature Selection in High Dimensional Regression Problem

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

How to Cite

Most read articles by the same author(s)

Language

Make a Submission

earlyview

oa

similaritycheck

cope

partner konferencji

Keywords

Information

mnisw

Latest publications