Selected Robust Logistic Regression Specification for Classification of Multi‑dimensional Functional Data in Presence of Outlier
DOI:
https://doi.org/10.18778/0208-6018.334.04Keywords:
basis functions representation, classification problem, functional regression analysis, logistic regression model, multi‑dimensional functional data, robust estimationAbstract
In this paper, the binary classification problem of multi‑dimensional functional data is considered. To solve this problem a regression technique based on functional logistic regression model is used. This model is re‑expressed as a particular logistic regression model by using the basis expansions of functional coefficients and explanatory variables. Based on re‑expressed model, a classification rule is proposed. To handle with outlying observations, robust methods of estimation of unknown parameters are also considered. Numerical experiments suggest that the proposed methods may behave satisfactory in practice.
Downloads
References
Ahmad S., Ramli N.M., Midi H. (2010), Robust estimators in logistic regression: A Comparative simulation study, “Journal of Modern Applied Statistical Methods”, vol. 9, pp. 502–511.
Google Scholar
Bianco A.M., Yohai V.J. (1996), Robust estimation in the logistic regression model, [in:] H. Reider (ed.), Robust statistics, Data analysis and computer intensive methods, Springer Verlag, New York.
Google Scholar
Chiou J.M., Müller H.G., Wang J.L. (2004), Functional response models, “Statistica Sinica”, vol. 14, pp. 675–693.
Google Scholar
Chiou J.M., Yang Y.F., Chen Y.T. (2016), Multivariate functional linear regression and prediction, “Journal of Multivariate Analysis”, vol. 146, pp. 301–312.
Google Scholar
Collazos J.A.A., Dias R., Zambom A.Z. (2016), Consistent variable selection for functional regression models, “Journal of Multivariate Analysis”, vol. 146, pp. 63–71.
Google Scholar
Croux C., Haesbroeck G. (2003), Implementing the Bianco and Yohai estimator for logistic regression, “Computational Statistics & Data Analysis”, vol. 44, pp. 273–295.
Google Scholar
Febrero‑Bande M., Galeano P., González‑Manteiga W. (2007), A functional analysis of NO_x levels: location and scale estimation and outlier detection, “Computational Statistics”, vol. 22, pp. 411–427.
Google Scholar
Febrero‑Bande M., Galeano P., González‑Manteiga W. (2008), Outlier detection in functional data by depth measures, with application to identify abnormal NO_x levels, “Environmetrics”, vol. 19, pp. 331–345.
Google Scholar
Febrero‑Bande M., Oviedo de la Fuente M. (2012), Statistical computing in functional data analysis: The R package fda.usc, “Journal of Statistical Software”, vol. 51, pp. 1–28.
Google Scholar
Ferraty F., Vieu P. (2006), Nonparametric Functional Data Analysis: Theory and Practice, Springer, New York.
Google Scholar
Giacofci M., Lambert‑Lacroix S., Marot G., Picard F. (2013), Wavelet‑based clustering for mixed‑effects functional models in high dimension, “Biometrics”, vol. 69, pp. 31–40.
Google Scholar
Górecki T., Krzyśko M., Wołyński W. (2015), Classification problem based on regression models for multidimensional functional data, “Statistics in Transition New Series”, no. 16, pp. 97–110.
Google Scholar
Górecki T., Łaźniewska E. (2013), Funkcjonalna analiza składowych głównych PKB, “Wiadomości Statystyczne”, no. 4, pp. 23–34.
Google Scholar
Górecki T., Smaga Ł. (2015), A comparison of tests for the one‑way ANOVA problem for functional data, “Computational Statistics”, vol. 30, pp. 987–1010.
Google Scholar
Górecki T., Smaga Ł. (2017), Multivariate analysis of variance for functional data, “Journal of Applied Statistics”, vol. 44, pp. 2172–2189.
Google Scholar
Horváth L., Kokoszka P. (2012), Inference for Functional Data with Applications, Springer, New York.
Google Scholar
Hubert M., Rousseeuw P.J., Segaert P. (2015), Multivariate functional outlier detection, “Statistical Methods & Applications”, vol. 24, pp. 177–202.
Google Scholar
James G.H., Hastie T.J. (2001), Functional linear discriminant analysis for irregularly sampled curves, “Journal of the Royal Statistical Society: Series B (Statistical Methodology)”, vol. 63, pp. 533–550.
Google Scholar
Jaworski S., Pietrzykowski R. (2014), Spatial comparison of the level and rate of change of farm income in the years 2004–2012, “Acta Universitatis Lodziensis, Folia Oeconomica”, no. 307, pp. 29–44.
Google Scholar
Kayano M., Konishi S. (2009), Functional principal component analysis via regularized Gaussian basis expansions and its application to unbalanced data, “Journal of Statistical Planning and Inference”, vol. 139, pp. 2388–2398.
Google Scholar
Krzyśko M., Waszak Ł. (2013), Canonical correlation analysis for functional data, “Biometrical Letters”, no. 50, pp. 95–105.
Google Scholar
Krzyśko M., Wołyński W. (2009), New variants of pairwise classification, “European Journal of Operational Research”, vol. 199, pp. 512–519.
Google Scholar
Krzyśko M., Wołyński W., Górecki T., Skorzybut M. (2008), Learning Systems, WNT, Warsaw.
Google Scholar
Künsch H.R., Stefanski L.A., Carroll R.J. (1989), Conditionally unbiased bounded influence estimation in general regression models, with applications to generalized linear models, “Journal of American Statistical Association”, vol. 84, pp. 460–466.
Google Scholar
Maechler M., Rousseeuw P., Croux C., Todorov V., Ruckstuhl A., Salibian‑Barrera A., Verbeke T., Koller M., Conceicao E.L.T., di Palma M.A. (2016), robustbase: Basic Robust Statistics, R package version 0.92–7, http://CRAN.R‑project.org/package=robustbase [accessed: 5.04.2017].
Google Scholar
Mallows C.L. (1975), On some topics in robustness, Bell Telephone Laboratories, Murray Hill.
Google Scholar
Matsui H., Konishi K. (2011), Variable selection for functional regression models via the L1 regularization, “Computational Statistics & Data Analysis”, vol. 55, pp. 3304–3310.
Google Scholar
Olszewski R.T. (2001), Generalized feature extraction for structural pattern recognition in time‑series data. Ph.D. Thesis, Carnegie Mellon University, Pittsburgh, http://www.cs.cmu.edu/~bobski [accessed: 10.04.2017].
Google Scholar
Ramsay J.O., Hooker G., Graves G. (2009), Functional Data Analysis with R and MATLAB, Springer, Berlin.
Google Scholar
Ramsay J.O., Silverman B.W. (2002), Applied Functional Data Analysis. Methods and Case Studies, Springer, New York.
Google Scholar
Ramsay J.O., Silverman B.W. (2005), Functional Data Analysis, 2nd Edition, Springer, New York.
Google Scholar
Ramsay J.O., Wickham H., Graves S., Hooker G. (2014), fda – Functional Data Analysis, R package version 2.4.3, http://CRAN.R‑project.org/package=fda [accessed: 28.01.2017].
Google Scholar
R Core Team (2017), R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, https://www.R‑project.org/ [accessed: 10.01.2017].
Google Scholar
Rodriguez J.J., Alonso C.J., Maestro J.A. (2005), Support vector machines of interval based features for time series classification, “Knowledge‑Based Systems”, vol. 18, pp. 171–178.
Google Scholar
Rousseeuw P.J. (1985), Multivariate estimation with high breakdown point, [in:] W. Grossmann, G. Pflug, I. Vincze, W. Wertz (eds.), Mathematical Statistics and Applications, vol. B, Reidel, Dordrecht.
Google Scholar
Wang J., Zamar R., Marazzi A., Yohai V., Salibian‑Barrera M., Maronna R., Zivot E., Rocke D., Martin D., Maechler M., Konis K. (2014), robust: Robust Library, R package version 0.4–16, https://CRAN.R‑project.org/package=robust [accessed: 6.04.2017].
Google Scholar
Zhang J.T. (2013), Analysis of Variance for Functional Data, Chapman & Hall, London.
Google Scholar
Downloads
Additional Files
- Canadian weather data
- ECG data
- 10-fold cross-validation error rates (as percentages) for different values of truncation parameter B by using classifier (7) based on the MLE, MALLOWS, CUBIF and BY estimators for Canadian weather data
- 10-fold cross-validation error rates (as percentages) for different values of truncation parameter B by using classifier (7) based on the MLE, MALLOWS, CUBIF, BY and WBY estimators for ECG data