Selection of Non-zero Loadings in Sparse Principal Component Analysis
Number of Authors: 3
2017 (English)In: Chemometrics and Intelligent Laboratory Systems, ISSN 0169-7439, E-ISSN 1873-3239, Vol. 162, 160-171 p.Article in journal (Refereed) Published
Principal component analysis (PCA) is a widely accepted procedure for summarizing data through dimensional reduction. In PCA, the selection of the appropriate number of components and the interpretation of those components have been the key challenging features. Sparse principal component analysis (SPCA) is a relatively recent technique proposed for producing principal components with sparse loadings via the variance-sparsity trade-off. Although several techniques for deriving sparse loadings have been offered, no detailed guidelines for choosing the penalty parameters to obtain a desired level of sparsity are provided. In this paper, we propose the use of a genetic algorithm (GA) to select the number of non-zero loadings (NNZL) in each principal component while using SPCA. The proposed approach considerably improves the interpretability of principal components and addresses the difficulty in the selection of NNZL in SPCA. Furthermore, we compare the performance of PCA and SPCA in uncovering the underlying latent structure of the data. The key features of the methodology are assessed through a synthetic example, pitprops data and a comparative study of the benchmark Tennessee Eastman process.
Place, publisher, year, edition, pages
2017. Vol. 162, 160-171 p.
Reliability and Maintenance
Research subject Quality Technology and Management
IdentifiersURN: urn:nbn:se:ltu:diva-61741DOI: 10.1016/j.chemolab.2017.01.018ScopusID: 2-s2.0-85012293100OAI: oai:DiVA.org:ltu-61741DiVA: diva2:1070159
Validerad; 2017; Nivå 2; 2017-02-23 (andbra)2017-01-312017-01-312017-02-23Bibliographically approved