Efficient ℓ 0 -norm feature selection based on augmented and penalized minimization.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Author(s): Li X;Li X; Xie S; Xie S; Zeng D; Zeng D; Wang Y; Wang Y
  • Source:
    Statistics in medicine [Stat Med] 2018 Feb 10; Vol. 37 (3), pp. 473-486. Date of Electronic Publication: 2017 Oct 30.
  • Publication Type:
    Journal Article; Research Support, N.I.H., Extramural
  • Language:
    English
  • Additional Information
    • Source:
      Publisher: Wiley Country of Publication: England NLM ID: 8215016 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1097-0258 (Electronic) Linking ISSN: 02776715 NLM ISO Abbreviation: Stat Med Subsets: MEDLINE
    • Publication Information:
      Original Publication: Chichester ; New York : Wiley, c1982-
    • Subject Terms:
    • Abstract:
      Advances in high-throughput technologies in genomics and imaging yield unprecedentedly large numbers of prognostic biomarkers. To accommodate the scale of biomarkers and study their association with disease outcomes, penalized regression is often used to identify important biomarkers. The ideal variable selection procedure would search for the best subset of predictors, which is equivalent to imposing an ℓ 0 -penalty on the regression coefficients. Since this optimization is a nondeterministic polynomial-time hard (NP-hard) problem that does not scale with number of biomarkers, alternative methods mostly place smooth penalties on the regression parameters, which lead to computationally feasible optimization problems. However, empirical studies and theoretical analyses show that convex approximation of ℓ 0 -norm (eg, ℓ 1 ) does not outperform their ℓ 0 counterpart. The progress for ℓ 0 -norm feature selection is relatively slower, where the main methods are greedy algorithms such as stepwise regression or orthogonal matching pursuit. Penalized regression based on regularizing ℓ 0 -norm remains much less explored in the literature. In this work, inspired by the recently popular augmenting and data splitting algorithms including alternating direction method of multipliers, we propose a 2-stage procedure for ℓ 0 -penalty variable selection, referred to as augmented penalized minimization-L 0 (APM-L 0 ). The APM-L 0 targets ℓ 0 -norm as closely as possible while keeping computation tractable, efficient, and simple, which is achieved by iterating between a convex regularized regression and a simple hard-thresholding estimation. The procedure can be viewed as arising from regularized optimization with truncated ℓ 1 norm. Thus, we propose to treat regularization parameter and thresholding parameter as tuning parameters and select based on cross-validation. A 1-step coordinate descent algorithm is used in the first stage to significantly improve computational efficiency. Through extensive simulation studies and real data application, we demonstrate superior performance of the proposed method in terms of selection accuracy and computational speed as compared to existing methods. The proposed APM-L 0 procedure is implemented in the R-package APML0.
      (Copyright © 2017 John Wiley & Sons, Ltd.)
    • References:
      J Stat Softw. 2010;33(1):1-22. (PMID: 20808728)
      Ann Stat. 2011;39(4):2021-2046. (PMID: 22102764)
      J Am Stat Assoc. 2012 Jan 1;107(497):223-232. (PMID: 22736876)
      Ann Appl Stat. 2010 Sep 1;4(3):1498-1516. (PMID: 22916087)
      Neuroimage Clin. 2015 May 21;8:583-93. (PMID: 26199870)
      Stat Med. 2006 Sep 30;25(18):3201-16. (PMID: 16143967)
      Biostatistics. 2014 Apr;15(2):207-21. (PMID: 24096388)
      J Neurosci. 2008 Apr 30;28(18):4756-66. (PMID: 18448652)
      Stat Med. 1997 Feb 28;16(4):385-95. (PMID: 9044528)
      JAMA. 1982 May 14;247(18):2543-6. (PMID: 7069920)
      Brain. 2007 Nov;130(Pt 11):2858-67. (PMID: 17893097)
      Front Aging Neurosci. 2014 Apr 22;6:78. (PMID: 24795630)
      Future Neurol. 2010 Jan;5(1):. (PMID: 24348095)
      Stat Appl Genet Mol Biol. 2009;8:Article 14. (PMID: 19222381)
      Cell. 1993 Mar 26;72(6):971-83. (PMID: 8458085)
      Stat Sin. 2014 Jul;24(3):1433-1459. (PMID: 26316678)
      Can J Stat. 2012 Dec;40(4):745-769. (PMID: 23519603)
      J Stat Softw. 2011 Mar;39(5):1-13. (PMID: 27065756)
      Prev Vet Med. 2000 May 30;45(1-2):23-41. (PMID: 10802332)
    • Grant Information:
      R01 CA082659 United States CA NCI NIH HHS; R37 GM047845 United States GM NIGMS NIH HHS; R01 GM047845 United States GM NIGMS NIH HHS; R01 NS073671 United States NS NINDS NIH HHS; U01 NS082062 United States NS NINDS NIH HHS
    • Contributed Indexing:
      Keywords: ADMM; biomarker signature; censored data; variable selection; ℓ0-penalty
    • Accession Number:
      0 (Biomarkers)
    • Publication Date:
      Date Created: 20171031 Date Completed: 20191011 Latest Revision: 20240610
    • Publication Date:
      20240610
    • Accession Number:
      PMC5768461
    • Accession Number:
      10.1002/sim.7526
    • Accession Number:
      29082539