Spectral data contain powerful information that can be used to identify unknown compounds and their chemical structures. In this paper, we study fused lasso logistic regression (FLLR) to classify the spectral data into two groups. We show that the FLLR has a grouping property on regression coefficients, which simultaneously selects a group of highly correlated variables together. Both the sparsity and the grouping property of the FLLR provide great advantages in the analysis of the spectral data. In particular, it resolves the well-known peak misalignment problem of the spectral data by providing data dependent binning, and provides a better interpretable classifier than other ℓ1-regularization methods. We also analyze the gas chromatography/mass spectrometry data to classify the origin of herbal medicines, and illustrate the advantages of the FLLR over other existing ℓ1-regularized methods.
- Fused lasso regression
- Mass spectral data
- Penalized logistic regression
ASJC Scopus subject areas
- Analytical Chemistry
- Process Chemistry and Technology
- Computer Science Applications