TY - GEN
T1 - Exploring deep parametric embeddings for breast CADx
AU - Jamieson, Andrew R.
AU - Alam, Rabi
AU - Giger, Maryellen L.
N1 - Copyright:
Copyright 2011 Elsevier B.V., All rights reserved.
PY - 2011
Y1 - 2011
N2 - Computer-aided diagnosis (CADx) involves training supervised classifiers using labeled ("truth-known") data. Often, training data consists of high-dimensional feature vectors extracted from medical images. Unfortunately, very large data sets may be required to train robust classifiers for high-dimensional inputs. To mitigate the risk of classifier over-fitting, CADx schemes may employ feature selection or dimension reduction (DR), for example, principal component analysis (PCA). Recently, a number of novel "structure-preserving" DR methods have been proposed1. Such methods are attractive for use in CADx schemes for two main reasons. First, by providing visualization of highdimensional data structure, and second, since DR can be unsupervised or semi-supervised, unlabeled ("truth-unknown") data may be incorporated2. However, the practical application of state-of-the-art DR techniques such as, t-SNE3, to breast CADx were inhibited by the inability to retain a parametric embedding function capable of mapping new input data to the reduced representation. Deep (more than one hidden layer) neural networks can be used to learn such parametric DR embeddings. We explored the feasibility of such methods for use in CADx by conducting a variety of experiments using simulated feature data, including models based on breast CADx features. Specifically, we investigated the unsupervised parametric t-SNE4 (pt-SNE), the supervised deep t-distributed MCML5 (dt-MCML), and hybrid semi-supervised modifications combining the two.
AB - Computer-aided diagnosis (CADx) involves training supervised classifiers using labeled ("truth-known") data. Often, training data consists of high-dimensional feature vectors extracted from medical images. Unfortunately, very large data sets may be required to train robust classifiers for high-dimensional inputs. To mitigate the risk of classifier over-fitting, CADx schemes may employ feature selection or dimension reduction (DR), for example, principal component analysis (PCA). Recently, a number of novel "structure-preserving" DR methods have been proposed1. Such methods are attractive for use in CADx schemes for two main reasons. First, by providing visualization of highdimensional data structure, and second, since DR can be unsupervised or semi-supervised, unlabeled ("truth-unknown") data may be incorporated2. However, the practical application of state-of-the-art DR techniques such as, t-SNE3, to breast CADx were inhibited by the inability to retain a parametric embedding function capable of mapping new input data to the reduced representation. Deep (more than one hidden layer) neural networks can be used to learn such parametric DR embeddings. We explored the feasibility of such methods for use in CADx by conducting a variety of experiments using simulated feature data, including models based on breast CADx features. Specifically, we investigated the unsupervised parametric t-SNE4 (pt-SNE), the supervised deep t-distributed MCML5 (dt-MCML), and hybrid semi-supervised modifications combining the two.
KW - computer-aided diagnosis
KW - deep embedding
KW - dimension reduction
KW - feature-space
KW - machine learning
KW - semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=79955752000&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79955752000&partnerID=8YFLogxK
U2 - 10.1117/12.878331
DO - 10.1117/12.878331
M3 - Conference contribution
AN - SCOPUS:79955752000
SN - 9780819485052
T3 - Progress in Biomedical Optics and Imaging - Proceedings of SPIE
BT - Medical Imaging 2011
T2 - Medical Imaging 2011: Computer-Aided Diagnosis
Y2 - 15 February 2011 through 17 February 2011
ER -