Talk
"Learning low dimensional representations of high dimensional data"
Fei Sha, University of Pennsylvania
Monday, March 31, 2008 at 12:00 Noon
Room 368 (CIT 3rd floor)
Statistical modeling of high-dimensional and complex data is a challenging task in machine learning. To tackle this problem, a very powerful strategy is to identify and exploit low-dimensional structures intrinsic to the data. For example, text and image data can often be represented as suppositions of meaningful and interpretable structures such as "object parts" and "topics". These structures are composed of visually salient image patches as well as groups of semantically related words. Examples of such learning algorithms include nonnegative matrix factorization (NMF) and latent Dirichlet allocation (LDA), where parts and topics are encoded by nonnegative basis matrices and probability distributions respectively.
In this talk, I will focus on my research that have brought new and interesting developments into the frameworks of NMF and LDA. In the first project, I show how to extend the original NMF approach to learning meaningful "audio parts" from speech and audio data. The audio parts robustly encode harmonic structures in the voices, which are key acoustic features for building machines that can analyze complicated acoustic signals as well as human listeners. In the second project, I investigate how to incorporate supervisory information like class labels in LDA models. In the supervised LDA, topics are discovered by grouping words based on not only semantic similarity but also class label proximity. These topics yield compact representation with better predictive powers than those derived from the original unsupervised LDA.
Towards the end of the talk, I will summarize briefly my work on learning other types of latent structures such as manifolds and clusters. I will then conclude by discussing all these approaches in a general perspective and speculating a few interesting directions for future work.
Host: Amy Greenwald
| Page Owner: Webmaster | Last Modified: Tue Mar 18 13:16:43 2008 |