[CS241] Fwd: talk at MIT Nov 9: Projection Pursuit, Gaussian Scale Mixtures, and the EM algorithm

Kimberley Burchett kim.burchett at gmail.com
Tue Oct 31 16:05:57 EST 2006


---------- Forwarded message ----------
From: CSAIL Event Calendar <eventcalendar at csail.mit.edu>
Date: Oct 31, 2006 2:05 PM
Subject: TALK:Thursday 11-9-06 Projection Pursuit, Gaussian Scale
Mixtures, and t
To: seminars at csail.mit.edu



Projection Pursuit, Gaussian Scale Mixtures, and the EM Algorithm
Language, Learning, Vision and Graphics Seminar Series (LLVG) 2006/2007
Speaker: Sanjoy Dasgupta
Speaker Affiliation: University of California, San Diego
Host: Regina Barzilay
Host Affiliation: MIT CSAIL

Date: 11-9-2006
Time: 4:00 PM - 5:00 PM
Refreshments: 3:45 PM
Location: 32-G475A Stata Center - 4th Floor Gates Lounge

The EM algorithm for fitting Gaussian mixture models is one of the
most widely-used clustering methods. Yet, surprisingly little is known
about its behavior. For instance, there are many different ways to
initialize EM, and to merge/remove intermediate clusters, and the
effects of these different strategies are not understood in a
principled way.

I'll describe a new probabilistic analysis of EM. First of all, it
will emerge that many common methods of initializing and running EM
produce hopelessly suboptimal results even in ideal clustering
scenarios. On the other hand, a particular variant of EM will provably
recover a near-optimal clustering, provided that the clusters are
adequately separated and that their distributions are of a certain
fairly general form.

The type of cluster distributions allowed is motivated by a new result
in projection pursuit, along the lines of the folklore that "all
projections of high-dimensional data look Gaussian". Specifically,
I'll show that for any D-dimensional distribution with finite second
moments,  there is a precise sense in which almost all of its linear
projections into d < D dimensions look like a scale-mixture of
spherical Gaussians (concentric spherical Gaussians with the same
center). The extent of this effect depends upon the ratio of d to D,
and upon the "eccentricity" of the high-dimensional distribution.

Sanjoy Dasgupta is an Assistant Professor in the Department of
Computer Science and Engineering at UC San Diego. He received his PhD
from Berkeley, and spent two years at AT&T Research Labs before
joining UCSD. His area of research is algorithmic statistics, with a
focus on unsupervised learning. He has just completed a textbook,
"Algorithms" (with Christos Papadimitriou and Umesh Vazirani).

Relevant URL(S):
For more information please contact: Marcia Davidson, 617-253-2448,
marcia at csail.mit.edu

_______________________________________________
Seminars mailing list
Seminars at lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/seminars


-- 
Kimberley Burchett
http://www.kimbly.com/


More information about the CS241 mailing list