Micha
Elsner
I am now a postdoctoral researcher at the
University of Edinburgh. I am working on Bayesian word
segmentation
with Sharon
Goldwater, although I intend to keep working on discourse as well.
My thesis focused on discourse coherence-- the way a document or
conversation is structured to provide context for new information. I
constructed models looking at where and how entities (things in the
world) are mentioned in a text. I also showed that these models can be
used to disentangle the different threads of conversation going on in
a crowded chat room.
I got my Phd. from Brown University in 2011, advised
by Eugene Charniak,
with Mark
Johnson and Regina
Barzilay as committee members. At Brown, I worked in
the Brown Laboratory for
Linguistic Information Processing (BLLIP).
I graduated from the University of
Rochester in 2005 with degrees in Computer Science and Classics. I
got my MS from Brown in 2007.
Publications
- Micha Elsner and Deepak Santhanam.
Learning to Fuse Disparate Sentences.
Workshop on Monolingual Text-to-Text Generation (T2T 2011),
Portland Oregon.
[PDF][Slides (PDF)]
- Micha Elsner and Eugene Charniak.
Extending the Entity Grid with Entity-specific Features.
Proceedings of the Association for Computational Linguistics (ACL 2011), Portland, Oregon.
[PDF]
[Slides (PDF)]
- Micha Elsner and Eugene Charniak.
Disentangling Chat with Local Coherence Models.
Proceedings of the Association for Computational Linguistics (ACL 2011), Portland, Oregon.
[PDF]
[Poster (PDF)]
[Thesis defense slides (PDF)]
- Micha Elsner and Eugene Charniak.
Disentangling Chat.
Computational Linguistics 36(3), September 2010.
[PDF]
- Micha Elsner and Eugene Charniak.
The Same-head Heuristic for Coreference.
Proceedings of the Association for Computational Linguistics (ACL
2010), Uppsala, Sweden.
[PDF]
[Poster (PDF)]
[Slides (PDF)]
(This is a short version of the same-head work.)
- Micha Elsner and Eugene Charniak.
The Same-head Heuristic for Coreference.
Northeast Student Conference on Artificial Intelligence (NESCAI
2010), Amherst, Massachusetts.
[PDF]
[Poster (PDF)]
(This is a full-length version of the same-head work.)
- Micha Elsner and Warren Schudy.
Bounding and Comparing Methods for Correlation Clustering Beyond
ILP.
NAACL-HLT 2009 Workshop on Integer Linear Programming for Natural Language
Processing (ILP-NLP 2009), Boulder, Colorado.
[PDF]
[Slides (PDF)]
-
Micha Elsner, Eugene Charniak, and Mark Johnson.
Structured Generative Models for Unsupervised Named-Entity
Clustering. Proceedings of the Conference on Human Language
Technology and North American chapter of the Association for
Computational Linguistics (HLT-NAACL 2009), Boulder,
Colorado.
[PDF]
[Slides (PDF)]
-
Eugene Charniak and Micha Elsner.
EM Works for Pronoun Anaphora Resolution. Proceedings of the
Conference of the European Chapter of the Association for
Computational Linguistics (EACL 2009), Athens,
Greece. [PDF]
-
Micha Elsner and Eugene Charniak.
You Talking to Me? A Corpus and Algorithm for Conversation
Disentanglement. Proceedings of the
Association for Computational Linguistics: Human Language
Technologies (ACL-HLT 2008), Columbus, Ohio. [PDF] [Slides (PDF)]
-
Micha Elsner and Eugene Charniak.
Coreference-inspired Coherence Modeling. Proceedings of the
Association for Computational Linguistics: Human Language
Technologies (ACL-HLT 2008), Columbus, Ohio. [PDF] [Poster (PDF)]
-
Micha Elsner, Joseph Austerweil, and Eugene Charniak.
A Unified Local and Global Model for Discourse
Coherence. Proceedings of the Conference on Human Language
Technology and North American chapter of the Association for
Computational Linguistics (HLT-NAACL 2007), Rochester, New York.
[PDF]
[Slides (PDF)]
Note: this publication contains a bug affecting development
results. A short explanation has been attached to the beginning of the
PDF.
-
Eugene Charniak, Mark Johnson, Micha Elsner, Joseph Austerweil, David
Ellis, Isaac Haxton, Catherine Hill, Shrivaths Iyengar, Jeremy Moore,
Michael Pozar, and Theresa Vu.
Multilevel Coarse-to-fine PCFG
Parsing. Proceedings of the Conference on Human Language Technology and
North American chapter of the Association for Computational
Linguistics (HLT-NAACL 2006), Brooklyn, New York.
[PDF]
[Slides (PDF)]
-
Micha Elsner, Mary Swift, James Allen and Daniel Gildea.
Online Statistics for a Unification-Based Dialogue
Parser. Proceedings of the Ninth International Workshop on
Parsing Technologies (IWPT 2005), Vancouver.
[PDF]
[Poster (PDF)]
-
Thomas Kollar, Jonathan Schmid, Eric Meisner, Micha Elsner, Diana
Calarese, Chikita Purav, Chris Brown, Jenine Turner, Dasun Peramunage,
Gautam Altekar and Victoria Sweetser.
Mabel: Extending Human Interaction and Robot Rescue Designs.
AAAI Mobile Robot Competition 2003: Papers from the AAAI
Workshop (ed. Smart, Smart, Bugajska), Acapulco.
[PDF]
Thesis
-
Generalizing Local Coherence Modeling.
Brown University, January 2011.
[PDF]
[Defense slides (PDF)]
Primarily based on work from NAACL-07, ACL-08a and b, ILP-NLP-09,
CL-10, and ACL-11a and b.
Tech Reports
-
Micha Elsner and Eugene Charniak.
A Generative Discourse-New Model for Text Coherence.
Technical Report CS-07-04, Brown University.
[PDF]
Talks
-
Dialogue Structure in Microtext. Invited talk,
AAAI-11
Workshop on Analyzing Microtext, Aug. 8, 2011, San
Francisco. [Slides (PDF)]
-
Generalizing Local Coherence Modeling. Thesis defense, Jan. 10, 2011, Brown Univ. (Also delivered as ILCC Seminar, Feb. 4, 2011, University of Edinburgh.) [Slides (PDF)]
-
Learning to Fuse Disparate Sentences. Invited talk, Nov. 15,
2010, Columbia Univ. (extended version of the PIRE talk) [Slides (PDF)]
-
Learning to Fuse Disparate Sentences. PIRE grant meeting, July
15, 2010, Uppsala. [Slides (PDF)]
-
Debugging Samplers: Making MCMC Work in Practice. Tutorial for
Machine Learning Reading Group, Jul. 8, 2010, Brown Univ.
[Slides (PDF)]
[Example code]
-
Reference Patterns for Discourse Coherence. Thesis proposal,
May 10, 2010, Brown Univ.
[Slides (PDF)]
-
The Same-head heuristic for Coreference. Invited talk, Jan. 20,
2010, Stanford Univ.
[Slides (PDF)]
-
Learning Maximum-entropy Models of Salience via EM. Pattern
theory reading group, Sept. 30, 2009, Brown Univ.
[Slides (PDF)]
-
Entity-based Coherence: Going Off the Grid.
Invited talk, Mar. 4, 2009, Univ. of Pennsylvania.
[Slides (PDF)]
-
The Dangling Conversation: A Corpus and Algorithm for Conversation
Disentanglement (extended version of ACL 2008 talk).
Invited talk, Jan. 21, 2009, Univ. of Maryland.
[Slides (PDF)]
-
Given/New Information and the Discourse Coherence Problem.
Invited talk, Oct. 10, 2007, MIT.
[Slides (PDF)]
Software
-
Brown Coherence Toolkit: software for a variety of local
coherence models, now including the extended entity grid, and test
applications for ordering and chat disentanglement (C++).
New version 1.0 as of 2011!
[Bitbucket]
- Sentence Fusion Software: software for preprocessing,
training and running our English sentence fusion system
(C++/Python). By me and Deepak Santhanam.
[Bitbucket]
- Waterworks: Python utility package, including
ClusterMetrics library for evaluating clusterings. Mostly by David
McClosky.
[Python
Package Index]
-
Correlation Clustering System: framework for creating and
analyzing datasets (Python), heuristic solvers, LP, ILP and SDP
bounding systems (C++).
This is the release version; the evaluation code requires Waterworks.
README,
[tgz]
You may also want the data matrices we constructed for 20
newsgroups: [tgz]
-
Unsupervised Pronoun Anaphora System: EM learner, pre-trained
model (newswire) and pronoun resolver (C++).
Eugene wrote this software, so although I'm pleased to answer
questions on it, I don't know the gory innards in detail.
[tgz]
If you're planning to use this software, you should consider Shane
Bergsma's NADA
non-referential pronoun detector as a preprocess. The reported
results demonstrate significant improvements over our built-in
non-referential detector.
-
IRC Chat Data and Disentanglement Model: annotated IRC chat
data, annotation software (Java), analysis and disentanglement model
(Python).
README,
[tgz]
Teaching
People I've Worked With
Service
- Faculty-Graduate
Liason (FGL), Brown CS, 2008-9.
- Brown/MIT/Harvard NLP 2009 Summit co-organizer
- 2006-7: Leader of Machine Learning Reading
Group
- Reviewing:
- NAACL 2009-10
- NAACL 2010 student research workshop
- ACL 2008-9
- ACL 2010-11 student research workshop
- AAAI 2010
- EMNLP 2010-11
- JAIR
- COLING 2008, 2010
- NESCAI 2007-8, 2010
- ICML 2007 co-reviewer
- AISTATS 2011
- IJCAI 2011 senior PC member
- AAAI-11 workshop on microtext, PC
- melsner0@gmail.com
- Informatics Forum
- 10 Crichton St.
- Edinburgh, UK EH8 9AB
- melsner0 (skype)