[Brown CS Talks] Brown CS Seminar: Jan Hajic talk in Lubrano on 6/14/02 at 2 pm
talks-admin@list.cs.brown.edu
talks-admin@list.cs.brown.edu
Wed, 12 Jun 2002 15:53:22 -0400
CS Seminar
The Department of Computer Science
BROWN UNIVERSITY
presents
Jan Hajic
Institute of Formal and Applied Linguistics, Charles
University, Prague (Czech Republic)
Friday, June 14, 2002 at 2 pm
Lubrano Conference Room (CIT 4th floor)
Refreshments will be served at 1:45 pm
The Prague Dependency Treebank (and its possible use)
Abstract
The Prague Dependency Treebank, about a million-word corpus of Czech
with rich linguistic annotation scheme, will be described. All three
layers of sentence analysis representation will be shown: from the
simplest (morphology) to the analytical (i.e., surface) dependency
syntax to the so-called ``tectogrammatical'' representation (TR). TR
represents the deepest analysis of a sentence structure at what we
call ``the linguistic meaning'' level. While the latter is still being
worked on (annotated), the data with the annotation of the first two
layers is already available for experiments (about 90,000 sentences).
Before the conclusion of the talk, the potential use of the
tectogrammatical representation in machine translation systems will be
speculated about (and some examples from English, Arabic and Czech
will be shown).
Host: Professor Eugene Charniak