[Brown CS Talks] Brown CS Seminar: Jan Hajic talk in Lubrano on 6/14/02 at 2 pm

talks-admin@list.cs.brown.edu talks-admin@list.cs.brown.edu
Wed, 12 Jun 2002 15:53:22 -0400


			      CS Seminar

		  
		  The Department of Computer Science
			   BROWN UNIVERSITY

			      
			       presents

			      Jan Hajic

	 Institute of Formal and Applied Linguistics, Charles
		 University, Prague (Czech Republic)


		    Friday, June 14, 2002 at 2 pm
	       Lubrano Conference Room (CIT 4th floor)
		Refreshments will be served at 1:45 pm


	The Prague Dependency Treebank (and its possible use)


			       Abstract

The Prague Dependency Treebank, about a million-word corpus of Czech
with rich linguistic annotation scheme, will be described. All three
layers of sentence analysis representation will be shown: from the
simplest (morphology) to the analytical (i.e., surface) dependency
syntax to the so-called ``tectogrammatical'' representation (TR). TR
represents the deepest analysis of a sentence structure at what we
call ``the linguistic meaning'' level.  While the latter is still being
worked on (annotated), the data with the annotation of the first two
layers is already available for experiments (about 90,000 sentences).
Before the conclusion of the talk, the potential use of the
tectogrammatical representation in machine translation systems will be
speculated about (and some examples from English, Arabic and Czech
will be shown).


		   Host: Professor Eugene Charniak