theohiwbkrlaweconsocartspsyscitweagrmedfarmfaber
CCL Centre for Computational Linguistics K.U.Leuven
Leuven    - Search Staff Students Organizational chart Search matrix Keywords
Home
About
Faculty of Arts
Staff
Projects
Dissertations
Conferences
Papers
Courses
Links
Job Openings
Nederlands
---
-  
.   Lassy: Large Scale Syntactic Annotation of Written Dutch ->
 Time span: 2006-2009
STEVIN (Spraak- en Taaltechnologische Essentiële Voorzieningen In het Nederlands)
F. Van Eynde, I. Schuurman, V. Vandeghinste
Other participant: R.U.Groningen (coordinator)

A large corpus of written Dutch texts (1,000,000 words) is syntactically annotated (manually corrected), using the CGN/D-COI annotation guidelines. In addition, the full D-COI corpus (499,000,000 words) is syntactically annotated automatically. For the manually corrected corpus PoS and lemmatization will be corrected as well.
The project aims to extend the available syntactically annotated corpora for Dutch both in size as well as with respect to the various text genres and topical domains. In addition, various browse and search tools for syntactically annotated corpora will be further developed and made available. Their potential for applications in corpus linguistics will be tested and evaluated.

CCL is responsible for the correction of PoS and lemmatization and for part of the manually corrected syntactic annotation.

More information with respect to Lassy is to be found here.

   
K.U.Leuven - CWIS  Copyright © Katholieke Universiteit Leuven | reacties op de inhoud: Ineke Schuurman
Realisatie: Bert de Bruijn | Laatste wijziging: 10 mei 2007 | Disclaimer
URL: http://apt.ccl.kuleuven.ac.be/about/Lassy.php