Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Rule Based Chunker for Croatian (CROSBI ID 36063)

Prilog u knjizi | izvorni znanstveni rad

Vučković, Kristina ; Tadić, Marko ; Dovedan, Zdravko Rule Based Chunker for Croatian // Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC2008) / Calzolari, Nicoletta ; Choukri, Khalid ; Maegaard, Bente et al. (ur.). Marakeš: European Language Resources Association (ELRA), 2008. str. 2544-2549

Podaci o odgovornosti

Vučković, Kristina ; Tadić, Marko ; Dovedan, Zdravko

engleski

Rule Based Chunker for Croatian

In this paper we discuss a rule-based approach to chunking sentences in Croatian, implemented using local regular grammars within the NooJ development environment. We describe the rules and their implementation by regular grammars and at the same time show that in NooJ environment it is extremely easy to fine tune their different sub-rules. Since Croatian has strong morphosyntactic features that are shared between most or all elements of a chunk, the rules are built by taking these features into account and strongly relying on them. For the evaluation of our chunker we used a extracted set of manually annotated sentences from 100 kw MSD/tagged and disambiguated Croatian corpus. Our chunker performed the best on VP- chunks (F: 97.01), while NP-chunks (F: 92.31) and PP-chunks (F: 83.08) were of lower quality. The results are comparable to chunker performance of CoNLL-2000 shared task of chunking.

chunker, rule based, local regular grammar, Croatian

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

2544-2549.

objavljeno

Podaci o knjizi

Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC2008)

Calzolari, Nicoletta ; Choukri, Khalid ; Maegaard, Bente ; Mariani, Joseph ; Odjik, Jan ; Piperidis, Stelios ; Tapias, Daniel

Marakeš: European Language Resources Association (ELRA)

2008.

2-9517408-4-0

Povezanost rada

Informacijske i komunikacijske znanosti, Filologija

Poveznice