The article describes an analysis of automatic term recognition results performed for single- and multi-word terms with the LUIZ term extraction system. The target application of the results is a dictionary of Public Relations and the main resource the KoRP Public Relations Corpus. Our analysis is focused on two segments: (a) single-word noun term candidates, which we compare with the frequency list of nouns from KoRP and evaluate termhood on the basis of the judgements of two domain experts, and (b) multi-word term candidates with verb as headword. In order to better assess the performance of the system and the soundness of our approach we also performed an analysis of recall. Our results show that the terminological relevance of extracted nouns is indeed higher than that of merely frequent nouns, and that verbal phrases only rarely count as proper terms. The analysis of recall shows low inter-annotator agreement, but nevertheless very satisfactory recall levels.
COBISS.SI-ID: 31519069
A selection of entries in the lexical database for Slovene, an activity within the Communication in Slovene project, was compilled using the automatic extraction of lexical information from the Gigafida corpus (via the Sketch Engine corpus tool) and importing the obtained information directly into the dictionary writing system iLex. The paper describes individual steps in the preparation of the automatic extraction procedure, especially the adjustment of sketch grammar, development of the GDEX configuration for the selection of good corpus examples, and the programming of the API script. We briefly present the initial results and suggest improvements in methodology of automatic extraction of lexical data, and inclusion of additional language technologies.
COBISS.SI-ID: 34731309
To analyse corpus data, lexicographers need software that allows them to search, manipulate, and save data. A good corpus tool is the key to a comprehensive lexicographic analysis. The funcionality and user-friendlyness of corpus tools have improved considerably since they were first used in dictionary projects. In the paper the Sketch Engine and it's TickBox Lexicography tool is presented.
COBISS.SI-ID: 31850845