The paper presents an innovative approach to extract Slovene definition candidates from domain specific corpora using morphosyntactic patterns, automatic terminology recognition and semantic tagging with wordnet senses. First, a classification model was trained on examples from Slovene Wikipedia which was then used to find wellformed definitions among the extracted candidates.
B.03 Paper at an international scientific conference
COBISS.SI-ID: 43122530In this paper we present an approach to automatically extract and align multi-word terms from an English-Slovene comparable health corpus. First, the terms are extracted from the corpus for each language separately using a list of user-adjustable morphosyntactic patterns and a term weighting measure. Then, the extracted terms are aligned in a bag-of-equivalents fashion with a seed bilingual lexicon. In the extension of the approach we also show that thesmall general seed lexicon can be enriched with domain-specific vocabulary by harvesting it directly from the comparable corpus, which significantly improves the results of multi-word term mapping. While most previous efforts in bilingual lexicon extraction from comparable corpora have focused on mapping of single words, the proposed technique successfully augments them in that it is able to deal with multi-word terms as well. Since the proposed approach requires minimal knowledge resources, it is easily adaptable for a new language pair or domain, which is one of its biggest advantages.
B.03 Paper at an international scientific conference
COBISS.SI-ID: 49683298The present study deals with the analysis of the hypothetical Romance lexical and morphological elements in Slovenian conversational patterns, found in the dictionary Vocabolario Italiano e Schiauo (1607) written by an Italian friar Gregorio Alasia da Sommaripa. The analysis shows that not all the supposed Romance elements can be regarded merely as examples of Romance (Italian) linguistic interference in Alasia’s conversations, as at least some of them could have been already present in a wider Slovene area (as confirmed by the Protestant and Baroque literary traditions, cf. the comparative adjective form vech drago ‘more expensive’) or in western Slovenian dialects (cf. the verb stati with the meaning ‘to be, to feel’). Such features are to be described as a result of positive transfer (the status of these elements within a particular variety must be addressed separately) and have to be distinguished from the idiolectal examples of negative transfer or other errors, attributable to Alasia’s lacking grasp of Slovene (of the local Slovene vernacular).
B.03 Paper at an international scientific conference
COBISS.SI-ID: 52693602The turbulent sociopolitical events that engulfed former Yugoslavia in the 1990s resulted in the change in the relations between individual cultures and languages. At the same time, the social changes that brought about the dissolution of Yugoslavia caused the "death" of what linguistically is a single language (SerboCroatian) and gave way to the creation of four standard languages – Croatian, Serbian, Bosnian, and Montenegrin, as well as the change in status of Slovene and Macedonian. All this received much attention of (socio) linguists from home and abroad. In the seven chapters of the monograph, established (socio) linguists deal, from the critical sociolinguistic perspective, with current sociolinguistic situations and relations between language policy and language reality in the newlyformed countries of former Yugoslavia.
C.01 Editorial board of a foreign/international collection of papers/book
COBISS.SI-ID: 200399884sloWTool is an all-in-one wordnet tool that enables browsing, editing and visualization of wordnet content with hyperbolic graphs and images. It is freely available under the CC-BY-SA licence and based on MySQL and PHP technologies, which makes the tool light-weight and portable. It is browser-independent and allows quick queries. Scripts for automatic database transformations from and into several standardized formats, such as DEBVisDic XML and LMF, are provided so that a wordnet for another language can be imported at any time. The on-line browser is simple to use for non-experts but also enables advanced searching and view settings for expert users that can enter complex search queries and decide which fields to display as well as toggle between a mono- and a multilingual option. sloWTool can be used online or downloaded: https://launchpad.net/slowtool
F.16 Improvements to an existing information system/databases
COBISS.SI-ID: 25364007