The present study deals with the analysis of the hypothetical Romance lexical and morphological elements in Slovenian conversational patterns, found in the dictionary Vocabolario Italiano e Schiauo (1607) written by an Italian friar Gregorio Alasia da Sommaripa. The analysis shows that not all the supposed Romance elements can be regarded merely as examples of Romance (Italian) linguistic interference in Alasia’s conversations, as at least some of them could have been already present in a wider Slovene area (as confirmed by the Protestant and Baroque literary traditions, cf. the comparative adjective form vech drago ‘more expensive’) or in western Slovenian dialects (cf. the verb stati with the meaning ‘to be, to feel’). Such features are to be described as a result of positive transfer (the status of these elements within a particular variety must be addressed separately) and have to be distinguished from the idiolectal examples of negative transfer or other errors, attributable to Alasia’s lacking grasp of Slovene (of the local Slovene vernacular).
B.03 Paper at an international scientific conference
COBISS.SI-ID: 52693602In this paper we present a corpus-based approach to automatic identification of false friends for Slovene and Croatian, a pair of closely related languages. By taking advantage of the lexical overlap between the two languages, we focus on measuring the difference in meaning between identicallyspelled words by using frequency and distributional information. Weanalyze the impact of corpora of different origin and size together with different association and similarity measures and compare them to a simple frequency-based baseline. With the best performing setting we obtain very goodaverage precision of 0.973 and 0.883 on different gold standards. The presented approach works on non-parallel datasets, is knowledge-lean and language-independent, which makes it attractive for natural language processing tasks that often lack the lexical resources and cannot afford to build them by hand.
B.03 Paper at an international scientific conference
COBISS.SI-ID: 52673634The paper describes the process of compiling an on-line dictionary of terminology (http://www.termis.fdv.uni-lj.si/index-en.html, July 2011-June 2013). The compilation began from an LSP corpus (i.e. KoRP, a corpus of public relations texts) and involves automatic term recognition performed for single- and multi-word terms, the automatic extraction of lexical information from the corpus and the development of technological infrastructure containing tools for the building of specialised corpora, automatic extraction of terminology and the possibility of incorporating lists of term candidates in a web program for the production and processing of dictionary entries.
B.03 Paper at an international scientific conference
COBISS.SI-ID: 32098397The turbulent socio-political events that engulfed former Yugoslavia in the 1990s resulted in the change in the relations between individual cultures and languages. At the same time, the social changes that brought about the dissolution of Yugoslavia caused the "death" of what linguistically is a single language (Serbo-Croatian) and gave way to the creation of four standard languages – Croatian, Serbian, Bosnian, and Montenegrin, as well as the change in status of Slovene and Macedonian. All this received much attention of (socio-)linguists from home and abroad. In the seven chapters of the monograph, established (socio-)linguists deal, from the critical sociolinguistic perspective, with current sociolinguistic situations and relations between language policy and language reality in the newly-formed countries of former Yugoslavia.
C.01 Editorial board of a foreign/international collection of papers/book
COBISS.SI-ID: 200399884The monograph contains nine new contributions in the field of translation studies, all based on corpus linguistics methodology. With the inclusion of corpus linguistics, translation studies saw a turn in methodology. This holds true also for Slovenian translation studies, however, up to this point Slovenian research has not yet focused on a planned process of compiling translations. This is the common denominator of all contributions in the monograph – they are all based on a single source, i.e., the first Slovenian translation corpus SPOOK.
C.02 Editorial board of a national monograph
COBISS.SI-ID: 265692928