The 660-page monograph brings answers to several key issues in the process of drafting a modern Slovene dictionary in the digital age. The articles are divided into ten segments: i) the dictionary is placed in the sociolinguistic context; ii) the outline of a language-resource-based lexicographic work is presented; iii) the first studies of dictionary users in Slovenia are presented; iv) corpora sources are analyzed from the standpoint of lexicographical usability; v) the modern lexicographical procedures are presented in their comlexity; vi) spoken language, intonation and pronunciation are addressed; vii) solutions regarding specialized lexis are presented; viii) a new approach to stylistic marking is implemented; ix) a new outline of word classes in Slovene is proposed in order to assure consistency in description; x) crowdsourcing in different stages of dictionary making is presented as one of possible approaches in modern lexicography.
COBISS.SI-ID: 281482496
The monograph describes the construction and content of the Lexical Database of Slovene as a starting point for an exhaustive corpus-based lexical description of modern written Slovene, with emphasis on word use in the contexts of syntax, meaning, and text. The book brings a detailed description of lexicographic processes based on current European trends in e-lexicography, especially those that have not yet been accepted in the Slovenian lexical and lexicographical theory and practice. Concrete solutions in the lexicographical examination of Slovene, described in detail in the monograph, are the basis for lexicographical description in a dictionary of modern Slovene.
COBISS.SI-ID: 282009600
In this paper we present a language-independent, fully modular and automatic approach to bootstrap a wordnet for a new language by recycling different types of already existing language resources, such as machine-readable dictionaries, parallel corpora, and Wikipedia. The approach, which we apply here to Slovene, takes into account monosemous and polysemous words, general and specialised vocabulary as well as simple and multi-word lexemes. The extracted words are then assigned one or several synset ids, based on a classifier that relies on several features including distributional similarity. Finally, we identify and remove highly dubious (literal, synset) pairs, based on simple distributional information extracted from a large corpus in an unsupervised way.
COBISS.SI-ID: 56782434
Slovene Sign Language (SZJ) has as yet received little attention from linguists. This article presents some basic facts about SZJ, its history, current status, and a description of the Slovene Sign Language Corpus and Pilot Grammar (SIGNOR) project, which compiled and annotated a representative corpus of SZJ. Finally, selected quantitative data extracted from the corpus are presented. The article discusses certain lexical and semantic properties of SZJ, for example, the role of fillers and gestures. Figures are compared to related works, particularly corpus-based studies of British Sign Language (BSL) and Auslan.
COBISS.SI-ID: 57485154
This paper shows that the constraint *Lapse cannot generate ternary stress and creates pathologies in Harmonic Serialism (HS). The constraint *Lapse works properly only when it can evaluate an entirely metrified string, which is impossible in HS. Only *FootFoot, which refers to metrical constituents rather than the distribution of peaks and troughs, can derive ternarity. This supports an analysis based on non-adjacency of constituent edges; in HS, feet are therefore required.
COBISS.SI-ID: 60005218