1.

Automatic discovery of semantic shifts in translation

The paper describes an attempt to develop a methodology for automatic identification of semantic shifts in translation. First the original side of a parallel corpus is annotated with synset IDs from Wordnet, then we can select the target semantic domain using these IDs and the domain tags in Wordnet. In our case we focus on terms pertaining to gastronomy, which we disambiguate using the UKB toolkit.We analyse the extracted sentences and classify them according to the translation strategy employed.

COBISS.SI-ID: 47263842

2.

SPOOK.Sem: semantic annotation of a parallel translation corpus

This paper presents the first attempt to semantically annotate the parallel translational corpus SPOOK. The sense inventory used for annotation was the Slovene semantic lexicon sloWNet which is based on the Princeton WordNet and was developed automatically from a number of freely available corpus and lexical resources. The main goal of the study was a comparison of the annotations assigned in both languages in order to determine to what extent the concepts overlap in the two languages and whether the Englishbased sloWNetis suitable for annotating Slovene texts. In addition, we also wanted to develop and test an annotation scheme that would be suitable for the annotation of a larger corpus, and look into the possibilities of automatizingthe annotation of parallel corpora at the semantic level.

COBISS.SI-ID: 50256738

3.

Translations into Slovene from the corpus-based perspective

The book presents 9 original papers describing the properties of Slovene translations and the SPOOK translational corpus.

COBISS.SI-ID: 265692928

4.

The construction and analysis of corpora in translation studies.

This paper presents guidelines for the construction and analysis of corpora intranslation studies. The first part introduces some basic concepts in corpuslinguistics and discusses the goals and types of corpus-based translation studies, gives theoretical and practical guidelines regarding the representativeness of specialized corpora and outlines the corpus annotation process at several levels and for different languages. The second part presents methods of corpus analysis with an emphasis on the tools that supportthe Slovene language and are appropriate for analysing both publicly available and custom-built corpora. The most important functions, such as the use of concordancers, frequency lists, collocations and keywords, are described and illustrated with practical examples.

COBISS.SI-ID: 41547106

5.

Le gérondif et le participe présent et leur évolution vers la grammaticalisation

The paper is a contrastive study of the grammaticalization of gerunds and present participles in Slovene and French. The study uses three corpus resources: FraSloK, Evrokorpus and Fidaplus.

COBISS.SI-ID: 47687266

J6-2009 — Final report

1.

Automatic discovery of semantic shifts in translation

2.

SPOOK.Sem: semantic annotation of a parallel translation corpus

3.

Translations into Slovene from the corpus-based perspective

4.

The construction and analysis of corpora in translation studies.

5.

Le gérondif et le participe présent et leur évolution vers la grammaticalisation