The objective of this research is to concentrate on adjective-noun collocations and to describe methods for creation of two resources based on large-scale corpora of contemporary Japanese (BCCWJ and JpTenTen). The first resource is “Adjective-Noun Collocation Data.” It uses 500 adjectives as headwords and lists the nouns that combine with each adjective, along with context. A total of 23,247 collocations have already been extracted from the ten-billion-word corpus JpTenTen. The second resource is a “Japanese Language Learner’s Dictionary of Adjective-Noun Collocations.” It aims at a detailed description of the 25 most frequent basic adjectives, which account for 62% of overall Japanese adjective usage. In this paper, the highly frequent adjective takai serves as a model to show how the data from the fi rst resource (“Adjective-Noun Collocation Data”) can be used as a basis for the creation of a dictionary for Japanese language learners. The dictionary sorts the modified nouns by difficulty levels and arranges them semantically into lexical maps, putting emphasis on collocations that are diffi cult for language learners to predict and providing corpus-informed information on register and special usages. Finally, the paper discusses some possible theoretical and practical implications. Once the two resources are complete, they will provide data currently not available for Japanese that can be used for research on lexis and grammar, as well as for the creation of syllabi and language learning materials.
COBISS.SI-ID: 54273378
In this paper, we explore learner production of adjectives using the Japanese language learner's corpus C-JAS (Corpus of Japanese As a Second language). Firstly, we describe the overall usage of adjectives in the corpus and discuss the distribution of the adjectives among learners including their correct and incorrect usages. Then, we take the frequently used adjective takai as an example and show how the learners’ production of adjectives develops in terms of form, correct/incorrect usages, and lexico-semantic coverage.
COBISS.SI-ID: 53567586
This paper examines Japanese language i-adjectives that are annotated as short and long-unit words in the Balanced Corpus of Contemporary Written Japanese (BCCWJ). The large gap between the number of lemma as short and long-unit words (suw and luw), besides revealing the nature of lemma design in the new morphological dictionary UniDic and BCCWJ, gives some new insights into the productivity of compound and derived adjectives. A careful examination of the phenomenon reveals results applicable in Japanese language learning where treatment of compound/derived adjective is not systematically covered, and shows that the BCCWJ, annotated with two-fold lexical units, can be used as an essential resource for further investigation of productivity in Japanese. In addition, the research provides suggestions for possible improvement of suw and luw annotation data for adjectives.
COBISS.SI-ID: 54303330