• Title/Summary/Keyword: Type-token ratio

Search Result 8, Processing Time 0.022 seconds

A Study on the Lexical Diversity of Korean-Chinese Bilingual Children (한국어·중국어 이중 언어 사용 아동의 어휘 다양성)

  • Choi, Jiyoung
    • Journal of Korean language education
    • /
    • v.28 no.4
    • /
    • pp.245-271
    • /
    • 2017
  • This study aimed at investigating the lexical diversity in the "Frog Story" narratives of Korean-Chinese bilingual children. Six bilingual speakers of Korean children- four boys and two girls- were audio recorded as they produced narratives based on pictures from the Mercer Mayer book "Frog, where are you?" The order of narration was counterbalanced. The vocabularies from narratives were analyzed by type, token, TTR (type-token Ratio) and D value using the CLAN (Computerized Language Analysis) program. The findings showed that the pattern of lexical diversity in Korean is similar with the Chinese, but the TTR and D value of Chinese still remain low in comparison with those of Korean. In addition, Korean language seems to have significant influence on Chinese in the language usage pattern and vice versa.

Analysis on Preschoolers' Mean Length of Utterance and Type-Token Ratio by their Sex and Play Situation Type (유아의 성별과 놀이상황 유형별 평균발화길이와 어휘다양도)

  • Sung, Mi Young;Chang, Moon Soo
    • Korean Journal of Childcare and Education
    • /
    • v.10 no.6
    • /
    • pp.43-56
    • /
    • 2014
  • The purpose of this study was to analyze the differences of preschoolers' utterance features by their gender and play situation type. For this purpose, a total of 40 5-year-old children participated in this study. Dyad were participated in each play session during 10 minutes. The play session was videotaped and the videotaped data were transcribed by CBS(2014). The collected data were analyzed by using a independent t-test and paired t-test. The main results are as follows. First, girls' MLU-e, MLU-w, MLU-m were longer than that of boys in a familiar play situation. Second, preschoolers' MLU-w was longer in an unfamiliar play situation than in familiar ones and preschoolers' type-token ratio were higher in an unfamiliar play situation than in familiar ones. Implications for the importance of preschoolers' spontaneous speech are discussed.

A Study on the Development of English Inflectional Morphemes Based on the CHILDES Corpus (CHILDES 코퍼스를 기반으로 한 아동의 영어 굴절형태소 발달 연구)

  • Min, Myung Sook;Jun, Jongsup;Lee, Sun-Young
    • Korean Journal of Cognitive Science
    • /
    • v.24 no.3
    • /
    • pp.203-235
    • /
    • 2013
  • The goal of this paper is to test the findings about English-speaking children's acquisition of inflectional morphemes in the literature using a large-scale database. For this, we obtained a 4.7-million-word corpus from the CHILDES (Child Language Data Exchange System) database, and analyzed 1,630 British and American children's uses of English derivational morphemes up to age 7. We analyzed the type and token frequencies, type per token ratio (TTR), and the lexical diversity (D) for such inflectional morphemes as the present progressive -ing, the past tense -(e)d, the comparative and superlative -er/est with reference to children's nationality and age groups. To sum up our findings, the correlations between the D value and children's age varied from morpheme to morpheme; e.g. we found no correlation for -ing, a marginal correlation for -ed, and a strong correlation for -er/-est. Our findings are consistent with Brown's (1973) classical observation that children learn progressive forms earlier than the past tense marker. In addition, overgeneralization errors were frequently found for -ed, but rarely for -ing, showing a U-shaped developmental pattern at ages 2-3. Finally, American children showed higher D scores than British children, which showed that American children used inflectional morphemes for more word types compared with British children. The present study has its significance in testing the earlier findings in the literature by setting up well-defined methodology for analyzing the entire CHILDES database.

  • PDF

Adjusting Weights of Single-word and Multi-word Terms for Keyphrase Extraction from Article Text

  • Kang, In-Su
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.8
    • /
    • pp.47-54
    • /
    • 2021
  • Given a document, keyphrase extraction is to automatically extract words or phrases which topically represent the content of the document. In unsupervised keyphrase extraction approaches, candidate words or phrases are first extracted from the input document, and scores are calculated for keyphrase candidates, and final keyphrases are selected based on the scores. Regarding the computation of the scores of candidates in unsupervised keyphrase extraction, this study proposes a method of adjusting the scores of keyphrase candidates according to the types of keyphrase candidates: word-type or phrase-type. For this, type-token ratios of word-type and phrase-type candidates as well as information content of high-frequency word-type and phrase-type candidates are collected from the input document, and those values are employed in adjusting the scores of keyphrase candidates. In experiments using four keyphrase extraction evaluation datasets which were constructed for full-text articles in English, the proposed method performed better than a baseline method and comparison methods in three datasets.

Use of Emotion Words by Korean English Learners

  • Lee, Jin-Kyong
    • English Language & Literature Teaching
    • /
    • v.17 no.4
    • /
    • pp.193-206
    • /
    • 2011
  • The purpose of the study is to examine the use of emotion vocabulary by Korean English learners. Three basic emotion fields, pleasure, anger, and fear were selected to elicit the participants' responses. L1 English speakers' data was also collected for comparison. The major results are as follows. First, English learners responded with various inappropriate verb forms like I feel~, I am~ while the majority of English native speaking teachers responded with subjunctive forms like I would feel~. In addition, L2 English learners used mostly simple and coordination sentences. Second, the lexical richness, measured through type/token ratio, was higher in English L1 data than in English L2 data. The proportion of emotion lemmas reflects the lexical richness or the diversity of the emotion words. Lastly, L2 English learners' responses focused on a few typical adjectives like happy, angry and scared. This structural and semantic distinctiveness of Korean English learners' emotion words was discussed from pedagogical perspectives.

  • PDF

A Study on the Diachronic Evolution of Ancient Chinese Vocabulary Based on a Large-Scale Rough Annotated Corpus

  • Yuan, Yiguo;Li, Bin
    • Asia Pacific Journal of Corpus Research
    • /
    • v.2 no.2
    • /
    • pp.31-41
    • /
    • 2021
  • This paper makes a quantitative analysis of the diachronic evolution of ancient Chinese vocabulary by constructing and counting a large-scale rough annotated corpus. The texts from Si Ku Quan Shu (a collection of Chinese ancient books) are automatically segmented to obtain ancient Chinese vocabulary with time information, which is used to the statistics on word frequency, standardized type/token ratio and proportion of monosyllabic words and dissyllabic words. Through data analysis, this study has the following four findings. Firstly, the high-frequency words in ancient Chinese are stable to a certain extent. Secondly, there is no obvious dissyllabic trend in ancient Chinese vocabulary. Moreover, the Northern and Southern Dynasties (420-589 AD) and Yuan Dynasty (1271-1368 AD) are probably the two periods with the most abundant vocabulary in ancient Chinese. Finally, the unique words with high frequency in each dynasty are mainly official titles with real power. These findings break away from qualitative methods used in traditional researches on Chinese language history and instead uses quantitative methods to draw macroscopic conclusions from large-scale corpus.

A Research on the Interlanguage of Chinese Speaking Korean Language Learners: Focusing on MLU and Characteristics Found in Vocabulary Usage (중국인 한국어 학습자의 중간언어 연구 - 평균발화길이(MLU)와 어휘적 특성을 중심으로)

  • Kim, Seon-Jung;Kim, Mok-Ah
    • Cross-Cultural Studies
    • /
    • v.22
    • /
    • pp.303-327
    • /
    • 2011
  • This study aims to uncover the learner's language proficiency shown in the writing data of Chinese elementary/intermediate level learners. Language proficiency of the learners acquired by error analysis provides only partial information, and thus this study analyses the interlanguage of Korean learners in terms of 'Mean Length of Utterance, MLU' to discover the overall aspect of learner's language proficiency more symmetrically. The analysis of vocabulary area is to be enforced after generally studying the learner's language development aspect in accordance with MLU-m(orpheme) and MLU-(w)ord found in compositions by Chinese speaking Korean language learners. In terms of MLU, it has been slightly increased as the level of proficiency between elementary level and intermediate level learners; however, the morpheme seemed to be difficult to use, since the difference between Chinese learners and Korean university students has been notably shown. Vocabulary diversity, using aspect for each word class, and using aspect of the predicate are studied for vocabulary area; more various and numerous vocabulary tend to be used as the level of proficiency increases. In terms of predicate use, Chinese learners use less numerous vocabulary types.

Foreign Language Education of Korean Peninsula: Insights from Nogeldae (『노걸대』 분석을 통해서 바라본 우리 반도의 외국어 교육)

  • Kim, Jeong-ryeol
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.6
    • /
    • pp.408-414
    • /
    • 2017
  • This paper aims to investigate the value and resilience of Nogeoldae which was written at the end of Koryo dynasty and has been used as the most important foreign language education materials throughout the 500 years of Chosun dynasty. To this end, 106 volumes of dialogues, 12 of meeting, 17 of lodging, 21 of Daedo bound, 34 of Daedo lives and 11 of return in Nogeoldae are analyzed by an average length of the sentences, an average length of words, type-token ratio, number of words before main verbs and number of words before nouns to identify the progressive degree of the complexity. The result of the analysis shows that Nogeoldae presents a desired progressive complexity found in modern foreign language textbooks.