• Title/Summary/Keyword: Word Frequency

Search Result 752, Processing Time 0.026 seconds

The Influence of Age of Acquisition in Hangul Word Recognition (한글단어재인에서 습득연령의 영향)

  • Lee, Hye-Won;Kim, Sun-Kyoung
    • Korean Journal of Cognitive Science
    • /
    • v.24 no.4
    • /
    • pp.339-363
    • /
    • 2013
  • The age of acquisition effect is the phenomenon in which the words acquired early in life are processed better than the words acquired later in life. Age of acquisition and word frequency are critical factors in lexical processing. In this study we examined the age of acquisition effects in Hangul word recognition. In Experiment 1, we examined the AoA effects in word naming and lexical decision tasks. The results showed that there was an interaction between task and age of acquisition. The AoA effects appeared only in the lexical decision task. In Experiment 2, we examined the relationship between age of acquisition and word frequency in the lexical decision task. The results showed that the two variables were significant. The early-acquired words were processed better than the words acquired later, and the words with high frequency were processed better than the words with low frequency. However, there was no interaction between the two variables. In Experiment 3, we examined how phonological changes in Hangul words influence the AoA effects. The results show that the AoA effects were similar whether phonological changes occur or not. Our results are discussed in terms of several theoretical hypotheses.

  • PDF

Word Sense Disambiguation based on Concept Learning with a focus on the Lowest Frequency Words (저빈도어를 고려한 개념학습 기반 의미 중의성 해소)

  • Kim Dong-Sung;Choe Jae-Woong
    • Language and Information
    • /
    • v.10 no.1
    • /
    • pp.21-46
    • /
    • 2006
  • This study proposes a Word Sense Disambiguation (WSD) algorithm, based on concept learning with special emphasis on statistically meaningful lowest frequency words. Previous works on WSD typically make use of frequency of collocation and its probability. Such probability based WSD approaches tend to ignore the lowest frequency words which could be meaningful in the context. In this paper, we show an algorithm to extract and make use of the meaningful lowest frequency words in WSD. Learning method is adopted from the Find-Specific algorithm of Mitchell (1997), according to which the search proceeds from the specific predefined hypothetical spaces to the general ones. In our model, this algorithm is used to find contexts with the most specific classifiers and then moves to the more general ones. We build up small seed data and apply those data to the relatively large test data. Following the algorithm in Yarowsky (1995), the classified test data are exhaustively included in the seed data, thus expanding the seed data. However, this might result in lots of noise in the seed data. Thus we introduce the 'maximum a posterior hypothesis' based on the Bayes' assumption to validate the noise status of the new seed data. We use the Naive Bayes Classifier and prove that the application of Find-Specific algorithm enhances the correctness of WSD.

  • PDF

The Neighborhood Effect in Korean Visual Word Recognition (한국어 시각단어재인에서 나타나는 이웃효과)

  • Kwon, You-An;Cho, Hyae-Suk;Kim, Choong-Myung;Nam, Ki-Chun
    • MALSORI
    • /
    • no.60
    • /
    • pp.29-45
    • /
    • 2006
  • We investigated whether the first syllable plays an important role in lexical access in Korean visual word recognition. To do so, one lexical decision task (LDT) and two form primed LDT experiments examined the nature of the syllabic neighborhood effect. In Experiment 1, the syllabic neighborhood density and the syllabic neighborhood frequency was manipulated. The results showed that lexical decision latencies were only influenced by the syllabic neighborhood frequency. The purpose of experiment 2 was to confirm the results of experiment 1 with form-primed LDT task. The lexical decision latency was slower in form-related condition compared to form-unrelated condition. The effect of syllabic neighborhood density was significant only in form-related condition. This means that the first syllable plays an important role in the sub-lexical process. In Experiment 3, we conducted another form-primed LDT task manipulating the number of syllabic neighbors in words with higher frequency neighborhood. The interaction of syllabic neighborhood density and form relation was significant. This result confirmed that the words with higher frequency neighborhood are more inhibited by neighbors sharing the first syllable than words with no higher frequency neighborhood in the lexical level. These findings suggest that the first syllable is the unit of neighborhood and the unit of representation in sub-lexical representation is syllable in Korea.

  • PDF

Adaptive Changes in the Grain-size of Word Recognition (단어재인에 있어서 처리단위의 적응적 변화)

  • Lee, Chang H.
    • Proceedings of the Korean Society for Cognitive Science Conference
    • /
    • 2002.05a
    • /
    • pp.111-116
    • /
    • 2002
  • The regularity effect for printed word recognition and naming depends on ambiguities between single letters (small grain-size) and their phonemic values. As a given word is repeated and becomes more familiar, letter-aggregate size (grain-size) is predicted to increase, thereby decreasing the ambiguity between spelling pattern and phonological representation and, therefore, decreasing the regularity effect. Lexical decision and naming tasks studied the effect of repetition on the regularity effect for words. The familiarity of a word from was manipulated by presenting low and high frequency words as well as by presenting half the stimuli in mixed upper- and lowercase letters (an unfamiliar form) and half in uniform case. In lexical decision, the regularity effect was initially strong for low frequency words but became null after two presentations; in naming it was also initially strong but was merely reduced (although still substantial) after three repetitions. Mixed case words were recognized and named more slowly and tended to show stronger regularity effects. The results were consistent with the primary hypothesis that familiar word forms are read faster because they are processed at a larger grain-size, which requires fewer operations to achieve lexical selection. Results are discussed in terms of a neurobiological model of word recognition based on brain imaging studies.

  • PDF

A Convergence Study of the Research Trends on Stress Urinary Incontinence using Word Embedding (워드임베딩을 활용한 복압성 요실금 관련 연구 동향에 관한 융합 연구)

  • Kim, Jun-Hee;Ahn, Sun-Hee;Gwak, Gyeong-Tae;Weon, Young-Soo;Yoo, Hwa-Ik
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.8
    • /
    • pp.1-11
    • /
    • 2021
  • The purpose of this study was to analyze the trends and characteristics of 'stress urinary incontinence' research through word frequency analysis, and their relationships were modeled using word embedding. Abstract data of 9,868 papers containing abstracts in PubMed's MEDLINE were extracted using a Python program. Then, through frequency analysis, 10 keywords were selected according to the high frequency. The similarity of words related to keywords was analyzed by Word2Vec machine learning algorithm. The locations and distances of words were visualized using the t-SNE technique, and the groups were classified and analyzed. The number of studies related to stress urinary incontinence has increased rapidly since the 1980s. The keywords used most frequently in the abstract of the paper were 'woman', 'urethra', and 'surgery'. Through Word2Vec modeling, words such as 'female', 'urge', and 'symptom' were among the words that showed the highest relevance to the keywords in the study on stress urinary incontinence. In addition, through the t-SNE technique, keywords and related words could be classified into three groups focusing on symptoms, anatomical characteristics, and surgical interventions of stress urinary incontinence. This study is the first to examine trends in stress urinary incontinence-related studies using the keyword frequency analysis and word embedding of the abstract. The results of this study can be used as a basis for future researchers to select the subject and direction of the research field related to stress urinary incontinence.

A Study on the Integration of Similar Sentences in Atomatic Summarizing of Document (자동초록 작성시에 발생하는 유사의미 문장요소들의 통합에 관한 연구)

  • Lee, Tae-Young
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.34 no.2
    • /
    • pp.87-115
    • /
    • 2000
  • The effects of the Case, Part of Speech, Word and Clause Location, Word Frequency etc. were studied in discriminating the similar sentences of the Korean text. Word Frequency was much related to the discrimination of similarity and Tilte word and Functional Clause were little, but the others were not. The cosine coefficient and Salton'similarity measurement are used to measure the similarity between sentences. The change of clauses between each sentence is also used to unify the similar sentences into a represenative sentence.

  • PDF

Word Accent of Cheju Dialects in Korean (제주 방언의 낱말 악센트)

  • Park, Soon-Bok
    • MALSORI
    • /
    • v.55
    • /
    • pp.33-43
    • /
    • 2005
  • This paper investigates the word accent pattern of Cheju dialects in Korean and determines whether it varies according to the age as well as the word itself and where the speakers come from. On the basis on the theory of pitch accent, which was suggested by Koo(1993) and Jung(1965) for the Korean standard accent, the fundamental frequency of each syllable is measured. The syllable that has the highest frequency is labelled for 2, while the rests for 1. The results of the experiment are that the two syllabic words have 21 accent pattern, while the three syllabic words 121 pattern and the four syllabic words 1211. In addition to this characteristic of accent pattern in Cheju dialects, it is interesting that the older the speakers, the less accent pattern the utterance has as suggested above.

  • PDF

The Perception-Based Study of a Weak Syllable in English Words Containing Weak-Strong Pattern by Korean Learners (I) (약강구조를 포함하는 영어단어에 대한 영어학습자의 약음절 지각과 반응시간(I))

  • Shin Ji-Young;Kim Kee-Ho;Kim Hee-Sung
    • MALSORI
    • /
    • no.57
    • /
    • pp.31-42
    • /
    • 2006
  • The purpose of this study is to observe how Korean learners perceive an English weak syllable in words containing WS syllable pattern. According to the automated discrimination task using E-Prime, the ratio of correct answer(%) and reaction time of the stimuli with same syllable patterns were respectively higher and faster than those with different syllable patterns. Specifically, in the stimuli with different syllable patterns, the frequency(familiarity) of stressed word succeeding weak syllable and whether the weak syllable had coda in it were two important factors in distinguishing between a word with and without weak syllable. Even though the high English proficiency Koreans had faster reaction time than the low English proficiency Koreans, all Korean learners had a difficulty in perceiving the weak syllable at the beginning of a word.

  • PDF

A Novel Text to Image Conversion Method Using Word2Vec and Generative Adversarial Networks

  • LIU, XINRUI;Joe, Inwhee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.401-403
    • /
    • 2019
  • In this paper, we propose a generative adversarial networks (GAN) based text-to-image generating method. In many natural language processing tasks, which word expressions are determined by their term frequency -inverse document frequency scores. Word2Vec is a type of neural network model that, in the case of an unlabeled corpus, produces a vector that expresses semantics for words in the corpus and an image is generated by GAN training according to the obtained vector. Thanks to the understanding of the word we can generate higher and more realistic images. Our GAN structure is based on deep convolution neural networks and pixel recurrent neural networks. Comparing the generated image with the real image, we get about 88% similarity on the Oxford-102 flowers dataset.

A Study on Korean Students' Production and Perception of English Word-final Stop Voicing

  • Kang, Seok-Han
    • Speech Sciences
    • /
    • v.14 no.1
    • /
    • pp.105-119
    • /
    • 2007
  • The purpose of this study is to examine Korean students' production and perception of word-final stop voicing in light of their overseas experience. Subjects were English native speakers, Korean university students with residence experience in America, Korean university students without residence experience in America, and Korean elementary school students. They participated in both production and perception tests. Results showed that the students' production and perception with residence experience in America appeared quite similar to those of the English native speakers. In the production tests, we noticed somewhat different results in temporal and frequency features. The one-year residence in America had some influence on their frequency features, but not the temporal features in the word final stop production. That difference could be seen in the perception tests, too. We could not find any difference in the identification test of the final release environment between the Korean university students who had studied abroad and those who didn't. Rather the difference could be found in the cue influence test in both the final release and non-release environments.

  • PDF