• Title/Summary/Keyword: Word Input

Search Result 225, Processing Time 0.024 seconds

Phonological Process and Word Recognition in Continuous Speech: Evidence from Coda-neutralization (음운 현상과 연속 발화에서의 단어 인지 - 종성중화 작용을 중심으로)

  • Kim, Sun-Mi;Nam, Ki-Chun
    • Phonetics and Speech Sciences
    • /
    • v.2 no.2
    • /
    • pp.17-25
    • /
    • 2010
  • This study explores whether Koreans exploit their native coda-neutralization process when recognizing words in Korean continuous speech. According to the phonological rules in Korean, coda-neutralization process must come before the liaison process, as long as the latter(i.e. liaison process) occurs between 'words', which results in liaison-consonants being coda-neutralized ones such as /b/, /d/, or /g/, rather than non-neutralized ones like /p/, /t/, /k/, /ʧ/, /ʤ/, or /s/. Consequently, if Korean listeners use their native coda-neutralization rules when processing speech input, word recognition will be hampered when non-neutralized consonants precede vowel-initial targets. Word-spotting and word-monitoring tasks were conducted in Experiment 1 and 2, respectively. In both experiments, listeners recognized words faster and more accurately when vowel-initial target words were preceded by coda-neutralized consonants than when preceded by coda non-neutralized ones. The results show that Korean listeners exploit the coda-neutralization process when processing their native spoken language.

  • PDF

Four-Year-Old Children's Counting Skills and Their Mothers' Use of Number Words: The Mediating Role of Children's Number Word Use (4세 유아의 수세기 기술과 어머니의 수 단어 사용: 유아 수 단어 사용의 매개효과)

  • Jihyeon Park;Youjeong Park;Yujin Lee;Sunjung Baik;Sukyoung Choe
    • Korean Journal of Childcare and Education
    • /
    • v.19 no.6
    • /
    • pp.79-95
    • /
    • 2023
  • Objective: This study examines the relationships among four-year-olds' counting skills, their use of number words, and their mothers' use of number words during mother-child free play. Specifically, we assess whether children's use of number words mediates the relationship between their counting skills and their mothers' use of number words during play. Methods: Forty-two 4-year-old children and their mothers were asked to play freely with a given set of toys at their home for 10 minutes. Children also completed a counting skill test. Frequencies of number word use were calculated for mothers and children from transcriptions of the free play. Results: Children's counting skills, the frequency of their number word use, and their mothers' frequency of number word use were positively correlated with each other. Additionally, the frequency of children's number-word use completely mediated the relationship between their counting skills and their mothers' frequency of number-word use. Conclusion/Implications: The results suggest that children's use of number language may play a crucial role in the provision of number-related language input by parents, based on their children's math skills. Practical implications of the findings are discussed.

Adjusting Weights of Single-word and Multi-word Terms for Keyphrase Extraction from Article Text

  • Kang, In-Su
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.8
    • /
    • pp.47-54
    • /
    • 2021
  • Given a document, keyphrase extraction is to automatically extract words or phrases which topically represent the content of the document. In unsupervised keyphrase extraction approaches, candidate words or phrases are first extracted from the input document, and scores are calculated for keyphrase candidates, and final keyphrases are selected based on the scores. Regarding the computation of the scores of candidates in unsupervised keyphrase extraction, this study proposes a method of adjusting the scores of keyphrase candidates according to the types of keyphrase candidates: word-type or phrase-type. For this, type-token ratios of word-type and phrase-type candidates as well as information content of high-frequency word-type and phrase-type candidates are collected from the input document, and those values are employed in adjusting the scores of keyphrase candidates. In experiments using four keyphrase extraction evaluation datasets which were constructed for full-text articles in English, the proposed method performed better than a baseline method and comparison methods in three datasets.

Performance Improvement of Context-Sensitive Spelling Error Correction Techniques using Knowledge Graph Embedding of Korean WordNet (alias. KorLex) (한국어 어휘 의미망(alias. KorLex)의 지식 그래프 임베딩을 이용한 문맥의존 철자오류 교정 기법의 성능 향상)

  • Lee, Jung-Hun;Cho, Sanghyun;Kwon, Hyuk-Chul
    • Journal of Korea Multimedia Society
    • /
    • v.25 no.3
    • /
    • pp.493-501
    • /
    • 2022
  • This paper is a study on context-sensitive spelling error correction and uses the Korean WordNet (KorLex)[1] that defines the relationship between words as a graph to improve the performance of the correction[2] based on the vector information of the word embedded in the correction technique. The Korean WordNet replaced WordNet[3] developed at Princeton University in the United States and was additionally constructed for Korean. In order to learn a semantic network in graph form or to use it for learned vector information, it is necessary to transform it into a vector form by embedding learning. For transformation, we list the nodes (limited number) in a line format like a sentence in a graph in the form of a network before the training input. One of the learning techniques that use this strategy is Deepwalk[4]. DeepWalk is used to learn graphs between words in the Korean WordNet. The graph embedding information is used in concatenation with the word vector information of the learned language model for correction, and the final correction word is determined by the cosine distance value between the vectors. In this paper, In order to test whether the information of graph embedding affects the improvement of the performance of context- sensitive spelling error correction, a confused word pair was constructed and tested from the perspective of Word Sense Disambiguation(WSD). In the experimental results, the average correction performance of all confused word pairs was improved by 2.24% compared to the baseline correction performance.

Ternary Decomposition and Dictionary Extension for Khmer Word Segmentation

  • Sung, Thaileang;Hwang, Insoo
    • Journal of Information Technology Applications and Management
    • /
    • v.23 no.2
    • /
    • pp.11-28
    • /
    • 2016
  • In this paper, we proposed a dictionary extension and a ternary decomposition technique to improve the effectiveness of Khmer word segmentation. Most word segmentation approaches depend on a dictionary. However, the dictionary being used is not fully reliable and cannot cover all the words of the Khmer language. This causes an issue of unknown words or out-of-vocabulary words. Our approach is to extend the original dictionary to be more reliable with new words. In addition, we use ternary decomposition for the segmentation process. In this research, we also introduced the invisible space of the Khmer Unicode (char\u200B) in order to segment our training corpus. With our segmentation algorithm, based on ternary decomposition and invisible space, we can extract new words from our training text and then input the new words into the dictionary. We used an extended wordlist and a segmentation algorithm regardless of the invisible space to test an unannotated text. Our results remarkably outperformed other approaches. We have achieved 88.8%, 91.8% and 90.6% rates of precision, recall and F-measurement.

A Computational Model of Language Learning Driven by Training Inputs

  • Lee, Eun-Seok;Lee, Ji-Hoon;Zhang, Byoung-Tak
    • Proceedings of the Korean Society for Cognitive Science Conference
    • /
    • 2010.05a
    • /
    • pp.60-65
    • /
    • 2010
  • Language learning involves linguistic environments around the learner. So the variation in training input to which the learner is exposed has been linked to their language learning. We explore how linguistic experiences can cause differences in learning linguistic structural features, as investigate in a probabilistic graphical model. We manipulate the amounts of training input, composed of natural linguistic data from animation videos for children, from holistic (one-word expression) to compositional (two- to six-word one) gradually. The recognition and generation of sentences are a "probabilistic" constraint satisfaction process which is based on massively parallel DNA chemistry. Random sentence generation tasks succeed when networks begin with limited sentential lengths and vocabulary sizes and gradually expand with larger ones, like children's cognitive development in learning. This model supports the suggestion that variations in early linguistic environments with developmental steps may be useful for facilitating language acquisition.

  • PDF

A Performance Analysis Based on Hadoop Application's Characteristics in Cloud Computing (클라우드 컴퓨팅에서 Hadoop 애플리케이션 특성에 따른 성능 분석)

  • Keum, Tae-Hoon;Lee, Won-Joo;Jeon, Chang-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.5
    • /
    • pp.49-56
    • /
    • 2010
  • In this paper, we implement a Hadoop based cluster for cloud computing and evaluate the performance of this cluster based on application characteristics by executing RandomTextWriter, WordCount, and PI applications. A RandomTextWriter creates given amount of random words and stores them in the HDFS(Hadoop Distributed File System). A WordCount reads an input file and determines the frequency of a given word per block unit. PI application induces PI value using the Monte Carlo law. During simulation, we investigate the effect of data block size and the number of replications on the execution time of applications. Through simulation, we have confirmed that the execution time of RandomTextWriter was proportional to the number of replications. However, the execution time of WordCount and PI were not affected by the number of replications. Moreover, the execution time of WordCount was optimum when the block size was 64~256MB. Therefore, these results show that the performance of cloud computing system can be enhanced by using a scheduling scheme that considers application's characteristics.

A Study on the Accuracy Improvement of Movie Recommender System Using Word2Vec and Ensemble Convolutional Neural Networks (Word2Vec과 앙상블 합성곱 신경망을 활용한 영화추천 시스템의 정확도 개선에 관한 연구)

  • Kang, Boo-Sik
    • Journal of Digital Convergence
    • /
    • v.17 no.1
    • /
    • pp.123-130
    • /
    • 2019
  • One of the most commonly used methods of web recommendation techniques is collaborative filtering. Many studies on collaborative filtering have suggested ways to improve accuracy. This study proposes a method of movie recommendation using Word2Vec and an ensemble convolutional neural networks. First, in the user, movie, and rating information, construct the user sentences and movie sentences. It inputs user sentences and movie sentences into Word2Vec to obtain user vectors and movie vectors. User vectors are entered into user convolution model and movie vectors are input to movie convolution model. The user and the movie convolution models are linked to a fully connected neural network model. Finally, the output layer of the fully connected neural network outputs forecasts of user movie ratings. Experimentation results showed that the accuracy of the technique proposed in this study accuracy of conventional collaborative filtering techniques was improved compared to those of conventional collaborative filtering technique and the technique using Word2Vec and deep neural networks proposed in a similar study.

Coreference Resolution using Hierarchical Pointer Networks (계층적 포인터 네트워크를 이용한 상호참조해결)

  • Park, Cheoneum;Lee, Changki
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.9
    • /
    • pp.542-549
    • /
    • 2017
  • Sequence-to-sequence models and similar pointer networks suffer from performance degradation when an input is composed of multiple sentences or when the length of the input sentence is long. To solve this problem, this paper proposes a hierarchical pointer network model that uses both the word level and sentence level information to encode input sequences composed of several sentences at the word level and sentence level. We propose a hierarchical pointer network based coreference resolution that performs a coreference resolution for all mentions. The experimental results show that the proposed model has a precision of 87.07%, recall of 65.39% and CoNLL F1 74.61%, which is an improvement of 21.83% compared to an existing rule-based model.

SOC Verification Based on WGL

  • Du, Zhen-Jun;Li, Min
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.12
    • /
    • pp.1607-1616
    • /
    • 2006
  • The growing market of multimedia and digital signal processing requires significant data-path portions of SoCs. However, the common models for verification are not suitable for SoCs. A novel model--WGL (Weighted Generalized List) is proposed, which is based on the general-list decomposition of polynomials, with three different weights and manipulation rules introduced to effect node sharing and the canonicity. Timing parameters and operations on them are also considered. Examples show the word-level WGL is the only model to linearly represent the common word-level functions and the bit-level WGL is especially suitable for arithmetic intensive circuits. The model is proved to be a uniform and efficient model for both bit-level and word-level functions. Then Based on the WGL model, a backward-construction logic-verification approach is presented, which reduces time and space complexity for multipliers to polynomial complexity(time complexity is less than $O(n^{3.6})$ and space complexity is less than $O(n^{1.5})$) without hierarchical partitioning. Finally, a construction methodology of word-level polynomials is also presented in order to implement complex high-level verification, which combines order computation and coefficient solving, and adopts an efficient backward approach. The construction complexity is much less than the existing ones, e.g. the construction time for multipliers grows at the power of less than 1.6 in the size of the input word without increasing the maximal space required. The WGL model and the verification methods based on WGL show their theoretical and applicable significance in SoC design.

  • PDF