• Title/Summary/Keyword: English Word

Search Result 574, Processing Time 0.032 seconds

Text Classification Using Parallel Word-level and Character-level Embeddings in Convolutional Neural Networks

  • Geonu Kim;Jungyeon Jang;Juwon Lee;Kitae Kim;Woonyoung Yeo;Jong Woo Kim
    • Asia pacific journal of information systems
    • /
    • v.29 no.4
    • /
    • pp.771-788
    • /
    • 2019
  • Deep learning techniques such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) show superior performance in text classification than traditional approaches such as Support Vector Machines (SVMs) and Naïve Bayesian approaches. When using CNNs for text classification tasks, word embedding or character embedding is a step to transform words or characters to fixed size vectors before feeding them into convolutional layers. In this paper, we propose a parallel word-level and character-level embedding approach in CNNs for text classification. The proposed approach can capture word-level and character-level patterns concurrently in CNNs. To show the usefulness of proposed approach, we perform experiments with two English and three Korean text datasets. The experimental results show that character-level embedding works better in Korean and word-level embedding performs well in English. Also the experimental results reveal that the proposed approach provides better performance than traditional CNNs with word-level embedding or character-level embedding in both Korean and English documents. From more detail investigation, we find that the proposed approach tends to perform better when there is relatively small amount of data comparing to the traditional embedding approaches.

Target Word Selection for English-Korean Machine Translation System using Multiple Knowledge (다양한 지식을 사용한 영한 기계번역에서의 대역어 선택)

  • Lee, Ki-Young;Kim, Han-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.5 s.43
    • /
    • pp.75-86
    • /
    • 2006
  • Target word selection is one of the most important and difficult tasks in English-Korean Machine Translation. It effects on the translation accuracy of machine translation systems. In this paper, we present a new approach to select Korean target word for an English noun with translation ambiguities using multiple knowledge such as verb frame patterns, sense vectors based on collocations, statistical Korean local context information and co-occurring POS information. Verb frame patterns constructed with dictionary and corpus play an important role in resolving the sparseness problem of collocation data. Sense vectors are a set of collocation data when an English word having target selection ambiguities is to be translated to specific Korean target word. Statistical Korean local context Information is an N-gram information generated using Korean corpus. The co-occurring POS information is a statistically significant POS clue which appears with ambiguous word. The experiment showed promising results for diverse sentences from web documents.

  • PDF

An Acoustical Study of English Word Stress Produced by Americans and Koreans

  • Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.77-88
    • /
    • 2002
  • Acoustical correlates of stress can be classified as duration, intensity and fundamental frequency. This study examined the acoustical differences in the first two syllables of stressed English words produced by ten American and Korean speakers. The Korean subjects scored very high on the TOEFL. They read at a normal speed a fable from which the acoustical parameters of eight words were analyzed. In order to make the data comparison meaningful, each parameter was collected at 100 dynamic time points proportional to the total duration of the two syllables. Then the ratio of the parameter sum of the first rime to that of the second rime was calculated to determine the relative prominence of the syllables. Results showed that the durations of the first two syllables were almost comparable between the Americans and Koreans. However, statistically significant differences showed up in the diphthong pronunciations and in the words with the second syllable stressed. Also, remarkably high r-squared values were found between pairs of the three acoustical parameters, which suggests that either one or a combination of two or more parameters may account for the prominence of a syllable within a word.

  • PDF

Base-Identity Effects in Some Morphophonemic Alternations in English

  • Kim, Heeyong
    • Korean Journal of English Language and Linguistics
    • /
    • v.2 no.2
    • /
    • pp.185-205
    • /
    • 2002
  • Within the framework of Generalized Sympathy (GS) (Jun 1999), this paper investigates the reasons why phonological rules such as Cluster Simplification, Closed Syllable ${\ae}$-Tensing, and Belfast Dentalization overapply or underapply in Class 2 affixed words in English. According to GS, a morphologically independent word can be treated as a derived word in that it is assumed to have any possible outputs as bases to resemble. As a result, a correspondence relation is triggered between a morphologically independent word being represented as Derived (D) and any possible outputs represented as Base (B), i.e., BD-Faith. In analyses of affixed words, BA-Faith is evoked, instead of BD-Faith. Furthermore, as Benua (1997) suggests, BA-Faith is classified into two correspondence relations; $BA_1$-Faith between Base and Class 1 affixed words, and $BA_2$-Faith between Base and Class 2 affixed words. When the $BA_1$-Faith takes precedence over phonological constraints three rules misapply in Class 2 affixed words. In other words, the misapplications are driven by base-identity effects.

  • PDF

A Study on the Voice Onset Time of English Voiceless Stops in the Buckeye Corpus (벅아이 코퍼스를 이용한 영어 무성파열음의 VOT 연구)

  • Yoon, Kyu-Chul
    • Phonetics and Speech Sciences
    • /
    • v.4 no.2
    • /
    • pp.33-40
    • /
    • 2012
  • The purpose of this paper is to investigate the voice onset time (VOT) of the English voiceless stops [p, t, k] found in the Buckeye Corpus of Conversational Speech [1]. Three young female speakers were chosen for this study and their VOT values were semi-automatically extracted along with other factors. The factors used for the analysis were place of articulation, location in word, syllabic stress, content word or not, word frequency calculated from the corpus, and the speech rate expressed in syllables per second. Results showed that, for the three places of articulation of each speaker, all the factors had a statistically significant effect on the VOT values. This paper has significance in that the materials used for the analysis were from a corpus of spontaneous natural English speech.

Extra Vowel Addition Produced in Korean Students' English Pronunciation of Word-final Stop Consonants (영어 폐쇄자음 발음 뒤에 나타나는 모음추가 현상)

  • Hwang, Young-Soon
    • Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.169-186
    • /
    • 2000
  • This paper aims to confirm the mispronunciation of native Korean students due to the phonetic and phonological system differences between English and Korean, and to find the works-to-do by experiment. Many Korean students tend to differentiate the sounds of word-final stop consonants not by vowel duration or the allophones but by the phoneme of the consonant itself. In English, Stop sounds change through the conditions of the aspirated, unaspirated, or unreleased sounds. But in Korean they are not allophones of phonemes but distinct phonemes. Therefore, many Korean students are apt to add an extra vowel sound /i/ after the final stop consonant in the eve form due to both the unperception of the differences between the phonemes and the allophones of stop consonants, and the influence of the Korean sound-sequence relationship. Since the replacement of the allophones and extra vowel addition does not change the meaning, the importance was almost lost. Nevertheless, this kind of study is essential for the precise learning and the use of the English language.

  • PDF

A Study on the Voice Onset Times of the Buckeye Corpus Stops (벅아이 코퍼스 파열음의 성대진동 개시시간 연구)

  • Park, Soo Hee;Yoon, Kyuchul
    • Phonetics and Speech Sciences
    • /
    • v.8 no.1
    • /
    • pp.9-17
    • /
    • 2016
  • The purpose of this work is to examine the voice onset times(VOTs) of the voiceless and voiced stops from the ten young male speakers of the Buckeye corpus[9]. The factors that are known to affect VOTs were also extracted, including the place of articulation, height of following vowels, location within word, presence of a preceding [s], status of the target word with respect to the content versus function word, presence of a syllabic stress, word frequency and speech rate. Findings from this work mostly agreed with those from earlier studies on English, but with some exceptions and new discoveries. We hope that this work can contribute to figuring out the nature and properties of the spontaneous speech of English.

A Study on the Influence of English Vowel Pronunciation Training on Word Initial Stop Pronunciation of Korean English Learners (영어 모음 발음 교육이 한국인 학습자의 어두 폐쇄음 발화에 미치는 영향에 대한 연구)

  • Km, Ji-Eun
    • Phonetics and Speech Sciences
    • /
    • v.5 no.3
    • /
    • pp.31-38
    • /
    • 2013
  • This study investigated the influence of English vowel pronunciation training to English word-initial stop pronunciation. For that purpose, VOT values of English stops produced by twenty Korean English learners(five Youngnam dialect male speakers, five Youngnam dialect female speakers, five Kangwon dialect male speakers, and five Kangwon dialect female speakers) were measured using the Speech Analyzer and their post-training production was compared with their pre-training production. The result shows that post-training VOT values of voiced stops became closer to those of native English speakers in all four groups. Hence, it can be inferred that vowel pronunciation training is effective for correcting pronunciation of voiced vowels by analyzing the change of the quality of following vowels(especially low vowels) and the degree of giving stress.

Acoustic Analysis for Natural Pronunciation Programs

  • Lim Un
    • MALSORI
    • /
    • no.44
    • /
    • pp.1-14
    • /
    • 2002
  • Because the accuracy and the fluency are the essence in English speaking, both of them are very important in English trencher training and in-service English training programs. To get the accuracy and the fluency, the causes and the phenomena of the unnatural pronunciation have to be diagnosed. Consequently, the problematic and unnatural pronunciation of Korean elementary and secondary English teachers should be analyzed with using Acoustic Analyzing tools like CSL, Multi-speech and Praat. In addition, an attempt to Pinpoint what the causes of unnatural pronunciation was executed. Next a procedure and steps were proposed for in-service training programs that would cultivate the fluency and the accuracy. In case of elementary teachers, the unnatural pronunciation of segmental features and suprasegmental features were found much. therefore segmental features should be emphasized in the begging of pronunciation training courses and then suprasegmental features have to be emphasized. In case of secondary teachers, the unnatural pronunciation of suprasegmental features were found much. Therefore segmental and suprasegmental features have to be focused at the same time. In other words, features in word level should be focused first for elementary English teacher, and features in word level and beyond word level should be trained at the same time for secondary English teachers.

  • PDF

English Word Game System Recognizing Newly Coined Words (신조어를 인식할 수 있는 영어단어 게임시스템)

  • Shim, Dong-uk;Park, So-young;Kim, Ki-sub;Kang, Han-gu;Jang, Jun-ho;Kim, Dae-woong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.05a
    • /
    • pp.521-524
    • /
    • 2009
  • Everyone can easily acquire learning materials on web environment that rapidly develops. Because the importance of English education has been emphasized day by day, many English education systems are introduced. However, previous most English education systems support only single user mode, and cannot deal with a newly coined word such as 'WIKIPEDIA'. In order to lead a user's learning ability with interest and enjoyment, this paper propose an online English word game system implementing a 'scrabble' board game. The proposed English word game system has the following characteristics. First, the proposed system supports both single user mode and multi user mode with a virtual user based on artificial intelligence. Second, the proposed system can recognize newly coined words such as 'WIKIPEDIA' by using NEVER Open API dictionary. Third, the proposed system offers familiar user interface so that a user can play the game without any manual. Therefore, it is expected that the proposed system can help users to learn English words with interest and enjoyment.

  • PDF