• Title/Summary/Keyword: Korean-Spoken English

Search Result 83, Processing Time 0.028 seconds

Generating a Korean Sentiment Lexicon Through Sentiment Score Propagation (감정점수의 전파를 통한 한국어 감정사전 생성)

  • Park, Ho-Min;Kim, Chang-Hyun;Kim, Jae-Hoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.2
    • /
    • pp.53-60
    • /
    • 2020
  • Sentiment analysis is the automated process of understanding attitudes and opinions about a given topic from written or spoken text. One of the sentiment analysis approaches is a dictionary-based approach, in which a sentiment dictionary plays an much important role. In this paper, we propose a method to automatically generate Korean sentiment lexicon from the well-known English sentiment lexicon called VADER (Valence Aware Dictionary and sEntiment Reasoner). The proposed method consists of three steps. The first step is to build a Korean-English bilingual lexicon using a Korean-English parallel corpus. The bilingual lexicon is a set of pairs between VADER sentiment words and Korean morphemes as candidates of Korean sentiment words. The second step is to construct a bilingual words graph using the bilingual lexicon. The third step is to run the label propagation algorithm throughout the bilingual graph. Finally a new Korean sentiment lexicon is generated by repeatedly applying the propagation algorithm until the values of all vertices converge. Empirically, the dictionary-based sentiment classifier using the Korean sentiment lexicon outperforms machine learning-based approaches on the KMU sentiment corpus and the Naver sentiment corpus. In the future, we will apply the proposed approach to generate multilingual sentiment lexica.

AI-based language tutoring systems with end-to-end automatic speech recognition and proficiency evaluation

  • Byung Ok Kang;Hyung-Bae Jeon;Yun Kyung Lee
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.48-58
    • /
    • 2024
  • This paper presents the development of language tutoring systems for nonnative speakers by leveraging advanced end-to-end automatic speech recognition (ASR) and proficiency evaluation. Given the frequent errors in non-native speech, high-performance spontaneous speech recognition must be applied. Our systems accurately evaluate pronunciation and speaking fluency and provide feedback on errors by relying on precise transcriptions. End-to-end ASR is implemented and enhanced by using diverse non-native speaker speech data for model training. For performance enhancement, we combine semisupervised and transfer learning techniques using labeled and unlabeled speech data. Automatic proficiency evaluation is performed by a model trained to maximize the statistical correlation between the fluency score manually determined by a human expert and a calculated fluency score. We developed an English tutoring system for Korean elementary students called EBS AI Peng-Talk and a Korean tutoring system for foreigners called KSI Korean AI Tutor. Both systems were deployed by South Korean government agencies.

존 웰즈 교수의 초청 강연 초록

  • Wells, John
    • MALSORI
    • /
    • no.15_18
    • /
    • pp.71-80
    • /
    • 1989
  • It is an honour to be speaking on phonetics at the invitation of the Phonetic Society of Korea. Through the Korean Hangout script, invented in the fifteenth century at the instigation of the great King Sejong, and the work Hunminjeongeum which describes it, this country has an important place in the world history of phonetics. Phonetics is the description and analysis of pronunciation. Spoken language can be investigated at three points: in the speaker (articulatory phonetics), in the hearer (auditory phonetics), and in the physical speech signal (acoustic phonetics)... Beginners in English who are Korean mother tongue have to learn to make the sound 'f' as in "coffee", which is a voiceless labio-dental fricative, lip on upper teeth. They also have to learn to make [\theta]sound in "think", a voiceless dental fricative.

  • PDF

A Comparative Study on the Korean and English Genderlect: Focused on Polite Expressions (한국어와 영어 성별어 비교연구: 공손표현과 관련하여)

  • Kim, Hyun Hyo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.10
    • /
    • pp.6527-6533
    • /
    • 2015
  • It is generally accepted that there are differences between men and women in linguistic communication style. Genderlect is a socio-linguistic term to refer to the linguistic differences spoken by specific gender. Some linguistic features are provided as evidence to show the genderlects: pitch, lexicon, intonation, grammar and styles. The purpose of this paper is to compare the characteristics of genderlect in English and Korean. To do so, I analyzed the scripts of an English movie, 'Mrs. Doubtfire' and Korean tv drama, 'Oohlala couple'. In "Mrs. Doubtfire, tension and laughter arose out of discrepancy from the way he looked (as a woman) and the way he spoke (like a man). The same is true with "Oohlala couple." In the language of Mrs. Doubtfire, male speech characteristics with nouns were salient while in "Oohlala couple" with verb forms, especially with honorific style, which shows a difference between Korean and English genderlect. Korean language has special genderlect characteristics with honorific speech style realized in verb endings. In Korean the highest honorific speech style, 'Habsho-che' is used in official situation and men are more accustomed to it than women. When women have to use polite expressions they have to choose between the highest honorific style, 'Habsho-che' losing the female characteristics or the second highest honorific style 'Haeyo-che' keeping the female characteristics.

An Artificial Intelligence Approach for Word Semantic Similarity Measure of Hindi Language

  • Younas, Farah;Nadir, Jumana;Usman, Muhammad;Khan, Muhammad Attique;Khan, Sajid Ali;Kadry, Seifedine;Nam, Yunyoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.6
    • /
    • pp.2049-2068
    • /
    • 2021
  • AI combined with NLP techniques has promoted the use of Virtual Assistants and have made people rely on them for many diverse uses. Conversational Agents are the most promising technique that assists computer users through their operation. An important challenge in developing Conversational Agents globally is transferring the groundbreaking expertise obtained in English to other languages. AI is making it possible to transfer this learning. There is a dire need to develop systems that understand secular languages. One such difficult language is Hindi, which is the fourth most spoken language in the world. Semantic similarity is an important part of Natural Language Processing, which involves applications such as ontology learning and information extraction, for developing conversational agents. Most of the research is concentrated on English and other European languages. This paper presents a Corpus-based word semantic similarity measure for Hindi. An experiment involving the translation of the English benchmark dataset to Hindi is performed, investigating the incorporation of the corpus, with human and machine similarity ratings. A significant correlation to the human intuition and the algorithm ratings has been calculated for analyzing the accuracy of the proposed similarity measures. The method can be adapted in various applications of word semantic similarity or module for any other language.

Subdialogue Cues, Speaker Intention, and the Deletion of Hearer Arguments in Spoken Korean (대화체에서 부대화의 개시/종료 및 화자의 의도, 그리고 청자 논항의 생략)

  • Hong, Min-Pyo;Lee, Hyon-Ho
    • Annual Conference on Human and Language Technology
    • /
    • 1998.10c
    • /
    • pp.358-364
    • /
    • 1998
  • 본 연구는 한국어 대화인지모형을 구축하는 데 필요한 화용론적 지식에 관한 연구의 일환으로, 부대화의 개시 및 종료 시점에서 나타나는 담화 참여자의 의도 및 그 의도와 관련된 표층표지를 분석하고, 나아가 필수 논항 성분의 하나로서 대화체에서는 자주 생략되어 나타나는 청화자 논항의 의미복원을 위한 예비적 단계로서 술어의 어미 특성 및 화자의 언어행위 혹은 화행과 관련하여 청자 논항의 생략 유형을 분류하고 분석했다. 이러한 연구는 한국어 대화 에이전트를 개발하는데 있어 필수적인 단계로서 향후 대화체 이해시스템 연구에 기초적인 틀을 제공해 줄 수 있다.

  • PDF

Prosodic Strengthening in Speech Production and Perception: The Current Issues

  • Cho, Tae-Hong
    • Speech Sciences
    • /
    • v.14 no.4
    • /
    • pp.7-24
    • /
    • 2007
  • This paper discusses some current issues regarding how prosodic structure is manifested in fine-grained phonetic details, how prosodically-conditioned articulatory variation is explained in terms of speech dynamics, and how such phonetic manifestation of prosodic structure may be exploited in spoken word recognition. Prosodic structure is phonetically manifested in prosodically important landmark locations such as prosodic domain-final position, domain-initial position and stressed/accented syllables. It will be discussed how each of the prosodic landmarks engenders particular phonetic patterns, ow articulatory variation in such locations are dynamically accounted for, and how prosodically-driven fine-grained phonetic detail is exploited by listeners in speech comprehension.

  • PDF

Error Correction and Praat Script Tools for the Buckeye Corpus of Conversational Speech (벅아이 코퍼스 오류 수정과 코퍼스 활용을 위한 프랏 스크립트 툴)

  • Yoon, Kyu-Chul
    • Phonetics and Speech Sciences
    • /
    • v.4 no.1
    • /
    • pp.29-47
    • /
    • 2012
  • The purpose of this paper is to show how to convert the label files of the Buckeye Corpus of Spontaneous Speech [1] into Praat format and to introduce some of the Praat scripts that will enable linguists to study various aspects of spoken American English present in the corpus. During the conversion process, several types of errors were identified and corrected either manually or automatically by the use of scripts. The Praat script tools that have been developed can help extract from the corpus massive amounts of phonetic measures such as the VOT of plosives, the formants of vowels, word frequency information and speech rates that span several consecutive words. The script tools can extract additional information concerning the phonetic environment of the target words or allophones.

Zero Pronoun Resolution for Korean-English Spoken Language MT (한국어-영어 대화체 번역시스템을 위한 영형 대명사 해소)

  • Park, Arum;Ji, Eun-Byul;Hong, Munpyo
    • Annual Conference on Human and Language Technology
    • /
    • 2011.10a
    • /
    • pp.98-101
    • /
    • 2011
  • 이 논문은 한-영 대화체 번역 시스템에서 영형 대명사 해소를 위한 새로운 방법론을 제시하였다. 영형 대명사는 문맥, 상황, 세상 지식으로부터 추론될 수 있는 문장에서 생략된 요소이다. 이 논문은 특히 주어-대명사 생략 현상에 대해 다루고 있는데, 그 이유는 드라마 대본이나 인스턴트 메신저 채팅과 같은 한국어 대화체에서는 매우 일반적인 현상이기 때문이다. 이 논문에서 우리는 많은 양의 지식을 요구하지 않는 간단한 방법론을 제시하였다. 평가결과 우리의 방법은 0.79의 F-measure 스코어를 달성하였고, 전체번역률의 측면에서는 약 4.1% 정도의 향상효과가 있었다.

  • PDF

GMM based Nonlinear Transformation Methods for Voice Conversion

  • Vu, Hoang-Gia;Bae, Jae-Hyun;Oh, Yung-Hwan
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.67-70
    • /
    • 2005
  • Voice conversion (VC) is a technique for modifying the speech signal of a source speaker so that it sounds as if it is spoken by a target speaker. Most previous VC approaches used a linear transformation function based on GMM to convert the source spectral envelope to the target spectral envelope. In this paper, we propose several nonlinear GMM-based transformation functions in an attempt to deal with the over-smoothing effect of linear transformation. In order to obtain high-quality modifications of speech signals our VC system is implemented using the Harmonic plus Noise Model (HNM)analysis/synthesis framework. Experimental results are reported on the English corpus, MOCHA-TlMlT.

  • PDF