• Title/Summary/Keyword: Language Training

Search Result 689, Processing Time 0.025 seconds

Automatic Generation of Training Corpus for a Sentiment Analysis Using a Generative Adversarial Network (생성적 적대 네트워크를 이용한 감성인식 학습데이터 자동 생성)

  • Park, Cheon-Young;Choi, Yong-Seok;Lee, Kong Joo
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.389-393
    • /
    • 2018
  • 딥러닝의 발달로 기계번역, 대화 시스템 등의 자연언어처리 분야가 크게 발전하였다. 딥러닝 모델의 성능을 향상시키기 위해서는 많은 데이터가 필요하다. 그러나 많은 데이터를 수집하기 위해서는 많은 시간과 노력이 소요된다. 본 연구에서는 이미지 생성 모델로 좋은 성능을 보이고 있는 생성적 적대 네트워크(Generative adverasarial network)를 문장 생성에 적용해본다. 본 연구에서는 긍/부정 조건에 따른 문장을 자동 생성하기 위해 SeqGAN 모델을 수정하여 사용한다. 그리고 분류기를 포함한 SeqGAN이 긍/부정 감성인식 학습데이터를 자동 생성할 수 있는지 실험한다. 실험을 수행한 결과, 분류기를 포함한 SeqGAN 모델이 생성한 문장과 학습데이터를 혼용하여 학습할 경우 실제 학습데이터만 학습 시킨 경우보다 좋은 정확도를 보였다.

  • PDF

Listening Strategy Use of Korean EFL Middle School Students

  • Lee, Jung-Soo
    • English Language & Literature Teaching
    • /
    • v.17 no.1
    • /
    • pp.165-190
    • /
    • 2011
  • This research investigates listening strategies and the relationship between the employment of strategy and listening proficiency of Korean EFL middle school students. One hundred and four middle school students (N = 104) participated in this study and their strategy use was assessed through a questionnaire adapted from Oxford's (1990) SILL and O'Malley and Chamot (1990). To measure listening proficiency, the English Listening Ability Test designed by 15 city and provincial offices of education in Korea was used. The results show that students employed a moderate use of strategies; compensation strategies were used most frequently and metacognitive strategies were used the least frequently. Significant differences were found in the use of implicit strategy among different listening proficiency groups, but not in their use of behavioral strategy. Furthermore, there were significant differences in the use of implicit memory, cognitive and compensation strategies among groups of students with different listening proficiencies, but not in their use of metacognitive strategy. The results from multiple regression analysis indicate that implicit strategy use could play an important role in listening comprehension. The findings suggest the need for additional research to explore the effect of listening strategy training for English language learners.

  • PDF

Automated Call Routing Call Center System Based on Speech Recognition (음성인식을 이용한 고객센터 자동 호 분류 시스템)

  • Shim, Yu-Jin;Kim, Jae-In;Koo, Myung-Wan
    • Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.183-191
    • /
    • 2005
  • This paper describes the automated call routing for call center system based on speech recognition. We focus on the task of automatically routing telephone calls based on a users fluently spoken response instead of touch tone menus in an interactive voice response system. Vector based call routing algorithm is investigated and normalization method suggested. Call center database which was collected by KT is used for call routing experiment. Experimental results evaluating call-classification from transcribed speech are reported for that database. In case of small training data, an average call routing error reduction rate of 9% is observed when normalization method is used.

  • PDF

Development of English Speech Recognizer for Pronunciation Evaluation (발성 평가를 위한 영어 음성인식기의 개발)

  • Park Jeon Gue;Lee June-Jo;Kim Young-Chang;Hur Yongsoo;Rhee Seok-Chae;Lee Jong-Hyun
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.37-40
    • /
    • 2003
  • This paper presents the preliminary result of the automatic pronunciation scoring for non-native English speakers, and shows the developmental process for an English speech recognizer for the educational and evaluational purposes. The proposed speech recognizer, featuring two refined acoustic model sets, implements the noise-robust data compensation, phonetic alignment, highly reliable rejection, key-word and phrase detection, easy-to-use language modeling toolkit, etc., The developed speech recognizer achieves 0.725 as the average correlation between the human raters and the machine scores, based on the speech database YOUTH for training and K-SEC for test.

  • PDF

Bilingual Voice Conversion Using Frequency Warping on Formant Space (포만트 공간에서의 주파수 변환을 이용한 이중 언어 음성 변환 연구)

  • Chae, Yi-Geun;Yun, Young-Sun;Jung, Jin Man;Eun, Seongbae
    • Phonetics and Speech Sciences
    • /
    • v.6 no.4
    • /
    • pp.133-139
    • /
    • 2014
  • This paper describes several approaches to transform a speaker's individuality to another's individuality using frequency warping between bilingual formant frequencies on different language environments. The proposed methods are simple and intuitive voice conversion algorithms that do not use training data between different languages. The approaches find the warping function from source speaker's frequency to target speaker's frequency on formant space. The formant space comprises four representative monophthongs for each language. The warping functions can be represented by piecewise linear equations, inverse matrix. The used features are pure frequency components including magnitudes, phases, and line spectral frequencies (LSF). The experiments show that the LSF-based voice conversion methods give better performance than other methods.

Machine Scoring Methods Highly-correlated with Human Ratings in Speech Recognizer Detecting Mispronunciation of Foreign Language (한국인의 외국어 발화오류검출 음성인식기에서 청취판단과 상관관계가 높은 기계 스코어링 기법)

  • Bae, Min-Young;Kwon, Chul-Hong
    • Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.217-226
    • /
    • 2004
  • An automatic pronunciation correction system provides users with correction guidelines for each pronunciation error. For this purpose, we develop a speech recognition system which automatically classifies pronunciation errors when Koreans speak a foreign language. In this paper, we propose a machine scoring method for automatic assessment of pronunciation quality by the speech recognizer. Scores obtained from an expert human listener are used as the reference to evaluate the different machine scores and to provide targets when training some of algorithms. We use a log-likelihood score and a normalized log-likelihood score as machine scoring methods. Experimental results show that the normalized log-likelihood score had higher correlation with human scores than that obtained using the log-likelihood score.

  • PDF

Research on a Model of Extracting Persons' Information Based on Statistic Method and Conceptual Knowledge

  • Wei, XiangFeng;Jia, Ning;Zhang, Quan;Zang, HanFen
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.508-514
    • /
    • 2007
  • In order to extract some important information of a person from text, an extracting model was proposed. The person's name is recognized based on the maximal entropy statistic model and the training corpus. The sentences surrounding the person's name are analyzed according to the conceptual knowledge base. The three main elements of events, domain, situation and background, are also extracted from the sentences to construct the structure of events about the person.

  • PDF

ICAO Language Proficiency Requirements and the training results of Korea Air Traffic Controller

  • Choi, Youn-Chul;Moon, Woo-Choon
    • Journal of the Korean Society for Aviation and Aeronautics
    • /
    • v.16 no.2
    • /
    • pp.36-42
    • /
    • 2008
  • 영어는 1947년부터 국제민간항공기구에 의해 국제 항공공통어로 사용되기 시작하였다. 그러나 원어민을 제외한 대부분 국가의 조종사와 항공교통관제사는 항공영어로 인한 어려움을 토로하고 있으며 항공기 사고의 많은 부분도 항공영어를 사용하는 communication 문제로 발생하고 있다. 이 점을 인식한 ICAO에서는 2008년부터 항공영어의 등급을 제도화하여 비영어권 국가의 항공종사자에 대한 영어능력의 향상을 도모하고 있다. 본 연구는 관제사를 대상으로 한 항공영어교육의 결과를 SPSS11을 이용하여 분석하였다. 분석결과 교육기간과 교육시간이 영어성적 결과에 유의미한 차이를 보이며, 항공영어의 평가요소 6가지가 상호 유의미한 영향을 미치는 것으로 분석되었다.

  • PDF

GNI Corpus Version 1.0: Annotated Full-Text Corpus of Genomics & Informatics to Support Biomedical Information Extraction

  • Oh, So-Yeon;Kim, Ji-Hyeon;Kim, Seo-Jin;Nam, Hee-Jo;Park, Hyun-Seok
    • Genomics & Informatics
    • /
    • v.16 no.3
    • /
    • pp.75-77
    • /
    • 2018
  • Genomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Text corpus for this journal annotated with various levels of linguistic information would be a valuable resource as the process of information extraction requires syntactic, semantic, and higher levels of natural language processing. In this study, we publish our new corpus called GNI Corpus version 1.0, extracted and annotated from full texts of Genomics & Informatics, with NLTK (Natural Language ToolKit)-based text mining script. The preliminary version of the corpus could be used as a training and testing set of a system that serves a variety of functions for future biomedical text mining.

A Comparative Study of Recognition Rate According to the Variance of Speech Bandwidth (대역폭 변화에 따른 음성 인식률 비교연구)

  • Sohn, Il-Hyun;Doh, Sam-Joo;Koo, Myoung-Wan
    • Annual Conference on Human and Language Technology
    • /
    • 1992.10a
    • /
    • pp.193-199
    • /
    • 1992
  • 이 논문에서는 123개 단어의 한국어 음성에 대하여 음성의 대역폭 변화에 따른 인식률을 비교하였다. 인식률 비교실험을 위해 hidden Markov model과 음소와 유사한 131개의 한국어 subword 유니트를 사용한 화자독립 격리단어 인식 시스팀을 사용하였다. 이 실험은 대역폭이 각각 0 - 4.5kHz 및 0.3 - 3.3kHz인 두가지 종류의 음성 데이타베이스를 사용하였다. 훈련과정에서 corrective training의 반복회수를 2로 하고 state transition duration 정보를 사용하였을 때, 0 - 4.5kHz 와 0.3 - 3.3kHz 대역폭에 대해 각각 98.8 % 및 98.2 % 의 최고 인식률을 얻었다. 이로부터 전화대역폭에서도 음성인식률은 크게 저하되지 않음을 알 수 있다.

  • PDF