• 제목/요약/키워드: language training

검색결과 685건 처리시간 0.023초

Understanding recurrent neural network for texts using English-Korean corpora

  • Lee, Hagyeong;Song, Jongwoo
    • Communications for Statistical Applications and Methods
    • /
    • 제27권3호
    • /
    • pp.313-326
    • /
    • 2020
  • Deep Learning is the most important key to the development of Artificial Intelligence (AI). There are several distinguishable architectures of neural networks such as MLP, CNN, and RNN. Among them, we try to understand one of the main architectures called Recurrent Neural Network (RNN) that differs from other networks in handling sequential data, including time series and texts. As one of the main tasks recently in Natural Language Processing (NLP), we consider Neural Machine Translation (NMT) using RNNs. We also summarize fundamental structures of the recurrent networks, and some topics of representing natural words to reasonable numeric vectors. We organize topics to understand estimation procedures from representing input source sequences to predict target translated sequences. In addition, we apply multiple translation models with Gated Recurrent Unites (GRUs) in Keras on English-Korean sentences that contain about 26,000 pairwise sequences in total from two different corpora, colloquialism and news. We verified some crucial factors that influence the quality of training. We found that loss decreases with more recurrent dimensions and using bidirectional RNN in the encoder when dealing with short sequences. We also computed BLEU scores which are the main measures of the translation performance, and compared them with the score from Google Translate using the same test sentences. We sum up some difficulties when training a proper translation model as well as dealing with Korean language. The use of Keras in Python for overall tasks from processing raw texts to evaluating the translation model also allows us to include some useful functions and vocabulary libraries as well.

보건진료 전담공무원의 다문화대상 보건의료서비스 제공실태와 다문화 인식 조사 (Health Service Delivery and Attitudes toward Multi-cultural Clients of Community Health Practitioners)

  • 김진학;송민선
    • 가정간호학회지
    • /
    • 제23권1호
    • /
    • pp.5-15
    • /
    • 2016
  • Purpose: This study was conducted to evaluate health service delivery and attitudes, toward multi-cultural clients amongst community health practitioners (CHPs). Methods: A survey was conducted among 242 CHPs from December 10-22, 2015. The collected data were analyzed using chi-square test, t-test, and ANOVA using SPSS 18.0. Results: General awareness of multi-culturalism varied significantly by CHPs age and language ability. Additionally, utilization of services in accordance with the location of community health centers (CHCs) was significantly higher in rural areas than urban areas CHCs in post-partum maternal & neonate care giver service (in maternal child health), management of health educational programs and management of physical exercise (in implementing healthy life style) and networking resources in & outside of CHCs (in management of chronic disease). Conclusion: CHPs deliver health-care services to multi-cultural clients, but have not received sufficient training or education to serve these clients effectively. CHPs who received multi-cultural and foreign language training had more positive experiences with multi-cultural clients. This supports the needs for developing educational programs to enhance multi-cultural understanding amongst CHPs.

비지도 학습을 기반으로 한 한국어 부사격의 의미역 결정 (Unsupervised Semantic Role Labeling for Korean Adverbial Case)

  • 김병수;이용훈;나승훈;김준기;이종혁
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2006년도 제18회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.32-39
    • /
    • 2006
  • 본 논문은 한국어정보처리 과정에서 구문 관계를 의미 관계로 사상하는 의미역 결정 문제에 대해 다루고 있다. 한국어의 경우 대량의 학습 말뭉치를 구하기 힘들며, 이를 구축하기 위해서는 많은 시간과 노력이 필요한 문제점이 있다. 따라서 본 논문에서는 학습 말뭉치를 직접 태깅하지 않고 격틀사전을 이용하여 자동으로 학습 말뭉치를 구축하고 간단한 확률모델을 적용하여 점진적으로 모델을 학습하는 수정된 self-training 알고리즘을 사용하였다. 실험 결과, 4개의 부사격 조사에 대해 평균적으로 81.81%의 정확률을 보였으며, 수정된 self-training 방법은 기존의 방법에 비해 성능 및 실행시간에서 개선된 결과를 보였다.

  • PDF

Sentiment analysis of Korean movie reviews using XLM-R

  • Shin, Noo Ri;Kim, TaeHyeon;Yun, Dai Yeol;Moon, Seok-Jae;Hwang, Chi-gon
    • International Journal of Advanced Culture Technology
    • /
    • 제9권2호
    • /
    • pp.86-90
    • /
    • 2021
  • Sentiment refers to a person's thoughts, opinions, and feelings toward an object. Sentiment analysis is a process of collecting opinions on a specific target and classifying them according to their emotions, and applies to opinion mining that analyzes product reviews and reviews on the web. Companies and users can grasp the opinions of public opinion and come up with a way to do so. Recently, natural language processing models using the Transformer structure have appeared, and Google's BERT is a representative example. Afterwards, various models came out by remodeling the BERT. Among them, the Facebook AI team unveiled the XLM-R (XLM-RoBERTa), an upgraded XLM model. XLM-R solved the data limitation and the curse of multilinguality by training XLM with 2TB or more refined CC (CommonCrawl), not Wikipedia data. This model showed that the multilingual model has similar performance to the single language model when it is trained by adjusting the size of the model and the data required for training. Therefore, in this paper, we study the improvement of Korean sentiment analysis performed using a pre-trained XLM-R model that solved curse of multilinguality and improved performance.

『아더 왕궁의 코네티컷 양키』에 나타난 근대적 통치성 (Governmentality, Training, and Subjectivation in Mark Twain's A Connecticut Yankee in King Arthur's Court)

  • 김혜진
    • 영어영문학
    • /
    • 제58권4호
    • /
    • pp.679-700
    • /
    • 2012
  • This study aims to examine Mark Twain's criticism of American capitalistic ideals in the late nineteenth century. During this second industrial revolution, industry showed rapid growth and capitalism established an order, while America suffered under the monopolization of capitalistic conglomerates. This resulted in the widening gap between the rich and the poor and the dehumanization caused by rapid industrialization. In A Connecticut Yankee in King Arthur's Court, Hank Morgan, the protagonist--who represents nineteenth-century America's industrialism, individualism, and capitalism--is sent back in time to the sixth century of Arthurian England. Hank attempts to introduce nineteenth-century technologies and machines to build a capitalistic system in the middle ages. However, Hank's efforts lead to disaster in which the country and civilization he worked to build is completely destroyed. Although Twain does not deny capitalistic ideals, he criticizes the "governmentality" that operates Hank's reform system to the extreme. Hank values efficiency and utilizes human beings as capital. Hank's economic reason not only transforms the Round-Table knights into speculators but also transforms their religious acts and abstract ideals into moneymaking businesses. The destructive ending anticipates the World Wars and the Great Depression in the first half of twentieth century and even serves to predict the dangers that follow.

베트남 한국어 학습자를 위한 한국어 자음 지각 훈련 연구 (Perceptual training on Korean obstruents for Vietnamese learners)

  • 황효성
    • 말소리와 음성과학
    • /
    • 제15권4호
    • /
    • pp.17-26
    • /
    • 2023
  • 이 연구는 베트남인 성인 학습자들이 학습 단계별로 한국어 어두 초성 장애음을 어떻게 지각하는지 밝히고, 지각 훈련을 통해 오류가 교정될 수 있는지를 밝히는 것을 목적으로 한다. 이를 위해 베트남인 초급, 중급, 고급 학습자 105명을 대상으로 한국어 초성 장애음에 대한 지각 훈련을 실시하였다. 훈련 자료는 원어민 음성으로 녹음한 자연 자극으로 한국어의 최소대립쌍을 적극적으로 활용하여 제작하였다. 실험 집단에 속한 학습자들은 약 2주간에 걸쳐 20-40분의 자기주도적 지각 훈련을 5회 수행했고, 통제 집단에 속한 학습자들은 사전 테스트와 사후 테스트에만 참여하였다. 실험 결과 훈련 전에 잘 구분되지 않았던 음들에 대한 지각이 많이 개선되었고, 초급뿐만 아니라 고급 집단의 학습자들도 끝까지 교정이 잘 되지 않았던 음에 대한 효과를 보았다. 이 연구에서는 대규모의 지각 훈련을 통해 베트남인 학습자들이 한국어의 서로 다른 음을 구별하는 적절한 음향 단서를 학습하는 데 지각 훈련이 중요한 역할을 할 수 있음을 확인하였다.

공감훈련프로그램 참여아동의 공감표현 변화과정 분석 (An Analysis on the Empathic Changing Process of the Members in Empathy Training Program)

  • 김미영
    • 초등상담연구
    • /
    • 제7권1호
    • /
    • pp.205-226
    • /
    • 2008
  • The purpose of the study you have seen is to verify the effectiveness of existing quantitative research and to put the Empathy Training Program to practical use for participating children. From looking into this, the changes in empathic understanding that came to light in relationships between teacher and children and children and children are sure to have that effect. For this work, I established the following subject of inquiry: What kind of changing processes can be seen in the empathic understanding of participating children in the Empathy Training Program? To resolve the above line of inquiry, six female sixth grade elementary school students were chosen and they progressed through twelve sessions of the Empathy Training Program. The children were given a sentence completion exam, recognition work, neat writing exam and a school adaptation exam both before and after participation in the program, making data for analysis. To analyze, first, participants had one or two meetings of forty to fifty minutes each. Progress through the program's curriculum was recorded and through the repeating and copying method, to be sure participating children's empathic understanding was revealed, empathic language and behavior was routinely chosen. Next, according the above criteria I looked into visible changes of the participating children's empathic expressions, classifying and analyzing changes in empathic understanding and six instances of common changes in the emphatic understanding of the participants relationships were analyzed and put together. Next I will summarize the findings we have seen in this research: First, if we look into changes in common empathic understanding from the beginning, using the criteria of empathic language, each individual showed understanding at the beginning and passed and progressed through stages of care, insight and emotional expressions. Second, when we looked at the criteria of empathic behavior from the beginning to the end, one's line of vision and ability to concentrate one's attention was connected. Next, the act of nodding one's head looked like a brief nod at first but at the end, it was not just a simple nod but rather they could feel deep empathy. The condition and substance of the facial expression was seen to match and at the very end the child was expressive and stretched out arms to hold and pat the other person and the act of holding hands could also be seen. Among lots of empathic behavior the final stage was shown by half of the children. Third, from the first stage to the last stage there were many cases revealed. The more the children went the more complete their empathic language became. Their vocabulary increased and became more diverse with empathic actions. Also, when comparing actions and expressions from the beginning with the end, visible expressions became more natural and sincere at the end. The result of the research we have seen is that through receiving experience of empathic understanding, participating children showed a sense of self-confidence and they looked to make peaceful expressions while not being aggressive or defensive about problems. In addition, from understanding empathic expressions, participating children's relationships felt closer. This outcome within this group in this case will be applied and the formation of empathic understanding can be used by the children internally to solve their own problems, acquire close relationships with their teachers and others. It will also contribute to smooth classroom management.

  • PDF

국제 유학생들의 한국어 학습과정에 대한 근거이론적 연구 (Grounded Theory Approach to the Procedure of International Students' Learning Korean)

  • 김아영;강이화;김대현
    • 수산해양교육연구
    • /
    • 제21권4호
    • /
    • pp.523-542
    • /
    • 2009
  • The purpose of this study was to figure out the procedure of learning Korean for international students. A research question was set up as follows: What is the procedure of leaning Korean for international students in Korean universities? To achieve the research purpose, this study implemented a method of semi-constructed interviews. Nineteen international students participated in the interview. The collected data for this study included transcripts from each interview. The transcripts of 60 minutes of interviews with all the participants was audio-taped recorded. This study investigated the research question based on the grounded theory. The analysis of open coding, axial-coding, and selective coding was used in the study. Results indicated that international students learned Korean in a daily basis, and then they adapted to academic Korean in their majored fields. Both personality and mother tongue influenced Korean language learning positively and negatively. International students' improvement of Korean was related in studying with Korean mass media such as TV soap dramas, talk shows, and songs. International students think that TOPIK(Test Of Proficiency In Korean) is not much related with their Korean language fluency. In conclusion, the researchers suggested to give more emphasis on academic training courses for Korean language and to improve the TOPIK in general academic Korean.

국제 음소 기술에 의한 언어에 독립적인 발음사전 생성에 관한 연구 (A Study on the Language Independent Dictionary Creation Using International Phoneticizing Engine Technology)

  • 신좌철;우인성;강흥순;황인수;김석동
    • The Journal of the Acoustical Society of Korea
    • /
    • 제26권1E호
    • /
    • pp.1-7
    • /
    • 2007
  • One result of the trend towards globalization is an increased number of projects that focus on natural language processing. Automatic speech recognition (ASR) technologies, for example, hold great promise in facilitating global communications and collaborations. Unfortunately, to date, most research projects focus on single widely spoken languages. Therefore, the cost to adapt a particular ASR tool for use with other languages is often prohibitive. This work takes a more general approach. We propose an International Phoneticizing Engine (IPE) that interprets input files supplied in our Phonetic Language Identity (PLI) format to build a dictionary. IPE is language independent and rule based. It operates by decomposing the dictionary creation process into a set of well-defined steps. These steps reduce rule conflicts, allow for rule creation by people without linguistics training, and optimize run-time efficiency. Dictionaries created by the IPE can be used with the Sphinx speech recognition system. IPE defines an easy-to-use systematic approach that can lead to internationalization of automatic speech recognition systems.

사전학습 된 언어 모델 기반의 양방향 게이트 순환 유닛 모델과 조건부 랜덤 필드 모델을 이용한 참고문헌 메타데이터 인식 연구 (A Study on Recognition of Citation Metadata using Bidirectional GRU-CRF Model based on Pre-trained Language Model)

  • 지선영;최성필
    • 정보관리학회지
    • /
    • 제38권1호
    • /
    • pp.221-242
    • /
    • 2021
  • 본 연구에서는 사전학습 된 언어 모델을 기반으로 양방향 게이트 순환 유닛 모델과 조건부 랜덤 필드 모델을 활용하여 참고문헌을 구성하는 메타데이터를 자동으로 인식하기 위한 연구를 진행하였다. 실험 집단은 2018년에 발행된 학술지 40종을 대상으로 수집한 PDF 형식의 학술문헌 53,562건을 규칙 기반으로 분석하여 추출한 참고문헌 161,315개이다. 실험 집합을 구축하기 위하여 PDF 형식의 학술 문헌에서 참고문헌을 분석하여 참고문헌의 메타데이터를 자동으로 추출하는 연구를 함께 진행하였다. 본 연구를 통하여 가장 높은 성능을 나타낸 언어 모델을 파악하였으며 해당 모델을 대상으로 추가 실험을 진행하여 학습 집합의 규모에 따른 인식 성능을 비교하고 마지막으로 메타데이터별 성능을 확인하였다.