• Title/Summary/Keyword: Korean language training

Search Result 439, Processing Time 0.027 seconds

A study on Language Environment and Korean Language Education problems in Sakhalin, Russia (러시아 사할린 지역의 언어 환경과 한국어교육 문제 연구)

  • Cho, Hyun Yong;Lee, Sang Hyeok
    • Journal of Korean language education
    • /
    • v.23 no.1
    • /
    • pp.257-282
    • /
    • 2012
  • Sakhalin, Russia is a very specific area for Korean language education. The imposed separation and isolation in this region means the language in Sakhalin is mixed with South Korean, North Korean, Gyeongsang Province dialect, Japanese, and Russian. Scrutiny of the use of the actual language of Sakhalin Koreans is needed, and it is required in supporting Korean language education. In this study, I will cover: 1. Approach should differ depending on the situation of Korean, foreigners, Korean Language School(Hangeul Hakgyo) and Korean classes in local Schools. 2. Tailor-made textbooks for Sakhalin are required. 3. Korean textbooks to match local circumstances are needed. There should be a basic writing text written by a local Korean department professor and supervision or modification, supplements from Korean language education researchers in Korea. 4. Enlarged Korean training programs are needed. Furthermore, if Korean and Russian university students are to study in Korea, there should be programs offering a dual degree among other things. 5. Methodical, overall examination of overseas Korean regions like Sakhalin is necessary. Also in the case of far east Russia, connectivity between Vladivostok, Khabarovsk and Sakhalin needs to be strengthened.

Korean language model construction and comparative analysis with Cross-lingual Post-Training (XPT) (Cross-lingual Post-Training (XPT)을 통한 한국어 언어모델 구축 및 비교 실험)

  • Suhyune Son;Chanjun Park ;Jungseob Lee;Midan Shim;Sunghyun Lee;JinWoo Lee ;Aram So;Heuiseok Lim
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.295-299
    • /
    • 2022
  • 자원이 부족한 언어 환경에서 사전학습 언어모델 학습을 위한 대용량의 코퍼스를 구축하는데는 한계가 존재한다. 본 논문은 이러한 한계를 극복할 수 있는 Cross-lingual Post-Training (XPT) 방법론을 적용하여 비교적 자원이 부족한 한국어에서 해당 방법론의 효율성을 분석한다. 적은 양의 한국어 코퍼스인 400K와 4M만을 사용하여 다양한 한국어 사전학습 모델 (KLUE-BERT, KLUE-RoBERTa, Albert-kor)과 mBERT와 전반적인 성능 비교 및 분석 연구를 진행한다. 한국어의 대표적인 벤치마크 데이터셋인 KLUE 벤치마크를 사용하여 한국어 하위태스크에 대한 성능평가를 진행하며, 총 7가지의 태스크 중에서 5가지의 태스크에서 XPT-4M 모델이 기존 한국어 언어모델과의 비교에서 가장 우수한 혹은 두번째로 우수한 성능을 보인다. 이를 통해 XPT가 훨씬 더 많은 데이터로 훈련된 한국어 언어모델과 유사한 성능을 보일 뿐 아니라 학습과정이 매우 효율적임을 보인다.

  • PDF

Hyperparameter experiments on end-to-end automatic speech recognition

  • Yang, Hyungwon;Nam, Hosung
    • Phonetics and Speech Sciences
    • /
    • v.13 no.1
    • /
    • pp.45-51
    • /
    • 2021
  • End-to-end (E2E) automatic speech recognition (ASR) has achieved promising performance gains with the introduced self-attention network, Transformer. However, due to training time and the number of hyperparameters, finding the optimal hyperparameter set is computationally expensive. This paper investigates the impact of hyperparameters in the Transformer network to answer two questions: which hyperparameter plays a critical role in the task performance and training speed. The Transformer network for training has two encoder and decoder networks combined with Connectionist Temporal Classification (CTC). We have trained the model with Wall Street Journal (WSJ) SI-284 and tested on devl93 and eval92. Seventeen hyperparameters were selected from the ESPnet training configuration, and varying ranges of values were used for experiments. The result shows that "num blocks" and "linear units" hyperparameters in the encoder and decoder networks reduce Word Error Rate (WER) significantly. However, performance gain is more prominent when they are altered in the encoder network. Training duration also linearly increased as "num blocks" and "linear units" hyperparameters' values grow. Based on the experimental results, we collected the optimal values from each hyperparameter and reduced the WER up to 2.9/1.9 from dev93 and eval93 respectively.

Personal Computer Based Aids to Navigation Training Simulator Using Virtual Reality Modeling Language

  • Yim, Jeong-Bin;Park, Sung-Hyeon;Jeong, Jung-Sik
    • Proceedings of KOSOMES biannual meeting
    • /
    • 2003.05a
    • /
    • pp.77-87
    • /
    • 2003
  • This paper describes recently developed PC based Aids to Navigation Training Simulator (AtoN-TS) using Virtual Reality Modeling language (VRML). The purpose of AtoN-TS is to train entry-level cadets to reduce the amount of sea-time training. The practical application procedure of VR technology to implement AtoN-TS is represented. The construction method of virtual waterway world, according to the guidelines of International Association of Lighthouse Authorities (IALA) is proposed. Design concepts and simulation experiments are also discussed. Results from trial tests and evaluations by subject assessment, provide practical insight on the importance of AtoN-TS.

  • PDF

A Training Feasibility Evaluation of Nuclear Safeguards Terms for the Large Language Model (LLM) (거대언어모델에 대한 원자력 안전조치 용어 적용 가능성 평가)

  • Sung-Ho Yoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.479-480
    • /
    • 2024
  • 본 논문에서는 원자력 안전조치 용어를 미세조정(fine tuning) 알고리즘을 활용해 추가 학습한 공개 거대 언어모델(Large Language Model, LLM)이 안전조치 관련 질문에 대해 답변한 결과를 정성적으로 평가하였다. 평가 결과, 학습 데이터 범위 내 질문에 대해 학습 모델은 기반 모델 답변에 추가 학습 데이터를 활용한 낮은 수준의 추론을 수행한 답변을 출력하였다. 평가 결과를 통해 추가 학습 개선 방향을 도출하였으며 저비용 전문 분야 언어 모델 구축에 활용할 수 있을 것으로 보인다.

  • PDF

A Basic Study on Maritime English Education and the Need for Raising the Instructor Profile

  • Davy, James G.;Noh, Chang-Kyun
    • Journal of Navigation and Port Research
    • /
    • v.34 no.7
    • /
    • pp.533-538
    • /
    • 2010
  • English is the accepted common working language of the maritime world and being competent in its use is essential to the safety of ships, their crews and the marine environment. This paper is a response to the urgent need to find a suitable solution to the problem of providing maritime students with quality instruction in Maritime English. This paper will show what type of English instructor is best suited to help cadets have at least a basic grasp of Maritime English communication, with a view to possessing the level required by STCW 95 within the shortest time. It presents ways that maritime institutes can develop their own qualified or 'marinated' English Instructors and what qualifications should be required. It is concluded that by further essential research, interviews and questionnaires etc., the language needs of the university and shipping industry in Korea as a whole can be clearly verified. By examining such data, the present language education systems can be evaluated as to efficacy and relevance, allowing the establishment and implementation of 'best practice' within the training institute. This will result in making excellent informed decisions and choices about how best to improve the language competencies of graduating cadets, thereby creating the catalyst for the success of future seafarers whilst raising the image of the institute and Korean shipping worldwide.

A study on implementation of courseware for Digital System Simulation and Crcuit Synthesis (디지털 시스템의 시뮬레이션과 회로합성을 위한 코스웨어 구현에 관한 연구)

  • 이천우;김형배;강호성;박인정
    • Journal of the Korean Institute of Telematics and Electronics T
    • /
    • v.36T no.3
    • /
    • pp.94-100
    • /
    • 1999
  • In this paper, we are implemented the courseware targets to the integrated a digital system analysis, a design theory, and a hardware description language training and a logic analysis. This paper consists of two subjects. One is that the learning of a digital system analysis, that of a design theory, and the training of a hardware description language is simultaneously performed. The other is that the experiment of courseware. To learn the hardware description language, the explanation using sound or moving images, setting-up of a simulation or a synthesis program, and simulating are executed on a courseware. And also, we proposed an integrated systems for the hardware description language and a logic synthesis. Also, The reliablity of the tool was verified to be preyed an efficient operation of an implemented digital system courseware tool by korea computer research association.

  • PDF

An Efficient Machine Learning-based Text Summarization in the Malayalam Language

  • P Haroon, Rosna;Gafur M, Abdul;Nisha U, Barakkath
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1778-1799
    • /
    • 2022
  • Automatic text summarization is a procedure that packs enormous content into a more limited book that incorporates significant data. Malayalam is one of the toughest languages utilized in certain areas of India, most normally in Kerala and in Lakshadweep. Natural language processing in the Malayalam language is relatively low due to the complexity of the language as well as the scarcity of available resources. In this paper, a way is proposed to deal with the text summarization process in Malayalam documents by training a model based on the Support Vector Machine classification algorithm. Different features of the text are taken into account for training the machine so that the system can output the most important data from the input text. The classifier can classify the most important, important, average, and least significant sentences into separate classes and based on this, the machine will be able to create a summary of the input document. The user can select a compression ratio so that the system will output that much fraction of the summary. The model performance is measured by using different genres of Malayalam documents as well as documents from the same domain. The model is evaluated by considering content evaluation measures precision, recall, F score, and relative utility. Obtained precision and recall value shows that the model is trustable and found to be more relevant compared to the other summarizers.

The Effectiveness of a Comprehensive Language Teaching Program Using Web-Based Picture Books (웹 기반 그림동화 활용 포괄적 언어교수 프로그램의 효과)

  • Park, Soo Jin;Joo, Eun Hee
    • Korean Journal of Child Studies
    • /
    • v.27 no.4
    • /
    • pp.81-102
    • /
    • 2006
  • This study investigated the effects on young children's vocabulary and reading ability of the comprehensive language-teaching program using web-based picture books. The comprehensive language program was put into operation for 9 weeks with a classroom teacher who had in-service training for this program. The language course for the 23 children in the control group consisted only of ordinary language activities using teacher-made picture cards. Test results analyzed by t-test showed that the 25 children in the experimental group gained more than the control group on reading attitude including the concept of reading, accuracy, verbal expression, participation, contents and originality. Also, the ability to read a fairy tale aloud increased in the experimental group.

  • PDF

A Survey on Open Source based Large Language Models (오픈 소스 기반의 거대 언어 모델 연구 동향: 서베이)

  • Ha-Young Joo;Hyeontaek Oh;Jinhong Yang
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.16 no.4
    • /
    • pp.193-202
    • /
    • 2023
  • In recent years, the outstanding performance of large language models (LLMs) trained on extensive datasets has become a hot topic. Since studies on LLMs are available on open-source approaches, the ecosystem is expanding rapidly. Models that are task-specific, lightweight, and high-performing are being actively disseminated using additional training techniques using pre-trained LLMs as foundation models. On the other hand, the performance of LLMs for Korean is subpar because English comprises a significant proportion of the training dataset of existing LLMs. Therefore, research is being carried out on Korean-specific LLMs that allow for further learning with Korean language data. This paper identifies trends of open source based LLMs and introduces research on Korean specific large language models; moreover, the applications and limitations of large language models are described.