• Title/Summary/Keyword: language transfer

Search Result 287, Processing Time 0.028 seconds

Knowledge Transfer in Multilingual LLMs Based on Code-Switching Corpora (코드 스위칭 코퍼스 기반 다국어 LLM의 지식 전이 연구)

  • Seonghyun Kim;Kanghee Lee;Minsu Jeong;Jungwoo Lee
    • Annual Conference on Human and Language Technology
    • /
    • 2023.10a
    • /
    • pp.301-305
    • /
    • 2023
  • 최근 등장한 Large Language Models (LLM)은 자연어 처리 분야에서 눈에 띄는 성과를 보여주었지만, 주로 영어 중심의 연구로 진행되어 그 한계를 가지고 있다. 본 연구는 사전 학습된 LLM의 언어별 지식 전이 가능성을 한국어를 중심으로 탐구하였다. 이를 위해 한국어와 영어로 구성된 코드 스위칭 코퍼스를 구축하였으며, 기본 모델인 LLAMA-2와 코드 스위칭 코퍼스를 추가 학습한 모델 간의 성능 비교를 수행하였다. 결과적으로, 제안하는 방법론으로 학습한 모델은 두 언어 간의 희미론적 정보가 효과적으로 전이됐으며, 두 언어 간의 지식 정보 연계가 가능했다. 이 연구는 다양한 언어와 문화를 반영하는 다국어 LLM 연구와, 소수 언어를 포함한 AI 기술의 확산 및 민주화에 기여할 수 있을 것으로 기대된다.

  • PDF

AI-based language tutoring systems with end-to-end automatic speech recognition and proficiency evaluation

  • Byung Ok Kang;Hyung-Bae Jeon;Yun Kyung Lee
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.48-58
    • /
    • 2024
  • This paper presents the development of language tutoring systems for nonnative speakers by leveraging advanced end-to-end automatic speech recognition (ASR) and proficiency evaluation. Given the frequent errors in non-native speech, high-performance spontaneous speech recognition must be applied. Our systems accurately evaluate pronunciation and speaking fluency and provide feedback on errors by relying on precise transcriptions. End-to-end ASR is implemented and enhanced by using diverse non-native speaker speech data for model training. For performance enhancement, we combine semisupervised and transfer learning techniques using labeled and unlabeled speech data. Automatic proficiency evaluation is performed by a model trained to maximize the statistical correlation between the fluency score manually determined by a human expert and a calculated fluency score. We developed an English tutoring system for Korean elementary students called EBS AI Peng-Talk and a Korean tutoring system for foreigners called KSI Korean AI Tutor. Both systems were deployed by South Korean government agencies.

An Analysis of Hemisphere-cylindrical Shell Structure by Transfer Matrix Method (전달행렬법에 의한 반구 원통형 쉘구조의 해석)

  • 김용희;이윤영
    • Magazine of the Korean Society of Agricultural Engineers
    • /
    • v.45 no.4
    • /
    • pp.115-125
    • /
    • 2003
  • Shell structures are widely used in a variety of engineering application, and mathematical solution of shell structures are available only for a few special cases. The solution of shell structure is more complicated when it has such condition as winkler foundation, other problems. In this study many simplified methods (analogy of beam on elastic foudation, finite element method and transfer matrix method) are applied to analyze a hemisphere-cylindrical shell structures on elastic foundation. And the transfer matrix method is extensively used for the structural analysis because of its merit in the theoretical backgroud and applicability. Therefore, this paper presents the analysis of hemisphere-cylindrical shell structure base on the transfer matrix method. The technique is attractive for implementation on a numerical solution by means of a computer program coded in FORTRAN language with a few elements. To demonstrate this fact, it gives good results which compare well with finite element method.

INTONATION OF TAIWANESE: A COMPARATIVE OF THE INTONATION PATTERNS IN LI, IL, AND L2

  • Chin Chin Tseng
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.574-575
    • /
    • 1996
  • The theme of the current study is to study intonation of Taiwanese(Tw.) by comparing the intonation patterns in native language (Ll), target language (L2), and interlanguage (IL). Studies on interlanguage have dealt primarily with segments. Though there were studies which addressed to the issues of interlanguage intonation, more often than not, they didn't offer evidence for the statement, and the hypotheses were mainly based on impression. Therefore, a formal description of interlanguage intonation is necessary for further development in this field. The basic assumption of this study is that native speakers of one language perceive and produce a second language in ways closely related to the patterns of their first language. Several studies on interlanguage prosody have suggested that prosodic structure and rules are more subject to transfer than certain other phonological phenomena, given their abstract structural nature and generality(Vogel 1991). Broselow(1988) also shows that interlanguage may provide evidence for particular analyses of the native language grammar, which may not be available from the study of the native language alone. Several research questions will be addressed in the current study: A. How does duration vary among native and nominative utterances\ulcorner The results shows that there is a significant difference in duration between the beginning English learners, and the native speakers of American English for all the eleven English sentences. The mean duration shows that the beginning English learners take almost twice as much time (1.70sec.), as Americans (O.97sec.) to produce English sentences. The results also show that American speakers take significant longer time to speak all ten Taiwanese utterances. The mean duration shows that Americans take almost twice as much time (2.24sec.) as adult Taiwanese (1.14sec.) to produce Taiwanese sentences. B. Does proficiency level influence the performance of interlanguage intonation\ulcorner Can native intonation patterns be achieved by a non-native speaker\ulcorner Wenk(1986) considers proficiency level might be a variable which related to the extent of Ll influence. His study showed that beginners do transfer rhythmic features of the Ll and advanced learners can and do succeed in overcoming mother-tongue influence. The current study shows that proficiency level does play a role in the acquisition of English intonation by Taiwanese speakers. The duration and pitch range of the advanced learners are much closer to those of the native American English speakers than the beginners, but even advanced learners still cannot achieve native-like intonation patterns. C. Do Taiwanese have a narrower pitch range in comparison with American English speakers\ulcorner Ross et. al.(1986) suggests that the presence of tone in a language significantly inhibits the unrestricted manipulation of three acoustical measures of prosody which are involved in producing local pitch changes in the fundamental frequency contour during affective signaling. Will the presence of tone in a language inhibit the ability of speakers to modulate intonation\ulcorner The results do show that Taiwanese have a narrower pitch range in comparison with American English speakers. Both advanced (84Hz) and beginning learners (58Hz) of English show a significant narrower FO range than that of Americans' (112Hz), and the difference is greater between the beginning learners' group and native American English speakers.

  • PDF

Testing the Validity of Crosslinguistic Influence in EFL Learning

  • Lee, Gun-Soo
    • English Language & Literature Teaching
    • /
    • no.6
    • /
    • pp.35-47
    • /
    • 2000
  • This study questions the validity of Crosslinguistic Influence (CLI) in EFL Learning. A ten-minute grammaticality judgement test involving resumptive pronouns in English relative clauses was given to 15 female subjects. The research results, which were analysed in terns of language transfer and universalist arguments, support the existence of a universal process that guides L2 learning, and some common developmental patterns between the two processes of L1 and L2 learning. Hence, the universalist view should be given at least equal Weight as the CLI approach.

  • PDF

Cross-language Transfer of Phonological Awareness and Its Relations with Reading and Writing in Korean and English (음운인식의 언어 간 전이와 한글 및 영어의 읽기 쓰기와의 관계)

  • Kim, Sangmi;Cho, Jeung-Ryeul;Kim, Ji-Youn
    • Korean Journal of Cognitive Science
    • /
    • v.26 no.2
    • /
    • pp.125-146
    • /
    • 2015
  • This study investigated the contribution of Korean phonological awareness to English phonological awareness and the relations of phonological awareness with reading and writing in Korean Hangul and English among Korean 5th graders. With age and vocabulary knowledge statistically controlled, Korean phonological awareness was transferred to English phonological awareness. Specifically, syllable and phoneme awareness in Korean transferred to syllable awareness in English, and Korean phoneme awareness transferred to English phoneme awareness. In addition, English phoneme awareness independently explained significant variance of reading and writing in Korean and English after controlling for age and vocabulary. Syllable awareness in Korean and English explained Hangul reading and writing, respectively. The results suggest cross-language transfer of phonological awareness that is a metalinguistic skill. Phoneme awareness is important in reading and writing in English whereas both of syllable and phoneme awareness are important in literacy of Korean.

Secure Information Flow Analysis in Mini x86 Assembly Language (Mini x86 어셈블리어에서 보안 정보 흐름 분석)

  • Kim, Je Min;Kim, Ki Tae;Yoo, Weon Hee
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.5 no.3
    • /
    • pp.87-98
    • /
    • 2009
  • This paper discuss secure information flow analysis and its visualization. Information leak is defined as existence of information flow from variables which have user's private informations to variables that anyone can access. Secure information flow analysis has been made to decide of whether the information leak is or not. There are many researches for secure information flow analysis concerning high level programming languages. But actually, programs that user executes don't have program source code represented in high level programming language. Thus there is need for analysis of program represented in low level language. More to analysis, visualization of analysis is very important. So, this paper discuss visualization of secure information flow analysis. In this paper, Mini x86 Assembly Language which is subset of x86 assembly language is defined and secure information flow analysis of program is proposed. In addition, this paper defines transfer function that is used for analysis and shows how to visualize control flow graph.

Building Specialized Language Model for National R&D through Knowledge Transfer Based on Further Pre-training (추가 사전학습 기반 지식 전이를 통한 국가 R&D 전문 언어모델 구축)

  • Yu, Eunji;Seo, Sumin;Kim, Namgyu
    • Knowledge Management Research
    • /
    • v.22 no.3
    • /
    • pp.91-106
    • /
    • 2021
  • With the recent rapid development of deep learning technology, the demand for analyzing huge text documents in the national R&D field from various perspectives is rapidly increasing. In particular, interest in the application of a BERT(Bidirectional Encoder Representations from Transformers) language model that has pre-trained a large corpus is growing. However, the terminology used frequently in highly specialized fields such as national R&D are often not sufficiently learned in basic BERT. This is pointed out as a limitation of understanding documents in specialized fields through BERT. Therefore, this study proposes a method to build an R&D KoBERT language model that transfers national R&D field knowledge to basic BERT using further pre-training. In addition, in order to evaluate the performance of the proposed model, we performed classification analysis on about 116,000 R&D reports in the health care and information and communication fields. Experimental results showed that our proposed model showed higher performance in terms of accuracy compared to the pure KoBERT model.

Performance Comparison and Error Analysis of Korean Bio-medical Named Entity Recognition (한국어 생의학 개체명 인식 성능 비교와 오류 분석)

  • Jae-Hong Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.4
    • /
    • pp.701-708
    • /
    • 2024
  • The advent of transformer architectures in deep learning has been a major breakthrough in natural language processing research. Object name recognition is a branch of natural language processing and is an important research area for tasks such as information retrieval. It is also important in the biomedical field, but the lack of Korean biomedical corpora for training has limited the development of Korean clinical research using AI. In this study, we built a new biomedical corpus for Korean biomedical entity name recognition and selected language models pre-trained on a large Korean corpus for transfer learning. We compared the name recognition performance of the selected language models by F1-score and the recognition rate by tag, and analyzed the errors. In terms of recognition performance, KlueRoBERTa showed relatively good performance. The error analysis of the tagging process shows that the recognition performance of Disease is excellent, but Body and Treatment are relatively low. This is due to over-segmentation and under-segmentation that fails to properly categorize entity names based on context, and it will be necessary to build a more precise morphological analyzer and a rich lexicon to compensate for the incorrect tagging.