• 제목/요약/키워드: target language

검색결과 476건 처리시간 0.026초

단어의 음성학적 특징을 이용한 한국어 기계 번역 데이터 세트 구축 방안 (Proposed Methodology for Building Korean Machine Translation Data sets Considering Phonetic Features)

  • 장칭하오;양홍진;김세린;권혁철
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2022년도 제34회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.592-595
    • /
    • 2022
  • 한국어에서 한자어와 외래어가 차지하는 비중은 매우 높다. 일상어의 경우 한자어와 외래어의 비중이 약 53%, 전문어의 경우 약 92%에 달한다. 한자어나 외래어는 중국이나 다른 나라로부터 영향을 받아 한국에서 쓰이는 단어들이다. 한국어에서 사용되는 한자어와 외래어의 한글 표기과 원어 표기를 발음해보면, 발음이 상당히 유사하다는 것을 알 수 있다. 한자어인 도서관(图书馆)을 중국어로 발음해보면 thu.ʂu.kwan'로 해당 단어에 대한 한국 사람의 발음과 상당히 유사하다. 본 논문에서는 Source Length, Source IPA Length, Target Length, Target IPA Length, IPA Distance 등 총 5가지의 음성학적 특징을 고려한 한국어-중국어 한국어-영어 단어 기계번역 데이터 세트를 구축하고자 한다.

  • PDF

The Effects of Corpus Use on Learning L2 Collocations of Light Verbs and Nouns

  • Yoshiho Satake
    • 아시아태평양코퍼스연구
    • /
    • 제4권2호
    • /
    • pp.41-55
    • /
    • 2023
  • In data-driven learning (DDL), learners explore a corpus to understand vocabulary and grammar. Although many studies have emphasized the role of DDL in second language (L2) acquisition, L2 light verbs have been largely under-explored. To bridge this gap, this study focused on the learning outcomes of L2 light verbs among 29 intermediate-level Japanese university students. The research zeroed in on six prevalent light verbs in English: "make," "do," "take," "have," "give," and "get." Over nine weeks, the participants engaged with verb-noun collocations using worksheets that juxtaposed Japanese translations of the target collocations with their English equivalents, with the verbs omitted. With the aid of Wordbanks Online, they filled in the blanks and constructed accurate sentences. Before this activity, a 20-minute tutorial was given to the participants on how to interpret the concordance lines. The effectiveness of the DDL method was evaluated using pre-tests, immediate post-tests, and delayed post-tests. The results showed that DDL significantly improved the participants' knowledge of the target collocations of light verbs and nouns; the post-test and delayed post-test scores were significantly higher than the pre-test scores. The results showed that, overall, DDL contributed to memorizing the collocations of light verbs and nouns; however, DDL had different effects on the memorization of collocations across different light verbs. The extent of work on the worksheet is not the only factor in its retention, and observing concordance lines may promote learners' memorization of light-verb collocations.

프랑스에서의 아랍어와 아랍문화의 현황과 전망 분석 - Sabhan Rabina Al-Baldhawe의 논문을 중심으로 (Analysis about the actual situation of Arabic education and his culture in France and his view)

  • 정일영
    • 비교문화연구
    • /
    • 제25권
    • /
    • pp.107-129
    • /
    • 2011
  • This article aims to observe the role of Arabic and analyze the future of Arabic in France under the base of the Al-Sabhan Rabina Baldhawe's article, published in mettre l'importance sur University Paris 8 in 2007. In the first part, we have focus into the historical analysis: in France, with a few Arabic and French policy has been settled for what were examined. Also enable the use of Arabic in France with regard to trends of Maghreb countries and other Arab countries, is being led by noted. In the second part, we put on the importance about the situation of Arabic in the France's educational institution. And we have analysed the reasons why Arabic became the most important reason for learning the target language: - in order to faciliate the children of immigrants living in Maghreb able to speak French - Due to differences in culture and language experience to relieve the psychological insecurity above sea - By using the Arabic language at home among family members, strengthen solidarity and resolve heterogeneity In the third part, we have recognized that the French education system was looked at in the Journal of Arabic teaching elementary, middle and high school courses, separated by a learning Arabic as the target language. Finally, we have tried to find a way to revitalize Arabic in France in connection with Sabhan Rabina Al-Baldhawe concrete example of the paper were based on a survey. France and the Arab countries' relationship has been long enough to prove the historic aspects and economic cooperation have maintained a relationship even tighter. Arabic, many of the French people also need education and children to learn Arabic in the French educational institution that has shown a positive stance. French students learning Arabic as a future career in choosing the width of the wider benefits it helps to have. Learning Arabic in the course need to be addressed is also true that a lot of points. But the Arabic and various aspects of internal organization is considered a minority in the popular Arabic language training in France has become more competitive in research and analysis to be active stance is required externally, such as the increase of trade agreements and economic systems side at the level of cultural exchange and international co-operation system, strengthening its position as the Arabic language in France.

영어능력 개발을 위한 문학텍스트 활용방안 (The effective use of literary text in English education)

  • 한상택
    • 영어어문교육
    • /
    • 제7권1호
    • /
    • pp.179-208
    • /
    • 2001
  • Using literary materials as resources for English learning rather than an object of literary study can be a genuine tool for the students to learn English in the form of oral and written communication. This case study treated the applications of a whole text to the overall course divided into pre-reading activities, while-reading activities, and post-reading activities and the applications of some partial passages extracted from various texts to teaching objectives with many levels of difficulty. This study found that literary texts could be good materials to teach the target language in EFL setting. The English-speaking students with little linguistic competence as a foreign language may be limited in learning English at first, but soon they can accelerate their linguistic competence by reinforcing the literary competence through the literary texts. To achieve effectively a desired goal through the use of literary texts as resources for language development several concrete techniques should be introduced: teacher-guided question strategies laying a central emphasis on the text itself, a problem-solving ability through student-centered activities, process-based and open-ended activities should be presented in a variety of ways using many appropriate activities according to teaching procedure with a careful selection of the texts.

  • PDF

모국어와 외국어 단어 산출의 의미처리 과정 (Semantic Processing in Korean and English Word Production)

  • 김효선;최원일;김충명;남기준
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2005년도 추계 학술대회 발표논문집
    • /
    • pp.131-135
    • /
    • 2005
  • Previous studies on the bilinguals' lexical selection have suggested some evidence in favor of language-specific hypothesis. The purpose of this study was to see whether Korean-English bilinguals' semantic systems of Korean and English are shared or separated between the two languages. In a series of picture-word interference tasks, participants were to name the pictures in Korean or in English with distractor words printed either in Korean or English. The distractor words were either semantically identical, related, unrelated to the picture, or nonexistant. The response time of naming was facilitated when distractor words were semantically identical for both same-(Naming pictures in English/korean with English/Korean distractor words) and different-language pairs(Naming pictures in English with Korean distractor words and vice versa). But this facilitation effect was stronger when naming was produced in their native language, which in this case was Korean. Also, inhibitory effect was shown when the picture and its distractor word were semantically related in both same- and different-language paired conditions. These results show that bilinguals'two lexicons compete to some extent when selecting the target word. In this viewpoint, it can be concluded that the lexicons of either languages may not be entirely but partly overlapping in bilinguals.

  • PDF

Enhancement of a language model using two separate corpora of distinct characteristics

  • 조세형;정태선
    • 한국지능시스템학회논문지
    • /
    • 제14권3호
    • /
    • pp.357-362
    • /
    • 2004
  • 언어 모델은 음성 인식이나 필기체 문자 인식 등에서 다음 단어를 예측함으로써 인식률을 높이게 된다. 그러나 언어 모델은 그 도메인에 따라 모두 다르며 충분한 분량의 말뭉치를 수집하는 것이 거의 불가능하다. 본 논문에서는 N그램 방식의 언어모델을 구축함에 있어서 크기가 제한적인 말뭉치의 한계를 극복하기 위하여 두개의 말뭉치, 즉 소규모의 구어체 말뭉치와 대규모의 문어체 말뭉치의 통계를 이용하는 방법을 제시한다. 이 이론을 검증하기 위하여 수십만 단어 규모의 방송용 말뭉치에 수백만 이상의 신문 말뭉치를 결합하여 방송 스크립트에 대한 퍼플렉시티를 30% 향상시킨 결과를 획득하였다.

PWIM 활용 한국어 초급 어휘교육 (Vocabulary Education for Korean Beginner Level Using PWIM)

  • 정연숙;이병운
    • 한국어교육
    • /
    • 제29권3호
    • /
    • pp.325-344
    • /
    • 2018
  • The purpose of this study is to summarize PWIM (Picture Words Inductive Model) which is one of learner-centered vocabulary teaching-learning models, and suggest ways to implement them in Korean language education. The pictures that are used in the Korean language education field help visualize the specific shape, color, and texture of the vocabulary that is the learning target; thus, helping beginner learners to recognize the meaning of the sound. Visual material stimulates the intrinsic schema of the learner and not only becomes a 'bridge' connecting the mother tongue and the Korean language, but also reduces difficulty in learning a foreign language because of the ambiguity between meaning and sound in Korean and all languages. PWIM shows commonality with existing learning methods in that it uses visual materials. However, in the past, the teacher-centered learning method has only imitated the teacher because the teacher showed a piece-wise, out-of-life photograph and taught the word. PWIM is a learner-centered learning method that stimulates learners to find vocabulary on their own by presenting visual information reflecting the context. In this paper, PWIM is more suitable for beginner learners who are learning specific concrete vocabulary such as personal identity (mainly objects), residence and environment, daily life, shopping, health, climate, and traffic. The purpose of this study was to develop a method of using PWIM suitable for Korean language learners and teaching procedures. The researchers rearranged the previous research into three steps: brainstorming and word organization, generalization of semantic and morphological rules of extracted words, and application of words. In the case of PWIM, you can go through all three steps at once. Otherwise, it is possible to divide the three steps of PWIM and teach at different times. It is expected that teachers and learners using the PWIM teaching-learning method, which uses realistic visual materials, will enable making an effective class together.

GARDIAN: 실시간 내장형 소프트웨어 개발 방법론에서의 룰 기반의 모델링 평가 및 지원도구 (GARDIAN: Rule Based Modeling Validation for Concurrent Object Modeling and Architectural Design mEThod(COMET))

  • 김순태;김진태;박수용
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제34권8호
    • /
    • pp.721-730
    • /
    • 2007
  • UML(Unified Modeling Language)은 대부분의 소프트웨어 개발 방법론에서 목표로 하는 소프트웨어를 분석.설계하기 위하여 널리 사용되며, UML로 작성된 산출물을 기반으로 목표 소프트웨어를 구축한다. 그러나 방법론에서 모델링에 대한 가이드라인이 보통 자연어로 기술되어 있기 때문에 목표 소프트웨어를 위한 모델이 이를 적절히 준수하고 있는가의 검증이 어렵다는 문제점을 가지고 있다. 본 논문에서는 실시간 내장형 시스템(Real-time Embedded System)을 위한 방법론인 COMET방법론을 대상으로 모델링의 가이드라인을 표현하고, 표현된 가이드라인을 기반으로 모델을 평가할 수 있는 룰 기반 COMET 방법론 가이드라인 평가 프레임워크인 GARDIAN을 제안한다. 제안된 프레임워크의 유용성을 검증하기 위하여 비전문가가 UML을 사용하여 분석.설계한 지능형 로봇의 주행 시스템에 프레임워크를 적용하여 보았다.

A Preliminary Report on Perceptual Resolutions of Korean Consonant Cluster Simplification and Their Possible Change over Time

  • Cho, Tae-Hong
    • 말소리와 음성과학
    • /
    • 제2권4호
    • /
    • pp.83-92
    • /
    • 2010
  • The present study examined how listeners of Seoul Korean would recover deleted phonemes in consonant cluster simplification. In a phoneme monitoring experiment, listeners had to monitor for C2 (/k/ or /p/) in C1C2C3 when C2 was deleted (C1 was preserved) or preserved (C1 was deleted). The target consonant (C2) was either /k/ or /p/ (e.g., i$\b{lk}$-t${\partial}$lato vs. pa$\b{lp}$-t${\partial}$lato), and there were two listener groups, one group tested in 2002 and the other in 2009. Some points have emerged from the results. First, listeners were able to detect deleted phonemes as accurately and rapidly as preserved phonemes, showing that the physical presence of the acoustic information did not improve the listeners' performance. This suggests that listeners must have relied on language-specific phonological knowledge about the consonant cluster simplification, rather than relying on the low-level acoustic-phonetic information. Second, listener groups (participants in 2002 vs. 2009), differed in processing /p/ versus /k/: listeners in 2009 failed to detect /p/ more frequently than those in 2002, suggesting that the way the consonant cluster sequence is produced and perceived has changed over time. This result was interpreted as coming from statistical patterns of speech production in contemporary Seoul Korean as reported in a recent study by Cho & Kim (2009): /p/ is deleted far more often than /p/ is preserved, which is likely reflected in the way listeners process simplified variants. Finally, listeners processed /k/ more efficiently than /p/, especially when the target was physically present (in C-preserved condition), indicating that listeners benefited more from the presence of /k/ than of /p/. This was interpreted as supporting the view that velars are perceptually more robust than labials, which constrains shaping phonological patterns of the language. These results were then discussed in terms of their implications for theories of spoken word recognition.

  • PDF

연속음성인식의 음향모델 출력을 이용한 뉴스 데이터 분석 (News Data Analysis Using Acoustic Model Output of Continuous Speech Recognition)

  • 이경록
    • 한국콘텐츠학회논문지
    • /
    • 제6권10호
    • /
    • pp.9-16
    • /
    • 2006
  • 본 논문에서는 연속음성인식의 음향모델 출력을 이용하여 뉴스 데이터를 분석하였다. 실험에 사용된 뉴스 데이터베이스는 2,093개의 기사로 구성되어 있다. 기존의 한국어 연속음성인식은 열악한 언어모델 때문에 낮은 인식성능을 보여 뉴스 데이터 분석에 적합하지 않다. 본 논문에서는 이를 보완하기 위해서 상대적으로 견인한 음향모델의 인식결과를 후처리하여 핵심어 정보 파일을 만들었다. 음향모델의 출력레벨 문턱치가 100일 때 전체 인식대상 형태소의 86.9%가 인식되었다. 동일한 조건에 길이정보 기반 정규화를 적용하였더니 81.25%가 인식되었다. 정규화의 목적은 긴 길이의 형태소를 보상하는 것이다. 실험결과, 인식대상 형태소 인식률은 75.13%였다. 그리고 5,040MB의 뉴스 데이터에서 314MB의 핵심어 정보 파일이 만들어졌다. 이것은 절대적인 정보량이 93.8% 감소한 것이다.

  • PDF