• 제목/요약/키워드: Korean Language Model

검색결과 1,570건 처리시간 0.028초

다중로봇을 휘한 관리제어 시스템의 설계 (A design of supervisory control system for a multi-robot system)

  • 서일홍;여희주;김재현;류종석;오상록
    • 대한전기학회논문지
    • /
    • 제45권1호
    • /
    • pp.100-112
    • /
    • 1996
  • This paper presents a design experience of a control language for coordination of a multi-robot system. To effectively program job commands, a Petrinet-type Graphical Robot Language(PGRL) is proposed, where some functions, such as concurrency and synchronization, for coordination among tasks can be easily programmed.In our system, the proposed task commands of PGRL are implemented by employing formal model languages, which are composed of three modules, sensory, data handling, and action module. It is expected that by using our proposed PGRL and formal languages, one can easily describe a job or task, and hence can effectively operate a complex real-time and concurrent system. The control system is being implemented by using VME-based 32-bit microprocessor boards for supervisory, each module controller(arm, hand, leg, sensor data processing module) and a real time multi-tasking operating system(VxWorks). (author). 17 refs., 16 figs., 2 tabs.

  • PDF

An Efficient Machine Learning-based Text Summarization in the Malayalam Language

  • P Haroon, Rosna;Gafur M, Abdul;Nisha U, Barakkath
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권6호
    • /
    • pp.1778-1799
    • /
    • 2022
  • Automatic text summarization is a procedure that packs enormous content into a more limited book that incorporates significant data. Malayalam is one of the toughest languages utilized in certain areas of India, most normally in Kerala and in Lakshadweep. Natural language processing in the Malayalam language is relatively low due to the complexity of the language as well as the scarcity of available resources. In this paper, a way is proposed to deal with the text summarization process in Malayalam documents by training a model based on the Support Vector Machine classification algorithm. Different features of the text are taken into account for training the machine so that the system can output the most important data from the input text. The classifier can classify the most important, important, average, and least significant sentences into separate classes and based on this, the machine will be able to create a summary of the input document. The user can select a compression ratio so that the system will output that much fraction of the summary. The model performance is measured by using different genres of Malayalam documents as well as documents from the same domain. The model is evaluated by considering content evaluation measures precision, recall, F score, and relative utility. Obtained precision and recall value shows that the model is trustable and found to be more relevant compared to the other summarizers.

Zero-shot learning 기반 대규모 언어 모델 한국어 품질 비교 분석 (Comparative analysis of large language model Korean quality based on zero-shot learning)

  • 허윤아;소아람;이태민;신중민;박정배;박기남;안성민;임희석
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2023년도 제35회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.722-725
    • /
    • 2023
  • 대규모 언어 모델(LLM)은 대규모의 데이터를 학습하여 얻은 지식을 기반으로 텍스트와 다양한 콘텐츠를 인식하고 요약, 번역, 예측, 생성할 수 있는 딥러닝 알고리즘이다. 초기 공개된 LLM은 영어 기반 모델로 비영어권에서는 높은 성능을 기대할 수 없었으며, 이에 한국, 중국 등 자체적 LLM 연구개발이 활성화되고 있다. 본 논문에서는 언어가 LLM의 성능에 영향을 미치는가에 대하여 한국어 기반 LLM과 영어 기반 LLM으로 KoBEST의 4가지 Task에 대하여 성능비교를 하였다. 그 결과 한국어에 대한 사전 지식을 추가하는 것이 LLM의 성능에 영향을 미치는 것을 확인할 수 있었다.

  • PDF

자기 지도 학습 기반의 언어 모델을 활용한 다출처 정보 통합 프레임워크 (Multi-source information integration framework using self-supervised learning-based language model)

  • 김한민;이정빈;박규동;손미애
    • 인터넷정보학회논문지
    • /
    • 제22권6호
    • /
    • pp.141-150
    • /
    • 2021
  • 인공지능(Artificial Intelligence) 기술을 활용하여 인공지능 기반의 전쟁 (AI-enabled warfare)가 미래전의 핵심이 될 것으로 예상한다. 자연어 처리 기술은 이러한 AI 기술의 핵심 기술로 지휘관 및 참모들이 자연어로 작성된 보고서, 정보 및 첩보를 일일이 열어확인하는 부담을 줄이는데 획기적으로 기여할 수 있다. 본 논문에서는 지휘관 및 참모의 정보 처리 부담을 줄이고 신속한 지휘결심을 지원하기 위해 언어 모델 기반의 다출처 정보 통합 (Language model-based Multi-source Information Integration, LAMII) 프레임워크를 제안한다. 제안된 LAMII 프레임워크는 자기지도 학습법을 활용한 언어 모델에 기반한 표현학습과 오토인코더를 활용한 문서 통합의 핵심 단계로 구성되어 있다. 첫 번째 단계에서는, 자기지도 학습 기법을 활용하여 구조적으로 이질적인 두 문장간의 유사 관계를 식별할 수 있는 표현학습을 수행한다. 두 번째 단계에서는, 앞서 학습된 모델을 활용하여 다출처로부터 비슷한 내용 혹은 토픽을 함양하는 문서들을 발견하고 이들을 통합한다. 이 때, 중복되는 문장을 제거하기 위해 오토인코더를 활용하여 문장의 중복성을 측정한다. 본 논문의 우수성을 입증하기 위해, 우리는 언어모델들과 이의 성능을 평가할 때 활용되는 대표적인 벤치마크 셋들을 함께 활용하여 이질적인 문장간의 유사 관계를 예측의 비교 실험하였다. 실험 결과, 제안된 LAMII 프레임워크가 다른 언어 모델에 비하여 이질적인 문장 구조간의 유사 관계를 효과적으로 예측할 수 있음을 입증하였다.

N-gram 기반의 유사도를 이용한 대화체 연속 음성 언어 모델링 (Spontaneous Speech Language Modeling using N-gram based Similarity)

  • 박영희;정민화
    • 대한음성학회지:말소리
    • /
    • 제46호
    • /
    • pp.117-126
    • /
    • 2003
  • This paper presents our language model adaptation for Korean spontaneous speech recognition. Korean spontaneous speech is observed various characteristics of content and style such as filled pauses, word omission, and contraction as compared with the written text corpus. Our approaches focus on improving the estimation of domain-dependent n-gram models by relevance weighting out-of-domain text data, where style is represented by n-gram based tf/sup */idf similarity. In addition to relevance weighting, we use disfluencies as Predictor to the neighboring words. The best result reduces 9.7% word error rate relatively and shows that n-gram based relevance weighting reflects style difference greatly and disfluencies are good predictor also.

  • PDF

대화체 연속음성 인식을 위한 언어모델 적응 (Language Model Adaptation for Conversational Speech Recognition)

  • 박영희;정민화
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 5월 학술대회지
    • /
    • pp.83-86
    • /
    • 2003
  • This paper presents our style-based language model adaptation for Korean conversational speech recognition. Korean conversational speech is observed various characteristics of content and style such as filled pauses, word omission, and contraction as compared with the written text corpora. For style-based language model adaptation, we report two approaches. Our approaches focus on improving the estimation of domain-dependent n-gram models by relevance weighting out-of-domain text data, where style is represented by n-gram based tf*idf similarity. In addition to relevance weighting, we use disfluencies as predictor to the neighboring words. The best result reduces 6.5% word error rate absolutely and shows that n-gram based relevance weighting reflects style difference greatly and disfluencies are good predictor.

  • PDF

무속 공간모형에 의한 남사마을 공간 해석에 관한 연구 (An Interpretative Study on the Nam-Sa Village Space by Shamanistic Space Model)

  • 김동찬;이윤수;임상재
    • 한국조경학회지
    • /
    • 제27권2호
    • /
    • pp.95-107
    • /
    • 1999
  • Shamanism is an ancient culture that is also considered as a religious rite by most of people. So, shamanism is an important part of Korean tradition and should be a significant base to the Korean exterior space organization theme. However in the field of Landscape architecture th principle of exterior spacing has not yet clearly been identified as shamanistic. Therefore believe that this study can exhibit a model for the study of shaministic space language and its application to one of Korean's village Namsa. The results of this study are summarized below; 1. Extracted models are Unspecialized· Circular·Coninuous space. These are analyzed on the basis of the shaministic space language. Also shaministic space languages are based with Korean common belief of eternal human identify, circular view of the world. 2. Applying the shamanistic space models to Namsa village shows that shamanistic space models follow the Korean space organization principle. Some area of the village do not apply, because they were built on the structure of the social hierarchy between families or the difference between head households and collateral households. 3. Applying the shamanistic space model to Namsa village shows that the shamanistic space model follows the Korean space organization principle. Therefore can say that Namsa village was built by a shamanistic system that pursued eternal human identity.

  • PDF

Korean Text to Gloss: Self-Supervised Learning approach

  • Thanh-Vu Dang;Gwang-hyun Yu;Ji-yong Kim;Young-hwan Park;Chil-woo Lee;Jin-Young Kim
    • 스마트미디어저널
    • /
    • 제12권1호
    • /
    • pp.32-46
    • /
    • 2023
  • Natural Language Processing (NLP) has grown tremendously in recent years. Typically, bilingual, and multilingual translation models have been deployed widely in machine translation and gained vast attention from the research community. On the contrary, few studies have focused on translating between spoken and sign languages, especially non-English languages. Prior works on Sign Language Translation (SLT) have shown that a mid-level sign gloss representation enhances translation performance. Therefore, this study presents a new large-scale Korean sign language dataset, the Museum-Commentary Korean Sign Gloss (MCKSG) dataset, including 3828 pairs of Korean sentences and their corresponding sign glosses used in Museum-Commentary contexts. In addition, we propose a translation framework based on self-supervised learning, where the pretext task is a text-to-text from a Korean sentence to its back-translation versions, then the pre-trained network will be fine-tuned on the MCKSG dataset. Using self-supervised learning help to overcome the drawback of a shortage of sign language data. Through experimental results, our proposed model outperforms a baseline BERT model by 6.22%.

작업기억과 언어발달장애: 문헌연구 (Working Memory and Language Disorders : Literature Review)

  • 김수진;김정연;이혜란
    • 대한음성학회지:말소리
    • /
    • 제51호
    • /
    • pp.39-55
    • /
    • 2004
  • Working memory is the term used to refer to the mental workplace in which information can be temporarily stored and manipulated during complex everyday activities such as understanding language. The studies on language and working memory are based on Baddeley's phonological working memory and Daneman and Carpenter's functional working memory. This article reviews two working memory models and the studies on language and working memory based on each model. These are described in the implication of working memory in language development and specific language impairment-evaluation and treatment.

  • PDF

A Temporal Data model and a Query Language Based on the OO data model

  • Shu, Yongmoo
    • 경영과학
    • /
    • 제14권1호
    • /
    • pp.87-105
    • /
    • 1997
  • There have been lots of research on temporal data management for the past two decades. Most of them are based on some logical data model, especially on the relational data model, although there are some conceptual data models which are independent of logical data models. Also, many properties or issues regarding temporal data models and temporal query languages have been studied. But some of them were shown to be incompatible, which means there could not be a complete temporal data model, satisfying all the desired properties at the same time. Many modeling issues discussed in the papers, do not have to be done so, if they take object-oriented data model as a base model. Therefore, this paper proposes a temporal data model, which is based on the object-oriented data model, mainly discussing the most essential issues that are common to many temporal data models. Our new temporal data model and query language will be illustrated with a small database, created by a set of sample transaction.

  • PDF