Search | Korea Science

An Efficient Machine Learning-based Text Summarization in the Malayalam Language

P Haroon, Rosna;Gafur M, Abdul;Nisha U, Barakkath
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.16 no.6
- /
- pp.1778-1799
- /
- 2022
Automatic text summarization is a procedure that packs enormous content into a more limited book that incorporates significant data. Malayalam is one of the toughest languages utilized in certain areas of India, most normally in Kerala and in Lakshadweep. Natural language processing in the Malayalam language is relatively low due to the complexity of the language as well as the scarcity of available resources. In this paper, a way is proposed to deal with the text summarization process in Malayalam documents by training a model based on the Support Vector Machine classification algorithm. Different features of the text are taken into account for training the machine so that the system can output the most important data from the input text. The classifier can classify the most important, important, average, and least significant sentences into separate classes and based on this, the machine will be able to create a summary of the input document. The user can select a compression ratio so that the system will output that much fraction of the summary. The model performance is measured by using different genres of Malayalam documents as well as documents from the same domain. The model is evaluated by considering content evaluation measures precision, recall, F score, and relative utility. Obtained precision and recall value shows that the model is trustable and found to be more relevant compared to the other summarizers.
https://doi.org/10.3837/tiis.2022.06.001 인용 PDF KSCI HTML

A Training Feasibility Evaluation of Nuclear Safeguards Terms for the Large Language Model (LLM) (거대언어모델에 대한 원자력 안전조치 용어 적용 가능성 평가)

Sung-Ho Yoon
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2024.01a
- /
- pp.479-480
- /
- 2024
본 논문에서는 원자력 안전조치 용어를 미세조정(fine tuning) 알고리즘을 활용해 추가 학습한 공개 거대 언어모델(Large Language Model, LLM)이 안전조치 관련 질문에 대해 답변한 결과를 정성적으로 평가하였다. 평가 결과, 학습 데이터 범위 내 질문에 대해 학습 모델은 기반 모델 답변에 추가 학습 데이터를 활용한 낮은 수준의 추론을 수행한 답변을 출력하였다. 평가 결과를 통해 추가 학습 개선 방향을 도출하였으며 저비용 전문 분야 언어 모델 구축에 활용할 수 있을 것으로 보인다.
PDF

Domain-Adaptation Technique for Semantic Role Labeling with Structural Learning

Lim, Soojong;Lee, Changki;Ryu, Pum-Mo;Kim, Hyunki;Park, Sang Kyu;Ra, Dongyul
- ETRI Journal
- /
- v.36 no.3
- /
- pp.429-438
- /
- 2014
Semantic role labeling (SRL) is a task in natural-language processing with the aim of detecting predicates in the text, choosing their correct senses, identifying their associated arguments, and predicting the semantic roles of the arguments. Developing a high-performance SRL system for a domain requires manually annotated training data of large size in the same domain. However, such SRL training data of sufficient size is available only for a few domains. Constructing SRL training data for a new domain is very expensive. Therefore, domain adaptation in SRL can be regarded as an important problem. In this paper, we show that domain adaptation for SRL systems can achieve state-of-the-art performance when based on structural learning and exploiting a prior model approach. We provide experimental results with three different target domains showing that our method is effective even if training data of small size is available for the target domains. According to experimentations, our proposed method outperforms those of other research works by about 2% to 5% in F-score.
https://doi.org/10.4218/etrij.14.0113.0645 인용 PDF KSCI KPUBS

A VQ Codebook Design Based on Phonetic Distribution for Distributed Speech Recognition (분산 음성인식 시스템의 성능향상을 위한 음소 빈도 비율에 기반한 VQ 코드북 설계)

Oh Yoo-Rhee;Yoon Jae-Sam;Lee Gil-Ho;Kim Hong-Kook;Ryu Chang-Sun;Koo Myoung-Wa
- Proceedings of the KSPS conference
- /
- 2006.05a
- /
- pp.37-40
- /
- 2006
In this paper, we propose a VQ codebook design of speech recognition feature parameters in order to improve the performance of a distributed speech recognition system. For the context-dependent HMMs, a VQ codebook should be correlated with phonetic distributions in the training data for HMMs. Thus, we focus on a selection method of training data based on phonetic distribution instead of using all the training data for an efficient VQ codebook design. From the speech recognition experiments using the Aurora 4 database, the distributed speech recognition system employing a VQ codebook designed by the proposed method reduced the word error rate (WER) by 10% when compared with that using a VQ codebook trained with the whole training data.
PDF

A study on Language Environment and Korean Language Education problems in Sakhalin, Russia (러시아 사할린 지역의 언어 환경과 한국어교육 문제 연구)

Cho, Hyun Yong;Lee, Sang Hyeok
- Journal of Korean language education
- /
- v.23 no.1
- /
- pp.257-282
- /
- 2012
Sakhalin, Russia is a very specific area for Korean language education. The imposed separation and isolation in this region means the language in Sakhalin is mixed with South Korean, North Korean, Gyeongsang Province dialect, Japanese, and Russian. Scrutiny of the use of the actual language of Sakhalin Koreans is needed, and it is required in supporting Korean language education. In this study, I will cover: 1. Approach should differ depending on the situation of Korean, foreigners, Korean Language School(Hangeul Hakgyo) and Korean classes in local Schools. 2. Tailor-made textbooks for Sakhalin are required. 3. Korean textbooks to match local circumstances are needed. There should be a basic writing text written by a local Korean department professor and supervision or modification, supplements from Korean language education researchers in Korea. 4. Enlarged Korean training programs are needed. Furthermore, if Korean and Russian university students are to study in Korea, there should be programs offering a dual degree among other things. 5. Methodical, overall examination of overseas Korean regions like Sakhalin is necessary. Also in the case of far east Russia, connectivity between Vladivostok, Khabarovsk and Sakhalin needs to be strengthened.

A Study on Semantic Logic Platform of multimedia Sign Language Content (멀티미디어 수화 콘텐츠의 Semantic Logic 플랫폼 연구)

Jung, Hoe-Jun;Park, Dea-Woo;Han, Kyung-Don
- Journal of the Korea Society of Computer and Information
- /
- v.14 no.10
- /
- pp.199-206
- /
- 2009
The development of broadband multimedia content, a deaf sign language sign language is being used in education. Most of the content used in sign language training for Hangul word representation of sign language is sign language videos for the show. For the first time to learn sign language, sign language users are unfamiliar with the sign language characteristics difficult to understand, difficult to express the sign is displayed. In this paper, online, learning sign language to express the sign with reference to the attributes, Semantic Logic applying the sign language of multimedia content model for video-based platform is designed to study.
https://doi.org/10.9708/jksci.2009.14.10.199 인용 PDF

Considerations Regarding the Application of IMO Maritime English Model Course 3.17 in Korean Contexts

Choi, Seung-Hee;Park, Jin-Soo
- Journal of Navigation and Port Research
- /
- v.40 no.5
- /
- pp.299-304
- /
- 2016
The importance of clear and effective communication at sea has been greatly emphasized due to the increase in multiculturalism on board both ocean-going and coastal vessels, and the necessity of systematic English training based on 'Knowledge, Understanding, and Proficiency' specified in STCW has also been recognized. With these growing needs in mind, the International Maritime Organization (IMO) updated the Maritime English (ME) Model Course 3.17 in 2015 by providing guidelines on language education within two separate categories, General Maritime English (GME) and Specialized Maritime English (SME). The IMO is now attempting to create a new, global framework of ME education and training, and this this new course model must first be thoroughly understood in order to explore the ways to apply the modified version into the context of current ME education in Korea and to design an updated language curriculum. Therefore, the general structural features of the new model course will be explained in this paper, and the course focus set by IMO and to be considered and/or adopted by the Republic of Korea will be closely examined. Finally, suggestions will be made on how to implement this revised model course in practice with the following focus: the development of localized curriculum for GME and SME; the provision of practical teaching guidance through relevant online and offline materials for class and self-study; and the establishment of qualification guidelines and a teaching support system for language teachers in maritime and language education.
https://doi.org/10.5394/KINPR.2016.40.5.299 인용 PDF KSCI

English Education for International Sports Events (국제 스포츠 행사를 위한 영어교육 방안)

Kim, Ji-Eun;Yoo, Ho
- The Journal of the Korea Contents Association
- /
- v.15 no.6
- /
- pp.589-596
- /
- 2015
The purpose of this study was (1) to examine the present state of English education for the 2018 Pyeongchang Winter Olympics preparations, and (2) to identify the need for an English language training program for international sports events. With these goals in mind, the information was gathered from telephone interviews with educational administration officials who were responsible for international sporting events. For the survey, a total of twenty-six participants responded to a questionnaire designed to gauge their self-perceived English instructional needs for international sporting events. The principal results obtained from this study were: the Pyeongchang Organizing Committee for the 2018 Winter Olympic Games signed an agreement with an official supplier of language training services. In addition, Gangneung city is providing a 'Global Leaders' Academy' and English programs for taxi drivers and citizens. The result of the survey shows that the majority of participants reported that an English language training program is essential for international sporting events and it should be different from general English language education.
https://doi.org/10.5392/JKCA.2015.15.06.589 인용 PDF KSCI

Building Specialized Language Model for National R&D through Knowledge Transfer Based on Further Pre-training (추가 사전학습 기반 지식 전이를 통한 국가 R&D 전문 언어모델 구축)

Yu, Eunji;Seo, Sumin;Kim, Namgyu
- Knowledge Management Research
- /
- v.22 no.3
- /
- pp.91-106
- /
- 2021
With the recent rapid development of deep learning technology, the demand for analyzing huge text documents in the national R&D field from various perspectives is rapidly increasing. In particular, interest in the application of a BERT(Bidirectional Encoder Representations from Transformers) language model that has pre-trained a large corpus is growing. However, the terminology used frequently in highly specialized fields such as national R&D are often not sufficiently learned in basic BERT. This is pointed out as a limitation of understanding documents in specialized fields through BERT. Therefore, this study proposes a method to build an R&D KoBERT language model that transfers national R&D field knowledge to basic BERT using further pre-training. In addition, in order to evaluate the performance of the proposed model, we performed classification analysis on about 116,000 R&D reports in the health care and information and communication fields. Experimental results showed that our proposed model showed higher performance in terms of accuracy compared to the pure KoBERT model.
https://doi.org/10.15813/kmr.2021.22.3.006 인용 PDF KSCI

The Effectiveness of a Comprehensive Language Teaching Program Using Web-Based Picture Books (웹 기반 그림동화 활용 포괄적 언어교수 프로그램의 효과)

Park, Soo Jin;Joo, Eun Hee
- Korean Journal of Child Studies
- /
- v.27 no.4
- /
- pp.81-102
- /
- 2006
This study investigated the effects on young children's vocabulary and reading ability of the comprehensive language-teaching program using web-based picture books. The comprehensive language program was put into operation for 9 weeks with a classroom teacher who had in-service training for this program. The language course for the 23 children in the control group consisted only of ordinary language activities using teacher-made picture cards. Test results analyzed by t-test showed that the 25 children in the experimental group gained more than the control group on reading attitude including the concept of reading, accuracy, verbal expression, participation, contents and originality. Also, the ability to read a fairy tale aloud increased in the experimental group.
PDF

Search Result 696, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)