• Title/Summary/Keyword: language data

Search Result 3,790, Processing Time 0.026 seconds

Trend Analysis of Research Using Evaluation Tools of Languages Abilities for Young Children: Based on Early Children Education Journals registered with the Korea Research Foundation (유아 언어능력 평가연구의 동향 분석 -한국학술진흥재단 등재 학회지를 중심으로)

  • Youn, Jin-Ju
    • Korean Journal of Human Ecology
    • /
    • v.16 no.4
    • /
    • pp.677-690
    • /
    • 2007
  • This study has a goal to read a trend of language research by analysing evaluation tools and methods that researchers have used for assessing young children's language abilities. Thus the study has chosen 237 language ability evaluation methods out of 121 young child's language ability evaluation researches. The treatises were selected from 4 types of early childhood education journals registered on the Korea Research Foundation. The data analysis was employed for processing the frequency and percentage of the collected data. The results were as follows: First, of single age groups the subject group most selected was five-year-olders and of mixed-age groups the subject group most selected was from three to five, and the number of subjects in researches were mostly below fifty children. The researches were sorted into an 'experimental/ investigational researching' type that has been frequently re-utilized by others, an 'interview type' using a data collection method, and a 'difference verification' type using a data analysis method which has been used in majority of studies. Second, the number of treaties that required data analysis has increased since 1996. Concludingly, the analysis of young child's language ability evaluation tools shows that the purposes of many researches were concentrated on studying children's knowledge about language, children's language functions such as speaking, reading, writing and listening, while evaluation contents were focused on speaking and writing.

Design and Implementation of Data Acquisition and Storage Systems for Multi-view Points Sign Language (다시점 수어 데이터 획득 및 저장 시스템 설계 및 구현)

  • Kim, Geunmo;Kim, Bongjae
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.3
    • /
    • pp.63-68
    • /
    • 2022
  • There are 395,789 people with hearing impairment in Korea, according to the 2021 Disability Statistics Annual Report by the Korea Institute for the Development of Disabled Persons. These people are experiencing a lot of inconvenience through hearing impairment, and many studies related to recognition and translation of Korean sign language are being conducted to solve this problem. In sign language recognition and translation research, collecting sign language data has many difficulties because few people use sign language professionally. In addition, most of the existed data is sign language data taken from the front of the speaker. To solve this problem, in this paper, we designed and developed a storage system that can collect sign language data based on multi-view points in real-time, rather than a single point, and store and manage it with high usability.

A BERT-Based Automatic Scoring Model of Korean Language Learners' Essay

  • Lee, Jung Hee;Park, Ji Su;Shon, Jin Gon
    • Journal of Information Processing Systems
    • /
    • v.18 no.2
    • /
    • pp.282-291
    • /
    • 2022
  • This research applies a pre-trained bidirectional encoder representations from transformers (BERT) handwriting recognition model to predict foreign Korean-language learners' writing scores. A corpus of 586 answers to midterm and final exams written by foreign learners at the Intermediate 1 level was acquired and used for pre-training, resulting in consistent performance, even with small datasets. The test data were pre-processed and fine-tuned, and the results were calculated in the form of a score prediction. The difference between the prediction and actual score was then calculated. An accuracy of 95.8% was demonstrated, indicating that the prediction results were strong overall; hence, the tool is suitable for the automatic scoring of Korean written test answers, including grammatical errors, written by foreigners. These results are particularly meaningful in that the data included written language text produced by foreign learners, not native speakers.

Robust Sentiment Classification of Metaverse Services Using a Pre-trained Language Model with Soft Voting

  • Haein Lee;Hae Sun Jung;Seon Hong Lee;Jang Hyun Kim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.9
    • /
    • pp.2334-2347
    • /
    • 2023
  • Metaverse services generate text data, data of ubiquitous computing, in real-time to analyze user emotions. Analysis of user emotions is an important task in metaverse services. This study aims to classify user sentiments using deep learning and pre-trained language models based on the transformer structure. Previous studies collected data from a single platform, whereas the current study incorporated the review data as "Metaverse" keyword from the YouTube and Google Play Store platforms for general utilization. As a result, the Bidirectional Encoder Representations from Transformers (BERT) and Robustly optimized BERT approach (RoBERTa) models using the soft voting mechanism achieved a highest accuracy of 88.57%. In addition, the area under the curve (AUC) score of the ensemble model comprising RoBERTa, BERT, and A Lite BERT (ALBERT) was 0.9458. The results demonstrate that the ensemble combined with the RoBERTa model exhibits good performance. Therefore, the RoBERTa model can be applied on platforms that provide metaverse services. The findings contribute to the advancement of natural language processing techniques in metaverse services, which are increasingly important in digital platforms and virtual environments. Overall, this study provides empirical evidence that sentiment analysis using deep learning and pre-trained language models is a promising approach to improving user experiences in metaverse services.

Issues of Discourse Studies in Korean Language Education (한국어교육학에서의 담화 연구 분석)

  • Kang, Hyounhwa
    • Journal of Korean language education
    • /
    • v.23 no.1
    • /
    • pp.219-256
    • /
    • 2012
  • The aim of this study is to observe the trend of discourse study in language education and analyze the main issues by investigating the literatures related to discourse in Korean language education in the last ten years. This study observed the discourse study conducted in Korean language education from the perspectives of study subject, study method and study data. Moreover, based on the results, it estimated the achievements and effectiveness of the discourse study conducted in Korean language education. The subject of discourse study was mainly dealt with discourse function, discourse pattern, discourse marker, discourse structure. In the study methods, analysis of corpus and survey were mainly used as the study methods, and spoken corpus, written corpus and semi-spoken corpus were used as study materials. In particular, the semi-spoken corpus was used at a very high rate among them. This showed that discourse study in Korean language education was mainly focused on spoken corpus study. This study divided the detailed field of Korean language education into four fields of linguistic knowledge, communication function, teaching activities and learning activities, and observed the trends of discourse study in each field. Overall, it was recognized that relatively many studies were focused on linguistic knowledge, particularly in pragmatic perspective. It can be said that the study based on discourse has a language educational effectiveness in that it is based on actual data and improves practical communication skills in the environment of various languages.

Construction of Text Summarization Corpus in Economics Domain and Baseline Models

  • Sawittree Jumpathong;Akkharawoot Takhom;Prachya Boonkwan;Vipas Sutantayawalee;Peerachet Porkaew;Sitthaa Phaholphinyo;Charun Phrombut;Khemarath Choke-mangmi;Saran Yamasathien;Nattachai Tretasayuth;Kasidis Kanwatchara;Atiwat Aiemleuk;Thepchai Supnithi
    • Journal of information and communication convergence engineering
    • /
    • v.22 no.1
    • /
    • pp.33-43
    • /
    • 2024
  • Automated text summarization (ATS) systems rely on language resources as datasets. However, creating these datasets is a complex and labor-intensive task requiring linguists to extensively annotate the data. Consequently, certain public datasets for ATS, particularly in languages such as Thai, are not as readily available as those for the more popular languages. The primary objective of the ATS approach is to condense large volumes of text into shorter summaries, thereby reducing the time required to extract information from extensive textual data. Owing to the challenges involved in preparing language resources, publicly accessible datasets for Thai ATS are relatively scarce compared to those for widely used languages. The goal is to produce concise summaries and accelerate the information extraction process using vast amounts of textual input. This study introduced ThEconSum, an ATS architecture specifically designed for Thai language, using economy-related data. An evaluation of this research revealed the significant remaining tasks and limitations of the Thai language.

Types and Construction Method of Multimedia Materials for the Korean Language Education: For the Construction of Digital Library on Nuri-Sejonghakdang (한국어 교육 멀티미디어 자료의 유형과 구축 방식 - 누리-세종학당의 '디지털 자료관' 구축을 위하여 -)

  • Lee, Hyun Ju;Cho, Tae-Rin
    • Journal of Korean language education
    • /
    • v.23 no.1
    • /
    • pp.25-45
    • /
    • 2012
  • The purpose of this article is to examine types and construction method of multimedia materials for the Korean language education, finally in order to construct digital library on Nuri-Sejonghakdang. Firstly, this article reviews some major concepts such as teaching material, multimedia, learning object, meta-data, reusability, etc. Secondly, various multimedia materials are divided into three types(namely, example material, explanation material, training and evaluating material) according to their characteristics as a learning objects. And then, this article tries to propose the classification-search system and meta-data elements for effective search and use of multimedia materials. Finally, this article is concluded by presenting the long-term plan of digital library construction on Nuri-Sejonghakdang and some follow-up task of this study.

Linguistics in Postmodern Science Fiction: Delany's Babel 17 and Stephenson's Snow Crash

  • Kim, Il-Gu
    • English Language & Literature Teaching
    • /
    • v.12 no.2
    • /
    • pp.41-59
    • /
    • 2006
  • As the late partner to science fiction, various experimental languages such as animal language, telepathic language, newly invented language, alien language often appear as "unexpected and frightened situations" in SF. Like generative semanticists, some SF writers daringly delve into the sacred mystery of semantics in language whereas others avoid the dream of a universal language by holding themselves to manageable data. Samuel Delany's description of the ideal telepathic universal language in Babel 17 shows us humans' dream to be like God by showing to us the new process of communication in the factual interplanetary environment. Similar to the mystery of alien language in SF, the baby's babbling reveals how language is both simple and complicated. Children's language shows us the changing process of a soul revealed by language use and it is no wonder that many languages of AIs in SF often borrow their source from children's language acquisition processes. In short, science fiction as the repository of tropes illuminates other literary language studies and other literary genres. Especially in terms of the futuristic study of linguistics, the relationship between science fiction and linguistics is much closer than we thought.

  • PDF

Implementation of R-language-based REST API and Solution for Security Issues (R 언어 기반의 REST API 구현 및 보안문제의 해결 방안)

  • Kang, DongHoon;Oh, Sejong
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.9 no.1
    • /
    • pp.387-394
    • /
    • 2019
  • Recently, the importance of big data has been increased, and demand for data analysis for the big data is also increased. R language is developed for data analysis, and users are analyzing data by using algorithms of various statistics, machine learning and data mining packages in R language. However, it is difficult to develop an application using R. Early study proposed a method to call R script through another language such as PHP, Java, and so on. However, it is troublesome to write such a development method in addition to R in combination with other languages. In this study, we introduce how to write API using only R language without using another language by using Plumber package. We also propose a solution for security issues related with R API. If we use propose technology for developing web application, we can expect high productivity, easy of use, and easy of maintenance.

A Proposal of Evaluation of Large Language Models Built Based on Research Data (연구데이터 관점에서 본 거대언어모델 품질 평가 기준 제언)

  • Na-eun Han;Sujeong Seo;Jung-ho Um
    • Journal of the Korean Society for information Management
    • /
    • v.40 no.3
    • /
    • pp.77-98
    • /
    • 2023
  • Large Language Models (LLMs) are becoming the major trend in the natural language processing field. These models were built based on research data, but information such as types, limitations, and risks of using research data are unknown. This research would present how to analyze and evaluate the LLMs that were built with research data: LLaMA or LLaMA base models such as Alpaca of Stanford, Vicuna of the large model systems organization, and ChatGPT from OpenAI from the perspective of research data. This quality evaluation focuses on the validity, functionality, and reliability of Data Quality Management (DQM). Furthermore, we adopted the Holistic Evaluation of Language Models (HELM) to understand its evaluation criteria and then discussed its limitations. This study presents quality evaluation criteria for LLMs using research data and future development directions.