• Title/Summary/Keyword: 한국어

Search Result 5,311, Processing Time 0.037 seconds

A Study on Fine-Tuning and Transfer Learning to Construct Binary Sentiment Classification Model in Korean Text (한글 텍스트 감정 이진 분류 모델 생성을 위한 미세 조정과 전이학습에 관한 연구)

  • JongSoo Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.5
    • /
    • pp.15-30
    • /
    • 2023
  • Recently, generative models based on the Transformer architecture, such as ChatGPT, have been gaining significant attention. The Transformer architecture has been applied to various neural network models, including Google's BERT(Bidirectional Encoder Representations from Transformers) sentence generation model. In this paper, a method is proposed to create a text binary classification model for determining whether a comment on Korean movie review is positive or negative. To accomplish this, a pre-trained multilingual BERT sentence generation model is fine-tuned and transfer learned using a new Korean training dataset. To achieve this, a pre-trained BERT-Base model for multilingual sentence generation with 104 languages, 12 layers, 768 hidden, 12 attention heads, and 110M parameters is used. To change the pre-trained BERT-Base model into a text classification model, the input and output layers were fine-tuned, resulting in the creation of a new model with 178 million parameters. Using the fine-tuned model, with a maximum word count of 128, a batch size of 16, and 5 epochs, transfer learning is conducted with 10,000 training data and 5,000 testing data. A text sentiment binary classification model for Korean movie review with an accuracy of 0.9582, a loss of 0.1177, and an F1 score of 0.81 has been created. As a result of performing transfer learning with a dataset five times larger, a model with an accuracy of 0.9562, a loss of 0.1202, and an F1 score of 0.86 has been generated.

A Validation Study of the Korean Version of the Workplace Intergenerational Climate Scale(K-WICS) (한국판 세대친화적 조직문화척도(K-WICS) 타당화 연구)

  • Seoyeong Jeong;Hee Woong Park;Young Woo Sohn
    • Korean Journal of Culture and Social Issue
    • /
    • v.29 no.4
    • /
    • pp.429-453
    • /
    • 2023
  • Due to recent demographic changes, employees from diverse generations now work together in organizations. Thus, there is a need for research on intergenerational cooperation. However, the lack of valid and reliable measures to capture intergenerational climate in the workplace is an obstacle to research. Therefore, we translated the Workplace Intergenerational Climate Scale(WICS) into Korean and validated it with a sample of 1,052 Korean full-time employees. Firstly, we conducted an exploratory factor analysis by using sample 1(N = 460) and revealed a five-factor solution. Secondly, the confirmatory factor analysis(sample 2; N = 592) showed a good model fit of the correlated five-factor model. Thirdly, the scale's discriminant and convergent validity was supported by negative correlations with four types of existing ageism scales and by positive correlations with trust, organizational commitment, work engagement, psychological safety, intention to remain, job satisfaction, and communication satisfaction. Moreover, it further demonstrated significant incremental validity in predicting positive outcome variables even when controlling for pre-existing agism scales. Lastly, we confirmed strict measurement invariance of the scale between the age groups(below 40 versus above 40). The findings support the reliability and validity of the Korean version of WICS among Korean employees. The scale will be broadly applied to measure intergenerational climate of organizations and provide practical implications for HR management.

AI-based stuttering automatic classification method: Using a convolutional neural network (인공지능 기반의 말더듬 자동분류 방법: 합성곱신경망(CNN) 활용)

  • Jin Park;Chang Gyun Lee
    • Phonetics and Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.71-80
    • /
    • 2023
  • This study primarily aimed to develop an automated stuttering identification and classification method using artificial intelligence technology. In particular, this study aimed to develop a deep learning-based identification model utilizing the convolutional neural networks (CNNs) algorithm for Korean speakers who stutter. To this aim, speech data were collected from 9 adults who stutter and 9 normally-fluent speakers. The data were automatically segmented at the phrasal level using Google Cloud speech-to-text (STT), and labels such as 'fluent', 'blockage', prolongation', and 'repetition' were assigned to them. Mel frequency cepstral coefficients (MFCCs) and the CNN-based classifier were also used for detecting and classifying each type of the stuttered disfluency. However, in the case of prolongation, five results were found and, therefore, excluded from the classifier model. Results showed that the accuracy of the CNN classifier was 0.96, and the F1-score for classification performance was as follows: 'fluent' 1.00, 'blockage' 0.67, and 'repetition' 0.74. Although the effectiveness of the automatic classification identifier was validated using CNNs to detect the stuttered disfluencies, the performance was found to be inadequate especially for the blockage and prolongation types. Consequently, the establishment of a big speech database for collecting data based on the types of stuttered disfluencies was identified as a necessary foundation for improving classification performance.

Non-Keyword Model for the Improvement of Vocabulary Independent Keyword Spotting System (가변어휘 핵심어 검출 성능 향상을 위한 비핵심어 모델)

  • Kim, Min-Je;Lee, Jung-Chul
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.7
    • /
    • pp.319-324
    • /
    • 2006
  • We Propose two new methods for non-keyword modeling to improve the performance of speaker- and vocabulary-independent keyword spotting system. The first method is decision tree clustering of monophone at the state level instead of monophone clustering method based on K-means algorithm. The second method is multi-state multiple mixture modeling at the syllable level rather than single state multiple mixture model for the non-keyword. To evaluate our method, we used the ETRI speech DB for training and keyword spotting test (closed test) . We also conduct an open test to spot 100 keywords with 400 sentences uttered by 4 speakers in an of fce environment. The experimental results showed that the decision tree-based state clustering method improve 28%/29% (closed/open test) than the monophone clustering method based K-means algorithm in keyword spotting. And multi-state non-keyword modeling at the syllable level improve 22%/2% (closed/open test) than single state model for the non-keyword. These results show that two proposed methods achieve the improvement of keyword spotting performance.

The Error Pattern Analysis of the HMM-Based Automatic Phoneme Segmentation (HMM기반 자동음소분할기의 음소분할 오류 유형 분석)

  • Kim Min-Je;Lee Jung-Chul;Kim Jong-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.5
    • /
    • pp.213-221
    • /
    • 2006
  • Phone segmentation of speech waveform is especially important for concatenative text to speech synthesis which uses segmented corpora for the construction of synthetic units. because the quality of synthesized speech depends critically on the accuracy of the segmentation. In the beginning. the phone segmentation was manually performed. but it brings the huge effort and the large time delay. HMM-based approaches adopted from automatic speech recognition are most widely used for automatic segmentation in speech synthesis, providing a consistent and accurate phone labeling scheme. Even the HMM-based approach has been successful, it may locate a phone boundary at a different position than expected. In this paper. we categorized adjacent phoneme pairs and analyzed the mismatches between hand-labeled transcriptions and HMM-based labels. Then we described the dominant error patterns that must be improved for the speech synthesis. For the experiment. hand labeled standard Korean speech DB from ETRI was used as a reference DB. Time difference larger than 20ms between hand-labeled phoneme boundary and auto-aligned boundary is treated as an automatic segmentation error. Our experimental results from female speaker revealed that plosive-vowel, affricate-vowel and vowel-liquid pairs showed high accuracies, 99%, 99.5% and 99% respectively. But stop-nasal, stop-liquid and nasal-liquid pairs showed very low accuracies, 45%, 50% and 55%. And these from male speaker revealed similar tendency.

The Effect of Syllable Frequency, Syllable Type and Final Consonant on Hangeul Word and Pseudo-word Lexical Decision: An Analysis of the Korean Lexicon Project Database (한글 두 글자 단어와 비단어의 어휘판단에 글자 빈도, 글자 유형, 받침이 미치는 영향: KLP 자료의 분석)

  • Myong Seok Shin;ChangHo Park
    • Korean Journal of Cognitive Science
    • /
    • v.34 no.4
    • /
    • pp.277-297
    • /
    • 2023
  • This study attempted to find out how lexical decision of two-syllable words or pseudo-words is affected by syllabic information, such as syllable frequency, syllable (i.e. vowel) type, and presence of final consonant (i.e. batchim), through the analysis of the Korean Lexicon Project Database (KLP-DB). Hierarchical regression of RT data showed that lexical decision of words was influenced by the frequency of the first syllable, the syllable type of the first and second syllables, batchim for the first and second syllables, and also by the interaction of the two syllable types and the interaction of syllable frequency and batchim of the second syllable. For pseudo-words lexical decision was influenced by the frequency of the first and second syllables, syllable type of the first syllable, and batchim for the first and second syllables, and also by the interaction of the two syllable frequencies, the interaction of the two syllable types, and the interaction of syllable frequency and batchim of the first syllable. Word frequency had a strong effect on lexical decision of words, while syllabic information had a stable effect on the lexical decision of pseudo-words. These results indicate that syllabic information should be seriously considered in constructing word and pseudo-word lists and interpreting lexical decision time. Understanding the effect of syllabic information will also contribute to the understanding of word recognition process.

The Influence of Korean Chinese Students' Sense of Cultural Identity on Second Language Acquisition -Mediating Effect of Learning Motivation and Learning Strategies- (재한 중국유학생의 문화정체감이 제2언어 습득에 미치는 영향 -학습동기와 학습전략의 매개효과-)

  • Gong Ruoning;Cho, Mi Young
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.3
    • /
    • pp.749-761
    • /
    • 2024
  • The purpose of this study analyzes the cultural identity, learning motivation, learning strategies, and second language acquisition trends of Chinese students living in Korea to reveal the structural relationship between these four variables, thereby revealing the cultural identity, learning motivation, and learning strategies of international students in the Korean language learning process. The purpose is to provide basic data to promote . This study verified reliability and validity through a preliminary survey targeting 200 people. This survey was conducted on 1,006 Chinese students studying abroad at six universities in Seoul, Gyeonggi-do, Busan, and Chungcheong-do from May 28 to June 15, 2023. As a result of the study, first, the structural relationship between variables was that cultural identity had a positive effect on learning motivation, learning strategies, and second language acquisition. Second, learning motivation had a positive (+) effect on learning strategies and second language acquisition. Third, learning strategies had a positive (+) (+) effect on second language acquisition. Fourth, learning motivation and learning strategy between cultural identity and learning strategy were found to play a positive (+) mediating role and multiple mediating roles. Therefore, in order to promote international students' cultural identity, learning motivation, and learning strategies in the Korean language learning process, it is necessary to increase opportunities for international students to directly experience the formation of cultural identity and to organize and teach a multifaceted curriculum centered on practice.

Analysis of the scholastic capability of ChatGPT utilizing the Korean College Scholastic Ability Test (대학입시 수능시험을 평가 도구로 적용한 ChatGPT의 학업 능력 분석)

  • WEN HUILIN;Kim Jinhyuk;Han Kyonghee;Kim Shiho
    • Journal of Platform Technology
    • /
    • v.11 no.5
    • /
    • pp.72-83
    • /
    • 2023
  • ChatGPT, commercial launch in late 2022, has shown successful results in various professional exams, including US Bar Exam and the United States Medical Licensing Exam (USMLE), demonstrating its ability to pass qualifying exams in professional domains. However, further experimentation and analysis are required to assess ChatGPT's scholastic capability, such as logical inference and problem-solving skills. This study evaluated ChatGPT's scholastic performance utilizing the Korean College Scholastic Ability Test (KCSAT) subjects, including Korean, English, and Mathematics. The experimental results revealed that ChatGPT achieved a relatively high accuracy rate of 69% in the English exam but relatively lower rates of 34% and 19% in the Korean Language and Mathematics domains, respectively. Through analyzing the results of the Korean language exam, English exams, and TOPIK II, we evaluated ChatGPT's strengths and weaknesses in comprehension and logical inference abilities. Although ChatGPT, as a generative language model, can understand and respond to general Korean, English, and Mathematics problems, it is considered weak in tasks involving higher-level logical inference and complex mathematical problem-solving. This study might provide simple yet accurate and effective evaluation criteria for generative artificial intelligence performance assessment through the analysis of KCSAT scores.

  • PDF

Korean Food Review Analysis Using Large Language Models: Sentiment Analysis and Multi-Labeling for Food Safety Hazard Detection (대형 언어 모델을 활용한 한국어 식품 리뷰 분석: 감성분석과 다중 라벨링을 통한 식품안전 위해 탐지 연구)

  • Eun-Seon Choi;Kyung-Hee Lee;Wan-Sup Cho
    • The Journal of Bigdata
    • /
    • v.9 no.1
    • /
    • pp.75-88
    • /
    • 2024
  • Recently, there have been cases reported in the news of individuals experiencing symptoms of food poisoning after consuming raw beef purchased from online platforms, or reviews claiming that cherry tomatoes tasted bitter. This suggests the potential for analyzing food reviews on online platforms to detect food hazards, enabling government agencies, food manufacturers, and distributors to manage consumer food safety risks. This study proposes a classification model that uses sentiment analysis and large language models to analyze food reviews and detect negative ones, multi-labeling key food safety hazards (food poisoning, spoilage, chemical odors, foreign objects). The sentiment analysis model effectively minimized the misclassification of negative reviews with a low False Positive rate using a 'funnel' model. The multi-labeling model for food safety hazards showed high performance with both recall and accuracy over 96% when using GPT-4 Turbo compared to GPT-3.5. Government agencies, food manufacturers, and distributors can use the proposed model to monitor consumer reviews in real-time, detect potential food safety issues early, and manage risks. Such a system can protect corporate brand reputation, enhance consumer protection, and ultimately improve consumer health and safety.

AN OBSERVATIONAL MULTI-CENTER STUDY FOR EVALUATION OF EFFICACY, SAFETY AND PARENTAL SATISFACTION OF METHYLPHENIDATE-OROS IN CHILDREN WITH ADHD (주의력결핍과잉운동장애 아동에게 Methylphenidate-OROS 투여시 효용성과 안전성 및 부모 만족도를 평가하기 위한 다기관관찰연구)

  • Kim, Bong-Seog;Park, Eun-Jin
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.16 no.2
    • /
    • pp.279-285
    • /
    • 2005
  • Objectives : This study was performed to evaluate the efficacy and safety of MPH-OROS and parental satisfaction for treatment of children with ADHD Method : The 569 participants were clinically diagnosed for ADHD using DSM-IV criteria. We switched current medication to MPH-OROS or introduced MPH-OROS for treatment of ADHD. We assessed the clinical global impression severity of illness (CGI-S), the clinical global impression severity of improvement(CGI-I). And the parents of participants measured the Korean version of Conners rating scale at baseline, the 1 st week and the 3rd week after MPH-OROS trial. At the 3rd week, the parents measured the parent satisfaction questionnaire. Results : $13\%$ of participants dropped out because of several causes including side effects. The change of CGI-S was significantly decreased. Using CGI-I, the improvement was $72.3\%$ at the 1st week and $87.4\%$ at the 3rd week. The total score of the Korean version of Conners parent rating scale was significantly decreased. The participants complaining one or more of side effects were 119$(20.7\%)$, and the most common side effect was anorexia. The $94\%$ of parents replied that they were overall satisfied with MPH-OROS trial. Also the advantages of MPH-OROS of parental report were the long duration of the drug, the improvement of schoolwork and attitude, the improvement of home behavior and homework and the improvement of overactivity. Conclusion : MPH-OROS is effective and well-tolerated in actual clinical use for ADHD.

  • PDF