• 제목/요약/키워드: pretrained

검색결과 93건 처리시간 0.023초

Alzheimer's disease recognition from spontaneous speech using large language models

  • Jeong-Uk Bang;Seung-Hoon Han;Byung-Ok Kang
    • ETRI Journal
    • /
    • 제46권1호
    • /
    • pp.96-105
    • /
    • 2024
  • We propose a method to automatically predict Alzheimer's disease from speech data using the ChatGPT large language model. Alzheimer's disease patients often exhibit distinctive characteristics when describing images, such as difficulties in recalling words, grammar errors, repetitive language, and incoherent narratives. For prediction, we initially employ a speech recognition system to transcribe participants' speech into text. We then gather opinions by inputting the transcribed text into ChatGPT as well as a prompt designed to solicit fluency evaluations. Subsequently, we extract embeddings from the speech, text, and opinions by the pretrained models. Finally, we use a classifier consisting of transformer blocks and linear layers to identify participants with this type of dementia. Experiments are conducted using the extensively used ADReSSo dataset. The results yield a maximum accuracy of 87.3% when speech, text, and opinions are used in conjunction. This finding suggests the potential of leveraging evaluation feedback from language models to address challenges in Alzheimer's disease recognition.

로그 이상 탐지를 위한 도메인별 사전 훈련 언어 모델 중요성 연구 (On the Significance of Domain-Specific Pretrained Language Models for Log Anomaly Detection)

  • 레리사 아데바 질차;김득훈;곽진
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2024년도 춘계학술발표대회
    • /
    • pp.337-340
    • /
    • 2024
  • Pretrained language models (PLMs) are extensively utilized to enhance the performance of log anomaly detection systems. Their effectiveness lies in their capacity to extract valuable semantic information from logs, thereby strengthening the detection performance. Nonetheless, challenges arise due to discrepancies in the distribution of log messages, hindering the development of robust and generalizable detection systems. This study investigates the structural and distributional variation across various log message datasets, underscoring the crucial role of domain-specific PLMs in overcoming the said challenge and devising robust and generalizable solutions.

자연어처리 모델을 이용한 이커머스 데이터 기반 감성 분석 모델 구축 (E-commerce data based Sentiment Analysis Model Implementation using Natural Language Processing Model)

  • 최준영;임희석
    • 한국융합학회논문지
    • /
    • 제11권11호
    • /
    • pp.33-39
    • /
    • 2020
  • 자연어 처리 분야에서 번역, 형태소 태깅, 질의응답, 감성 분석등 다양한 영역의 연구가 활발히 진행되고 있다. 감성 분석 분야는 Pretrained Model을 전이 학습하여 단일 도메인 영어 데이터셋에 대해 높은 분류 정확도를 보여주고 있다. 본 연구에서는 다양한 도메인 속성을 가지고 있는 이커머스 한글 상품평 데이터를 이용하고 단어 빈도 기반의 BOW(Bag Of Word), LSTM[1], Attention, CNN[2], ELMo[3], KoBERT[4] 모델을 구현하여 분류 성능을 비교하였다. 같은 단어를 동일하게 임베딩하는 모델에 비해 문맥에 따라 다르게 임베딩하는 전이학습 모델이 높은 정확도를 낸다는 것을 확인하였고, 17개 카테고리 별, 모델 성능 결과를 분석하여 실제 이커머스 산업에서 적용할 수 있는 감성 분석 모델 구성을 제안한다. 그리고 모델별 용량에 따른 추론 속도를 비교하여 실시간 서비스가 가능할 수 있는 모델 연구 방향을 제시한다.

Reverting Gene Expression Pattern of Cancer into Normal-Like Using Cycle-Consistent Adversarial Network

  • Lee, Chan-hee;Ahn, TaeJin
    • International Journal of Advanced Culture Technology
    • /
    • 제6권4호
    • /
    • pp.275-283
    • /
    • 2018
  • Cancer show distinct pattern of gene expression when it is compared to normal. This difference results malignant characteristic of cancer. Many cancer drugs are targeting this difference so that it can selectively kill cancer cells. One of the recent demand for personalized treating cancer is retrieving normal tissue from a patient so that the gene expression difference between cancer and normal be assessed. However, in most clinical situation it is hard to retrieve normal tissue from a patient. This is because biopsy of normal tissues may cause damage to the organ function or a risk of infection or side effect what a patient to take. Thus, there is a challenge to estimate normal cell's gene expression where cancers are originated from without taking additional biopsy. In this paper, we propose in-silico based prediction of normal cell's gene expression from gene expression data of a tumor sample. We call this challenge as reverting the cancer into normal. We divided this challenge into two parts. The first part is making a generator that is able to fool a pretrained discriminator. Pretrained discriminator is from the training of public data (9,601 cancers, 7,240 normals) which shows 0.997 of accuracy to discriminate if a given gene expression pattern is cancer or normal. Deceiving this pretrained discriminator means our method is capable of generating very normal-like gene expression data. The second part of the challenge is to address whether generated normal is similar to true reverse form of the input cancer data. We used, cycle-consistent adversarial networks to approach our challenges, since this network is capable of translating one domain to the other while maintaining original domain's feature and at the same time adding the new domain's feature. We evaluated that, if we put cancer data into a cycle-consistent adversarial network, it could retain most of the information from the input (cancer) and at the same time change the data into normal. We also evaluated if this generated gene expression of normal tissue would be the biological reverse form of the gene expression of cancer used as an input.

최신 기계번역 사후 교정 연구 (Recent Automatic Post Editing Research)

  • 문현석;박찬준;어수경;서재형;임희석
    • 디지털융복합연구
    • /
    • 제19권7호
    • /
    • pp.199-208
    • /
    • 2021
  • 기계번역 사후교정이란, 기계번역 문장에 포함된 오류를 자동으로 교정하기 위해 제안된 연구 분야이다. 이는 번역 시스템과 관계없이 번역문의 품질을 높이는 오류 교정 모델을 생성하는 목적을 가진 연구로, 훈련을 위해 소스문장, 번역문, 그리고 이를 사람이 직접 교정한 문장이 활용된다. 특히, 최신 기계번역 사후교정 연구에서는 사후교정 데이터를 통한 학습을 진행하기 이전에, 사전학습된 다국어 언어모델을 활용하는 방법이 적용되고 있다. 이에 본 논문은 최신 연구들에서 활용되고 있는 다국어 사전학습 언어모델들과 함께, 해당 모델을 도입한 각 연구에서의 구체적인 적용방법을 소개한다. 나아가 이를 기반으로, 번역 모델과 mBART모델을 활용하는 향후 연구 방향을 제안한다.

Cross-Lingual Post-Training (XPT)을 위한 한국어 및 다국어 언어모델 연구 (Korean and Multilingual Language Models Study for Cross-Lingual Post-Training (XPT))

  • 손수현;박찬준;이정섭;심미단;이찬희;박기남;임희석
    • 한국융합학회논문지
    • /
    • 제13권3호
    • /
    • pp.77-89
    • /
    • 2022
  • 대용량의 코퍼스로 학습한 사전학습 언어모델이 다양한 자연어처리 태스크에서 성능 향상에 도움을 주는 것은 많은 연구를 통해 증명되었다. 하지만 자원이 부족한 언어 환경에서 사전학습 언어모델 학습을 위한 대용량의 코퍼스를 구축하는데는 한계가 있다. 이러한 한계를 극복할 수 있는 Cross-lingual Post-Training (XPT) 방법론을 사용하여 비교적 자원이 부족한 한국어에서 해당 방법론의 효율성을 분석한다. XPT 방법론은 자원이 풍부한 영어의 사전학습 언어모델의 파라미터를 필요에 따라 선택적으로 재활용하여 사용하며 두 언어 사이의 관계를 학습하기 위해 적응계층을 사용한다. 이를 통해 관계추출 태스크에서 적은 양의 목표 언어 데이터셋만으로도 원시언어의 사전학습 모델보다 우수한 성능을 보이는 것을 확인한다. 더불어, 국내외 학계와 기업에서 공개한 한국어 사전학습 언어모델 및 한국어 multilingual 사전학습 모델에 대한 조사를 통해 각 모델의 특징을 분석한다

Layer-wise hint-based training for knowledge transfer in a teacher-student framework

  • Bae, Ji-Hoon;Yim, Junho;Kim, Nae-Soo;Pyo, Cheol-Sig;Kim, Junmo
    • ETRI Journal
    • /
    • 제41권2호
    • /
    • pp.242-253
    • /
    • 2019
  • We devise a layer-wise hint training method to improve the existing hint-based knowledge distillation (KD) training approach, which is employed for knowledge transfer in a teacher-student framework using a residual network (ResNet). To achieve this objective, the proposed method first iteratively trains the student ResNet and incrementally employs hint-based information extracted from the pretrained teacher ResNet containing several hint and guided layers. Next, typical softening factor-based KD training is performed using the previously estimated hint-based information. We compare the recognition accuracy of the proposed approach with that of KD training without hints, hint-based KD training, and ResNet-based layer-wise pretraining using reliable datasets, including CIFAR-10, CIFAR-100, and MNIST. When using the selected multiple hint-based information items and their layer-wise transfer in the proposed method, the trained student ResNet more accurately reflects the pretrained teacher ResNet's rich information than the baseline training methods, for all the benchmark datasets we consider in this study.

Dialog-based multi-item recommendation using automatic evaluation

  • Euisok Chung;Hyun Woo Kim;Byunghyun Yoo;Ran Han;Jeongmin Yang;Hwa Jeon Song
    • ETRI Journal
    • /
    • 제46권2호
    • /
    • pp.277-289
    • /
    • 2024
  • In this paper, we describe a neural network-based application that recommends multiple items using dialog context input and simultaneously outputs a response sentence. Further, we describe a multi-item recommendation by specifying it as a set of clothing recommendations. For this, a multimodal fusion approach that can process both cloth-related text and images is required. We also examine achieving the requirements of downstream models using a pretrained language model. Moreover, we propose a gate-based multimodal fusion and multiprompt learning based on a pretrained language model. Specifically, we propose an automatic evaluation technique to solve the one-to-many mapping problem of multi-item recommendations. A fashion-domain multimodal dataset based on Koreans is constructed and tested. Various experimental environment settings are verified using an automatic evaluation method. The results show that our proposed method can be used to obtain confidence scores for multi-item recommendation results, which is different from traditional accuracy evaluation.

Improving Chest X-ray Image Classification via Integration of Self-Supervised Learning and Machine Learning Algorithms

  • Tri-Thuc Vo;Thanh-Nghi Do
    • Journal of information and communication convergence engineering
    • /
    • 제22권2호
    • /
    • pp.165-171
    • /
    • 2024
  • In this study, we present a novel approach for enhancing chest X-ray image classification (normal, Covid-19, edema, mass nodules, and pneumothorax) by combining contrastive learning and machine learning algorithms. A vast amount of unlabeled data was leveraged to learn representations so that data efficiency is improved as a means of addressing the limited availability of labeled data in X-ray images. Our approach involves training classification algorithms using the extracted features from a linear fine-tuned Momentum Contrast (MoCo) model. The MoCo architecture with a Resnet34, Resnet50, or Resnet101 backbone is trained to learn features from unlabeled data. Instead of only fine-tuning the linear classifier layer on the MoCopretrained model, we propose training nonlinear classifiers as substitutes for softmax in deep networks. The empirical results show that while the linear fine-tuned ImageNet-pretrained models achieved the highest accuracy of only 82.9% and the linear fine-tuned MoCo-pretrained models an increased highest accuracy of 84.8%, our proposed method offered a significant improvement and achieved the highest accuracy of 87.9%.

사전학습 언어모델의 Perplexity에 기반한 Zero-shot 어휘 의미 모델 (Zero-shot Lexical Semantics based on Perplexity of Pretrained Language Models)

  • 최형준;나승훈
    • 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
    • /
    • 한국정보과학회언어공학연구회 2021년도 제33회 한글 및 한국어 정보처리 학술대회
    • /
    • pp.473-475
    • /
    • 2021
  • 유의어 추천을 구현하기 위해서는 각 단어 사이의 유사도를 계산하는 것이 필수적이다. 하지만, 기존의 단어간 유사도를 계산하는 여러 방법들은 데이터셋에 등장하지 않은 단어에 대해 유사도를 계산 할 수 없다. 이 논문에서는 이를 해결하기 위해 언어모델의 PPL을 활용하여 단어간 유사도를 계산하였고, 이를 통해 유의어를 추천했을 때 MRR 41.31%의 성능을 확인했다.

  • PDF