• Title/Summary/Keyword: BERT

Search Result 396, Processing Time 0.022 seconds

Effects of Gamma-ray and Chemical Mutagens on the Germination and Seedling Growth in Stevia rebaudiana Bert. (감마선 및 화학적 돌연변이원 처리가 스테비아 (Stevia rebaudiana Bert.)의 종자 발아 및 초기 생장에 미치는 영향)

  • Yoon, Tai-Young;Kim, Ee-Youb;Kim, Young-Ho;Choi, Gin-Su;Hyun, Kyung-Sup;Seong, Yoon-Hee;Jo, Han-Jig;Kim, Dong Sub;Kang, Si-Yong;Ko, Jeong-Ae
    • Journal of Radiation Industry
    • /
    • v.6 no.2
    • /
    • pp.189-197
    • /
    • 2012
  • This study was carried out to develop the improved useful mutants for yield or composition of stevia plants using the gamma ray or chemical mutagens treatments. The seeds of stevia 'Suwon No. 11' were irradiated up to 400 Gy of gamma ray. Chemical mutagens were treated on the seeds of the 'Suwon No. 11' using 0.07% colchicine, 10 mM sodium azide, or 10 mM NMU for various durations. The germination rate, and shoot and root growth of seedling were estimated at 30 days after gamma ray irradiation or chemical mutagen treatment, and the plant height, the number of branches, and leaf length and width were examined at 3 months after mutagenesis treatments. In the case of gamma ray treatments, the germination rate and early-stage growth were decreased as the increase of radiation dose, and the 50% lethal dose was found to be 200 Gy. the plant height was decreased as the increase of radiation dose, while the number of branches per plant and leaf length were increased. Leaf shape was modified to the relatively longer one compared to the control, which was identified more apparently at the treatments of higher than 150 Gy. In the treatment of chemical mutagens, the rate of germination and survival were decreased as the increase of incubation time. The 50% lethal dose for germination rate were identified as the conditions of the 15 hours incubation in 0.07% colchicine, the 4 hrs in 10 mM sodium azide, and the 2 hrs in 10 mM NMU, in the three chemical mutagens treatments. Chemical mutagens had no influence on shoot growth, while root growth was increased, especially as the incubation time was extended. The highest root growth occurred in the NMU treatment at 6 hrs incubation time. The plant height was decreased as the increase of incubation time in the chemical mutagens treatments. Among the chemical mutagens, NMU was the most effective to induce the mutants with long-shaped or the least lobed leaves.

A Study on Classification of Mobile Application Reviews Using Deep Learning (딥러닝을 활용한 모바일 어플리케이션 리뷰 분류에 관한 연구)

  • Son, Jae Ik;Noh, Mi Jin;Rahman, Tazizur;Pyo, Gyujin;Han, Mumoungcho;Kim, Yang Sok
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.76-83
    • /
    • 2021
  • With the development and use of smart devices such as smartphones and tablets increases, the mobile application market based on mobile devices is growing rapidly. Mobile application users write reviews to share their experience in using the application, which can identify consumers' various needs and application developers can receive useful feedback on improving the application through reviews written by consumers. However, there is a need to come up with measures to minimize the amount of time and expense that consumers have to pay to manually analyze the large amount of reviews they leave. In this work, we propose to collect delivery application user reviews from Google PlayStore and then use machine learning and deep learning techniques to classify them into four categories like application feature advantages, disadvantages, feature improvement requests and bug report. In the case of the performance of the Hugging Face's pretrained BERT-based Transformer model, the f1 score values for the above four categories were 0.93, 0.51, 0.76, and 0.83, respectively, showing superior performance than LSTM and GRU.

Multimodal Sentiment Analysis Using Review Data and Product Information (리뷰 데이터와 제품 정보를 이용한 멀티모달 감성분석)

  • Hwang, Hohyun;Lee, Kyeongchan;Yu, Jinyi;Lee, Younghoon
    • The Journal of Society for e-Business Studies
    • /
    • v.27 no.1
    • /
    • pp.15-28
    • /
    • 2022
  • Due to recent expansion of online market such as clothing, utilizing customer review has become a major marketing measure. User review has been used as a tool of analyzing sentiment of customers. Sentiment analysis can be largely classified with machine learning-based and lexicon-based method. Machine learning-based method is a learning classification model referring review and labels. As research of sentiment analysis has been developed, multi-modal models learned by images and video data in reviews has been studied. Characteristics of words in reviews are differentiated depending on products' and customers' categories. In this paper, sentiment is analyzed via considering review data and metadata of products and users. Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), Self Attention-based Multi-head Attention models and Bidirectional Encoder Representation from Transformer (BERT) are used in this study. Same Multi-Layer Perceptron (MLP) model is used upon every products information. This paper suggests a multi-modal sentiment analysis model that simultaneously considers user reviews and product meta-information.

Artificial Intelligence for Assistance of Facial Expression Practice Using Emotion Classification (감정 분류를 이용한 표정 연습 보조 인공지능)

  • Dong-Kyu, Kim;So Hwa, Lee;Jae Hwan, Bong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.6
    • /
    • pp.1137-1144
    • /
    • 2022
  • In this study, an artificial intelligence(AI) was developed to help with facial expression practice in order to express emotions. The developed AI used multimodal inputs consisting of sentences and facial images for deep neural networks (DNNs). The DNNs calculated similarities between the emotions predicted by the sentences and the emotions predicted by facial images. The user practiced facial expressions based on the situation given by sentences, and the AI provided the user with numerical feedback based on the similarity between the emotion predicted by sentence and the emotion predicted by facial expression. ResNet34 structure was trained on FER2013 public data to predict emotions from facial images. To predict emotions in sentences, KoBERT model was trained in transfer learning manner using the conversational speech dataset for emotion classification opened to the public by AIHub. The DNN that predicts emotions from the facial images demonstrated 65% accuracy, which is comparable to human emotional classification ability. The DNN that predicts emotions from the sentences achieved 90% accuracy. The performance of the developed AI was evaluated through experiments with changing facial expressions in which an ordinary person was participated.

Study on Zero-shot based Quality Estimation (Zero-Shot 기반 기계번역 품질 예측 연구)

  • Eo, Sugyeong;Park, Chanjun;Seo, Jaehyung;Moon, Hyeonseok;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.11
    • /
    • pp.35-43
    • /
    • 2021
  • Recently, there has been a growing interest in zero-shot cross-lingual transfer, which leverages cross-lingual language models (CLLMs) to perform downstream tasks that are not trained in a specific language. In this paper, we point out the limitations of the data-centric aspect of quality estimation (QE), and perform zero-shot cross-lingual transfer even in environments where it is difficult to construct QE data. Few studies have dealt with zero-shots in QE, and after fine-tuning the English-German QE dataset, we perform zero-shot transfer leveraging CLLMs. We conduct comparative analysis between various CLLMs. We also perform zero-shot transfer on language pairs with different sized resources and analyze results based on the linguistic characteristics of each language. Experimental results showed the highest performance in multilingual BART and multillingual BERT, and we induced QE to be performed even when QE learning for a specific language pair was not performed at all.

Topic Model Augmentation and Extension Method using LDA and BERTopic (LDA와 BERTopic을 이용한 토픽모델링의 증강과 확장 기법 연구)

  • Kim, SeonWook;Yang, Kiduk
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.3
    • /
    • pp.99-132
    • /
    • 2022
  • The purpose of this study is to propose AET (Augmented and Extended Topics), a novel method of synthesizing both LDA and BERTopic results, and to analyze the recently published LIS articles as an experimental approach. To achieve the purpose of this study, 55,442 abstracts from 85 LIS journals within the WoS database, which spans from January 2001 to October 2021, were analyzed. AET first constructs a WORD2VEC-based cosine similarity matrix between LDA and BERTopic results, extracts AT (Augmented Topics) by repeating the matrix reordering and segmentation procedures as long as their semantic relations are still valid, and finally determines ET (Extended Topics) by removing any LDA related residual subtopics from the matrix and ordering the rest of them by F1 (BERTopic topic size rank, Inverse cosine similarity rank). AET, by comparing with the baseline LDA result, shows that AT has effectively concretized the original LDA topic model and ET has discovered new meaningful topics that LDA didn't. When it comes to the qualitative performance evaluation, AT performs better than LDA while ET shows similar performances except in a few cases.

Open Domain Machine Reading Comprehension using InferSent (InferSent를 활용한 오픈 도메인 기계독해)

  • Jeong-Hoon, Kim;Jun-Yeong, Kim;Jun, Park;Sung-Wook, Park;Se-Hoon, Jung;Chun-Bo, Sim
    • Smart Media Journal
    • /
    • v.11 no.10
    • /
    • pp.89-96
    • /
    • 2022
  • An open domain machine reading comprehension is a model that adds a function to search paragraphs as there are no paragraphs related to a given question. Document searches have an issue of lower performance with a lot of documents despite abundant research with word frequency based TF-IDF. Paragraph selections also have an issue of not extracting paragraph contexts, including sentence characteristics accurately despite a lot of research with word-based embedding. Document reading comprehension has an issue of slow learning due to the growing number of parameters despite a lot of research on BERT. Trying to solve these three issues, this study used BM25 which considered even sentence length and InferSent to get sentence contexts, and proposed an open domain machine reading comprehension with ALBERT to reduce the number of parameters. An experiment was conducted with SQuAD1.1 datasets. BM25 recorded a higher performance of document research than TF-IDF by 3.2%. InferSent showed a higher performance in paragraph selection than Transformer by 0.9%. Finally, as the number of paragraphs increased in document comprehension, ALBERT was 0.4% higher in EM and 0.2% higher in F1.

Industrial Technology Leak Detection System on the Dark Web (다크웹 환경에서 산업기술 유출 탐지 시스템)

  • Young Jae, Kong;Hang Bae, Chang
    • Smart Media Journal
    • /
    • v.11 no.10
    • /
    • pp.46-53
    • /
    • 2022
  • Today, due to the 4th industrial revolution and extensive R&D funding, domestic companies have begun to possess world-class industrial technologies and have grown into important assets. The national government has designated it as a "national core technology" in order to protect companies' critical industrial technologies. Particularly, technology leaks in the shipbuilding, display, and semiconductor industries can result in a significant loss of competitiveness not only at the company level but also at the national level. Every year, there are more insider leaks, ransomware attacks, and attempts to steal industrial technology through industrial spy. The stolen industrial technology is then traded covertly on the dark web. In this paper, we propose a system for detecting industrial technology leaks in the dark web environment. The proposed model first builds a database through dark web crawling using information collected from the OSINT environment. Afterwards, keywords for industrial technology leakage are extracted using the KeyBERT model, and signs of industrial technology leakage in the dark web environment are proposed as quantitative figures. Finally, based on the identified industrial technology leakage sites in the dark web environment, the possibility of secondary leakage is detected through the PageRank algorithm. The proposed method accepted for the collection of 27,317 unique dark web domains and the extraction of 15,028 nuclear energy-related keywords from 100 nuclear power patents. 12 dark web sites identified as a result of detecting secondary leaks based on the highest nuclear leak dark web sites.

Building Sentence Meaning Identification Dataset Based on Social Problem-Solving R&D Reports (사회문제 해결 연구보고서 기반 문장 의미 식별 데이터셋 구축)

  • Hyeonho Shin;Seonki Jeong;Hong-Woo Chun;Lee-Nam Kwon;Jae-Min Lee;Kanghee Park;Sung-Pil Choi
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.4
    • /
    • pp.159-172
    • /
    • 2023
  • In general, social problem-solving research aims to create important social value by offering meaningful answers to various social pending issues using scientific technologies. Not surprisingly, however, although numerous and extensive research attempts have been made to alleviate the social problems and issues in nation-wide, we still have many important social challenges and works to be done. In order to facilitate the entire process of the social problem-solving research and maximize its efficacy, it is vital to clearly identify and grasp the important and pressing problems to be focused upon. It is understandable for the problem discovery step to be drastically improved if current social issues can be automatically identified from existing R&D resources such as technical reports and articles. This paper introduces a comprehensive dataset which is essential to build a machine learning model for automatically detecting the social problems and solutions in various national research reports. Initially, we collected a total of 700 research reports regarding social problems and issues. Through intensive annotation process, we built totally 24,022 sentences each of which possesses its own category or label closely related to social problem-solving such as problems, purposes, solutions, effects and so on. Furthermore, we implemented four sentence classification models based on various neural language models and conducted a series of performance experiments using our dataset. As a result of the experiment, the model fine-tuned to the KLUE-BERT pre-trained language model showed the best performance with an accuracy of 75.853% and an F1 score of 63.503%.

A Deep Learning-based Depression Trend Analysis of Korean on Social Media (딥러닝 기반 소셜미디어 한글 텍스트 우울 경향 분석)

  • Park, Seojeong;Lee, Soobin;Kim, Woo Jung;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.1
    • /
    • pp.91-117
    • /
    • 2022
  • The number of depressed patients in Korea and around the world is rapidly increasing every year. However, most of the mentally ill patients are not aware that they are suffering from the disease, so adequate treatment is not being performed. If depressive symptoms are neglected, it can lead to suicide, anxiety, and other psychological problems. Therefore, early detection and treatment of depression are very important in improving mental health. To improve this problem, this study presented a deep learning-based depression tendency model using Korean social media text. After collecting data from Naver KonwledgeiN, Naver Blog, Hidoc, and Twitter, DSM-5 major depressive disorder diagnosis criteria were used to classify and annotate classes according to the number of depressive symptoms. Afterwards, TF-IDF analysis and simultaneous word analysis were performed to examine the characteristics of each class of the corpus constructed. In addition, word embedding, dictionary-based sentiment analysis, and LDA topic modeling were performed to generate a depression tendency classification model using various text features. Through this, the embedded text, sentiment score, and topic number for each document were calculated and used as text features. As a result, it was confirmed that the highest accuracy rate of 83.28% was achieved when the depression tendency was classified based on the KorBERT algorithm by combining both the emotional score and the topic of the document with the embedded text. This study establishes a classification model for Korean depression trends with improved performance using various text features, and detects potential depressive patients early among Korean online community users, enabling rapid treatment and prevention, thereby enabling the mental health of Korean society. It is significant in that it can help in promotion.