• Title/Summary/Keyword: Model Tuning

Search Result 773, Processing Time 0.029 seconds

Korean Food Review Analysis Using Large Language Models: Sentiment Analysis and Multi-Labeling for Food Safety Hazard Detection (대형 언어 모델을 활용한 한국어 식품 리뷰 분석: 감성분석과 다중 라벨링을 통한 식품안전 위해 탐지 연구)

  • Eun-Seon Choi;Kyung-Hee Lee;Wan-Sup Cho
    • The Journal of Bigdata
    • /
    • v.9 no.1
    • /
    • pp.75-88
    • /
    • 2024
  • Recently, there have been cases reported in the news of individuals experiencing symptoms of food poisoning after consuming raw beef purchased from online platforms, or reviews claiming that cherry tomatoes tasted bitter. This suggests the potential for analyzing food reviews on online platforms to detect food hazards, enabling government agencies, food manufacturers, and distributors to manage consumer food safety risks. This study proposes a classification model that uses sentiment analysis and large language models to analyze food reviews and detect negative ones, multi-labeling key food safety hazards (food poisoning, spoilage, chemical odors, foreign objects). The sentiment analysis model effectively minimized the misclassification of negative reviews with a low False Positive rate using a 'funnel' model. The multi-labeling model for food safety hazards showed high performance with both recall and accuracy over 96% when using GPT-4 Turbo compared to GPT-3.5. Government agencies, food manufacturers, and distributors can use the proposed model to monitor consumer reviews in real-time, detect potential food safety issues early, and manage risks. Such a system can protect corporate brand reputation, enhance consumer protection, and ultimately improve consumer health and safety.

Business Application of Convolutional Neural Networks for Apparel Classification Using Runway Image (합성곱 신경망의 비지니스 응용: 런웨이 이미지를 사용한 의류 분류를 중심으로)

  • Seo, Yian;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.1-19
    • /
    • 2018
  • Large amount of data is now available for research and business sectors to extract knowledge from it. This data can be in the form of unstructured data such as audio, text, and image data and can be analyzed by deep learning methodology. Deep learning is now widely used for various estimation, classification, and prediction problems. Especially, fashion business adopts deep learning techniques for apparel recognition, apparel search and retrieval engine, and automatic product recommendation. The core model of these applications is the image classification using Convolutional Neural Networks (CNN). CNN is made up of neurons which learn parameters such as weights while inputs come through and reach outputs. CNN has layer structure which is best suited for image classification as it is comprised of convolutional layer for generating feature maps, pooling layer for reducing the dimensionality of feature maps, and fully-connected layer for classifying the extracted features. However, most of the classification models have been trained using online product image, which is taken under controlled situation such as apparel image itself or professional model wearing apparel. This image may not be an effective way to train the classification model considering the situation when one might want to classify street fashion image or walking image, which is taken in uncontrolled situation and involves people's movement and unexpected pose. Therefore, we propose to train the model with runway apparel image dataset which captures mobility. This will allow the classification model to be trained with far more variable data and enhance the adaptation with diverse query image. To achieve both convergence and generalization of the model, we apply Transfer Learning on our training network. As Transfer Learning in CNN is composed of pre-training and fine-tuning stages, we divide the training step into two. First, we pre-train our architecture with large-scale dataset, ImageNet dataset, which consists of 1.2 million images with 1000 categories including animals, plants, activities, materials, instrumentations, scenes, and foods. We use GoogLeNet for our main architecture as it has achieved great accuracy with efficiency in ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Second, we fine-tune the network with our own runway image dataset. For the runway image dataset, we could not find any previously and publicly made dataset, so we collect the dataset from Google Image Search attaining 2426 images of 32 major fashion brands including Anna Molinari, Balenciaga, Balmain, Brioni, Burberry, Celine, Chanel, Chloe, Christian Dior, Cividini, Dolce and Gabbana, Emilio Pucci, Ermenegildo, Fendi, Giuliana Teso, Gucci, Issey Miyake, Kenzo, Leonard, Louis Vuitton, Marc Jacobs, Marni, Max Mara, Missoni, Moschino, Ralph Lauren, Roberto Cavalli, Sonia Rykiel, Stella McCartney, Valentino, Versace, and Yve Saint Laurent. We perform 10-folded experiments to consider the random generation of training data, and our proposed model has achieved accuracy of 67.2% on final test. Our research suggests several advantages over previous related studies as to our best knowledge, there haven't been any previous studies which trained the network for apparel image classification based on runway image dataset. We suggest the idea of training model with image capturing all the possible postures, which is denoted as mobility, by using our own runway apparel image dataset. Moreover, by applying Transfer Learning and using checkpoint and parameters provided by Tensorflow Slim, we could save time spent on training the classification model as taking 6 minutes per experiment to train the classifier. This model can be used in many business applications where the query image can be runway image, product image, or street fashion image. To be specific, runway query image can be used for mobile application service during fashion week to facilitate brand search, street style query image can be classified during fashion editorial task to classify and label the brand or style, and website query image can be processed by e-commerce multi-complex service providing item information or recommending similar item.

Novel Power Bus Design Method for High-Speed Digital Boards (고속 디지털 보드를 위한 새로운 전압 버스 설계 방법)

  • Wee, Jae-Kyung
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.12 s.354
    • /
    • pp.23-32
    • /
    • 2006
  • Fast and accurate power bus design (FAPUD) method for multi-layers high-speed digital boards is devised for the power supply network design tool for accurate and precise high speed board. FAPUD is constructed, based on two main algorithms of the PBEC (Path Based Equivalent Circuit) model and the network synthesis method. The PBEC model exploits simple arithmetic expressions of the lumped 1-D circuit model from the electrical parameters of a 2-D power distribution network. The circuit level design based on PBEC is carried with the proposed regional approach. The circuit level design directly calculates and determines the size of on-chip decoupling capacitors, the size and the location of off-chip decoupling capacitors, and the effective inductances of the package power bus. As a design output, a lumped circuit model and a pre-layout of the power bus including a whole decoupling capacitors are obtained after processing FAPUD. In the tuning procedure, the board re-optimization considering simultaneous switching noise (SSN) added by I/O switching can be carried out because the I/O switching effect on a power supply noise can be estimated over the operation frequency range with the lumped circuit model. Furthermore, if a design changes or needs to be tuned, FAPUD can modify design by replacing decoupling capacitors without consuming other design resources. Finally, FAPUD is accurate compared with conventional PEEC-based design tools, and its design time is 10 times faster than that of conventional PEEC-based design tools.

Development of Joint-Based Motion Prediction Model for Home Co-Robot Using SVM (SVM을 이용한 가정용 협력 로봇의 조인트 위치 기반 실행동작 예측 모델 개발)

  • Yoo, Sungyeob;Yoo, Dong-Yeon;Park, Ye-Seul;Lee, Jung-Won
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.12
    • /
    • pp.491-498
    • /
    • 2019
  • Digital twin is a technology that virtualizes physical objects of the real world on a computer. It is used by collecting sensor data through IoT, and using the collected data to connect physical objects and virtual objects in both directions. It has an advantage of minimizing risk by tuning an operation of virtual model through simulation and responding to varying environment by exploiting experiments in advance. Recently, artificial intelligence and machine learning technologies have been attracting attention, so that tendency to virtualize a behavior of physical objects, observe virtual models, and apply various scenarios is increasing. In particular, recognition of each robot's motion is needed to build digital twin for co-robot which is a heart of industry 4.0 factory automation. Compared with modeling based research for recognizing motion of co-robot, there are few attempts to predict motion based on sensor data. Therefore, in this paper, an experimental environment for collecting current and inertia data in co-robot to detect the motion of the robot is built, and a motion prediction model based on the collected sensor data is proposed. The proposed method classifies the co-robot's motion commands into 9 types based on joint position and uses current and inertial sensor values to predict them by accumulated learning. The data used for accumulating learning is the sensor values that are collected when the co-robot operates with margin in input parameters of the motion commands. Through this, the model is constructed to predict not only the nine movements along the same path but also the movements along the similar path. As a result of learning using SVM, the accuracy, precision, and recall factors of the model were evaluated as 97% on average.

Enhanced Variable Structure Control With Fuzzy Logic System

  • Charnprecharut, Veeraphon;Phaitoonwattanakij, Kitti;Tiacharoen, Somporn
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.999-1004
    • /
    • 2005
  • An algorithm for a hybrid controller consists of a sliding mode control part and a fuzzy logic part which ar purposely for nonlinear systems. The sliding mode part of the solution is based on "eigenvalue/vector"-type controller is used as the backstepping approach for tracking errors. The fuzzy logic part is a Mamdani fuzzy model. This is designed by applying sliding mode control (SMC) method to the dynamic model. The main objective is to keep the update dynamics in a stable region by used SMC. After that the plant behavior is presented to train procedure of adaptive neuro-fuzzy inference systems (ANFIS). ANFIS architecture is determined and the relevant formulation for the approach is given. Using the error (e) and rate of error (de), occur due to the difference between the desired output value (yd) and the actual output value (y) of the system. A dynamic adaptation law is proposed and proved the particularly chosen form of the adaptation strategy. Subsequently VSC creates a sliding mode in the plant behavior while the parameters of the controller are also in a sliding mode (stable trainer). This study considers the ANFIS structure with first order Sugeno model containing nine rules. Bell shaped membership functions with product inference rule are used at the fuzzification level. Finally the Mamdani fuzzy logic which is depends on adaptive neuro-fuzzy inference systems structure designed. At the transferable stage from ANFIS to Mamdani fuzzy model is adjusted for the membership function of the input value (e, de) and the actual output value (y) of the system could be changed to trapezoidal and triangular functions through tuning the parameters of the membership functions and rules base. These help adjust the contributions of both fuzzy control and variable structure control to the entire control value. The application example, control of a mass-damper system is considered. The simulation has been done using MATLAB. Three cases of the controller will be considered: for backstepping sliding-mode controller, for hybrid controller, and for adaptive backstepping sliding-mode controller. A numerical example is simulated to verify the performances of the proposed control strategy, and the simulation results show that the controller designed is more effective than the adaptive backstepping sliding mode controller.

  • PDF

Change Attention-based Vehicle Scratch Detection System (변화 주목 기반 차량 흠집 탐지 시스템)

  • Lee, EunSeong;Lee, DongJun;Park, GunHee;Lee, Woo-Ju;Sim, Donggyu;Oh, Seoung-Jun
    • Journal of Broadcast Engineering
    • /
    • v.27 no.2
    • /
    • pp.228-239
    • /
    • 2022
  • In this paper, we propose an unmanned vehicle scratch detection deep learning model for car sharing services. Conventional scratch detection models consist of two steps: 1) a deep learning module for scratch detection of images before and after rental, 2) a manual matching process for finding newly generated scratches. In order to build a fully automatic scratch detection model, we propose a one-step unmanned scratch detection deep learning model. The proposed model is implemented by applying transfer learning and fine-tuning to the deep learning model that detects changes in satellite images. In the proposed car sharing service, specular reflection greatly affects the scratch detection performance since the brightness of the gloss-treated automobile surface is anisotropic and a non-expert user takes a picture with a general camera. In order to reduce detection errors caused by specular reflected light, we propose a preprocessing process for removing specular reflection components. For data taken by mobile phone cameras, the proposed system can provide high matching performance subjectively and objectively. The scores for change detection metrics such as precision, recall, F1, and kappa are 67.90%, 74.56%, 71.08%, and 70.18%, respectively.

Breast Cancer Histopathological Image Classification Based on Deep Neural Network with Pre-Trained Model Architecture (사전훈련된 모델구조를 이용한 심층신경망 기반 유방암 조직병리학적 이미지 분류)

  • Mudeng, Vicky;Lee, Eonjin;Choe, Se-woon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.399-401
    • /
    • 2022
  • A definitive diagnosis to classify the breast malignancy status may be achieved by microscopic analysis using surgical open biopsy. However, this procedure requires experts in the specializing of histopathological image analysis directing to time-consuming and high cost. To overcome these issues, deep learning is considered practically efficient to categorize breast cancer into benign and malignant from histopathological images in order to assist pathologists. This study presents a pre-trained convolutional neural network model architecture with a 100% fine-tuning scheme and Adagrad optimizer to classify the breast cancer histopathological images into benign and malignant using a 40× magnification BreaKHis dataset. The pre-trained architecture was constructed using the InceptionResNetV2 model to generate a modified InceptionResNetV2 by substituting the last layer with dense and dropout layers. The results by demonstrating training loss of 0.25%, training accuracy of 99.96%, validation loss of 3.10%, validation accuracy of 99.41%, test loss of 8.46%, and test accuracy of 98.75% indicated that the modified InceptionResNetV2 model is reliable to predict the breast malignancy type from histopathological images. Future works are necessary to focus on k-fold cross-validation, optimizer, model, hyperparameter optimization, and classification on 100×, 200×, and 400× magnification.

  • PDF

Generative AI service implementation using LLM application architecture: based on RAG model and LangChain framework (LLM 애플리케이션 아키텍처를 활용한 생성형 AI 서비스 구현: RAG모델과 LangChain 프레임워크 기반)

  • Cheonsu Jeong
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.129-164
    • /
    • 2023
  • In a situation where the use and introduction of Large Language Models (LLMs) is expanding due to recent developments in generative AI technology, it is difficult to find actual application cases or implementation methods for the use of internal company data in existing studies. Accordingly, this study presents a method of implementing generative AI services using the LLM application architecture using the most widely used LangChain framework. To this end, we reviewed various ways to overcome the problem of lack of information, focusing on the use of LLM, and presented specific solutions. To this end, we analyze methods of fine-tuning or direct use of document information and look in detail at the main steps of information storage and retrieval methods using the retrieval augmented generation (RAG) model to solve these problems. In particular, similar context recommendation and Question-Answering (QA) systems were utilized as a method to store and search information in a vector store using the RAG model. In addition, the specific operation method, major implementation steps and cases, including implementation source and user interface were presented to enhance understanding of generative AI technology. This has meaning and value in enabling LLM to be actively utilized in implementing services within companies.

Deep Learning-Assisted Diagnosis of Pediatric Skull Fractures on Plain Radiographs

  • Jae Won Choi;Yeon Jin Cho;Ji Young Ha;Yun Young Lee;Seok Young Koh;June Young Seo;Young Hun Choi;Jung-Eun Cheon;Ji Hoon Phi;Injoon Kim;Jaekwang Yang;Woo Sun Kim
    • Korean Journal of Radiology
    • /
    • v.23 no.3
    • /
    • pp.343-354
    • /
    • 2022
  • Objective: To develop and evaluate a deep learning-based artificial intelligence (AI) model for detecting skull fractures on plain radiographs in children. Materials and Methods: This retrospective multi-center study consisted of a development dataset acquired from two hospitals (n = 149 and 264) and an external test set (n = 95) from a third hospital. Datasets included children with head trauma who underwent both skull radiography and cranial computed tomography (CT). The development dataset was split into training, tuning, and internal test sets in a ratio of 7:1:2. The reference standard for skull fracture was cranial CT. Two radiology residents, a pediatric radiologist, and two emergency physicians participated in a two-session observer study on an external test set with and without AI assistance. We obtained the area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity along with their 95% confidence intervals (CIs). Results: The AI model showed an AUROC of 0.922 (95% CI, 0.842-0.969) in the internal test set and 0.870 (95% CI, 0.785-0.930) in the external test set. The model had a sensitivity of 81.1% (95% CI, 64.8%-92.0%) and specificity of 91.3% (95% CI, 79.2%-97.6%) for the internal test set and 78.9% (95% CI, 54.4%-93.9%) and 88.2% (95% CI, 78.7%-94.4%), respectively, for the external test set. With the model's assistance, significant AUROC improvement was observed in radiology residents (pooled results) and emergency physicians (pooled results) with the difference from reading without AI assistance of 0.094 (95% CI, 0.020-0.168; p = 0.012) and 0.069 (95% CI, 0.002-0.136; p = 0.043), respectively, but not in the pediatric radiologist with the difference of 0.008 (95% CI, -0.074-0.090; p = 0.850). Conclusion: A deep learning-based AI model improved the performance of inexperienced radiologists and emergency physicians in diagnosing pediatric skull fractures on plain radiographs.

Theoretical Investigations on Compatibility of Feedback-Based Cellular Models for Dune Dynamics : Sand Fluxes, Avalanches, and Wind Shadow ('되먹임 기반' 사구 역학 모형의 호환 가능성에 대한 이론적 고찰 - 플럭스, 사면조정, 바람그늘 문제를 중심으로 -)

  • RHEW, Hosahng
    • Journal of the Korean association of regional geographers
    • /
    • v.22 no.3
    • /
    • pp.681-702
    • /
    • 2016
  • Two different modelling approaches to dune dynamics have been established thus far; continuous models that emphasize the precise representation of wind field, and feedback-based models that focus on the interactions between dunes, rather than aerodynamics. Though feedback-based models have proven their capability to capture the essence of dune dynamics, the compatibility issues on these models have less been addressed. This research investigated, mostly from the theoretical point of view, the algorithmic compatibility of three feedback-based dune models: sand slab models, Nishimori model, and de Castro model. Major findings are as follows. First, sand slab models and de Castro model are both compatible in terms of flux perspectives, whereas Nishimori model needs a tuning factor. Second, the algorithm of avalanching can be easily implemented via repetitive spatial smoothing, showing high compatibility between models. Finally, the wind shadow rule might not be a necessary component to reproduce dune patterns unlike the interpretation or assumption of previous studies. The wind shadow rule, rather, might be more important in understanding bedform-level interactions. Overall, three models show high compatibility between them, or seem to require relatively small modification, though more thorough investigation is needed.

  • PDF