• 제목/요약/키워드: Deep CNN

Search Result 1,148, Processing Time 0.028 seconds

A Comparison of Image Classification System for Building Waste Data based on Deep Learning (딥러닝기반 건축폐기물 이미지 분류 시스템 비교)

  • Jae-Kyung Sung;Mincheol Yang;Kyungnam Moon;Yong-Guk Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.3
    • /
    • pp.199-206
    • /
    • 2023
  • This study utilizes deep learning algorithms to automatically classify construction waste into three categories: wood waste, plastic waste, and concrete waste. Two models, VGG-16 and ViT (Vision Transformer), which are convolutional neural network image classification algorithms and NLP-based models that sequence images, respectively, were compared for their performance in classifying construction waste. Image data for construction waste was collected by crawling images from search engines worldwide, and 3,000 images, with 1,000 images for each category, were obtained by excluding images that were difficult to distinguish with the naked eye or that were duplicated and would interfere with the experiment. In addition, to improve the accuracy of the models, data augmentation was performed during training with a total of 30,000 images. Despite the unstructured nature of the collected image data, the experimental results showed that VGG-16 achieved an accuracy of 91.5%, and ViT achieved an accuracy of 92.7%. This seems to suggest the possibility of practical application in actual construction waste data management work. If object detection techniques or semantic segmentation techniques are utilized based on this study, more precise classification will be possible even within a single image, resulting in more accurate waste classification

Developing a deep learning-based recommendation model using online reviews for predicting consumer preferences: Evidence from the restaurant industry (딥러닝 기반 온라인 리뷰를 활용한 추천 모델 개발: 레스토랑 산업을 중심으로)

  • Dongeon Kim;Dongsoo Jang;Jinzhe Yan;Jiaen Li
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.31-49
    • /
    • 2023
  • With the growth of the food-catering industry, consumer preferences and the number of dine-in restaurants are gradually increasing. Thus, personalized recommendation services are required to select a restaurant suitable for consumer preferences. Previous studies have used questionnaires and star-rating approaches, which do not effectively depict consumer preferences. Online reviews are the most essential sources of information in this regard. However, previous studies have aggregated online reviews into long documents, and traditional machine-learning methods have been applied to these to extract semantic representations; however, such approaches fail to consider the surrounding word or context. Therefore, this study proposes a novel review textual-based restaurant recommendation model (RT-RRM) that uses deep learning to effectively extract consumer preferences from online reviews. The proposed model concatenates consumer-restaurant interactions with the extracted high-level semantic representations and predicts consumer preferences accurately and effectively. Experiments on real-world datasets show that the proposed model exhibits excellent recommendation performance compared with several baseline models.

Analysis of Surface Urban Heat Island and Land Surface Temperature Using Deep Learning Based Local Climate Zone Classification: A Case Study of Suwon and Daegu, Korea (딥러닝 기반 Local Climate Zone 분류체계를 이용한 지표면온도와 도시열섬 분석: 수원시와 대구광역시를 대상으로)

  • Lee, Yeonsu;Lee, Siwoo;Im, Jungho;Yoo, Cheolhee
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.5_3
    • /
    • pp.1447-1460
    • /
    • 2021
  • Urbanization increases the amount of impervious surface and artificial heat emission, resulting in urban heat island (UHI) effect. Local climate zones (LCZ) are a classification scheme for urban areas considering urban land cover characteristics and the geometry and structure of buildings, which can be used for analyzing urban heat island effect in detail. This study aimed to examine the UHI effect by urban structure in Suwon and Daegu using the LCZ scheme. First, the LCZ maps were generated using Landsat 8 images and convolutional neural network (CNN) deep learning over the two cities. Then, Surface UHI (SUHI), which indicates the land surface temperature (LST) difference between urban and rural areas, was analyzed by LCZ class. The results showed that the overall accuracies of the CNN models for LCZ classification were relatively high 87.9% and 81.7% for Suwon and Daegu, respectively. In general, Daegu had higher LST for all LCZ classes than Suwon. For both cities, LST tended to increase with increasing building density with relatively low building height. For both cities, the intensity of SUHI was very high in summer regardless of LCZ classes and was also relatively high except for a few classes in spring and fall. In winter the SUHI intensity was low, resulting in negative values for many LCZ classes. This implies that UHI is very strong in summer, and some urban areas often are colder than rural areas in winter. The research findings demonstrated the applicability of the LCZ data for SUHI analysis and can provide a basis for establishing timely strategies to respond urban on-going climate change over urban areas.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.

Flaw Evaluation of Bogie connected Part for Railway Vehicle Based on Convolutional Neural Network (CNN 기반 철도차량 차체-대차 연결부의 결함 평가기법 연구)

  • Kwon, Seok-Jin;Kim, Min-Soo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.11
    • /
    • pp.53-60
    • /
    • 2020
  • The bogies of railway vehicles are one of the most critical components for service. Fatigue defects in the bogie can be initiated for various reasons, such as material imperfection, welding defects, and unpredictable and excessive overloads during operation. To prevent the derailment of a railway vehicle, it is necessary to evaluate and detect the defect of a connection weldment between the car body and bogie accurately. The safety of the bogie weldment was checked using an ultrasonic test, and it is necessary to determine the occurrence of defects using a learning method. Recently, studies on deep learning have been performed to identify defects with a high recognition rate with respect to a fine and similar defect. In this paper, the databases of weldment specimens with artificial defects were constructed to detect the defect of a bogie weldment. The ultrasonic inspection using the wedge angle was performed to understand the detection ability of fatigue cracks. In addition, the convolutional neural network was applied to minimize human error during the inspection. The results showed that the defects of connection weldment between the car body and bogie could be classified with more than 99.98% accuracy using CNN, and the effectiveness can be verified in the case of an inspection.

Multi-modal Emotion Recognition using Semi-supervised Learning and Multiple Neural Networks in the Wild (준 지도학습과 여러 개의 딥 뉴럴 네트워크를 사용한 멀티 모달 기반 감정 인식 알고리즘)

  • Kim, Dae Ha;Song, Byung Cheol
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.351-360
    • /
    • 2018
  • Human emotion recognition is a research topic that is receiving continuous attention in computer vision and artificial intelligence domains. This paper proposes a method for classifying human emotions through multiple neural networks based on multi-modal signals which consist of image, landmark, and audio in a wild environment. The proposed method has the following features. First, the learning performance of the image-based network is greatly improved by employing both multi-task learning and semi-supervised learning using the spatio-temporal characteristic of videos. Second, a model for converting 1-dimensional (1D) landmark information of face into two-dimensional (2D) images, is newly proposed, and a CNN-LSTM network based on the model is proposed for better emotion recognition. Third, based on an observation that audio signals are often very effective for specific emotions, we propose an audio deep learning mechanism robust to the specific emotions. Finally, so-called emotion adaptive fusion is applied to enable synergy of multiple networks. The proposed network improves emotion classification performance by appropriately integrating existing supervised learning and semi-supervised learning networks. In the fifth attempt on the given test set in the EmotiW2017 challenge, the proposed method achieved a classification accuracy of 57.12%.

Method of ChatBot Implementation Using Bot Framework (봇 프레임워크를 활용한 챗봇 구현 방안)

  • Kim, Ki-Young
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.15 no.1
    • /
    • pp.56-61
    • /
    • 2022
  • In this paper, we classify and present AI algorithms and natural language processing methods used in chatbots. A framework that can be used to implement a chatbot is also described. A chatbot is a system with a structure that interprets the input string by constructing the user interface in a conversational manner and selects an appropriate answer to the input string from the learned data and outputs it. However, training is required to generate an appropriate set of answers to a question and hardware with considerable computational power is required. Therefore, there is a limit to the practice of not only developing companies but also students learning AI development. Currently, chatbots are replacing the existing traditional tasks, and a practice course to understand and implement the system is required. RNN and Char-CNN are used to increase the accuracy of answering questions by learning unstructured data by applying technologies such as deep learning beyond the level of responding only to standardized data. In order to implement a chatbot, it is necessary to understand such a theory. In addition, the students presented examples of implementation of the entire system by utilizing the methods that can be used for coding education and the platform where existing developers and students can implement chatbots.

Evaluating the Effectiveness of an Artificial Intelligence Model for Classification of Basic Volcanic Rocks Based on Polarized Microscope Image (편광현미경 이미지 기반 염기성 화산암 분류를 위한 인공지능 모델의 효용성 평가)

  • Sim, Ho;Jung, Wonwoo;Hong, Seongsik;Seo, Jaewon;Park, Changyun;Song, Yungoo
    • Economic and Environmental Geology
    • /
    • v.55 no.3
    • /
    • pp.309-316
    • /
    • 2022
  • In order to minimize the human and time consumption required for rock classification, research on rock classification using artificial intelligence (AI) has recently developed. In this study, basic volcanic rocks were subdivided by using polarizing microscope thin section images. A convolutional neural network (CNN) model based on Tensorflow and Keras libraries was self-producted for rock classification. A total of 720 images of olivine basalt, basaltic andesite, olivine tholeiite, trachytic olivine basalt reference specimens were mounted with open nicol, cross nicol, and adding gypsum plates, and trained at the training : test = 7 : 3 ratio. As a result of machine learning, the classification accuracy was over 80-90%. When we confirmed the classification accuracy of each AI model, it is expected that the rock classification method of this model will not be much different from the rock classification process of a geologist. Furthermore, if not only this model but also models that subdivide more diverse rock types are produced and integrated, the AI model that satisfies both the speed of data classification and the accessibility of non-experts can be developed, thereby providing a new framework for basic petrology research.

Water Segmentation Based on Morphologic and Edge-enhanced U-Net Using Sentinel-1 SAR Images (형태학적 연산과 경계추출 학습이 강화된 U-Net을 활용한 Sentinel-1 영상 기반 수체탐지)

  • Kim, Hwisong;Kim, Duk-jin;Kim, Junwoo
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.5_2
    • /
    • pp.793-810
    • /
    • 2022
  • Synthetic Aperture Radar (SAR) is considered to be suitable for near real-time inundation monitoring. The distinctly different intensity between water and land makes it adequate for waterbody detection, but the intrinsic speckle noise and variable intensity of SAR images decrease the accuracy of waterbody detection. In this study, we suggest two modules, named 'morphology module' and 'edge-enhanced module', which are the combinations of pooling layers and convolutional layers, improving the accuracy of waterbody detection. The morphology module is composed of min-pooling layers and max-pooling layers, which shows the effect of morphological transformation. The edge-enhanced module is composed of convolution layers, which has the fixed weights of the traditional edge detection algorithm. After comparing the accuracy of various versions of each module for U-Net, we found that the optimal combination is the case that the morphology module of min-pooling and successive layers of min-pooling and max-pooling, and the edge-enhanced module of Scharr filter were the inputs of conv9. This morphologic and edge-enhanced U-Net improved the F1-score by 9.81% than the original U-Net. Qualitative inspection showed that our model has capability of detecting small-sized waterbody and detailed edge of water, which are the distinct advancement of the model presented in this research, compared to the original U-Net.

Application of convolutional autoencoder for spatiotemporal bias-correction of radar precipitation (CAE 알고리즘을 이용한 레이더 강우 보정 평가)

  • Jung, Sungho;Oh, Sungryul;Lee, Daeeop;Le, Xuan Hien;Lee, Giha
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.7
    • /
    • pp.453-462
    • /
    • 2021
  • As the frequency of localized heavy rainfall has increased during recent years, the importance of high-resolution radar data has also increased. This study aims to correct the bias of Dual Polarization radar that still has a spatial and temporal bias. In many studies, various statistical techniques have been attempted to correct the bias of radar rainfall. In this study, the bias correction of the S-band Dual Polarization radar used in flood forecasting of ME was implemented by a Convolutional Autoencoder (CAE) algorithm, which is a type of Convolutional Neural Network (CNN). The CAE model was trained based on radar data sets that have a 10-min temporal resolution for the July 2017 flood event in Cheongju. The results showed that the newly developed CAE model provided improved simulation results in time and space by reducing the bias of raw radar rainfall. Therefore, the CAE model, which learns the spatial relationship between each adjacent grid, can be used for real-time updates of grid-based climate data generated by radar and satellites.