• Title/Summary/Keyword: Convolutional Neural Network

Search Result 1,517, Processing Time 0.024 seconds

Deep Learning-based Interior Design Recognition (딥러닝 기반 실내 디자인 인식)

  • Wongyu Lee;Jihun Park;Jonghyuk Lee;Heechul Jung
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.19 no.1
    • /
    • pp.47-55
    • /
    • 2024
  • We spend a lot of time in indoor space, and the space has a huge impact on our lives. Interior design plays a significant role to make an indoor space attractive and functional. However, it should consider a lot of complex elements such as color, pattern, and material etc. With the increasing demand for interior design, there is a growing need for technologies that analyze these design elements accurately and efficiently. To address this need, this study suggests a deep learning-based design analysis system. The proposed system consists of a semantic segmentation model that classifies spatial components and an image classification model that classifies attributes such as color, pattern, and material from the segmented components. Semantic segmentation model was trained using a dataset of 30000 personal indoor interior images collected for research, and during inference, the model separate the input image pixel into 34 categories. And experiments were conducted with various backbones in order to obtain the optimal performance of the deep learning model for the collected interior dataset. Finally, the model achieved good performance of 89.05% and 0.5768 in terms of accuracy and mean intersection over union (mIoU). In classification part convolutional neural network (CNN) model which has recorded high performance in other image recognition tasks was used. To improve the performance of the classification model we suggests an approach that how to handle data that has data imbalance and vulnerable to light intensity. Using our methods, we achieve satisfactory results in classifying interior design component attributes. In this paper, we propose indoor space design analysis system that automatically analyzes and classifies the attributes of indoor images using a deep learning-based model. This analysis system, used as a core module in the A.I interior recommendation service, can help users pursuing self-interior design to complete their designs more easily and efficiently.

Analysis of deep learning-based deep clustering method (딥러닝 기반의 딥 클러스터링 방법에 대한 분석)

  • Hyun Kwon;Jun Lee
    • Convergence Security Journal
    • /
    • v.23 no.4
    • /
    • pp.61-70
    • /
    • 2023
  • Clustering is an unsupervised learning method that involves grouping data based on features such as distance metrics, using data without known labels or ground truth values. This method has the advantage of being applicable to various types of data, including images, text, and audio, without the need for labeling. Traditional clustering techniques involve applying dimensionality reduction methods or extracting specific features to perform clustering. However, with the advancement of deep learning models, research on deep clustering techniques using techniques such as autoencoders and generative adversarial networks, which represent input data as latent vectors, has emerged. In this study, we propose a deep clustering technique based on deep learning. In this approach, we use an autoencoder to transform the input data into latent vectors, and then construct a vector space according to the cluster structure and perform k-means clustering. We conducted experiments using the MNIST and Fashion-MNIST datasets in the PyTorch machine learning library as the experimental environment. The model used is a convolutional neural network-based autoencoder model. The experimental results show an accuracy of 89.42% for MNIST and 56.64% for Fashion-MNIST when k is set to 10.

Algorithm development for texture and color style transfer of cultural heritage images (문화유산 이미지의 질감과 색상 스타일 전이를 위한 알고리즘 개발 연구)

  • Baek Seohyun;Cho Yeeun;Ahn Sangdoo;Choi Jongwon
    • Conservation Science in Museum
    • /
    • v.31
    • /
    • pp.55-70
    • /
    • 2024
  • Style transfer algorithms are currently undergoing active research and are used, for example, to convert ordinary images into classical painting styles. However, such algorithms have yet to produce appropriate results when applied to Korean cultural heritage images, while the number of cases for such applications also remains insufficient. Accordingly, this study attempts to develop a style transfer algorithm that can be applied to styles found among Korean cultural heritage. The algorithm was produced by improving data comprehension by enabling it to learn meaningful characteristics of the styles through representation learning and to separate the cultural heritage from the background in the target images, allowing it to extract the style-relevant areas with the desired color and texture from the style images. This study confirmed that, by doing so, a new image can be created by effectively transferring the characteristics of the style image while maintaining the form of the target image, which thereby enables the transfer of a variety of cultural heritage styles.

Predictive modeling algorithms for liver metastasis in colorectal cancer: A systematic review of the current literature

  • Isaac Seow-En;Ye Xin Koh;Yun Zhao;Boon Hwee Ang;Ivan En-Howe Tan;Aik Yong Chok;Emile John Kwong Wei Tan;Marianne Kit Har Au
    • Annals of Hepato-Biliary-Pancreatic Surgery
    • /
    • v.28 no.1
    • /
    • pp.14-24
    • /
    • 2024
  • This study aims to assess the quality and performance of predictive models for colorectal cancer liver metastasis (CRCLM). A systematic review was performed to identify relevant studies from various databases. Studies that described or validated predictive models for CRCLM were included. The methodological quality of the predictive models was assessed. Model performance was evaluated by the reported area under the receiver operating characteristic curve (AUC). Of the 117 articles screened, seven studies comprising 14 predictive models were included. The distribution of included predictive models was as follows: radiomics (n = 3), logistic regression (n = 3), Cox regression (n = 2), nomogram (n = 3), support vector machine (SVM, n = 2), random forest (n = 2), and convolutional neural network (CNN, n = 2). Age, sex, carcinoembryonic antigen, and tumor staging (T and N stage) were the most frequently used clinicopathological predictors for CRCLM. The mean AUCs ranged from 0.697 to 0.870, with 86% of the models demonstrating clear discriminative ability (AUC > 0.70). A hybrid approach combining clinical and radiomic features with SVM provided the best performance, achieving an AUC of 0.870. The overall risk of bias was identified as high in 71% of the included studies. This review highlights the potential of predictive modeling to accurately predict the occurrence of CRCLM. Integrating clinicopathological and radiomic features with machine learning algorithms demonstrates superior predictive capabilities.

Speech Emotion Recognition in People at High Risk of Dementia

  • Dongseon Kim;Bongwon Yi;Yugwon Won
    • Dementia and Neurocognitive Disorders
    • /
    • v.23 no.3
    • /
    • pp.146-160
    • /
    • 2024
  • Background and Purpose: The emotions of people at various stages of dementia need to be effectively utilized for prevention, early intervention, and care planning. With technology available for understanding and addressing the emotional needs of people, this study aims to develop speech emotion recognition (SER) technology to classify emotions for people at high risk of dementia. Methods: Speech samples from people at high risk of dementia were categorized into distinct emotions via human auditory assessment, the outcomes of which were annotated for guided deep-learning method. The architecture incorporated convolutional neural network, long short-term memory, attention layers, and Wav2Vec2, a novel feature extractor to develop automated speech-emotion recognition. Results: Twenty-seven kinds of Emotions were found in the speech of the participants. These emotions were grouped into 6 detailed emotions: happiness, interest, sadness, frustration, anger, and neutrality, and further into 3 basic emotions: positive, negative, and neutral. To improve algorithmic performance, multiple learning approaches were applied using different data sources-voice and text-and varying the number of emotions. Ultimately, a 2-stage algorithm-initial text-based classification followed by voice-based analysis-achieved the highest accuracy, reaching 70%. Conclusions: The diverse emotions identified in this study were attributed to the characteristics of the participants and the method of data collection. The speech of people at high risk of dementia to companion robots also explains the relatively low performance of the SER algorithm. Accordingly, this study suggests the systematic and comprehensive construction of a dataset from people with dementia.

Satellite-Based Cabbage and Radish Yield Prediction Using Deep Learning in Kangwon-do (딥러닝을 활용한 위성영상 기반의 강원도 지역의 배추와 무 수확량 예측)

  • Hyebin Park;Yejin Lee;Seonyoung Park
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_3
    • /
    • pp.1031-1042
    • /
    • 2023
  • In this study, a deep learning model was developed to predict the yield of cabbage and radish, one of the five major supply and demand management vegetables, using satellite images of Landsat 8. To predict the yield of cabbage and radish in Gangwon-do from 2015 to 2020, satellite images from June to September, the growing period of cabbage and radish, were used. Normalized difference vegetation index, enhanced vegetation index, lead area index, and land surface temperature were employed in this study as input data for the yield model. Crop yields can be effectively predicted using satellite images because satellites collect continuous spatiotemporal data on the global environment. Based on the model developed previous study, a model designed for input data was proposed in this study. Using time series satellite images, convolutional neural network, a deep learning model, was used to predict crop yield. Landsat 8 provides images every 16 days, but it is difficult to acquire images especially in summer due to the influence of weather such as clouds. As a result, yield prediction was conducted by splitting June to July into one part and August to September into two. Yield prediction was performed using a machine learning approach and reference models , and modeling performance was compared. The model's performance and early predictability were assessed using year-by-year cross-validation and early prediction. The findings of this study could be applied as basic studies to predict the yield of field crops in Korea.

Image-Based Automatic Bridge Component Classification Using Deep Learning (딥러닝을 활용한 이미지 기반 교량 구성요소 자동분류 네트워크 개발)

  • Cho, Munwon;Lee, Jae Hyuk;Ryu, Young-Moo;Park, Jeongjun;Yoon, Hyungchul
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.41 no.6
    • /
    • pp.751-760
    • /
    • 2021
  • Most bridges in Korea are over 20 years old, and many problems linked to their deterioration are being reported. The current practice for bridge inspection mainly depends on expert evaluation, which can be subjective. Recent studies have introduced data-driven methods using building information modeling, which can be more efficient and objective, but these methods require manual procedures that consume time and money. To overcome this, this study developed an image-based automaticbridge component classification network to reduce the time and cost required for converting the visual information of bridges to a digital model. The proposed method comprises two convolutional neural networks. The first network estimates the type of the bridge based on the superstructure, and the second network classifies the bridge components. In avalidation test, the proposed system automatically classified the components of 461 bridge images with 96.6 % of accuracy. The proposed approach is expected to contribute toward current bridge maintenance practice.

Development of Deep Learning Structure to Secure Visibility of Outdoor LED Display Board According to Weather Change (날씨 변화에 따른 실외 LED 전광판의 시인성 확보를 위한 딥러닝 구조 개발)

  • Sun-Gu Lee;Tae-Yoon Lee;Seung-Ho Lee
    • Journal of IKEEE
    • /
    • v.27 no.3
    • /
    • pp.340-344
    • /
    • 2023
  • In this paper, we propose a study on the development of deep learning structure to secure visibility of outdoor LED display board according to weather change. The proposed technique secures the visibility of the outdoor LED display board by automatically adjusting the LED luminance according to the weather change using deep learning using an imaging device. In order to automatically adjust the LED luminance according to weather changes, a deep learning model that can classify the weather is created by learning it using a convolutional network after first going through a preprocessing process for the flattened background part image data. The applied deep learning network reduces the difference between the input value and the output value using the Residual learning function, inducing learning while taking the characteristics of the initial input value. Next, by using a controller that recognizes the weather and adjusts the luminance of the outdoor LED display board according to the weather change, the luminance is changed so that the luminance increases when the surrounding environment becomes bright, so that it can be seen clearly. In addition, when the surrounding environment becomes dark, the visibility is reduced due to scattering of light, so the brightness of the electronic display board is lowered so that it can be seen clearly. By applying the method proposed in this paper, the result of the certified measurement test of the luminance measurement according to the weather change of the LED sign board confirmed that the visibility of the outdoor LED sign board was secured according to the weather change.

A Design of the Vehicle Crisis Detection System(VCDS) based on vehicle internal and external data and deep learning (차량 내·외부 데이터 및 딥러닝 기반 차량 위기 감지 시스템 설계)

  • Son, Su-Rak;Jeong, Yi-Na
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.2
    • /
    • pp.128-133
    • /
    • 2021
  • Currently, autonomous vehicle markets are commercializing a third-level autonomous vehicle, but there is a possibility that an accident may occur even during fully autonomous driving due to stability issues. In fact, autonomous vehicles have recorded 81 accidents. This is because, unlike level 3, autonomous vehicles after level 4 have to judge and respond to emergency situations by themselves. Therefore, this paper proposes a vehicle crisis detection system(VCDS) that collects and stores information outside the vehicle through CNN, and uses the stored information and vehicle sensor data to output the crisis situation of the vehicle as a number between 0 and 1. The VCDS consists of two modules. The vehicle external situation collection module collects surrounding vehicle and pedestrian data using a CNN-based neural network model. The vehicle crisis situation determination module detects a crisis situation in the vehicle by using the output of the vehicle external situation collection module and the vehicle internal sensor data. As a result of the experiment, the average operation time of VESCM was 55ms, R-CNN was 74ms, and CNN was 101ms. In particular, R-CNN shows similar computation time to VESCM when the number of pedestrians is small, but it takes more computation time than VESCM as the number of pedestrians increases. On average, VESCM had 25.68% faster computation time than R-CNN and 45.54% faster than CNN, and the accuracy of all three models did not decrease below 80% and showed high accuracy.

Integrated receptive field diversification method for improving speaker verification performance for variable-length utterances (가변 길이 입력 발성에서의 화자 인증 성능 향상을 위한 통합된 수용 영역 다양화 기법)

  • Shin, Hyun-seo;Kim, Ju-ho;Heo, Jungwoo;Shim, Hye-jin;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.3
    • /
    • pp.319-325
    • /
    • 2022
  • The variation of utterance lengths is a representative factor that can degrade the performance of speaker verification systems. To handle this issue, previous studies had attempted to extract speaker features from various branches or to use convolution layers with different receptive fields. Combining the advantages of the previous two approaches for variable-length input, this paper proposes integrated receptive field diversification that extracts speaker features through more diverse receptive field. The proposed method processes the input features by convolutional layers with different receptive fields at multiple time-axis branches, and extracts speaker embedding by dynamically aggregating the processed features according to the lengths of input utterances. The deep neural networks in this study were trained on the VoxCeleb2 dataset and tested on the VoxCeleb1 evaluation dataset that divided into 1 s, 2 s, 5 s, and full-length. Experimental results demonstrated that the proposed method reduces the equal error rate by 19.7 % compared to the baseline.