• 제목/요약/키워드: Convolutional Neural Networks (CNN)

검색결과 355건 처리시간 0.024초

Human Face Tracking and Modeling using Active Appearance Model with Motion Estimation

  • Tran, Hong Tai;Na, In Seop;Kim, Young Chul;Kim, Soo Hyung
    • 스마트미디어저널
    • /
    • 제6권3호
    • /
    • pp.49-56
    • /
    • 2017
  • Images and Videos that include the human face contain a lot of information. Therefore, accurately extracting human face is a very important issue in the field of computer vision. However, in real life, human faces have various shapes and textures. To adapt to these variations, A model-based approach is one of the best ways in which unknown data can be represented by the model in which it is built. However, the model-based approach has its weaknesses when the motion between two frames is big, it can be either a sudden change of pose or moving with fast speed. In this paper, we propose an enhanced human face-tracking model. This approach included human face detection and motion estimation using Cascaded Convolutional Neural Networks, and continuous human face tracking and modeling correction steps using the Active Appearance Model. A proposed system detects human face in the first input frame and initializes the models. On later frames, Cascaded CNN face detection is used to estimate the target motion such as location or pose before applying the old model and fit new target.

Waste Classification by Fine-Tuning Pre-trained CNN and GAN

  • Alsabei, Amani;Alsayed, Ashwaq;Alzahrani, Manar;Al-Shareef, Sarah
    • International Journal of Computer Science & Network Security
    • /
    • 제21권8호
    • /
    • pp.65-70
    • /
    • 2021
  • Waste accumulation is becoming a significant challenge in most urban areas and if it continues unchecked, is poised to have severe repercussions on our environment and health. The massive industrialisation in our cities has been followed by a commensurate waste creation that has become a bottleneck for even waste management systems. While recycling is a viable solution for waste management, it can be daunting to classify waste material for recycling accurately. In this study, transfer learning models were proposed to automatically classify wastes based on six materials (cardboard, glass, metal, paper, plastic, and trash). The tested pre-trained models were ResNet50, VGG16, InceptionV3, and Xception. Data augmentation was done using a Generative Adversarial Network (GAN) with various image generation percentages. It was found that models based on Xception and VGG16 were more robust. In contrast, models based on ResNet50 and InceptionV3 were sensitive to the added machine-generated images as the accuracy degrades significantly compared to training with no artificial data.

안개영상의 의미론적 분할 및 안개제거를 위한 심층 멀티태스크 네트워크 (Deep Multi-task Network for Simultaneous Hazy Image Semantic Segmentation and Dehazing)

  • 송태용;장현성;하남구;연윤모;권구용;손광훈
    • 한국멀티미디어학회논문지
    • /
    • 제22권9호
    • /
    • pp.1000-1010
    • /
    • 2019
  • Image semantic segmentation and dehazing are key tasks in the computer vision. In recent years, researches in both tasks have achieved substantial improvements in performance with the development of Convolutional Neural Network (CNN). However, most of the previous works for semantic segmentation assume the images are captured in clear weather and show degraded performance under hazy images with low contrast and faded color. Meanwhile, dehazing aims to recover clear image given observed hazy image, which is an ill-posed problem and can be alleviated with additional information about the image. In this work, we propose a deep multi-task network for simultaneous semantic segmentation and dehazing. The proposed network takes single haze image as input and predicts dense semantic segmentation map and clear image. The visual information getting refined during the dehazing process can help the recognition task of semantic segmentation. On the other hand, semantic features obtained during the semantic segmentation process can provide cues for color priors for objects, which can help dehazing process. Experimental results demonstrate the effectiveness of the proposed multi-task approach, showing improved performance compared to the separate networks.

Efficient Large Dataset Construction using Image Smoothing and Image Size Reduction

  • Jaemin HWANG;Sac LEE;Hyunwoo LEE;Seyun PARK;Jiyoung LIM
    • 한국인공지능학회지
    • /
    • 제11권1호
    • /
    • pp.17-24
    • /
    • 2023
  • With the continuous growth in the amount of data collected and analyzed, deep learning has become increasingly popular for extracting meaningful insights from various fields. However, hardware limitations pose a challenge for achieving meaningful results with limited data. To address this challenge, this paper proposes an algorithm that leverages the characteristics of convolutional neural networks (CNNs) to reduce the size of image datasets by 20% through smoothing and shrinking the size of images using color elements. The proposed algorithm reduces the learning time and, as a result, the computational load on hardware. The experiments conducted in this study show that the proposed method achieves effective learning with similar or slightly higher accuracy than the original dataset while reducing computational and time costs. This color-centric dataset construction method using image smoothing techniques can lead to more efficient learning on CNNs. This method can be applied in various applications, such as image classification and recognition, and can contribute to more efficient and cost-effective deep learning. This paper presents a promising approach to reducing the computational load and time costs associated with deep learning and provides meaningful results with limited data, enabling them to apply deep learning to a broader range of applications.

Evaluation of Deep Learning Model for Scoliosis Pre-Screening Using Preprocessed Chest X-ray Images

  • Min Gu Jang;Jin Woong Yi;Hyun Ju Lee;Ki Sik Tae
    • 대한의용생체공학회:의공학회지
    • /
    • 제44권4호
    • /
    • pp.293-301
    • /
    • 2023
  • Scoliosis is a three-dimensional deformation of the spine that is a deformity induced by physical or disease-related causes as the spine is rotated abnormally. Early detection has a significant influence on the possibility of nonsurgical treatment. To train a deep learning model with preprocessed images and to evaluate the results with and without data augmentation to enable the diagnosis of scoliosis based only on a chest X-ray image. The preprocessed images in which only the spine, rib contours, and some hard tissues were left from the original chest image, were used for learning along with the original images, and three CNN(Convolutional Neural Networks) models (VGG16, ResNet152, and EfficientNet) were selected to proceed with training. The results obtained by training with the preprocessed images showed a superior accuracy to those obtained by training with the original image. When the scoliosis image was added through data augmentation, the accuracy was further improved, ultimately achieving a classification accuracy of 93.56% with the ResNet152 model using test data. Through supplementation with future research, the method proposed herein is expected to allow the early diagnosis of scoliosis as well as cost reduction by reducing the burden of additional radiographic imaging for disease detection.

Image-based rainfall prediction from a novel deep learning method

  • Byun, Jongyun;Kim, Jinwon;Jun, Changhyun
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2021년도 학술발표회
    • /
    • pp.183-183
    • /
    • 2021
  • Deep learning methods and their application have become an essential part of prediction and modeling in water-related research areas, including hydrological processes, climate change, etc. It is known that application of deep learning leads to high availability of data sources in hydrology, which shows its usefulness in analysis of precipitation, runoff, groundwater level, evapotranspiration, and so on. However, there is still a limitation on microclimate analysis and prediction with deep learning methods because of deficiency of gauge-based data and shortcomings of existing technologies. In this study, a real-time rainfall prediction model was developed from a sky image data set with convolutional neural networks (CNNs). These daily image data were collected at Chung-Ang University and Korea University. For high accuracy of the proposed model, it considers data classification, image processing, ratio adjustment of no-rain data. Rainfall prediction data were compared with minutely rainfall data at rain gauge stations close to image sensors. It indicates that the proposed model could offer an interpolation of current rainfall observation system and have large potential to fill an observation gap. Information from small-scaled areas leads to advance in accurate weather forecasting and hydrological modeling at a micro scale.

  • PDF

Deep learning of sweep signal for damage detection on the surface of concrete

  • Gao Shanga;Jun Chen
    • Computers and Concrete
    • /
    • 제32권5호
    • /
    • pp.475-486
    • /
    • 2023
  • Nondestructive evaluation (NDE) is an important task of civil engineering structure monitoring and inspection, but minor damage such as small cracks in local structure is difficult to observe. If cracks continued expansion may cause partial or even overall damage to the structure. Therefore, monitoring and detecting the structure in the early stage of crack propagation is important. The crack detection technology based on machine vision has been widely studied, but there are still some problems such as bad recognition effect for small cracks. In this paper, we proposed a deep learning method based on sweep signals to evaluate concrete surface crack with a width less than 1 mm. Two convolutional neural networks (CNNs) are used to analyze the one-dimensional (1D) frequency sweep signal and the two-dimensional (2D) time-frequency image, respectively, and the probability value of average damage (ADPV) is proposed to evaluate the minor damage of structural. Finally, we use the standard deviation of energy ratio change (ERVSD) and infrared thermography (IRT) to compare with ADPV to verify the effectiveness of the method proposed in this paper. The experiment results show that the method proposed in this paper can effectively predict whether the concrete surface is damaged and the severity of damage.

스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식 (A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data)

  • 김길호;최상우;채문정;박희웅;이재홍;박종헌
    • 지능정보연구
    • /
    • 제25권1호
    • /
    • pp.163-177
    • /
    • 2019
  • 스마트폰이 널리 보급되고 현대인들의 생활 속에 깊이 자리 잡으면서, 스마트폰에서 수집된 다종 데이터를 바탕으로 사용자 개인의 행동을 인식하고자 하는 연구가 활발히 진행되고 있다. 그러나 타인과의 상호작용 행동 인식에 대한 연구는 아직까지 상대적으로 미진하였다. 기존 상호작용 행동 인식 연구에서는 오디오, 블루투스, 와이파이 등의 데이터를 사용하였으나, 이들은 사용자 사생활 침해 가능성이 높으며 단시간 내에 충분한 양의 데이터를 수집하기 어렵다는 한계가 있다. 반면 가속도, 자기장, 자이로스코프 등의 물리 센서의 경우 사생활 침해 가능성이 낮으며 단시간 내에 충분한 양의 데이터를 수집할 수 있다. 본 연구에서는 이러한 점에 주목하여, 스마트폰 상의 다종 물리 센서 데이터만을 활용, 딥러닝 모델에 기반을 둔 사용자의 동행 상태 인식 방법론을 제안한다. 사용자의 동행 여부 및 대화 여부를 분류하는 동행 상태 분류 모델은 컨볼루션 신경망과 장단기 기억 순환 신경망이 혼합된 구조를 지닌다. 먼저 스마트폰의 다종 물리 센서에서 수집한 데이터에 존재하는 타임 스태프의 차이를 상쇄하고, 정규화를 수행하여 시간에 따른 시퀀스 데이터 형태로 변환함으로써 동행 상태분류 모델의 입력 데이터를 생성한다. 이는 컨볼루션 신경망에 입력되며, 데이터의 시간적 국부 의존성이 반영된 요인 지도를 출력한다. 장단기 기억 순환 신경망은 요인 지도를 입력받아 시간에 따른 순차적 연관 관계를 학습하며, 동행 상태 분류를 위한 요인을 추출하고 소프트맥스 분류기에서 이에 기반한 최종적인 분류를 수행한다. 자체 제작한 스마트폰 애플리케이션을 배포하여 실험 데이터를 수집하였으며, 이를 활용하여 제안한 방법론을 평가하였다. 최적의 파라미터를 설정하여 동행 상태 분류 모델을 학습하고 평가한 결과, 동행 여부와 대화 여부를 각각 98.74%, 98.83%의 높은 정확도로 분류하였다.

Contactless User Identification System using Multi-channel Palm Images Facilitated by Triple Attention U-Net and CNN Classifier Ensemble Models

  • Kim, Inki;Kim, Beomjun;Woo, Sunghee;Gwak, Jeonghwan
    • 한국컴퓨터정보학회논문지
    • /
    • 제27권3호
    • /
    • pp.33-43
    • /
    • 2022
  • 본 논문에서는 기존의 스마트폰 카메라 센서를 사용하여 비접촉식 손바닥 기반 사용자 식별 시스템을 구축하기 위해 Attention U-Net 모델과 사전 훈련된 컨볼루션 신경망(CNN)이 있는 다채널 손바닥 이미지를 이용한 앙상블 모델을 제안한다. Attention U-Net 모델은 손바닥(손가락 포함), 손바닥(손바닥 미포함) 및 손금을 포함한 관심 영역을 추출하는 데 사용되며, 이는 앙상블 분류기로 입력되는 멀티채널 이미지를 생성하기 위해 결합 된다. 생성된 데이터는 제안된 손바닥 정보 기반 사용자 식별 시스템에 입력되며 사전 훈련된 CNN 모델 3개를 앙상블 한 분류기를 사용하여 클래스를 예측한다. 제안된 모델은 각각 98.60%, 98.61%, 98.61%, 98.61%의 분류 정확도, 정밀도, 재현율, F1-Score를 달성할 수 있음을 입증하며, 이는 저렴한 이미지 센서를 사용하고 있음에도 불구하고 제안된 모델이 효과적이라는 것을 나타낸다. 본 논문에서 제안하는 모델은 COVID-19 펜데믹 상황에서 기존 시스템에 비하여 높은 안전성과 신뢰성으로 대안이 될 수 있다.

Variations of AlexNet and GoogLeNet to Improve Korean Character Recognition Performance

  • Lee, Sang-Geol;Sung, Yunsick;Kim, Yeon-Gyu;Cha, Eui-Young
    • Journal of Information Processing Systems
    • /
    • 제14권1호
    • /
    • pp.205-217
    • /
    • 2018
  • Deep learning using convolutional neural networks (CNNs) is being studied in various fields of image recognition and these studies show excellent performance. In this paper, we compare the performance of CNN architectures, KCR-AlexNet and KCR-GoogLeNet. The experimental data used in this paper is obtained from PHD08, a large-scale Korean character database. It has 2,187 samples of each Korean character with 2,350 Korean character classes for a total of 5,139,450 data samples. In the training results, KCR-AlexNet showed an accuracy of over 98% for the top-1 test and KCR-GoogLeNet showed an accuracy of over 99% for the top-1 test after the final training iteration. We made an additional Korean character dataset with fonts that were not in PHD08 to compare the classification success rate with commercial optical character recognition (OCR) programs and ensure the objectivity of the experiment. While the commercial OCR programs showed 66.95% to 83.16% classification success rates, KCR-AlexNet and KCR-GoogLeNet showed average classification success rates of 90.12% and 89.14%, respectively, which are higher than the commercial OCR programs' rates. Considering the time factor, KCR-AlexNet was faster than KCR-GoogLeNet when they were trained using PHD08; otherwise, KCR-GoogLeNet had a faster classification speed.