• Title/Summary/Keyword: Feature Maps

Search Result 287, Processing Time 0.03 seconds

Deconvolution Pixel Layer Based Semantic Segmentation for Street View Images (디컨볼루션 픽셀층 기반의 도로 이미지의 의미론적 분할)

  • Wahid, Abdul;Lee, Hyo Jong
    • Annual Conference of KIPS
    • /
    • 2019.05a
    • /
    • pp.515-518
    • /
    • 2019
  • Semantic segmentation has remained as a challenging problem in the field of computer vision. Given the immense power of Convolution Neural Network (CNN) models, many complex problems have been solved in computer vision. Semantic segmentation is the challenge of classifying several pixels of an image into one category. With the help of convolution neural networks, we have witnessed prolific results over the time. We propose a convolutional neural network model which uses Fully CNN with deconvolutional pixel layers. The goal is to create a hierarchy of features while the fully convolutional model does the primary learning and later deconvolutional model visually segments the target image. The proposed approach creates a direct link among the several adjacent pixels in the resulting feature maps. It also preserves the spatial features such as corners and edges in images and hence adding more accuracy to the resulting outputs. We test our algorithm on Karlsruhe Institute of Technology and Toyota Technologies Institute (KITTI) street view data set. Our method achieves an mIoU accuracy of 92.04 %.

MPEG-4 Video Rate Control Algorithm using SOFM-Based Neural Classifier (SOFM 신경망 분류기를 이용한 MPEG-4 비디오 전송률 제어)

  • Park, Gwang-Hoon;Lee, Yoon-Jin
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.7
    • /
    • pp.425-435
    • /
    • 2002
  • This paper introduces a macroblock-based rate control algorithm using the neural classifier based in Self Organization feature Maps (SOFM). In contrast to the conventional rate control methods based on the mathematical rate distortion (RD) model and the feedback regression, proposed method can actively adapt to the rapid-varying image characteristics by establishing the global model for bitrate control and by using the SOFM based neural classifier to manage that model. Proposed rate control algorithm has 0.2 dB ~ 0.6 dB better performances than MPEG-4 macroblock-based rate control algorithm by evaluating with the encoded Peak Signal to Noise Ratios while maintaining similar overall computational complexity.

The Comparison of the SIFT Image Descriptor by Contrast Enhancement Algorithms with Various Types of High-resolution Satellite Imagery

  • Choi, Jaw-Wan;Kim, Dae-Sung;Kim, Yong-Min;Han, Dong-Yeob;Kim, Yong-Il
    • Korean Journal of Remote Sensing
    • /
    • v.26 no.3
    • /
    • pp.325-333
    • /
    • 2010
  • Image registration involves overlapping images of an identical region and assigning the data into one coordinate system. Image registration has proved important in remote sensing, enabling registered satellite imagery to be used in various applications such as image fusion, change detection and the generation of digital maps. The image descriptor, which extracts matching points from each image, is necessary for automatic registration of remotely sensed data. Using contrast enhancement algorithms such as histogram equalization and image stretching, the normalized data are applied to the image descriptor. Drawing on the different spectral characteristics of high resolution satellite imagery based on sensor type and acquisition date, the applied normalization method can be used to change the results of matching interest point descriptors. In this paper, the matching points by scale invariant feature transformation (SIFT) are extracted using various contrast enhancement algorithms and injection of Gaussian noise. The results of the extracted matching points are compared with the number of correct matching points and matching rates for each point.

A Multi-Stage Convolution Machine with Scaling and Dilation for Human Pose Estimation

  • Nie, Yali;Lee, Jaehwan;Yoon, Sook;Park, Dong Sun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.6
    • /
    • pp.3182-3198
    • /
    • 2019
  • Vision-based Human Pose Estimation has been considered as one of challenging research subjects due to problems including confounding background clutter, diversity of human appearances and illumination changes in scenes. To tackle these problems, we propose to use a new multi-stage convolution machine for estimating human pose. To provide better heatmap prediction of body joints, the proposed machine repeatedly produces multiple predictions according to stages with receptive field large enough for learning the long-range spatial relationship. And stages are composed of various modules according to their strategic purposes. Pyramid stacking module and dilation module are used to handle problem of human pose at multiple scales. Their multi-scale information from different receptive fields are fused with concatenation, which can catch more contextual information from different features. And spatial and channel information of a given input are converted to gating factors by squeezing the feature maps to a single numeric value based on its importance in order to give each of the network channels different weights. Compared with other ConvNet-based architectures, we demonstrated that our proposed architecture achieved higher accuracy on experiments using standard benchmarks of LSP and MPII pose datasets.

RDNN: Rumor Detection Neural Network for Veracity Analysis in Social Media Text

  • SuthanthiraDevi, P;Karthika, S
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.12
    • /
    • pp.3868-3888
    • /
    • 2022
  • A widely used social networking service like Twitter has the ability to disseminate information to large groups of people even during a pandemic. At the same time, it is a convenient medium to share irrelevant and unverified information online and poses a potential threat to society. In this research, conventional machine learning algorithms are analyzed to classify the data as either non-rumor data or rumor data. Machine learning techniques have limited tuning capability and make decisions based on their learning. To tackle this problem the authors propose a deep learning-based Rumor Detection Neural Network model to predict the rumor tweet in real-world events. This model comprises three layers, AttCNN layer is used to extract local and position invariant features from the data, AttBi-LSTM layer to extract important semantic or contextual information and HPOOL to combine the down sampling patches of the input feature maps from the average and maximum pooling layers. A dataset from Kaggle and ground dataset #gaja are used to train the proposed Rumor Detection Neural Network to determine the veracity of the rumor. The experimental results of the RDNN Classifier demonstrate an accuracy of 93.24% and 95.41% in identifying rumor tweets in real-time events.

PathGAN: Local path planning with attentive generative adversarial networks

  • Dooseop Choi;Seung-Jun Han;Kyoung-Wook Min;Jeongdan Choi
    • ETRI Journal
    • /
    • v.44 no.6
    • /
    • pp.1004-1019
    • /
    • 2022
  • For autonomous driving without high-definition maps, we present a model capable of generating multiple plausible paths from egocentric images for autonomous vehicles. Our generative model comprises two neural networks: feature extraction network (FEN) and path generation network (PGN). The FEN extracts meaningful features from an egocentric image, whereas the PGN generates multiple paths from the features, given a driving intention and speed. To ensure that the paths generated are plausible and consistent with the intention, we introduce an attentive discriminator and train it with the PGN under a generative adversarial network framework. Furthermore, we devise an interaction model between the positions in the paths and the intentions hidden in the positions and design a novel PGN architecture that reflects the interaction model for improving the accuracy and diversity of the generated paths. Finally, we introduce ETRIDriving, a dataset for autonomous driving, in which the recorded sensor data are labeled with discrete high-level driving actions, and demonstrate the state-of-the-art performance of the proposed model on ETRIDriving in terms of accuracy and diversity.

Lightweight Deep Learning Model for Heart Rate Estimation from Facial Videos (얼굴 영상 기반의 심박수 추정을 위한 딥러닝 모델의 경량화 기법)

  • Gyutae Hwang;Myeonggeun Park;Sang Jun Lee
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.18 no.2
    • /
    • pp.51-58
    • /
    • 2023
  • This paper proposes a deep learning method for estimating the heart rate from facial videos. Our proposed method estimates remote photoplethysmography (rPPG) signals to predict the heart rate. Although there have been proposed several methods for estimating rPPG signals, most previous methods can not be utilized in low-power single board computers due to their computational complexity. To address this problem, we construct a lightweight student model and employ a knowledge distillation technique to reduce the performance degradation of a deeper network model. The teacher model consists of 795k parameters, whereas the student model only contains 24k parameters, and therefore, the inference time was reduced with the factor of 10. By distilling the knowledge of the intermediate feature maps of the teacher model, we improved the accuracy of the student model for estimating the heart rate. Experiments were conducted on the UBFC-rPPG dataset to demonstrate the effectiveness of the proposed method. Moreover, we collected our own dataset to verify the accuracy and processing time of the proposed method on a real-world dataset. Experimental results on a NVIDIA Jetson Nano board demonstrate that our proposed method can infer the heart rate in real time with the mean absolute error of 2.5183 bpm.

Image Segmentation of Fuzzy Deep Learning using Fuzzy Logic (퍼지 논리를 이용한 퍼지 딥러닝 영상 분할)

  • Jongjin Park
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.5
    • /
    • pp.71-76
    • /
    • 2023
  • In this paper, we propose a fuzzy U-Net, a fuzzy deep learning model that applies fuzzy logic to improve performance in image segmentation using deep learning. Fuzzy modules using fuzzy logic were combined with U-Net, a deep learning model that showed excellent performance in image segmentation, and various types of fuzzy modules were simulated. The fuzzy module of the proposed deep learning model learns intrinsic and complex rules between feature maps of images and corresponding segmentation results. To this end, the superiority of the proposed method was demonstrated by applying it to dental CBCT data. As a result of the simulation, it can be seen that the performance of the ADD-RELU fuzzy module structure of the model using the addition skip connection in the proposed fuzzy U-Net is 0.7928 for the test dataset and the best.

A Study on Improving License Plate Recognition Performance Using Super-Resolution Techniques

  • Kyeongseok JANG;Kwangchul SON
    • Korean Journal of Artificial Intelligence
    • /
    • v.12 no.3
    • /
    • pp.1-7
    • /
    • 2024
  • In this paper, we propose an innovative super-resolution technique to address the issue of reduced accuracy in license plate recognition caused by low-resolution images. Conventional vehicle license plate recognition systems have relied on images obtained from fixed surveillance cameras for traffic detection to perform vehicle detection, tracking, and license plate recognition. However, during this process, image quality degradation occurred due to the physical distance between the camera and the vehicle, vehicle movement, and external environmental factors such as weather and lighting conditions. In particular, the acquisition of low-resolution images due to camera performance limitations has been a major cause of significantly reduced accuracy in license plate recognition. To solve this problem, we propose a Single Image Super-Resolution (SISR) model with a parallel structure that combines Multi-Scale and Attention Mechanism. This model is capable of effectively extracting features at various scales and focusing on important areas. Specifically, it generates feature maps of various sizes through a multi-branch structure and emphasizes the key features of license plates using an Attention Mechanism. Experimental results show that the proposed model demonstrates significantly improved recognition accuracy compared to existing vehicle license plate super-resolution methods using Bicubic Interpolation.

Design and Implementation of Internet Spatial Data Service Component based Open GIS Specification (개방형 GIS 기반 인터넷 공간 데이터서비스 컴포넌트의 설계 및 구현)

  • Choi, Sang-Kil;Lee, Jin-Kyu;Lee, Jong-Won;Kim, Jang-Su
    • Journal of Korea Spatial Information System Society
    • /
    • v.1 no.2 s.2
    • /
    • pp.21-31
    • /
    • 1999
  • In accordance with the completion of the spatial database building works in the central and/or local government authorities as well as the rapid popularization of various information services through internet, it is heavily required to provide spatial information services through World Wide Web. To provide a qualified spatial information service, it is crucial to have a Web-based GIS (Geographic Information Service) service system equipped with the publicity, the convenient accessibility, and the easy-to-use user interface. In this paper, we introduce a new component system for Web-based spatial information services based on the OpenGIS Simple Feature specification for OLE/COM[3] and OLE DB specification[4]. The important functionality of a Web-based spatial information service system includes its accessibility of various existing GIS server system and huge databases in addition to the resolution of response-time delay problems caused by transmitting a large amount of digital maps via internet[6]. To cope with these problems, our component system has been designed to access heterogeneous databases in transparent manner and to support vector-based and/or image-based image production techniques for shortening transmission time.

  • PDF