• Title/Summary/Keyword: 시간 컨볼루션 네트워크

Search Result 10, Processing Time 0.03 seconds

Harnessing Deep Learning for Abnormal Respiratory Sound Detection (이상 호흡음 탐지를 위한 딥러닝 활용)

  • Gyurin Byun;Huigyu Yang;Hyunseung Choo
    • Annual Conference of KIPS
    • /
    • 2023.11a
    • /
    • pp.641-643
    • /
    • 2023
  • Deep Learning(DL)을 사용한 호흡음의 자동 분석은 폐 질환의 조기 진단에 중추적인 역할을 한다. 그러나 현재의 DL 방법은 종종 호흡음의 공간적 및 시간적 특성을 분리하여 검사하기 때문에 한계가 있다. 본 연구는 컨볼루션 연산을 통해 공간적 특징을 캡처하고 시간 컨볼루션 네트워크를 사용하여 이러한 특징의 공간적-시간적 상관 관계를 활용하는 새로운 DL 프레임워크를 제한한다. 제안된 프레임워크는 앙상블 학습 접근법 내에 컨볼루션 네트워크를 통합하여 폐음 녹음에서 호흡 이상 및 질병을 검출하는 정확도를 크게 향상시킨다. 잘 알려진 ICBHI 2017 챌린지 데이터 세트에 대한 실험은 제안된 프레임워크가 호흡 이상 및 질병 검출을 위한 4-Class 작업에서 비교모델 성능보다 우수함을 보여준다. 특히 민감도와 특이도를 나타내는 점수 메트릭 측면에서 최대 45.91%와 14.1%의 개선이 이진 및 다중 클래스 호흡 이상 감지 작업에서 각각 보여준다. 이러한 결과는 기존 기술보다 우리 방법의 두드러진 이점을 강조하여 호흡기 의료 기술의 미래 혁신을 주도할 수 있는 잠재력을 보여준다.

Teacher-Student Architecture Based CNN for Action Recognition (동작 인식을 위한 교사-학생 구조 기반 CNN)

  • Zhao, Yulan;Lee, Hyo Jong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.3
    • /
    • pp.99-104
    • /
    • 2022
  • Convolutional neural network (CNN) generally uses two-stream architecture RGB and optical flow stream for its action recognition function. RGB frames stream display appearance and optical flow stream interprets its action. However, the standard method of using optical flow is costly in its computational time and latency associated with increased action recognition. The purpose of the study was to evaluate a novel way to create a two sub-networks in neural networks. The optical flow sub-network was assigned as a teacher and the RGB frames as a student. In the training stage, the optical flow sub-network extracts features through the teacher sub-network and transmits the information to student sub-network for baseline training. In the test stage, only student sub-network was operational with decreased in latency without computing optical flow. Experimental results shows that our network fed only by RGB stream gets a competitive accuracy of 54.5% on HMDB51, which is 1.5 times better than that on R3D-18.

Implementation of Neural Network Accelerator for Rendering Noise Reduction on OpenCL (OpenCL을 이용한 랜더링 노이즈 제거를 위한 뉴럴 네트워크 가속기 구현)

  • Nam, Kihun
    • The Journal of the Convergence on Culture Technology
    • /
    • v.4 no.4
    • /
    • pp.373-377
    • /
    • 2018
  • In this paper, we propose an implementation of a neural network accelerator for reducing the rendering noise using OpenCL. Among the rendering algorithms, we selects a ray tracing to assure a high quality graphics. Ray tracing rendering uses ray to render, less use of the ray will result in noise. Ray used more will produce a higher quality image but will take operation time longer. To reduce operation time whiles using fewer rays, Learning Base Filtering algorithm using neural network was applied. it's not always produce optimize result. In this paper, a new approach to Matrix Multiplication that is based on General Matrix Multiplication for improved performance. The development environment, we used specialized in high speed parallel processing of OpenCL. The proposed architecture was verified using Kintex UltraScale XKU6909T-2FDFG1157C FPGA board. The time it takes to calculate the parameters is about 1.12 times fast than that of Verilog-HDL structure.

Efficient Super-Resolution of 2D Smoke Data with Optimized Quadtree (최적화된 쿼드트리를 이용한 2차원 연기 데이터의 효율적인 슈퍼 해상도 기법)

  • Choe, YooYeon;Kim, Donghui;Kim, Jong-Hyun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.01a
    • /
    • pp.261-264
    • /
    • 2021
  • 본 논문에서는 SR(Super-Resolution)을 계산하는데 필요한 데이터를 효율적으로 분류하고 분할하여 빠르게 SR연산을 가능하게 하는 쿼드트리 기반 최적화 기법을 제안한다. 제안하는 방법은 입력 데이터로 사용하는 연기 데이터를 다운스케일링(Downscaling)하여 쿼드트리 연산 소요 시간을 감소시키며, 이때 연기의 밀도를 이진화함으로써, 다운스케일링 과정에서 밀도가 손실되는 문제를 피한다. 학습에 사용된 데이터는 COCO 2017 Dataset이며, 인공신경망은 VGG19 기반 네트워크를 사용한다. 컨볼루션 계층을 거칠 때 데이터의 손실을 막기 위해 잔차(Residual)방식과 유사하게 이전 계층의 출력 값을 더해주며 학습한다. 결과적으로 제안하는 방법은 이전 결과 기법에 비해 약15~18배 정도의 속도향상을 얻었다.

  • PDF

A New Residual Attention Network based on Attention Models for Human Action Recognition in Video

  • Kim, Jee-Hyun;Cho, Young-Im
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.1
    • /
    • pp.55-61
    • /
    • 2020
  • With the development of deep learning technology and advances in computing power, video-based research is now gaining more and more attention. Video data contains a large amount of temporal and spatial information, which is the biggest difference compared with image data. It has a larger amount of data. It has attracted intense attention in computer vision. Among them, motion recognition is one of the research focuses. However, the action recognition of human in the video is extremely complex and challenging subject. Based on many research in human beings, we have found that artificial intelligence-like attention mechanisms are an efficient model for cognition. This efficient model is ideal for processing image information and complex continuous video information. We introduce this attention mechanism into video action recognition, paying attention to human actions in video and effectively improving recognition efficiency. In this paper, we propose a new 3D residual attention network using convolutional neural network based on two attention models to identify human action behavior in the video. An evaluation result of our model showed up to 90.7% accuracy.

Conv-LSTM-based Range Modeling and Traffic Congestion Prediction Algorithm for the Efficient Transportation System (효율적인 교통 체계 구축을 위한 Conv-LSTM기반 사거리 모델링 및 교통 체증 예측 알고리즘 연구)

  • Seung-Young Lee;Boo-Won Seo;Seung-Min Park
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.2
    • /
    • pp.321-327
    • /
    • 2023
  • With the development of artificial intelligence, the prediction system has become one of the essential technologies in our lives. Despite the growth of these technologies, traffic congestion at intersections in the 21st century has continued to be a problem. This paper proposes a system that predicts intersection traffic jams using a Convolutional LSTM (Conv-LSTM) algorithm. The proposed system models data obtained by learning traffic information by time zone at the intersection where traffic congestion occurs. Traffic congestion is predicted with traffic volume data recorded over time. Based on the predicted result, the intersection traffic signal is controlled and maintained at a constant traffic volume. Road congestion data was defined using VDS sensors, and each intersection was configured with a Conv-LSTM algorithm-based network system to facilitate traffic.

Multiple Binarization Quadtree Framework for Optimizing Deep Learning-Based Smoke Synthesis Method

  • Kim, Jong-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.4
    • /
    • pp.47-53
    • /
    • 2021
  • In this paper, we propose a quadtree-based optimization technique that enables fast Super-resolution(SR) computation by efficiently classifying and dividing physics-based simulation data required to calculate SR. The proposed method reduces the time required for quadtree computation by downscaling the smoke simulation data used as input data. By binarizing the density of the smoke in this process, a quadtree is constructed while mitigating the problem of numerical loss of density in the downscaling process. The data used for training is the COCO 2017 Dataset, and the artificial neural network uses a VGG19-based network. In order to prevent data loss when passing through the convolutional layer, similar to the residual method, the output value of the previous layer is added and learned. In the case of smoke, the proposed method achieved a speed improvement of about 15 to 18 times compared to the previous approach.

Accurate Prediction of VVC Intra-coded Block using Convolutional Neural Network (VVC 화면 내 예측에서의 딥러닝 기반 예측 블록 개선을 통한 부호화 효율 향상 기법)

  • Jeong, Hye-Sun;Kang, Je-Won
    • Journal of Broadcast Engineering
    • /
    • v.27 no.4
    • /
    • pp.477-486
    • /
    • 2022
  • In this paper, we propose a novel intra-prediction method using convolutional neural network (CNN) to improve a quality of a predicted block in VVC. The proposed algorithm goes through a two-step procedure. First, an input prediction block is generated using one of the VVC intra-prediction modes. Second, the prediction block is further refined through a CNN model, by inputting the prediction block itself and reconstructed reference samples in the boundary. The proposed algorithm outputs a refined block to reduce residual signals and enhance coding efficiency, which is enabled by a CU-level flag. Experimental results demonstrate that the proposed method achieves improved rate-distortion performance as compared a VVC reference software, I.e., VTM version 10.0.

A Temporal Convolutional Network for Hotel Demand Prediction Based on NSGA3 Feature Selection

  • Keehyun Park;Gyeongho Jung;Hyunchul Ahn
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.10
    • /
    • pp.121-128
    • /
    • 2024
  • Demand forecasting is a critical element of revenue management in the tourism industry. Since the 2010s, with the globalization of the tourism industry and the increase of different forms of marketing and information sharing, such as SNS, forecasting has become difficult due to non-linear activities and unstructured information. Various forecasting models for resolving the problems have been studied, and ML models have been used effectively. In this study, we applied the feature selection technique (NSGA3) to time series models and compared their performance. In hotel demand forecasting, it was found that the TCN model has a high forecasting performance of MAPE 9.73% with a performance improvement of 7.05% compared to no feature selection. The results of this study are expected to be useful for decision support through improved forecasting performance.

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

  • Byambajav, Batkhuu;Alikhanov, Jumabek;Fang, Yang;Ko, Seunghyun;Jo, Geun Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.205-225
    • /
    • 2018
  • Convolutional Neural Network (ConvNet) is one class of the powerful Deep Neural Network that can analyze and learn hierarchies of visual features. Originally, first neural network (Neocognitron) was introduced in the 80s. At that time, the neural network was not broadly used in both industry and academic field by cause of large-scale dataset shortage and low computational power. However, after a few decades later in 2012, Krizhevsky made a breakthrough on ILSVRC-12 visual recognition competition using Convolutional Neural Network. That breakthrough revived people interest in the neural network. The success of Convolutional Neural Network is achieved with two main factors. First of them is the emergence of advanced hardware (GPUs) for sufficient parallel computation. Second is the availability of large-scale datasets such as ImageNet (ILSVRC) dataset for training. Unfortunately, many new domains are bottlenecked by these factors. For most domains, it is difficult and requires lots of effort to gather large-scale dataset to train a ConvNet. Moreover, even if we have a large-scale dataset, training ConvNet from scratch is required expensive resource and time-consuming. These two obstacles can be solved by using transfer learning. Transfer learning is a method for transferring the knowledge from a source domain to new domain. There are two major Transfer learning cases. First one is ConvNet as fixed feature extractor, and the second one is Fine-tune the ConvNet on a new dataset. In the first case, using pre-trained ConvNet (such as on ImageNet) to compute feed-forward activations of the image into the ConvNet and extract activation features from specific layers. In the second case, replacing and retraining the ConvNet classifier on the new dataset, then fine-tune the weights of the pre-trained network with the backpropagation. In this paper, we focus on using multiple ConvNet layers as a fixed feature extractor only. However, applying features with high dimensional complexity that is directly extracted from multiple ConvNet layers is still a challenging problem. We observe that features extracted from multiple ConvNet layers address the different characteristics of the image which means better representation could be obtained by finding the optimal combination of multiple ConvNet layers. Based on that observation, we propose to employ multiple ConvNet layer representations for transfer learning instead of a single ConvNet layer representation. Overall, our primary pipeline has three steps. Firstly, images from target task are given as input to ConvNet, then that image will be feed-forwarded into pre-trained AlexNet, and the activation features from three fully connected convolutional layers are extracted. Secondly, activation features of three ConvNet layers are concatenated to obtain multiple ConvNet layers representation because it will gain more information about an image. When three fully connected layer features concatenated, the occurring image representation would have 9192 (4096+4096+1000) dimension features. However, features extracted from multiple ConvNet layers are redundant and noisy since they are extracted from the same ConvNet. Thus, a third step, we will use Principal Component Analysis (PCA) to select salient features before the training phase. When salient features are obtained, the classifier can classify image more accurately, and the performance of transfer learning can be improved. To evaluate proposed method, experiments are conducted in three standard datasets (Caltech-256, VOC07, and SUN397) to compare multiple ConvNet layer representations against single ConvNet layer representation by using PCA for feature selection and dimension reduction. Our experiments demonstrated the importance of feature selection for multiple ConvNet layer representation. Moreover, our proposed approach achieved 75.6% accuracy compared to 73.9% accuracy achieved by FC7 layer on the Caltech-256 dataset, 73.1% accuracy compared to 69.2% accuracy achieved by FC8 layer on the VOC07 dataset, 52.2% accuracy compared to 48.7% accuracy achieved by FC7 layer on the SUN397 dataset. We also showed that our proposed approach achieved superior performance, 2.8%, 2.1% and 3.1% accuracy improvement on Caltech-256, VOC07, and SUN397 dataset respectively compare to existing work.