• Title/Summary/Keyword: 3D Convolution

Search Result 104, Processing Time 0.02 seconds

A New Residual Attention Network based on Attention Models for Human Action Recognition in Video

  • Kim, Jee-Hyun;Cho, Young-Im
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.1
    • /
    • pp.55-61
    • /
    • 2020
  • With the development of deep learning technology and advances in computing power, video-based research is now gaining more and more attention. Video data contains a large amount of temporal and spatial information, which is the biggest difference compared with image data. It has a larger amount of data. It has attracted intense attention in computer vision. Among them, motion recognition is one of the research focuses. However, the action recognition of human in the video is extremely complex and challenging subject. Based on many research in human beings, we have found that artificial intelligence-like attention mechanisms are an efficient model for cognition. This efficient model is ideal for processing image information and complex continuous video information. We introduce this attention mechanism into video action recognition, paying attention to human actions in video and effectively improving recognition efficiency. In this paper, we propose a new 3D residual attention network using convolutional neural network based on two attention models to identify human action behavior in the video. An evaluation result of our model showed up to 90.7% accuracy.

Super-resolution based on multi-channel input convolutional residual neural network (다중 채널 입력 Convolution residual neural networks 기반의 초해상화 기법)

  • Youm, Gwang-Young;Kim, Munchurl
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2016.06a
    • /
    • pp.37-39
    • /
    • 2016
  • 최근 Convolutional neural networks(CNN) 기반의 초해상화 기법인 Super-Resolution Convolutional Neural Networks (SRCNN) 이 좋은 PSNR 성능을 발휘하는 것으로 보고되었다 [1]. 하지만 많은 제안 방법들이 고주파 성분을 복원하는데 한계를 드러내는 것처럼, SRCNN 도 고주파 성분 복원에 한계점을 지니고 있다. 또한 SRCNN 의 네트워크 층을 깊게 만들면 좋은 PSNR 성능을 발휘하는 것으로 널리 알려져 있지만, 네트워크의 층을 깊게 하는 것은 네트워크 파라미터 학습을 어렵게 하는 경향이 있다. 네트워크의 층을 깊게 할 경우, gradient 값이 아래(역방향) 층으로 갈수록 발산하거나 0 으로 수렴하여, 네트워크 파라미터 학습이 제대로 되지 않는 현상이 발생하기 때문이다. 따라서 본 논문에서는 네트워크 층을 깊게 하는 대신에, 입력을 다중 채널로 구성하여, 네트워크에 고주파 성분에 관한 추가적인 정보를 주는 방법을 제안하였다. 많은 초해상화 기법들이 고주파 성분의 복원 능력이 부족하다는 점에 착안하여, 우리는 네트워크가 고주파 성분에 관한 많은 정보를 필요로 한다는 것을 가정하였다. 따라서 우리는 네트워크의 입력을 고주파 성분이 여러 가지 강도로 입력되도록 저해상도 입력 영상들을 구성하였다. 또한 잔차신호 네트워크(residual networks)를 도입하여, 네트워크 파라미터를 학습할 때 고주파 성분의 복원에 집중할 수 있도록 하였다. 본 논문의 효율성을 검증하기 위하여 set5 데이터와 set14 데이터에 관하여 실험을 진행하였고, SRCNN 과 비교하여 set5 데이터에서는 2, 3, 4 배에 관하여 각각 평균 0.29, 0.35, 0.17dB 의 PSNR 성능 향상이 있었으며, set14 데이터에서는 3 배의 관하여 평균 0.20dB 의 PSNR 성능 향상이 있었다.

  • PDF

Trellis Coded Spread Spectrum with the multiple symbol detection (다중 심벌 검파를 이용한 트렐리스 부호화된 대역 확산 통신 시스템)

  • 김상태;김종일
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.4 no.3
    • /
    • pp.517-526
    • /
    • 2000
  • In this paper, we propose the trellis coded spread spectrum communication system with one channel signal selection of the subset by the PN code. This paper proposes the Viterbi decoder that have the squared Euclidean distance of the order phase difference as well as 1st order phase difference as the branch metrics by using the multiple symbol detection method. TCM method was developed to overcome limited power and bandwidth efficiently in digital communication. we multiply one of convolution code's output data to PN code for applying TCM to the spread spectrum. We investigated the performance of the direct sequence/spread spectrum communication system with trellis coded modulation. In this system, we could improved the coding gain in the spread spectrum.

  • PDF

The Evaluation of Evenness of Nonwovens Using Image Analysis Method

  • Jeong, Sung-Hoon;Kim, Si-Hwan;Hong, Cheol-Jae
    • Fibers and Polymers
    • /
    • v.2 no.3
    • /
    • pp.164-170
    • /
    • 2001
  • Authors studied on the applicability of image analysis technique using a scanner with a CCD (charged coupled deviced) to the evaluation of evenness of nonwovens because it has distinctive features to considerably save time and labor in the analysis compared with other classical methods. As specimens fur the experiment, two different types that are unpatterned and patterned ones were prepared. For the unpatterned specimen, webs were chemically bonded, while for the patterned specimen, webs being thermally calendered with engraved roller. Several webs having various areal densities were prepared and bonded. Coefficient of variation (CV%) was used as a parameter to evaluate the evenness. Scanning conditions could be suitably set up through comparing the total variance to the between-group variance and to the within-group variance, respectively, on the images scanned at the different conditions. The 2D convolution method with smoothing filter kernel was introduced to further filter the noises on the scanned images. After the filtering process, the increase of web areal densities gave an uniform decrease of the CV%. This showed that the scanned image analysis with proper filtering process could be successfully applicable to the evaluation of evenness in nonwovens.

  • PDF

Artificial neural network reconstructs core power distribution

  • Li, Wenhuai;Ding, Peng;Xia, Wenqing;Chen, Shu;Yu, Fengwan;Duan, Chengjie;Cui, Dawei;Chen, Chen
    • Nuclear Engineering and Technology
    • /
    • v.54 no.2
    • /
    • pp.617-626
    • /
    • 2022
  • To effectively monitor the variety of distributions of neutron flux, fuel power or temperatures in the reactor core, usually the ex-core and in-core neutron detectors are employed. The thermocouples for temperature measurement are installed in the coolant inlet or outlet of the respective fuel assemblies. It is necessary to reconstruct the measurement information of the whole reactor position. However, the reading of different types of detector in the core reflects different aspects of the 3D power distribution. The feasibility of reconstruction the core three-dimension power distribution by using different combinations of in-core, ex-core and thermocouples detectors is analyzed in this paper to synthesize the useful information of various detectors. A comparison of multilayer perceptron (MLP) network and radial basis function (RBF) network is performed. RBF results are more extreme precision but also more sensitivity to detector failure and uncertainty, compare to MLP networks. This is because that localized neural network could offer conservative regression in RBF. Adding random disturbance in training dataset is helpful to reduce the influence of detector failure and uncertainty. Some convolution neural networks seem to be helpful to get more accurate results by use more spatial layout information, though relative researches are still under way.

Attention-Based Heart Rate Estimation using MobilenetV3

  • Yeo-Chan Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.1-7
    • /
    • 2023
  • The advent of deep learning technologies has led to the development of various medical applications, making healthcare services more convenient and effective. Among these applications, heart rate estimation is considered a vital method for assessing an individual's health. Traditional methods, such as photoplethysmography through smart watches, have been widely used but are invasive and require additional hardware. Recent advancements allow for contactless heart rate estimation through facial image analysis, providing a more hygienic and convenient approach. In this paper, we propose a lightweight methodology capable of accurately estimating heart rate in mobile environments, using a specialized 2-channel network structure based on 2D convolution. Our method considers both subtle facial movements and color changes resulting from blood flow and muscle contractions. The approach comprises two major components: an Encoder for analyzing image features and a regression layer for evaluating Blood Volume Pulse. By incorporating both features simultaneously our methodology delivers more accurate results even in computing environments with limited resources. The proposed approach is expected to offer a more efficient way to monitor heart rate without invasive technology, particularly well-suited for mobile devices.

MLCNN-COV: A multilabel convolutional neural network-based framework to identify negative COVID medicine responses from the chemical three-dimensional conformer

  • Pranab Das;Dilwar Hussain Mazumder
    • ETRI Journal
    • /
    • v.46 no.2
    • /
    • pp.290-306
    • /
    • 2024
  • To treat the novel COronaVIrus Disease (COVID), comparatively fewer medicines have been approved. Due to the global pandemic status of COVID, several medicines are being developed to treat patients. The modern COVID medicines development process has various challenges, including predicting and detecting hazardous COVID medicine responses. Moreover, correctly predicting harmful COVID medicine reactions is essential for health safety. Significant developments in computational models in medicine development can make it possible to identify adverse COVID medicine reactions. Since the beginning of the COVID pandemic, there has been significant demand for developing COVID medicines. Therefore, this paper presents the transferlearning methodology and a multilabel convolutional neural network for COVID (MLCNN-COV) medicines development model to identify negative responses of COVID medicines. For analysis, a framework is proposed with five multilabel transfer-learning models, namely, MobileNetv2, ResNet50, VGG19, DenseNet201, and Inceptionv3, and an MLCNN-COV model is designed with an image augmentation (IA) technique and validated through experiments on the image of three-dimensional chemical conformer of 17 number of COVID medicines. The RGB color channel is utilized to represent the feature of the image, and image features are extracted by employing the Convolution2D and MaxPooling2D layer. The findings of the current MLCNN-COV are promising, and it can identify individual adverse reactions of medicines, with the accuracy ranging from 88.24% to 100%, which outperformed the transfer-learning model's performance. It shows that three-dimensional conformers adequately identify negative COVID medicine responses.

Analysis of the major factors of influence on the conditions of the Intensity Modulated Radiation Therapy planning optimization in Head and Neck (두경부 세기견조방사선치료계획 최적화 조건에서 주요 인자들의 영향 분석)

  • Kim, Dae Sup;Lee, Woo Seok;Yoon, In Ha;Back, Geum Mun
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.26 no.1
    • /
    • pp.11-19
    • /
    • 2014
  • Purpose : To derive the most appropriate factors by considering the effects of the major factors when applied to the optimization algorithm, thereby aiding the effective designing of a ideal treatment plan. Materials and Methods : The eclipse treatment planning system(Eclipse 10.0, Varian, USA) was used in this study. The PBC (Pencil Beam Convolution) algorithm was used for dose calculation, and the DVO (Dose Volume Optimizer 10.0.28) Optimization algorithm was used for intensity modulated radiation therapy. The experimental group consists of patients receiving intensity modulated radiation therapy for the head and neck cancer and dose prescription to two planned target volume was 2.2 Gy and 2.0 Gy simultaneously. Treatment plan was done with inverse dose calculation methods utilizing 6 MV beam and 7 fields. The optimal algorithm parameter of the established plan was selected based on volume dose-priority(Constrain), dose fluence smooth value and the impact of the treatment plan was analyzed according to the variation of each factors. Volume dose-priority determines the reference conditions and the optimization process was carried out under the condition using same ratio, but different absolute values. We evaluated the surrounding normal organs of treatment volume according to the changing conditions of the absolute values of the volume dose-priority. Dose fluence smooth value was applied by simply changing the reference conditions (absolute value) and by changing the related volume dose-priority. The treatment plan was evaluated using Conformal Index, Paddick's Conformal Index, Homogeneity Index and the average dose of each organs. Results : When the volume dose-priority values were directly proportioned by changing the absolute values, the CI values were found to be different. However PCI was $1.299{\pm}0.006$ and HI was $1.095{\pm}0.004$ while D5%/D95% was $1.090{\pm}1.011$. The impact on the prescribed dose were similar. The average dose of parotid gland decreased to 67.4, 50.3, 51.2, 47.1 Gy when the absolute values of the volume dose-priority increased by 40,60,70,90. When the dose smooth strength from each treatment plan was increased, PCI value increased to $1.338{\pm}0.006$. Conclusion : The optimization algorithm was more influenced by the ratio of each condition than the absolute value of volume dose-priority. If the same ratio was maintained, similar treatment plan was established even if the absolute values were different. Volume dose-priority of the treatment volume should be more than 50% of the normal organ volume dose-priority in order to achieve a successful treatment plan. Dose fluence smooth value should increase or decrease proportional to the volume dose-priority. Volume dose-priority is not enough to satisfy the conditions when the absolute value are applied solely.

Prediction of pathological complete response in rectal cancer using 3D tumor PET image (3차원 종양 PET 영상을 이용한 직장암 치료반응 예측)

  • Jinyu Yang;Kangsan Kim;Ui-sup Shin;Sang-Keun Woo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.63-65
    • /
    • 2023
  • 본 논문에서는 FDG-PET 영상을 사용하는 딥러닝 네트워크를 이용하여 직장암 환자의 치료 후 완치를 예측하는 연구를 수행하였다. 직장암은 흔한 악성 종양 중 하나이지만 병리학적으로 완전하게 치료되는 가능성이 매우 낮아, 치료 후의 반응을 예측하고 적절한 치료 방법을 선택하는 것이 중요하다. 따라서 본 연구에서는 FDG-PET 영상에 합성곱 신경망(CNN)모델을 활용하여 딥러닝 네트워크를 구축하고 직장암 환자의 치료반응을 예측하는 연구를 진행하였다. 116명의 직장암 환자의 FDG-PET 영상을 획득하였다. 대상군은 2cm 이상의 종양 크기를 가지는 환자를 대상으로 하였으며 치료 후 완치된 환자는 21명이었다. FDG-PET 영상은 전신 영역과 종양 영역으로 나누어 평가하였다. 딥러닝 네트워크는 2차원 및 3차원 영상입력에 대한 CNN 모델로 구성되었다. 학습된 CNN 모델을 사용하여 직장암의 치료 후 완치를 예측하는 성능을 평가하였다. 학습 결과에서 평균 정확도와 정밀도는 각각 0.854와 0.905로 나타났으며, 모든 CNN 모델과 영상 영역에 따른 성능을 보였다. 테스트 결과에서는 3차원 CNN 모델과 종양 영역만을 이용한 네트워크에서 정확도가 높게 평가됨을 확인하였다. 본 연구에서는 CNN 모델의 입력 영상에 따른 차이와 영상 영역에 따른 딥러닝 네트워크의 성능을 평가하였으며 딥러닝 네트워크 모델을 통해 직장암 치료반응을 예측하고 적절한 치료 방향 결정에 도움이 될 것으로 기대한다.

  • PDF

Watermarking for Digital Hologram by a Deep Neural Network and its Training Considering the Hologram Data Characteristics (딥 뉴럴 네트워크에 의한 디지털 홀로그램의 워터마킹 및 홀로그램 데이터 특성을 고려한 학습)

  • Lee, Juwon;Lee, Jae-Eun;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of Broadcast Engineering
    • /
    • v.26 no.3
    • /
    • pp.296-307
    • /
    • 2021
  • A digital hologram (DH) is an ultra-high value-added video content that includes 3D information in 2D data. Therefore, its intellectual property rights must be protected for its distribution. For this, this paper proposes a watermarking method of DH using a deep neural network. This method is a watermark (WM) invisibility, attack robustness, and blind watermarking method that does not use host information in WM extraction. The proposed network consists of four sub-networks: pre-processing for each of the host and WM, WM embedding watermark, and WM extracting watermark. This network expand the WM data to the host instead of shrinking host data to WM and concatenate it to the host to insert the WM by considering the characteristics of a DH having a strong high frequency component. In addition, in the training of this network, the difference in performance according to the data distribution property of DH is identified, and a method of selecting a training data set with the best performance in all types of DH is presented. The proposed method is tested for various types and strengths of attacks to show its performance. It also shows that this method has high practicality as it operates independently of the resolution of the host DH and WM data.