• Title/Summary/Keyword: 3D Convolution

Search Result 103, Processing Time 0.027 seconds

Effects of the Complexity of 3D Modeling on the Acoustic Simulations and Auralized Sounds (3D 모델의 구체성이 건축음향 시뮬레이션 및 가청화시재에 미치는 영향)

  • Park, Chan-Jae;Haan, Chan-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.1
    • /
    • pp.22-32
    • /
    • 2011
  • The present study examined the effects of the complexity of the 3D models on the results of acoustic simulation which is the predominant tool of the acoustical design of buildings. Also, the effects of the 3D model on the auralized sounds were investigated. In order to carry out the study, four 3D models with different levels of complexity were introduced for a real auditorium which have different numbers of surfaces in the persuit of the guidance of odeon room acoustic software. The set-up of models was also based on the level of transition order of the program. And the acoustic experiments were performed measuring room acoustic parameters including SPL, RT, C80, D50. Acoustic computer simulations were performed using four different models. Then, the results of the computer modeling were compared with the measured acoustical parameters. In addition, sound sources were recorded in the field and auralized sounds were made in convolution with the impulse source made from acoustic modeling. Then, subjective tests were undertaken using auralized sounds. As the results, it was found that the result of the acoustic simulation were closer to the real room acoustic properties when 3D model was more particularly made. For the subjective test, the listening materials were acknowledged as similar with the real sound source when more complex 3D model was used. Then, it could be concluded that the complexity of the 3D model affects the results of the acoustic modeling as well as subjective tests.

SVD Pseudo-inverse and Application to Image Reconstruction from Projections (SVD Pseudo-inverse를 이용한 영상 재구성)

  • 심영석;김성필
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.17 no.3
    • /
    • pp.20-25
    • /
    • 1980
  • A singular value decomposition (SVD) pseudo-inversion method has been applied to the image reconstruction from projections. This approach is relatively unknown and differs from conventionally used reconstructioll methods such as the Foxier convolution and iterative techniques. In this paper, two SVD pseudo-inversion methods have been discussed for the search of optimum reconstruction and restoration, one using truncated inverse filtering, the other scalar Wiener filtering. These methods partly overcome the ill-conditioned nature of restoration problems by trading off between noise and signal quality. To test the SVD pseudo-inversion method, simulations were performed from projection data obtained from a phantom using truncated inversefiltering. The results are presented together with some limitations particular to the applications of the method to the general class of 3-D image reconstruction and restoration.

  • PDF

Attention based Feature-Fusion Network for 3D Object Detection (3차원 객체 탐지를 위한 어텐션 기반 특징 융합 네트워크)

  • Sang-Hyun Ryoo;Dae-Yeol Kang;Seung-Jun Hwang;Sung-Jun Park;Joong-Hwan Baek
    • Journal of Advanced Navigation Technology
    • /
    • v.27 no.2
    • /
    • pp.190-196
    • /
    • 2023
  • Recently, following the development of LIDAR technology which can detect distance from the object, the interest for LIDAR based 3D object detection network is getting higher. Previous networks generate inaccurate localization results due to spatial information loss during voxelization and downsampling. In this study, we propose an attention-based convergence method and a camera-LIDAR convergence system to acquire high-level features and high positional accuracy. First, by introducing the attention method into the Voxel-RCNN structure, which is a grid-based 3D object detection network, the multi-scale sparse 3D convolution feature is effectively fused to improve the performance of 3D object detection. Additionally, we propose the late-fusion mechanism for fusing outcomes in 3D object detection network and 2D object detection network to delete false positive. Comparative experiments with existing algorithms are performed using the KITTI data set, which is widely used in the field of autonomous driving. The proposed method showed performance improvement in both 2D object detection on BEV and 3D object detection. In particular, the precision was improved by about 0.54% for the car moderate class compared to Voxel-RCNN.

A Method for 3D Human Pose Estimation based on 2D Keypoint Detection using RGB-D information (RGB-D 정보를 이용한 2차원 키포인트 탐지 기반 3차원 인간 자세 추정 방법)

  • Park, Seohee;Ji, Myunggeun;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.41-51
    • /
    • 2018
  • Recently, in the field of video surveillance, deep learning based learning method is applied to intelligent video surveillance system, and various events such as crime, fire, and abnormal phenomenon can be robustly detected. However, since occlusion occurs due to the loss of 3d information generated by projecting the 3d real-world in 2d image, it is need to consider the occlusion problem in order to accurately detect the object and to estimate the pose. Therefore, in this paper, we detect moving objects by solving the occlusion problem of object detection process by adding depth information to existing RGB information. Then, using the convolution neural network in the detected region, the positions of the 14 keypoints of the human joint region can be predicted. Finally, in order to solve the self-occlusion problem occurring in the pose estimation process, the method for 3d human pose estimation is described by extending the range of estimation to the 3d space using the predicted result of 2d keypoint and the deep neural network. In the future, the result of 2d and 3d pose estimation of this research can be used as easy data for future human behavior recognition and contribute to the development of industrial technology.

A Pipelined Parallel Optimized Design for Convolution-based Non-Cascaded Architecture of JPEG2000 DWT (JPEG2000 이산웨이블릿변환의 컨볼루션기반 non-cascaded 아키텍처를 위한 pipelined parallel 최적화 설계)

  • Lee, Seung-Kwon;Kong, Jin-Hyeung
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.7
    • /
    • pp.29-38
    • /
    • 2009
  • In this paper, a high performance pipelined computing design of parallel multiplier-temporal buffer-parallel accumulator is present for the convolution-based non-cascaded architecture aiming at the real time Discrete Wavelet Transform(DWT) processing. The convolved multiplication of DWT would be reduced upto 1/4 by utilizing the filter coefficients symmetry and the up/down sampling; and it could be dealt with 3-5 times faster computation by LUT-based DA multiplication of multiple filter coefficients parallelized for product terms with an image data. Further, the reutilization of computed product terms could be achieved by storing in the temporal buffer, which yields the saving of computation as well as dynamic power by 50%. The convolved product terms of image data and filter coefficients are realigned and stored in the temporal buffer for the accumulated addition. Then, the buffer management of parallel aligned storage is carried out for the high speed sequential retrieval of parallel accumulations. The convolved computation is pipelined with parallel multiplier-temporal buffer-parallel accumulation in which the parallelization of temporal buffer and accumulator is optimize, with respect to the performance of parallel DA multiplier, to improve the pipelining performance. The proposed architecture is back-end designed with 0.18um library, which verifies the 30fps throughput of SVGA(800$\times$600) images at 90MHz.

Estimation of Significant Wave Heights from X-Band Radar Based on ANN Using CNN Rainfall Classifier (CNN 강우여부 분류기를 적용한 ANN 기반 X-Band 레이다 유의파고 보정)

  • Kim, Heeyeon;Ahn, Kyungmo;Oh, Chanyeong
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.33 no.3
    • /
    • pp.101-109
    • /
    • 2021
  • Wave observations using a marine X-band radar are conducted by analyzing the backscattered radar signal from sea surfaces. Wave parameters are extracted using Modulation Transfer Function obtained from 3D wave number and frequency spectra which are calculated by 3D FFT of time series of sea surface images (42 images per minute). The accuracy of estimation of the significant wave height is, therefore, critically dependent on the quality of radar images. Wave observations during Typhoon Maysak and Haishen in the summer of 2020 show large errors in the estimation of the significant wave heights. It is because of the deteriorated radar images due to raindrops falling on the sea surface. This paper presents the algorithm developed to increase the accuracy of wave heights estimation from radar images by adopting convolution neural network(CNN) which automatically classify radar images into rain and non-rain cases. Then, an algorithm for deriving the Hs is proposed by creating different ANN models and selectively applying them according to the rain or non-rain cases. The developed algorithm applied to heavy rain cases during typhoons and showed critically improved results.

Alzheimer's Disease Classification with Automated MRI Biomarker Detection Using Faster R-CNN for Alzheimer's Disease Diagnosis (치매 진단을 위한 Faster R-CNN 활용 MRI 바이오마커 자동 검출 연동 분류 기술 개발)

  • Son, Joo Hyung;Kim, Kyeong Tae;Choi, Jae Young
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.10
    • /
    • pp.1168-1177
    • /
    • 2019
  • In order to diagnose and prevent Alzheimer's Disease (AD), it is becoming increasingly important to develop a CAD(Computer-aided Diagnosis) system for AD diagnosis, which provides effective treatment for patients by analyzing 3D MRI images. It is essential to apply powerful deep learning algorithms in order to automatically classify stages of Alzheimer's Disease and to develop a Alzheimer's Disease support diagnosis system that has the function of detecting hippocampus and CSF(Cerebrospinal fluid) which are important biomarkers in diagnosis of Alzheimer's Disease. In this paper, for AD diagnosis, we classify a given MRI data into three categories of AD, mild cognitive impairment, and normal control according by applying 3D brain MRI image to the Faster R-CNN model and detect hippocampus and CSF in MRI image. To do this, we use the 2D MRI slice images extracted from the 3D MRI data of the Faster R-CNN, and perform the widely used majority voting algorithm on the resulting bounding box labels for classification. To verify the proposed method, we used the public ADNI data set, which is the standard brain MRI database. Experimental results show that the proposed method achieves impressive classification performance compared with other state-of-the-art methods.

Filtering of a Dissonant Frequency for Speech Enhancement

  • Kang, Sang-Ki;Baek, Seong-Joon;Lee, Ki-Yong;Sun, Koeng-Mo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.3E
    • /
    • pp.110-112
    • /
    • 2003
  • There have been numerous studies on the enhancement of the noisy speech signal. In this paper, we propose a completely new speech enhancement scheme, that is, a filtering of a dissonant frequency (especially F# in each octave of the tempered scale) based on the fundamental frequency which is developed in frequency domain. In order to evaluate the performance of the proposed enhancement scheme, subjective tests (MOS tests) were conducted. The subjective test results indicate that the proposed method provides a significant gain in audible improvement especially for speech contaminated by colored noise and speaking in a husky voice. Therefore when the filter is employed as a pre-filter for speech enhancement, the output speech quality and intelligibility is greatly enhanced.

Issues in Localising 3D Sound in Space Using Head- Related Transfer Functions (머리전달함수를 이용한 공간 음상 정위의 문제점 고찰)

  • Cheung Wan-Sup;Hwang Shin;Lee Jeung-Hoon;Kyun Hyu-Sang
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.149-152
    • /
    • 1999
  • This paper addresses major issues in localising sound sources in space using the experimental data set of head-related responses in the time or frequency domain. They come from the technical realisation steps for implementing the convolution of HRIR's with sound sources, the cross-talk cancellation for transaural filtering, the matched time delay compensation, etc. in real, those technical matters seem to be minor because they can be realised in off-line signal processing schemes. This paper puts much emphasis on what we misunderstood about the sets of HRTF's or HRIR's, More specifcaily, the sets of HRTF's or HRIR's of course supply relevant information to sound localisation but include much useless 'rubbish' that have made for us to fail to put spatial image into real souno signals such as voices and music's. This paper proposes possible reasons for such failure and, furthermore, introduces detained subjects that should be challenged so as to resolve them.

  • PDF

Bit-rate Scalable Video Coder Using a $2{\times}2{\times}2$ DCT for Progressive Transmission

  • Woo, Seock-Hoon;Park, Jin-Hyung;Won, Chee-Sun
    • Proceedings of the IEEK Conference
    • /
    • 2000.07a
    • /
    • pp.66-69
    • /
    • 2000
  • In this paper, we propose a progressive transmission of a video using a 2$\times$2$\times$2 DCT First of all, the video data is transformed into multiresolution represented video data using a 2$\times$2$\times$2 DCT. Then. it is represented by a 3-D EZT(Embedded Zero Tree) coding fur the progressive transmission with a bit-rate scalability. The proposed progressive transmission algorithm needs much less computations and buffer memories than the higher-order convolution based wavelet filter. Also, since the 2$\times$2$\times$2 DCT requires independent local computations, parallel processing can be applied.

  • PDF