• 제목/요약/키워드: 3-D Neural Network

검색결과 419건 처리시간 0.029초

3차원 물체인식을 위한 신경회로망 인식시트메의 설계

  • 김대영;이창순
    • 한국산업정보학회논문지
    • /
    • 제2권1호
    • /
    • pp.73-87
    • /
    • 1997
  • Multilayer neural network using a modified beackpropagation learning algorithm was introduced to achieve automatic identification of different types of aircraft in a variety of 3-D orientations. A 3-D shape of an aircraft can be described by a library of 2-D images corresponding to the projected views of an aircraft. From each 2-D binary aircraft image we extracted 2-D invariant (L, Φ) feature vector to be used for training neural network aircraft classifier. Simulations concerning the neural network classification rate was compared using nearest-neighbor classfier (NNC) which has been widely served as a performance benchmark. And we also introduced reliability measure of the designed neural network classifier.

Effective Hand Gesture Recognition by Key Frame Selection and 3D Neural Network

  • Hoang, Nguyen Ngoc;Lee, Guee-Sang;Kim, Soo-Hyung;Yang, Hyung-Jeong
    • 스마트미디어저널
    • /
    • 제9권1호
    • /
    • pp.23-29
    • /
    • 2020
  • This paper presents an approach for dynamic hand gesture recognition by using algorithm based on 3D Convolutional Neural Network (3D_CNN), which is later extended to 3D Residual Networks (3D_ResNet), and the neural network based key frame selection. Typically, 3D deep neural network is used to classify gestures from the input of image frames, randomly sampled from a video data. In this work, to improve the classification performance, we employ key frames which represent the overall video, as the input of the classification network. The key frames are extracted by SegNet instead of conventional clustering algorithms for video summarization (VSUMM) which require heavy computation. By using a deep neural network, key frame selection can be performed in a real-time system. Experiments are conducted using 3D convolutional kernels such as 3D_CNN, Inflated 3D_CNN (I3D) and 3D_ResNet for gesture classification. Our algorithm achieved up to 97.8% of classification accuracy on the Cambridge gesture dataset. The experimental results show that the proposed approach is efficient and outperforms existing methods.

3-D 텐서와 recurrent neural network기반 심층신경망을 활용한 수동소나 다중 채널 신호분리 기술 개발 (Sources separation of passive sonar array signal using recurrent neural network-based deep neural network with 3-D tensor)

  • 이상헌;정동규;유재석
    • 한국음향학회지
    • /
    • 제42권4호
    • /
    • pp.357-363
    • /
    • 2023
  • 다양한 신호가 혼합된 수중 신호로부터 각각의 신호를 분리하는 기술은 오랫동안 연구되어왔지만, 낮은 품질의 수중 신호의 특성 상 쉽게 해결되지 않는 문제이다. 현재 주로 사용되는 방법은 Short-time Fourier transform을 사용하여 수신된 음향신호의 스펙트로그램을 얻은 뒤, 주파수의 특성을 분석하여 신호를 분리하는 기술이다. 하지만 매개변수의 최적화가 까다롭고, 스펙트로그램으로 변환하는 과정에서 위상 정보들이 손실되는 한계점이 지적되었다. 본 연구에서는 이러한 문제를 해결하기 위해 긴 시계열 신호 처리에서 좋은 성능을 보인 Dual-path Recurrent Neural Network을 기반으로, 다중 채널 센서로부터 생성된 입력신호인 3차원 텐서를 처리할 수 있도록 변형된 Tripple-path Recurrent Neural Network을 제안한다. 제안하는 기술은 먼저 다중 채널 입력 신호를 짧은 조각으로 분할하고 조각 내 신호 간, 구성된 조각간, 그리고 채널 신호 간의 각각의 관계를 고려한 3차원 텐서를 생성하여 로컬 및 글로벌 특성을 학습한다. 제안된 기법은, 기존 방법에 비해 개선된 Root Mean Square Error 값과 Scale Invariant Signal to Noise Ratio을 가짐을 확인하였다.

궤도차량의 지능제어 및 3D 시률레이터 개발 (Development of a 3D Simulator and Intelligent Control of Track Vehicle)

  • 장영희;신행봉;정동연;서운학;한성현;고희석
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 1998년도 춘계학술대회 학술발표 논문집
    • /
    • pp.107-111
    • /
    • 1998
  • This paper presents a now approach to the design of intelligent contorl system for track vehicle system using fuzzy logic based on neural network. The proposed control scheme uses a Gaussian function as a unit function in the neural network-fuzzy, and back propagation algorithm to train the fuzzy-neural network controller in the framework of the specialized learning architecture. Moreover, We develop a Windows 95 version dynamic simulator which can simulate a track vehicle model in 3D graphics space. It is proposed a learning controller consisting of two neural network-fuzzy based of independent reasoning and a connection net with fixed weights to simply the neural networks-fuzzy. The dynamic simulator for track vehicle is developed by Microsoft Visual C++. Graphic libraries, OpenGL, by Silicon Graphics, Inc. were utilized for 3D Graphics. The performance of the proposed controller is illustrated by simulation for trajectory tracking of track vehicle speed.

  • PDF

인공신경망을 이용한 삼차원 물체의 인식과 정확한 자세계산 (3D Object Recognition and Accurate Pose Calculation Using a Neural Network)

  • 박강
    • 대한기계학회논문집A
    • /
    • 제23권11호
    • /
    • pp.1929-1939
    • /
    • 1999
  • This paper presents a neural network approach, which was named PRONET, to 3D object recognition and pose calculation. 3D objects are represented using a set of centroidal profile patterns that describe the boundary of the 2D views taken from evenly distributed view points. PRONET consists of the training stage and the execution stage. In the training stage, a three-layer feed-forward neural network is trained with the centroidal profile patterns using an error back-propagation method. In the execution stage, by matching a centroidal profile pattern of the given image with the best fitting centroidal profile pattern using the neural network, the identity and approximate orientation of the real object, such as a workpiece in arbitrary pose, are obtained. In the matching procedure, line-to-line correspondence between image features and 3D CAD features are also obtained. An iterative model posing method then calculates the more exact pose of the object based on initial orientation and correspondence.

RAM 기반 신경망의 비지도 학습에 관한 연구 (A Study on Unsupervised Learning Method of RAM-based Neural Net)

  • 박상무;김성진;이동형;이수동;옥철영
    • 한국컴퓨터정보학회논문지
    • /
    • 제16권1호
    • /
    • pp.31-38
    • /
    • 2011
  • RAM 기반 3-D 신경망은 2진 신경망(Binary Neural Network, BNN)에 복수개의 정보 저장 비트를 두어 교육의 반복 횟수를 누적하도록 구성된 가중치를 가지지 않는 신경회로망으로서 한 번의 교육만으로 학습이 이루어지는 효율성이 뛰어난 신경회로망이다. MRD(Maximum Response Detector) 기법을 이용한 3-D 신경망의 인식 방법은 지도 학습에 기반을 둔 것으로서 학습을 통해 신경망 스스로가 범주를 구분할 수 없으며 잘 구분된 범주의 학습 데이터를 통해서만 성능을 발휘할 수 있다. 본 논문에서는 기존 3-D 신경 회로망에 학습 데이터의 구분 없이 신경망 자체가 입력 패턴에 따라 학습하여 범주를 구분하는 비지도 학습 알고리즘을 제안한다. 제안된 비지도 학습 알고리즘에 의해 신경회로망은 판별자의 수를 스스로 조절할 수 있는 구조를 가지게 되며 이는 망의 유연한 확장성을 보장한다. 0에서 9까지의 다중 패턴으로 구성된 오프라인 필기체 숫자를 무작위로 추출하여 학습 패턴으로 인식 실험을 수행하였으며 실험을 통해 신경망이 스스로 비지도 학습에 의해 판별자의 수를 결정하게 되며 이것은 신경망이 각각의 필기체 숫자에 대한 개념을 가지게 되는 것으로 해석할 수 있다.

신경회로망을 이용한 카메라 교정과 2차원 거리 측정에 관한 연구 (Neural Network Based Camera Calibration and 2-D Range Finding)

  • 정우태;고국원;조형석
    • 한국정밀공학회:학술대회논문집
    • /
    • 한국정밀공학회 1994년도 추계학술대회 논문집
    • /
    • pp.510-514
    • /
    • 1994
  • This paper deals with an application of neural network to camera calibration with wide angle lens and 2-D range finding. Wide angle lens has an advantage of having wide view angles for mobile environment recognition ans robot eye in hand system. But, it has severe radial distortion. Multilayer neural network is used for the calibration of the camera considering lens distortion, and is trained it by error back-propagation method. MLP can map between camera image plane and plane the made by structured light. In experiments, Calibration of camers was executed with calibration chart which was printed by using laser printer with 300 d.p.i. resolution. High distortion lens, COSMICAR 4.2mm, was used to see whether the neural network could effectively calibrate camera distortion. 2-D range of several objects well be measured with laser range finding system composed of camera, frame grabber and laser structured light. The performance of 3-D range finding system was evaluated through experiments and analysis of the results.

  • PDF

Speech Emotion Recognition Using 2D-CNN with Mel-Frequency Cepstrum Coefficients

  • Eom, Youngsik;Bang, Junseong
    • Journal of information and communication convergence engineering
    • /
    • 제19권3호
    • /
    • pp.148-154
    • /
    • 2021
  • With the advent of context-aware computing, many attempts were made to understand emotions. Among these various attempts, Speech Emotion Recognition (SER) is a method of recognizing the speaker's emotions through speech information. The SER is successful in selecting distinctive 'features' and 'classifying' them in an appropriate way. In this paper, the performances of SER using neural network models (e.g., fully connected network (FCN), convolutional neural network (CNN)) with Mel-Frequency Cepstral Coefficients (MFCC) are examined in terms of the accuracy and distribution of emotion recognition. For Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset, by tuning model parameters, a two-dimensional Convolutional Neural Network (2D-CNN) model with MFCC showed the best performance with an average accuracy of 88.54% for 5 emotions, anger, happiness, calm, fear, and sadness, of men and women. In addition, by examining the distribution of emotion recognition accuracies for neural network models, the 2D-CNN with MFCC can expect an overall accuracy of 75% or more.

Recognition of Virtual Written Characters Based on Convolutional Neural Network

  • Leem, Seungmin;Kim, Sungyoung
    • Journal of Platform Technology
    • /
    • 제6권1호
    • /
    • pp.3-8
    • /
    • 2018
  • This paper proposes a technique for recognizing online handwritten cursive data obtained by tracing a motion trajectory while a user is in the 3D space based on a convolution neural network (CNN) algorithm. There is a difficulty in recognizing the virtual character input by the user in the 3D space because it includes both the character stroke and the movement stroke. In this paper, we divide syllable into consonant and vowel units by using labeling technique in addition to the result of localizing letter stroke and movement stroke in the previous study. The coordinate information of the separated consonants and vowels are converted into image data, and Korean handwriting recognition was performed using a convolutional neural network. After learning the neural network using 1,680 syllables written by five hand writers, the accuracy is calculated by using the new hand writers who did not participate in the writing of training data. The accuracy of phoneme-based recognition is 98.9% based on convolutional neural network. The proposed method has the advantage of drastically reducing learning data compared to syllable-based learning.

소수 데이터의 신경망 학습에 의한 카메라 보정 (Camera Calibration Using Neural Network with a Small Amount of Data)

  • 도용태
    • 센서학회지
    • /
    • 제28권3호
    • /
    • pp.182-186
    • /
    • 2019
  • When a camera is employed for 3D sensing, accurate camera calibration is vital as it is a prerequisite for the subsequent steps of the sensing process. Camera calibration is usually performed by complex mathematical modeling and geometric analysis. On the other contrary, data learning using an artificial neural network can establish a transformation relation between the 3D space and the 2D camera image without explicit camera modeling. However, a neural network requires a large amount of accurate data for its learning. A significantly large amount of time and work using a precise system setup is needed to collect extensive data accurately in practice. In this study, we propose a two-step neural calibration method that is effective when only a small amount of learning data is available. In the first step, the camera projection transformation matrix is determined using the limited available data. In the second step, the transformation matrix is used for generating a large amount of synthetic data, and the neural network is trained using the generated data. Results of simulation study have shown that the proposed method as valid and effective.