• 제목/요약/키워드: 2D Dataset

검색결과 200건 처리시간 0.026초

Speech Emotion Recognition Using 2D-CNN with Mel-Frequency Cepstrum Coefficients

  • Eom, Youngsik;Bang, Junseong
    • Journal of information and communication convergence engineering
    • /
    • 제19권3호
    • /
    • pp.148-154
    • /
    • 2021
  • With the advent of context-aware computing, many attempts were made to understand emotions. Among these various attempts, Speech Emotion Recognition (SER) is a method of recognizing the speaker's emotions through speech information. The SER is successful in selecting distinctive 'features' and 'classifying' them in an appropriate way. In this paper, the performances of SER using neural network models (e.g., fully connected network (FCN), convolutional neural network (CNN)) with Mel-Frequency Cepstral Coefficients (MFCC) are examined in terms of the accuracy and distribution of emotion recognition. For Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset, by tuning model parameters, a two-dimensional Convolutional Neural Network (2D-CNN) model with MFCC showed the best performance with an average accuracy of 88.54% for 5 emotions, anger, happiness, calm, fear, and sadness, of men and women. In addition, by examining the distribution of emotion recognition accuracies for neural network models, the 2D-CNN with MFCC can expect an overall accuracy of 75% or more.

딥러닝과 구체의 형태 변형 방법을 이용한 단일 이미지에서의 3D Mesh 재구축 기법 (3D Mesh Reconstruction Technique from Single Image using Deep Learning and Sphere Shape Transformation Method)

  • 김정윤;이승호
    • 전기전자학회논문지
    • /
    • 제26권2호
    • /
    • pp.160-168
    • /
    • 2022
  • 본 논문에서는 딥러닝과 구체의 형태 변형 방법을 이용한 단일 이미지에서의 3D mesh 재구축 기법을 제안한다. 제안한 기법은 기존의 방식과 다른 다음과 같은 독창성이 있다. 첫 번째, 기존의 근처의 가까운 점들을 연결하여 모서리 또는 면을 구축하는 방식과 다르게 딥러닝 네트워크을 통하여 구체의 꼭짓점의 위치를 사물의 3D 포인트 클라우드와 매우 유사하게 수정한다. 3D 포인트 클라우드를 이용하므로 메모리가 적게 필요하며 구체의 꼭짓점에 오프셋 값 사이에 덧셈 연산만을 수행하기 때문에 더 빠른 연산이 가능하다. 두 번째, 수정한 꼭짓점에 구체의 면 정보를 씌워 3D mesh를 재구축한다. 구체의 꼭짓점의 위치를 수정하여 생성한 3D 포인트 클라우드의 점들의 간격이 일정하지 않을 때에도 이미 점들 사이의 연결 여부를 나타내는 구체의 면 정보라는 3D mesh의 면 정보를 가지고 있어 표현의 단순화나 결손을 방지할 수 있다. 제안하는 기법의 객관적인 신뢰성을 평가하기 위해 공개된 표준 데이터셋인 ShapeNet 데이터셋을 이용하여 비교 논문들과 같은 방법으로 실험한 결과, 본 논문에서 제안하는 기법의 IoU 값이 0.581로, chamfer distance 값은 0.212로 산출되었다. IoU 값은 수치가 높을수록, chamfer distance 값은 수치가 낮을수록 우수한 결과를 나타내므로 다른 논문에서 발표한 기법들보다 3D mesh 재구축의 결과에서 성능의 효율성이 입증되었다.

RPC 모델의 보정을 통한 IKONOS-2 위성영상과 항공레이저측량 자료의 정합에 관한 연구 (Integration of IKONOS-2 Satellite Imagery and ALS dataset by Compensating Biases of RPC Models)

  • 이재빈;유기윤;이창노;송우석
    • 대한토목학회논문집
    • /
    • 제28권3D호
    • /
    • pp.437-444
    • /
    • 2008
  • 본 연구는 IKONOS-2 위성영상의 센서모델인 RPC 모델에 존재하는 편의들을 항공레이저측량 자료를 활용하여 보정하고 이를 통해 IKONOS-2 위성영상과 항공레이저측량 자료를 정합하기 위해 다양한 조정방정식을 적용하고 이를 검증하는 것을 목적으로 한다. 이를 위해 거리자료인 항공레이저측량 자료와 영상자료인 IKONOS-2 위성영상의 특성을 고려하여, 선형객체를 편의 수정을 위한 기하보정의 기본요소로 활용하였다. 이 때, IKONOS-2 위성영상의 RPC 모델에 존재하는 편의들은 항공레이저측량 자료로부터 추출된 선형객체들을 이용하여 영상좌표계 상에서 조정되게 된다. 특히, 본 연구에서는 IKONOS-2 위성영상의 RPC 모델의 편의보정을 위해 선행연구들에서 제시된 다양한 형태의 변환함수들에 대해 기하보정 방법론을 적용하고, 사용되는 변환함수에 따른 기하보정의 정확도 평가를 수행하였다. 또한, 기하보정에 사용되는 선형객체의 개수가 결과에 미치는 영향에 대한 평가도 동시에 수행하였다. 마지막으로 DGPS 측량을 통하여 획득된 검사점들을 활용하여 정확도 평가를 수행하고 연구결과의 효용성을 검증하였다.

손을 다루는 컴퓨터 비전 작업들을 위한 멀티 모달 합성 데이터 생성 방법 (Generating A Synthetic Multimodal Dataset for Vision Tasks Involving Hands)

  • 이창화;이선경;김동욱;정찬양;백승렬
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2020년도 추계학술발표대회
    • /
    • pp.1052-1055
    • /
    • 2020
  • 본 논문에서는 3D 메시 정보, RGB-D 손 자세 및 2D/3D 손/세그먼트 마스크를 포함하여 인간의 손과 관련된 다양한 컴퓨터 비전 작업에 사용할 수 있는 새로운 다중 모달 합성 벤치마크를 제안 하였다. 생성된 데이터셋은 기존의 대규모 데이터셋인 BigHand2.2M 데이터셋과 변형 가능한 3D 손 메시(mesh) MANO 모델을 활용하여 다양한 손 포즈 변형을 다룬다. 첫째, 중복되는 손자세를 줄이기 위해 전략적으로 샘플링하는 방법을 이용하고 3D 메시 모델을 샘플링된 손에 피팅한다. 3D 메시의 모양 및 시점 파라미터를 탐색하여 인간 손 이미지의 자연스러운 가변성을 처리한다. 마지막으로, 다중 모달리티 데이터를 생성한다. 손 관절, 모양 및 관점의 데이터 공간을 기존 벤치마크의 데이터 공간과 비교한다. 이 과정을 통해 제안된 벤치마크가 이전 작업의 차이를 메우고 있음을 보여주고, 또한 네트워크 훈련 과정에서 제안된 데이터를 사용하여 RGB 기반 손 포즈 추정 실험을 하여 생성된 데이터가 양질의 질과 양을 가짐을 보여준다. 제안된 데이터가 RGB 기반 3D 손 포즈 추정 및 시맨틱 손 세그멘테이션과 같은 품질 좋은 큰 데이터셋이 부족하여 방해되었던 작업에 대한 발전을 가속화할 것으로 기대된다.

The Effect of R&D Expenditure on Firm Output: Empirical Evidence from Vietnam

  • BINH, Quan Minh Quoc;TUNG, Le Thanh
    • The Journal of Asian Finance, Economics and Business
    • /
    • 제7권6호
    • /
    • pp.379-385
    • /
    • 2020
  • The effect of research and development (R&D) expenditure on firm output is an interesting topic, but hardly explored in developing countries due to the unavailability of data. This study investigates this topic in the context of Vietnam by utilizing a novel dataset of 343 firms listed on the Vietnam Stock Exchange in the 2010-2018 period. The effect of R&D expenditure is examined under the production function framework. In order to obtain the robustness of the quantitative results, we estimate the production function with two coherent techniques including the OLS and 2-SLS. An instrumental variable regression technique is adopted to avoid the endogeneity problem between R&D expenditure and other variables. In our empirical analysis, we find that R&D expenditure has a positive and significant impact on output growth. The finding is robust in both OLS and 2-SLS frameworks. Besides, the output elasticity to R&D expenditure of our result is much higher than the estimated elasticity of other countries. The results imply that a 1% increase in R&D expenditure in Vietnam will help to expand the output more than a 1% increase in R&D investment in other countries. The findings from our paper provide important implications for firm managers, investors, and policymakers in Vietnam.

역설계를 통해 BIM 구축시에 3D 모델링에 대한 세밀도(LoD) 정립 - 지상 LiDAR 활용한 3D 모델링 연구 중심 - (Definition of 3D Modeling Level of Detail in BIM Regeneration Through Reverse Engineering - Case Study on 3D Modeling Using Terrestrial LiDAR -)

  • 채재현;이지영
    • 한국BIM학회 논문집
    • /
    • 제7권4호
    • /
    • pp.8-20
    • /
    • 2017
  • When it comes to set up the BIM through the reverse engineering, the level of detail(LoD) required for finalized outcomes is different from each purpose. Therefore, it is necessary to establish some concrete criteria which describe the definition of LoDs on 3D modeling for the purpose of each reverse engineering. This research shows the criteria of the 1) positional accuracy, 2) generalization level, 3) scale level, 4) scope of description, and 5) the area available for application by classifying LoD from 1 to 6 on 3D modeling for each purpose of reverse engineering. Moreover, through applying those criteria for the 3D point cloud dataset of building made by terrestrial LiDAR, this research finds out the working hour of 3D modeling of reverse engineering by each LoDs according to defined LoD criteria for each level. It is expected that those findings, how those criteria of LoD on reverse engineering are utilized for modeling-workers to decide whether the outcomes can be suitable for their budget, applicable fields or not, would contribute to help them as a basic information.

A Novel Method for Hand Posture Recognition Based on Depth Information Descriptor

  • Xu, Wenkai;Lee, Eung-Joo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제9권2호
    • /
    • pp.763-774
    • /
    • 2015
  • Hand posture recognition has been a wide region of applications in Human Computer Interaction and Computer Vision for many years. The problem arises mainly due to the high dexterity of hand and self-occlusions created in the limited view of the camera or illumination variations. To remedy these problems, a hand posture recognition method using 3-D point cloud is proposed to explicitly utilize 3-D information from depth maps in this paper. Firstly, hand region is segmented by a set of depth threshold. Next, hand image normalization will be performed to ensure that the extracted feature descriptors are scale and rotation invariant. By robustly coding and pooling 3-D facets, the proposed descriptor can effectively represent the various hand postures. After that, SVM with Gaussian kernel function is used to address the issue of posture recognition. Experimental results based on posture dataset captured by Kinect sensor (from 1 to 10) demonstrate the effectiveness of the proposed approach and the average recognition rate of our method is over 96%.

3차원 얼굴 인식을 위한 PSO와 다중 포인트 특징 추출을 이용한 RBFNNs 패턴분류기 설계 (Design of RBFNNs Pattern Classifier Realized with the Aid of PSO and Multiple Point Signature for 3D Face Recognition)

  • 오성권;오승훈
    • 전기학회논문지
    • /
    • 제63권6호
    • /
    • pp.797-803
    • /
    • 2014
  • In this paper, 3D face recognition system is designed by using polynomial based on RBFNNs. In case of 2D face recognition, the recognition performance reduced by the external environmental factors such as illumination and facial pose. In order to compensate for these shortcomings of 2D face recognition, 3D face recognition. In the preprocessing part, according to the change of each position angle the obtained 3D face image shapes are changed into front image shapes through pose compensation. the depth data of face image shape by using Multiple Point Signature is extracted. Overall face depth information is obtained by using two or more reference points. The direct use of the extracted data an high-dimensional data leads to the deterioration of learning speed as well as recognition performance. We exploit principle component analysis(PCA) algorithm to conduct the dimension reduction of high-dimensional data. Parameter optimization is carried out with the aid of PSO for effective training and recognition. The proposed pattern classifier is experimented with and evaluated by using dataset obtained in IC & CI Lab.

Evaluation and validation of stem volume models for Quercus glauca in the subtropical forest of Jeju Island, Korea

  • Seo, Yeon Ok;Lumbres, Roscinto Ian C.;Won, Hyun Kyu;Jung, Sung Cheol;Lee, Young Jin
    • Journal of Ecology and Environment
    • /
    • 제38권4호
    • /
    • pp.485-491
    • /
    • 2015
  • This study was conducted to develop stem volume models for the volume estimation of Quercus glauca Thunb. in Jeju Island, Republic of Korea. Furthermore, this study validated the developed stem volume models using an independent dataset. A total of 167 trees were measured for their diameter at breast height (DBH), total height and stem volume using non-destructive sampling methods. Eighty percent of the dataset was used for the initial model development while the remaining 20% was used for model validation. The performance of the different models was evaluated using the following fit statistics: standard error of estimate (SEE), mean bias absolute mean deviation (AMD), coefficient of determination (R2), and root mean square error (RMSE). The AMD of the five models from the different DBH classes were determined using the validation dataset. Model 5 (V = aDbHc), which estimates volume using DBH and total height as predicting variables, had the best SEE (0.02745), AMD (0.01538), R2 (0.97603) and RMSE (0.02746). Overall, volume models with two independent variables (DBH and total height) performed better than those with only one (DBH) based on the model evaluation and validation. The models developed in this study can provide forest managers with accurate estimations for the stem volumes of Quercus glauca in the subtropical forests of Jeju Island, Korea.

Object Classification based on Weakly Supervised E2LSH and Saliency map Weighting

  • Zhao, Yongwei;Li, Bicheng;Liu, Xin;Ke, Shengcai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권1호
    • /
    • pp.364-380
    • /
    • 2016
  • The most popular approach in object classification is based on the bag of visual-words model, which has several fundamental problems that restricting the performance of this method, such as low time efficiency, the synonym and polysemy of visual words, and the lack of spatial information between visual words. In view of this, an object classification based on weakly supervised E2LSH and saliency map weighting is proposed. Firstly, E2LSH (Exact Euclidean Locality Sensitive Hashing) is employed to generate a group of weakly randomized visual dictionary by clustering SIFT features of the training dataset, and the selecting process of hash functions is effectively supervised inspired by the random forest ideas to reduce the randomcity of E2LSH. Secondly, graph-based visual saliency (GBVS) algorithm is applied to detect the saliency map of different images and weight the visual words according to the saliency prior. Finally, saliency map weighted visual language model is carried out to accomplish object classification. Experimental results datasets of Pascal 2007 and Caltech-256 indicate that the distinguishability of objects is effectively improved and our method is superior to the state-of-the-art object classification methods.