• Title/Summary/Keyword: Model Feature Map

Search Result 156, Processing Time 0.027 seconds

A study on speech enhancement using complex-valued spectrum employing Feature map Dependent attention gate (특징 맵 중요도 기반 어텐션을 적용한 복소 스펙트럼 기반 음성 향상에 관한 연구)

  • Jaehee Jung;Wooil Kim
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.6
    • /
    • pp.544-551
    • /
    • 2023
  • Speech enhancement used to improve the perceptual quality and intelligibility of noise speech has been studied as a method using a complex-valued spectrum that can improve both magnitude and phase in a method using a magnitude spectrum. In this paper, a study was conducted on how to apply attention mechanism to complex-valued spectrum-based speech enhancement systems to further improve the intelligibility and quality of noise speech. The attention is performed based on additive attention and allows the attention weight to be calculated in consideration of the complex-valued spectrum. In addition, the global average pooling was used to consider the importance of the feature map. Complex-valued spectrum-based speech enhancement was performed based on the Deep Complex U-Net (DCUNET) model, and additive attention was conducted based on the proposed method in the Attention U-Net model. The results of the experiments on noise speech in a living room environment showed that the proposed method is improved performance over the baseline model according to evaluation metrics such as Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short Time Object Intelligence (STOI), and consistently improved performance across various background noise environments and low Signal-to-Noise Ratio (SNR) conditions. Through this, the proposed speech enhancement system demonstrated its effectiveness in improving the intelligibility and quality of noisy speech.

Parking Space Recognition for Autonomous Valet Parking Using Height and Salient-Line Probability Maps

  • Han, Seung-Jun;Choi, Jeongdan
    • ETRI Journal
    • /
    • v.37 no.6
    • /
    • pp.1220-1230
    • /
    • 2015
  • An autonomous valet parking (AVP) system is designed to locate a vacant parking space and park the vehicle in which it resides on behalf of the driver, once the driver has left the vehicle. In addition, the AVP is able to direct the vehicle to a location desired by the driver when requested. In this paper, for an AVP system, we introduce technology to recognize a parking space using image sensors. The proposed technology is mainly divided into three parts. First, spatial analysis is carried out using a height map that is based on dense motion stereo. Second, modelling of road markings is conducted using a probability map with a new salient-line feature extractor. Finally, parking space recognition is based on a Bayesian classifier. The experimental results show an execution time of up to 10 ms and a recognition rate of over 99%. Also, the performance and properties of the proposed technology were evaluated with a variety of data. Our algorithms, which are part of the proposed technology, are expected to apply to various research areas regarding autonomous vehicles, such as map generation, road marking recognition, localization, and environment recognition.

An Algorithm to Update a Codebook Using a Neural Net (신경회로망을 이용한 코드북의 순차적 갱신 알고리듬)

  • 정해묵;이주희;이충웅
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.11
    • /
    • pp.1857-1866
    • /
    • 1989
  • In this paper, an algorithm to update a codebook using a neural network in consecutive images, is proposed. With the Kohonen's self-organizing feature map, we adopt the iterative technique to update a centroid of each cluster instead of the unsupervised learning technique. Because the performance of this neural model is comparable to that of the LBG algorithm, it is possible to update the codebooks of consecutive frames sequentially in TV and to realize the hardwadre on the real-time implementation basis.

  • PDF

Active Shape Model-based Object Tracking using Depth Sensor (깊이 센서를 이용한 능동형태모델 기반의 객체 추적 방법)

  • Jung, Hun Jo;Lee, Dong Eun
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.9 no.1
    • /
    • pp.141-150
    • /
    • 2013
  • This study proposes technology using Active Shape Model to track the object separating it by depth-sensors. Unlike the common visual camera, the depth-sensor is not affected by the intensity of illumination, and therefore a more robust object can be extracted. The proposed algorithm removes the horizontal component from the information of the initial depth map and separates the object using the vertical component. In addition, it is also a more efficient morphology, and labeling to perform image correction and object extraction. By applying Active Shape Model to the information of an extracted object, it can track the object more robustly. Active Shape Model has a robust feature-to-object occlusion phenomenon. In comparison to visual camera-based object tracking algorithms, the proposed technology, using the existing depth of the sensor, is more efficient and robust at object tracking. Experimental results, show that the proposed ASM-based algorithm using depth sensor can robustly track objects in real-time.

Convex Sharp Edge Detection of CAD Surfaces without Topology (토폴로지 정보가 없는 CAD 곡면의 꺾인 모서리 탐색)

  • 박정환;이정근
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.17 no.2
    • /
    • pp.73-79
    • /
    • 2000
  • The part-surface of mold or stamping-dies consists of a compound surface which consists of lots of composite surfaces, and may have various types of feature shapes including convex sharp edge (CSE). Those CSE features should be considered with care in machining the surface, which necessitates extraction of CSE curves on a compound surface. This work can be done rather easily for a solid model which has a complete topology information. In case of the compound surface without topology information, however, such CSE curves must be gathered through some geometrical calculations paying much computation time. In the paper, extracting CSE curves by the construction of a CSE region-map which can reduce time, and detecting various common edge types are presented.

  • PDF

A Study on Attention Mechanism in DeepLabv3+ for Deep Learning-based Semantic Segmentation (딥러닝 기반의 Semantic Segmentation을 위한 DeepLabv3+에서 강조 기법에 관한 연구)

  • Shin, SeokYong;Lee, SangHun;Han, HyunHo
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.10
    • /
    • pp.55-61
    • /
    • 2021
  • In this paper, we proposed a DeepLabv3+ based encoder-decoder model utilizing an attention mechanism for precise semantic segmentation. The DeepLabv3+ is a semantic segmentation method based on deep learning and is mainly used in applications such as autonomous vehicles, and infrared image analysis. In the conventional DeepLabv3+, there is little use of the encoder's intermediate feature map in the decoder part, resulting in loss in restoration process. Such restoration loss causes a problem of reducing segmentation accuracy. Therefore, the proposed method firstly minimized the restoration loss by additionally using one intermediate feature map. Furthermore, we fused hierarchically from small feature map in order to effectively utilize this. Finally, we applied an attention mechanism to the decoder to maximize the decoder's ability to converge intermediate feature maps. We evaluated the proposed method on the Cityscapes dataset, which is commonly used for street scene image segmentation research. Experiment results showed that our proposed method improved segmentation results compared to the conventional DeepLabv3+. The proposed method can be used in applications that require high accuracy.

Analysis of Three Dimensional Positioning Accuracy of Vectorization Using UAV-Photogrammetry (무인항공사진측량을 이용한 벡터화의 3차원 위치정확도 분석)

  • Lee, Jae One;Kim, Doo Pyo
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.6
    • /
    • pp.525-533
    • /
    • 2019
  • There are two feature collection methods in digital mapping using the UAV (Unmanned Aerial Vehicle) Photogrammetry: vectorization and stereo plotting. In vectorization, planar information is extracted from orthomosaics and elevation value obtained from a DSM (Digital Surface Model) or a DEM (Digital Elevation Model). However, the exact determination of the positional accuracy of 3D features such as ground facilities and buildings is very ambiguous, because the accuracy of vectorizing results has been mainly analyzed using only check points placed on the ground. Thus, this study aims to review the possibility of 3D spatial information acquisition and digital map production of vectorization by analyzing the corner point coordinates of different layers as well as check points. To this end, images were taken by a Phantom 4 (DJI) with 3.6 cm of GSD (Ground Sample Distance) at altitude of 90 m. The outcomes indicate that the horizontal RMSE (Root Mean Square Error) of vectorization method is 0.045 cm, which was calculated from residuals at check point compared with those of the field survey results. It is therefore possible to produce a digital topographic (plane) map of 1:1,000 scale using ortho images. On the other hand, the three-dimensional accuracy of vectorization was 0.068~0.162 m in horizontal and 0.090~1.840 m in vertical RMSE. It is thus difficult to obtain 3D spatial information and 1:1,000 digital map production by using vectorization due to a large error in elevation.

Emotion recognition in speech using hidden Markov model (은닉 마르코프 모델을 이용한 음성에서의 감정인식)

  • 김성일;정현열
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.3 no.3
    • /
    • pp.21-26
    • /
    • 2002
  • This paper presents the new approach of identifying human emotional states such as anger, happiness, normal, sadness, or surprise. This is accomplished by using discrete duration continuous hidden Markov models(DDCHMM). For this, the emotional feature parameters are first defined from input speech signals. In this study, we used prosodic parameters such as pitch signals, energy, and their each derivative, which were then trained by HMM for recognition. Speaker adapted emotional models based on maximum a posteriori(MAP) estimation were also considered for speaker adaptation. As results, the simulation performance showed that the recognition rates of vocal emotion gradually increased with an increase of adaptation sample number.

  • PDF

A Study on Speaker Adaptation of Large Continuous Spoken Language Using back-off bigram (Back-off bigram을 이랑한 대용량 연속어의 화자적응에 관한 연구)

  • 최학윤
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.9C
    • /
    • pp.884-890
    • /
    • 2003
  • In this paper, we studied the speaker adaptation methods that improve the speaker independent recognition system. For the independent speakers, we compared the results between bigram and back-off bigram, MAP and MLLR. Cause back-off bigram applys unigram and back-off weighted value as bigram probability value, it has the effect adding little weighted value to bigram probability value. We did an experiment using total 39-feature vectors as featuring voice parameter with 12-MFCC, log energy and their delta and delta-delta parameter. For this recognition experiment, We constructed a system made by CHMM and tri-phones recognition unit and bigram and back-off bigrams language model.

Receptor-oriented Pharmacophore-based in silico Screening of Human Catechol O-Methyltransferase for the Design of Antiparkinsonian Drug

  • Lee, Jee-Young;Baek, Sun-Hee;Kim, Yang-Mee
    • Bulletin of the Korean Chemical Society
    • /
    • v.28 no.3
    • /
    • pp.379-385
    • /
    • 2007
  • Receptor-oriented pharmacophore-based in silico screening is a powerful tool for rapidly screening large number of compounds for interactions with a given protein. Inhibition of the enzyme catechol-Omethyltransferase (COMT) offers a novel possibility for treating Parkinson's disease. Bisubstrate inhibitors of COMT containing the adenine of S-adenosylmethionine (SAM) and a catechol moiety are a new class of potent and selective inhibitor. In the present study, we used receptor-oriented pharmacophore-based in silico screening to examine the interactions between the active site of human COMT and bisubstrate inhibitors. We generated 20 pharmacophore maps, of which 4 maps reproduced the docking model of hCOMT and a bisubstrate inhibitor. Only one of these four, pharmacophore map I, effectively described the common features of a series of bisubstrate inhibitors. Pharmacophore map I consisted of one hydrogen bond acceptor (to Mg2+), three hydrogen bond donors (to Glu199, Glu90, and Gln120), and one hydrophobic feature (an active site region surrounded by several aromatic and hydrophobic residues). This map represented the most essential pharmacophore for explaining interactions between hCOMT and a bisubstrate inhibitor. These results revealed a pharmacophore that should help in the development of new drugs for treating Parkinson's disease.