• Title/Summary/Keyword: feature enhancement

검색결과 258건 처리시간 0.025초

항법 적용을 위한 수중 소나 영상 처리 요소 기법 비교 분석 (Comparative Study of Sonar Image Processing for Underwater Navigation)

  • 신영식;조영근;이영준;최현택;김아영
    • 한국해양공학회지
    • /
    • 제30권3호
    • /
    • pp.214-220
    • /
    • 2016
  • Imaging sonars such as side-scanning sonar or forward-looking sonar are becoming fundamental sensors in the underwater robotics field. However, using sonar images for underwater perception presents many challenges. Sonar images are usually low resolution with inherent speckled noise. To overcome the limited sensor information for underwater perception, we investigated preprocessing methods for sonar images and feature detection methods for a nonlinear scale space. In this paper, we focus on a comparative analysis of (1) preprocessing for sonar images and (2) the feature detection performance in relation to the scale space composition.

입술정보를 이용한 음성 특징 파라미터 추정 및 음성인식 성능향상 (Estimation of speech feature vectors and enhancement of speech recognition performance using lip information)

  • 민소희;김진영;최승호
    • 대한음성학회지:말소리
    • /
    • 제44호
    • /
    • pp.83-92
    • /
    • 2002
  • Speech recognition performance is severly degraded under noisy envrionments. One approach to cope with this problem is audio-visual speech recognition. In this paper, we discuss the experiment results of bimodal speech recongition based on enhanced speech feature vectors using lip information. We try various kinds of speech features as like linear predicion coefficient, cepstrum, log area ratio and etc for transforming lip information into speech parameters. The experimental results show that the cepstrum parameter is the best feature in the point of reconition rate. Also, we present the desirable weighting values of audio and visual informations depending on signal-to-noiso ratio.

  • PDF

화자의도예측 파라미터를 이용한 조타명령 음성인식 시스템의 개선 (Enhancement of Ship's Wheel Order Recognition System using Speaker's Intention Predictive Parameters)

  • 문성배
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제32권5호
    • /
    • pp.791-797
    • /
    • 2008
  • The officer of the deck(OOD) may sometimes have to carry out lookout as well as handling of auto pilot without a quartermaster at sea. The purpose of this paper is to develop the ship's auto pilot control module using speech recognition in order to reduce the potential risk of one man bridge system. The feature parameters predicting the OOD's intention was extracted from the sample wheel orders written in SMCP(IMO Standard Marine Communication Phrases). We designed a pre-recognition procedure which could make some candidate words using DTW(Dynamic Time Warping) algorithm, a post-recognition procedure which made a final decision from the candidate words using the feature parameters. To evaluate the effectiveness of these procedures the experiment was conducted with 500 wheel orders.

한국어 숫자음 전화음성의 채널왜곡에 따른 특징파라미터의 변이 분석 (Variation Analysis of Feature Parameters According to the Channel Distortion of Korean Telephone Digit Speech)

  • 정성윤;손종목;김민성;배건성
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 하계종합학술대회 논문집(4)
    • /
    • pp.191-194
    • /
    • 2002
  • The final purpose of this paper is the enhancement of speech recognition rate under the matched telephone environment between training data and test data. To analyze the effect by the distortion of the changing telephone channel on every call, MFCC is used as the feature parameter and CMN, RTCN, and RASTA are used as channel compensation techniques. For each case, the variation of feature parameters of all phones is analyzed. And, we find recognition rates according to each compensation method using the continuous HMM recognizer, and examine the relationship between variation and recognition rate.

  • PDF

Enhanced and applicable algorithm for Big-Data by Combining Sparse Auto-Encoder and Load-Balancing, ProGReGA-KF

  • Kim, Hyunah;Kim, Chayoung
    • International Journal of Advanced Culture Technology
    • /
    • 제9권1호
    • /
    • pp.218-223
    • /
    • 2021
  • Pervasive enhancement and required enforcement of the Internet of Things (IoTs) in a distributed massively multiplayer online architecture have effected in massive growth of Big-Data in terms of server over-load. There have been some previous works to overcome the overloading of server works. However, there are lack of considered methods, which is commonly applicable. Therefore, we propose a combing Sparse Auto-Encoder and Load-Balancing, which is ProGReGA for Big-Data of server loads. In the process of Sparse Auto-Encoder, when it comes to selection of the feature-pattern, the less relevant feature-pattern could be eliminated from Big-Data. In relation to Load-Balancing, the alleviated degradation of ProGReGA can take advantage of the less redundant feature-pattern. That means the most relevant of Big-Data representation can work. In the performance evaluation, we can find that the proposed method have become more approachable and stable.

다중 스케일 특징 융합 모듈을 통한 종단 간 학습기반 공간적 스케일러블 영상 압축 (End-to-End Learning-based Spatial Scalable Image Compression with Multi-scale Feature Fusion Module)

  • 신주연;강제원
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송∙미디어공학회 2022년도 추계학술대회
    • /
    • pp.1-3
    • /
    • 2022
  • 최근 기존의 영상 압축 파이프라인 대신 신경망의 종단 간 학습을 통해 압축을 수행하는 알고리즘의 연구가 활발히 진행되고 있다. 본 논문은 종단 간 학습 기반 공간적 스케일러블 압축 기술을 제안한다. 보다 구체적으로 본 논문은 신경망의 각 계층에서 하위 계층의 학습된 특징 (feature)을 융합하여 상위 계층으로 전달하는 다중 스케일 특징 융합 (multi-scale feature fusion) 모듈을 도입해 상위 계층이 더욱 풍부한 특징 정보를 학습하고 계층 사이의 특징 중복성을 더욱 잘 제거할 수 있도록 한다. 기존 방법 대비 향상 계층(enhancement layer)에서 1.37%의 BD-rate가 향상된 결과를 볼 수 있다.

  • PDF

특징 강화 기법과 학습 데이터 길이 조절에 의한 Supervector Linear Kernel SVM 화자식별 개선 (Improvement in Supervector Linear Kernel SVM for Speaker Identification Using Feature Enhancement and Training Length Adjustment)

  • 소병민;김경화;김민석;양일호;김명재;유하진
    • 한국음향학회지
    • /
    • 제30권6호
    • /
    • pp.330-336
    • /
    • 2011
  • 본 논문에서는 supervector linear kernel SVM을 사용한 화자식별 시스템의 성능을 개선하는 방법을 제안하였다. 제안한 방법은 긴 학습 데이터를 여러 개의 짧은 학습 데이터로 분할하는 것을 기본 아이디어로 하고 있다. 제안한 방법의 성능을 평가하기 위해 서로 다른 4가지 데이터베이스에 PCA, GKPCA, KMDA를 사용하여 특징 강화를 하고 실험한 뒤 결과를 분석하였다. 실험 결과 제안한 방법이 supervector linear kernel SVM을 사용한 화자 식별 성능을 향상 시키는 것을 확인하였다.

AANet: Adjacency auxiliary network for salient object detection

  • Li, Xialu;Cui, Ziguan;Gan, Zongliang;Tang, Guijin;Liu, Feng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권10호
    • /
    • pp.3729-3749
    • /
    • 2021
  • At present, deep convolution network-based salient object detection (SOD) has achieved impressive performance. However, it is still a challenging problem to make full use of the multi-scale information of the extracted features and which appropriate feature fusion method is adopted to process feature mapping. In this paper, we propose a new adjacency auxiliary network (AANet) based on multi-scale feature fusion for SOD. Firstly, we design the parallel connection feature enhancement module (PFEM) for each layer of feature extraction, which improves the feature density by connecting different dilated convolution branches in parallel, and add channel attention flow to fully extract the context information of features. Then the adjacent layer features with close degree of abstraction but different characteristic properties are fused through the adjacent auxiliary module (AAM) to eliminate the ambiguity and noise of the features. Besides, in order to refine the features effectively to get more accurate object boundaries, we design adjacency decoder (AAM_D) based on adjacency auxiliary module (AAM), which concatenates the features of adjacent layers, extracts their spatial attention, and then combines them with the output of AAM. The outputs of AAM_D features with semantic information and spatial detail obtained from each feature are used as salient prediction maps for multi-level feature joint supervising. Experiment results on six benchmark SOD datasets demonstrate that the proposed method outperforms similar previous methods.

달 영구음영지역에서 로버 탐사를 위한 저조도 영상강화 및 영상 특징점 추출 성능 실험 (Experiment on Low Light Image Enhancement and Feature Extraction Methods for Rover Exploration in Lunar Permanently Shadowed Region)

  • 박재민;홍성철;신휴성
    • 대한토목학회논문집
    • /
    • 제42권5호
    • /
    • pp.741-749
    • /
    • 2022
  • 달 영구음영지역에 얼음 형태의 물이 발견되면서 주요 우주국들은 로버 중심의 현장 탐사를 준비 중이다. 달 영구음영지역은 극지역 크레이터의 중심부로 태양광이 직접 도달하지 않지만, 크레이터 벽면으로부터 반사되는 태양광으로 인해 일정 수준의 저조도 환경이 유지되는 것으로 예상된다. 본 연구에서는 달 영구음영지역의 조도와 지형환경을 모사한 실내 테스트베드를 구축하여 모의 지형영상을 촬영하였다. 모의 영상을 대상으로 저조도 영상강화 기법(CLAHE, Dehaze, RetinexNet, GLADNet)을 적용하여 밝기값과 색상복원 효과를 분석하였고, 특징점 추출 및 정합 기법(SIFT, SURF, ORB, AKAZE)의 성능 향상을 분석하였다. 실험 결과 GLADNet과 Dehaze 영상 순으로 저조도 환경에 강인한 시인성 개선 효과를 보여주었다. 반면 특징점 검출 및 정합 기법은 Dehaze와 GLADNet 영상 순으로 성능이 향상됨을 확인하였고, 특히 ORB와 AKAZE의 성능이 크게 개선되었다. 달 탐사에서 로버 탑재 카메라는 3차원 지형정보구축과 지질학적 조사에 활용된다. 따라서 GLADNet은 토양 성분과 암석 종류 판별에 유용하고, Dehaze는 로버의 주행과 함께 3차원 지형정보 구축에 적합할 것으로 판단된다.

Two-Microphone Binary Mask Speech Enhancement in Diffuse and Directional Noise Fields

  • Abdipour, Roohollah;Akbari, Ahmad;Rahmani, Mohsen
    • ETRI Journal
    • /
    • 제36권5호
    • /
    • pp.772-782
    • /
    • 2014
  • Two-microphone binary mask speech enhancement (2mBMSE) has been of particular interest in recent literature and has shown promising results. Current 2mBMSE systems rely on spatial cues of speech and noise sources. Although these cues are helpful for directional noise sources, they lose their efficiency in diffuse noise fields. We propose a new system that is effective in both directional and diffuse noise conditions. The system exploits two features. The first determines whether a given time-frequency (T-F) unit of the input spectrum is dominated by a diffuse or directional source. A diffuse signal is certainly a noise signal, but a directional signal could correspond to a noise or speech source. The second feature discriminates between T-F units dominated by speech or directional noise signals. Speech enhancement is performed using a binary mask, calculated based on the proposed features. In both directional and diffuse noise fields, the proposed system segregates speech T-F units with hit rates above 85%. It outperforms previous solutions in terms of signal-to-noise ratio and perceptual evaluation of speech quality improvement, especially in diffuse noise conditions.