• 제목/요약/키워드: Speech Processing

검색결과 960건 처리시간 0.034초

모국어와 외국어 단어 산출에서의 의미정보 처리과정 (Semantic Processing in Korean and English Word Production)

  • 김효선;남기춘;김충명
    • 대한음성학회지:말소리
    • /
    • 제57호
    • /
    • pp.59-72
    • /
    • 2006
  • The purpose of this study was to see whether Korean-English bilinguals' semantic systems of Korean and English are shared or separated between the two languages. In a series of picture-word interference tasks, participants were required to name the pictures in Korean or in English with distractor words printed either in Korean or English. The distractor words were any of identical, semantically related, or neutral to the picture. The response time of naming was facilitated when distractor words were semantically identical for both same- and different-language pairs. But this facilitation effect was stronger when naming was produced in their native language, which in this case was Korean. Also, inhibitory effect was found when the picture and its distractor word were semantically related in both same- and different-language paired conditions. From these results it can be concluded that semantic representations of Korean and English may not be entirely but partly overlapping in bilinguals.

  • PDF

청각보철을 위한 PLP방식의 음성신호처리에 관한 연구 (A Study on the Speech Signal Processing for Cochlear Implant using the PLP Analysis)

  • 김영선;최두일;박상희;백승화
    • 대한의용생체공학회:학술대회논문집
    • /
    • 대한의용생체공학회 1992년도 춘계학술대회
    • /
    • pp.167-170
    • /
    • 1992
  • 본 논문에서는 감각성 난청자들이 정상인들과 유사한 음성 인식을 하도록 청각 보철 기기를 구성하였다. 음성의 포먼트를 추출하기 위해서는 PLP(Perceptual Linear Prediction) 방식을 이용하였으며, pitch 추출을 위해서는 3 단계 클리핑 함수를 이용한 자기 상관법을 이용하였다. 또한 다중 채널 - 다중 전극 방식을 이용하여 내이의 헤어셀에 17 개의 전극을 삽입하여 신호를 가하는 시뮬레이션을 하였다. 실험에 사용한 데이타는 모음 /a/, /e/, /i/, /o/, /u/로 전모음과 후모음의 차이를 구별하였으며 두번째 포먼트의 변화와 포먼트 통합 이론에 대한 검증을 하였다.

  • PDF

Polyphase Representation of the Relationships Among Fullband, Subband, and Block Adaptive Filters

  • Tsai, Chimin
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2005년도 ICCAS
    • /
    • pp.1435-1438
    • /
    • 2005
  • In hands-free telephone systems, the received speech signal is fed back to the microphone and constitutes the so-called echo. To cancel the effect of this time-varying echo path, it is necessary to device an adaptive filter between the receiving and the transmitting ends. For a typical FIR realization, the length of the fullband adaptive filter results in high computational complexity and low convergence rate. Consequently, subband adaptive filtering schemes have been proposed to improve the performance. In this work, we use deterministic approach to analyze the relationship between fullband and subband adaptive filtering structures. With block adaptive filtering structure as an intermediate stage, the analysis is divided into two parts. First, to avoid aliasing, it is found that the matrix of block adaptive filters is in the form of pseudocirculant, and the elements of this matrix are the polyphase components of the fullband adaptive filter. Second, to transmit the near-end voice signal faithfully, the analysis and the synthesis filter banks in the subband adaptive filtering structure must form a perfect reconstruction pair. Using polyphase representation, the relationship between the block and the subband adaptive filters is derived.

  • PDF

마이크로셀 환경에서 개인휴대통신을 위한 다중접속 알고리즘의 성능 평가 (Performance evaluation of a multiple-access algorithm for PCN in microcell environment)

  • 전영희;이재형;최형진
    • 전자공학회논문지A
    • /
    • 제33A권7호
    • /
    • pp.55-63
    • /
    • 1996
  • In this paper, a multiple-access algorithm for PCN is proposed. The proposed algorithm provides th integrated service of information soruces and can be operated stably in high load state. Given bandwidth is efficiently used for it in the microcell environment. And system performance can be improved through the statistical-multiplexing technque. In order to process the speech signal usually requiring real-tiem processing, we adopt a random access of AlOHA type for th ebasic protocol sturcture and assume the form of ALOHA-reservation. We have analyzed the performance of the proposed algorithm through system throughput and packet delay in the microcell environment.

  • PDF

음성인식을 위한 환경잡음의 전처리기법에 관한 검토 (A Study on Front-End Processing Methods of Environmental Noise for Speech Recognition)

  • 김광수
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1997년도 영남지회 학술발표회 논문집 Acoustic Society of Korean Youngnam Chapter Symposium Proceedings
    • /
    • pp.17-22
    • /
    • 1997
  • 본 논문에서는 음성 인식기의 성능을 저하시키는 요인중 부가 잡음과 마이크의 변동에 의한 채널 왜곡을 동시에 감소시키는 방법으로 기존의 전처리에 의한 환경덥음처리기법의 단점을 개선한 Histogram 처리기법을 잡음처리에 도입하고 그 유효성을 확인하였다. 도입한 잡음처리기법의 유효성을 확인하기 위하여 기존의 잡음처리기법으로 잘 알려진 여러 가지 방법과 비교하기 위하여 단어 인식실험을 실시하였다. 실험결과, 부가잡음만이 첨가된 경우에 있어서는 일반적으로 알려진 SS, CMN, RASTA등을 이용한 결과 전처리방법을 이용하지 않은 경우의 기본인식률에 비해 SN비에 따라 25% 이상이 인식률 향상을 볼 수 있었다. 특히 CDCN 처리와 H-RASTA를 사용한 경우, 채널왜곡과 부가잡음이 함께 포함된 음성에 대해 SN비에 관계없이 약 15~30%정도의 인식률의 향상을 볼 수 있어 기존 방법으로서는 이글 방법이 우수함을 확인할 수 있었다. 이 위에 Histogram 에 의한 추정법을 적용한 경우 전처리의 성능을 10~15% 정도 성능향상을 가져와 도입한 방법의 유효성을 확인할 수 있었다.

  • PDF

Convolutional Neural Networks for Character-level Classification

  • Ko, Dae-Gun;Song, Su-Han;Kang, Ki-Min;Han, Seong-Wook
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제6권1호
    • /
    • pp.53-59
    • /
    • 2017
  • Optical character recognition (OCR) automatically recognizes text in an image. OCR is still a challenging problem in computer vision. A successful solution to OCR has important device applications, such as text-to-speech conversion and automatic document classification. In this work, we analyze character recognition performance using the current state-of-the-art deep-learning structures. One is the AlexNet structure, another is the LeNet structure, and the other one is the SPNet structure. For this, we have built our own dataset that contains digits and upper- and lower-case characters. We experiment in the presence of salt-and-pepper noise or Gaussian noise, and report the performance comparison in terms of recognition error. Experimental results indicate by five-fold cross-validation that the SPNet structure (our approach) outperforms AlexNet and LeNet in recognition error.

음성인식모듈을 이용한 선박조타용 임베디드 시스템 개발 (Development of an Embedded System for Ship′s Steering Gear using Voice Recognition Module)

  • 서기열;홍태호;김화영;박계각
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2004년도 춘계학술대회 학술발표 논문집 제14권 제1호
    • /
    • pp.144-148
    • /
    • 2004
  • Recently, various studies had been made for automatic control system of small ships, in order to improve maneuvering and to reduce labor and working on board. To achieve efficient operation of small ships, it had accomplished to rapid development of automatic technique, but the ship operation had been more complicated because of the need to handle various gauges and instruments. To solve these problems, there are examples to be applied to the speech information processing technologies which is one of the human interface methods in the system operation of ship, but the implementation of definite system is still incomplete. Therefore, the purpose of this paper is to implement the control system for ship steering using the voice recognition module.

  • PDF

Development of a Cryptographic Dongle for Secure Voice Encryption over GSM Voice Channel

  • Kim, Tae-Yong;Jang, Won-Tae;Lee, Hoon-Jae
    • Journal of information and communication convergence engineering
    • /
    • 제7권4호
    • /
    • pp.561-564
    • /
    • 2009
  • A cryptographic dongle, which is capable of transmitting encrypted voice signals over the CDMA/GSM voice channel, was designed and implemented. The dongle used PIC microcontroller for signals processing including analog to digital conversion and digital to analog conversion, encryption and communicating with the smart phone. A smart phone was used to provide power to the dongle as well as passing the encrypted speech to the smart phone which then transmits the signal to the network. A number of tests were conducted to check the efficiency of the dongle, the firmware programming, the encryption algorithms, and the secret key management system, the interface between the smart phone and the dongle and the noise level.

다중 템플릿 방법을 이용한 뇌파의 감성 분류 알고리즘 (Sensibility Classification Algorithm of EEGs using Multi-template Method)

  • 김동준
    • 대한전기학회논문지:시스템및제어부문D
    • /
    • 제53권12호
    • /
    • pp.834-838
    • /
    • 2004
  • This paper proposes an algorithm for EEG pattern classification using the Multi-template method, which is a kind of speaker adaptation method for speech signal processing. 10-channel EEG signals are collected in various environments. The linear prediction coefficients of the EEGs are extracted as the feature parameter of human sensibility. The human sensibility classification algorithm is developed using neural networks. Using EEGs of comfortable or uncomfortable seats, the proposed algorithm showed about 75% of classification performance in subject-independent test. In the tests using EEG signals according to room temperature and humidity variations, the proposed algorithm showed good performance in tracking of pleasantness changes and the subject-independent tests produced similar performances with subject-dependent ones.

Intensified Sentiment Analysis of Customer Product Reviews Using Acoustic and Textual Features

  • Govindaraj, Sureshkumar;Gopalakrishnan, Kumaravelan
    • ETRI Journal
    • /
    • 제38권3호
    • /
    • pp.494-501
    • /
    • 2016
  • Sentiment analysis incorporates natural language processing and artificial intelligence and has evolved as an important research area. Sentiment analysis on product reviews has been used in widespread applications to improve customer retention and business processes. In this paper, we propose a method for performing an intensified sentiment analysis on customer product reviews. The method involves the extraction of two feature sets from each of the given customer product reviews, a set of acoustic features (representing emotions) and a set of lexical features (representing sentiments). These sets are then combined and used in a supervised classifier to predict the sentiments of customers. We use an audio speech dataset prepared from Amazon product reviews and downloaded from the YouTube portal for the purposes of our experimental evaluations.