• Title/Summary/Keyword: Speech detection

Search Result 472, Processing Time 0.028 seconds

A Collaborative Framework for Discovering the Organizational Structure of Social Networks Using NER Based on NLP (NLP기반 NER을 이용해 소셜 네트워크의 조직 구조 탐색을 위한 협력 프레임 워크)

  • Elijorde, Frank I.;Yang, Hyun-Ho;Lee, Jae-Wan
    • Journal of Internet Computing and Services
    • /
    • v.13 no.2
    • /
    • pp.99-108
    • /
    • 2012
  • Many methods had been developed to improve the accuracy of extracting information from a vast amount of data. This paper combined a number of natural language processing methods such as NER (named entity recognition), sentence extraction, and part of speech tagging to carry out text analysis. The data source is comprised of texts obtained from the web using a domain-specific data extraction agent. A framework for the extraction of information from unstructured data was developed using the aforementioned natural language processing methods. We simulated the performance of our work in the extraction and analysis of texts for the detection of organizational structures. Simulation shows that our study outperformed other NER classifiers such as MUC and CoNLL on information extraction.

6 Clinical Reports of Temporary Severe Amnesia Patients -focusing on amnesia, hysteric convulsion, dissociative disorder (단기 기억상실을 주증(主症)으로 하는 6례(例)의 임상보고 -중기(中氣), 건망(健忘), 해리성 기억장애 중심으로)

  • Oh, Young-Jin;Kim, Bo-Kyung
    • Journal of Oriental Neuropsychiatry
    • /
    • v.16 no.2
    • /
    • pp.287-299
    • /
    • 2005
  • Dissociative disorder is a psychiatric disorder characterized by a sudden loss of memory, but which has no organic disease or explanation. It usually occurs after heavy psychosocial stress or traumatic experience. A transient cerebral ischemic attack (TIA) is an acute episode of temporary and focal loss of cerebral function of vascular origin. TIAs are rapid in onset; symptoms reach their maximal manifestation in fewer than 5 minutes. Manifestations are of variable duration and typically last 2-15 minutes(rarely as long as 24 h). Most TIA durations are less than 1 hour. Of concern is the careful detection of changes in behavior, speech, gait, memory, movement, and vision. TIAs are uncommon in persons younger than 60 years. I treat 6 cases of Sudden Temporary Amnesia Patients with oriental medicine and they are improved. All of them had amnesia for $6{\sim}10\;hours$. During that time, they show behavioral changes and they are not on the state of unconsciousness. After recovery, they also forget what happen at the time. they have some emotional reason too. In conclusion, 4 cases of them belong to dissociative disorder and 2 other cases, TIA.

  • PDF

An Enhanced Text-Prompt Speaker Recognition Using DTW (DTW를 이용한 향상된 문맥 제시형 화자인식)

  • 신유식;서광석;김종교
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.1
    • /
    • pp.86-91
    • /
    • 1999
  • This paper presents the text-prompt method to overcome the weakness of text-dependent and text-independent speaker recognition. Enhanced dynamic time warping for speaker recognition algorithm is applied. For the real-time processing, we use a simple algorithm for end-point detection without increasing computational complexity. The test shows that the weighted-cepstrum is most proper for speaker recognition among various speech parameters. As the experimental results of the proposed algorithm for three prompt words, the speaker identification error rate is 0.02%, and when the threshold is set properly, false rejection rate is 1.89%, false acceptance rate is 0.77% and verification total error rate is 0.97% for speaker verification.

  • PDF

Adaptive Noise Reduction of Speech using Wavelet Transform (웨이브렛 변환을 이용한 음성의 적응 잡음 제거)

  • Im Hyung-kyu;Kim Cheol-su
    • Journal of the Korea Computer Industry Society
    • /
    • v.6 no.2
    • /
    • pp.271-278
    • /
    • 2005
  • This paper proposed a new time adapted threshold using the standard deviations of Wavelet coefficients after Wavelet transform by frame scale. The time adapted threshold is set up using the sum of standard deviations of Wavelet coefficient in level 3 approximation and weighted level 1 detail. Level 3 approximation coefficients represent the voiced sound with low frequency and level 1 detail coefficients represent the unvoiced sound with high frequency. After reducing noise by soft thresholding with the proposed time adapted threshold, there are still residual noises in silent interval. To reduce residual noises in silent interval, a detection algorithm of silent interval is proposed. From simulation results, it is demonstrated that the proposed algorithm improves SNR and MSE performance more than Wavelet transform and Wavelet packet transform does.

  • PDF

Decoding Brain States during Auditory Perception by Supervising Unsupervised Learning

  • Porbadnigk, Anne K.;Gornitz, Nico;Kloft, Marius;Muller, Klaus-Robert
    • Journal of Computing Science and Engineering
    • /
    • v.7 no.2
    • /
    • pp.112-121
    • /
    • 2013
  • The last years have seen a rise of interest in using electroencephalography-based brain computer interfacing methodology for investigating non-medical questions, beyond the purpose of communication and control. One of these novel applications is to examine how signal quality is being processed neurally, which is of particular interest for industry, besides providing neuroscientific insights. As for most behavioral experiments in the neurosciences, the assessment of a given stimulus by a subject is required. Based on an EEG study on speech quality of phonemes, we will first discuss the information contained in the neural correlate of this judgement. Typically, this is done by analyzing the data along behavioral responses/labels. However, participants in such complex experiments often guess at the threshold of perception. This leads to labels that are only partly correct, and oftentimes random, which is a problematic scenario for using supervised learning. Therefore, we propose a novel supervised-unsupervised learning scheme, which aims to differentiate true labels from random ones in a data-driven way. We show that this approach provides a more crisp view of the brain states that experimenters are looking for, besides discovering additional brain states to which the classical analysis is blind.

Statistical Model-Based Voice Activity Detection Using Spatial Cues for Dual-Channel Noisy Speech Recognition (이중채널 잡음음성인식을 위한 공간정보를 이용한 통계모델 기반 음성구간 검출)

  • Shin, Min-Hwa;Park, Ji-Hun;Kim, Hong-Kook
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2010.07a
    • /
    • pp.150-151
    • /
    • 2010
  • 본 논문에서는 잡음환경에서의 이중채널 음성인식을 위한 통계모델 기반 음성구간 검출 방법을 제안한다. 제안된 방법에서는 다채널 입력 신호로부터 얻어진 공간정보를 이용하여 음성 존재 및 부재 확률모델을 구하고 이를 통해 음성구간 검출을 행한다. 이때, 공간정보는 두 채널간의 상호 시간 차이와 상호 크기 차이로, 음성 존재 및 부재 확률은 가우시안 커널 밀도 기반의 확률모델로 표현된다. 그리고 음성구간은 각 시간 프레임 별 음성 존재 확률 대비 음성 부재 확률의 비를 추정하여 검출된다. 제안된 음성구간 검출 방법의 평가를 위해 검출된 구간만을 입력으로 하는 음성인식 성능을 측정한다. 실험결과, 제안된 공간정보를 이용하는 통계모델 기반의 음성구간 검출 방법이 주파수 에너지를 이용하는 통계모델 기반의 음성구간 검출 방법과 주파수 스펙트럼 밀도 기반 음성구간 검출 방법에 비해 각각 15.6%, 15.4%의 상대적 오인식률 개선을 보였다.

  • PDF

A Study on Isolated Words Speech Recognition in a Running Automobile (주행중인 자동차 환경에서의 고립단어 음성인식 연구)

  • 유봉근
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06e
    • /
    • pp.381-384
    • /
    • 1998
  • 본 논문은 주행중인 자동차 환경에서 운전자의 안전성 및 편의성의 동시 확보를 위하여, 보조적인 스위치 조작없이 상시 음성의 입, 출력이 가능하도록 한다. 이때 잡음에 강인한 threshold 값을 구하기 위하여, 일정한 시간마다 기준 에너지와 영교차율(Zero Crossing Rate)을 변경하며, 밴드패스 필터(bandpass filter)를 이용하여 1차, 2차로 나누어 실시간 상태에서 자동으로, 정확하게 끝점검출(End Point Detection)을 처리한다. 기준패턴(reference pattern)은 DMS(Dynamic Multi-Section)을 사용하며, 화자의 변별력을 높이기 위하여 2개의 모델사용을 제안한다. 또한 주행중인 차량의 잡음환경에 강인하기 위하여 일반주행(80km/h 이내), 고속주행(80km/h 이상)등으로 나누며 차량의 가변잡음 크기에 따라 자동으로 선택하도록 한다. 음성의 특징 벡터와 인식 알고리즘은 PLP 13차와 One-Stage Dynamic Programming (OSDP)를 이용한다. 실험결과, 자주 사용되는 차량 편의장치 제어명령 33개에 대하여 중부, 영동 고속도로(시속 80Km/h 이상)에서 화자독립 89.75%, 화자종속 90.08%의 인식율을 구하였으며, 경부 고속도로에서는 화자독립 92.29%, 화자종속 92.42%의 인식율을 구하였다. 그리고 저속 주행중인 자동차 환경(80km/h 이내, 시멘트, 아스팔트 등의 서울시내 및 시외독립)에서는 화자독립 92.89%, 화자종속 94.44% 인식율을 구하였다.

  • PDF

A Study on the Voice Traffic Efficiency and Buffer Management by Priority Control in ATM Multiplexer (ATM 멀티플렉서에서 우선순위 제어에 의한 음성전송효율 및 버퍼관리에 관한 연구)

  • 이동수;최창수;강준길
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.2
    • /
    • pp.354-363
    • /
    • 1994
  • This paper describes the method that voice traffic is served efficiently in BISDN. Voice is divided into talkspurt and silent period, and it is possible to transmit olny talksurt by the speech activity detection. This paper described the voice traffic control algorithm in the ATM network where cell discarding method is applied to the embedded ADPCM voice data. For traffic control, the cell discarding was used over low priority cells when it overflows the queue threshold. To estimate the efficiency of traffic control algorithm, the computer simuation was performed with cell loss probability, queue length and mean delay as performance parameters. The embedded ADPCM voice coding and cell disscarding resulted in improving the voice cell traffic efficiency and the dynamic control over network congestion.

  • PDF

Analysis and Synthesis of Audio Signals using a Sinusoidal Model with Psychoacoustic Criteria (정현파 모델을 이용한 오디오 신호의 심리음향적 분석 및 합성)

  • 남승현;강경옥;홍진우
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.2
    • /
    • pp.77-82
    • /
    • 1999
  • A sinusoidal model has been widely used in the analysis and synthesis of speech and audio signals, and becomes one of the efficient candidates for high quality low bit rate audio coders. One of the crucial steps in the analysis and synthesis using a sinusoidal model is the detection of tonal components. This paper proposes an efficient method for the analysis and synthesis of audio signals using a sinusoidal model, which uses psychoacoustic criteria such as masking effect, masking index, and JNDf(Just Noticeable Difference in Frequency). Simulation results show that the proposed method reduces the number of sinusoids significantly without degrading the quality of the synthesized audio signals.

  • PDF

A Development of Intelligent Service Robot System for Store Management in Unmanned Environment (무인화 환경 기반의 상점 자동 관리를 위한 지능형 서비스 로봇 시스템)

  • Ahn, Ho-Seok;Sa, In-Kyu;Baek, Young-Min;Lee, Dong-Wook
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.17 no.6
    • /
    • pp.539-545
    • /
    • 2011
  • This paper describes an intelligent service robot system for managing a store in an unmanned environment. The robot can be a good replacement for humans because it is possible to work all day and to remember lots of information. We design a system architecture for configuring many intelligent functions of intelligent service robot system which consists of four layers; a User Interaction Layer, a Behavior Scheduling Layer, a Intelligent Module Layer, and a Hardware Layer. We develop an intelligent service robot 'Part Timer' based on the designed system architecture. The 'Part Timer' has many intelligent function modules such as face detection-recognition-tracking module, speech recognition module, navigation module, manipulator module, appliance control module, etc. The 'Part Timer' is possible to answer the phone and this function gives convenient interface to users.