Search | Korea Science

Phoneme Segmentation in Consideration of Speech feature in Korean Speech Recognition (한국어 음성인식에서 음성의 특성을 고려한 음소 경계 검출)

서영완;송점동;이정현
- Journal of Internet Computing and Services
- /
- v.2 no.1
- /
- pp.31-38
- /
- 2001
Speech database built of phonemes is significant in the studies of speech recognition, speech synthesis and analysis, Phoneme, consist of voiced sounds and unvoiced ones, Though there are many feature differences in voiced and unvoiced sounds, the traditional algorithms for detecting the boundary between phonemes do not reflect on them and determine the boundary between phonemes by comparing parameters of current frame with those of previous frame in time domain, In this paper, we propose the assort algorithm, which is based on a block and reflecting upon the feature differences between voiced and unvoiced sounds for phoneme segmentation, The assort algorithm uses the distance measure based upon MFCC(Mel-Frequency Cepstrum Coefficient) as a comparing spectrum measure, and uses the energy, zero crossing rate, spectral energy ratio, the formant frequency to separate voiced sounds from unvoiced sounds, N, the result of out experiment, the proposed system showed about 79 percents precision subject to the 3 or 4 syllables isolated words, and improved about 8 percents in the precision over the existing phonemes segmentation system.
PDF

Implementation of Power Line Modem Using a Direct Sequence Spread Spectrum Technique (직접대역확산 기법을 적용한 전력선 모뎀의 구현)

송문규;김대우;사공석진;차균현
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.18 no.2
- /
- pp.218-230
- /
- 1993
A power line modem(PLM) which transfers data safely through power lines in houses or small offices is considered. When a power line is used for communications, transmitted signals could be affected by the channel characteristics such as frequency-selective fading, interference, and time-varying attenuation. In order to overcome these impairments, a direct sequence(DS) technique which is well known as an effective instrument against a variety of interferences and hostile channel properties is employed. Using a DS technique, however, requires more circuits such as PN code generator circuits, code modification circuits, and complicated synchronization circuits, and it also results in substantial acquisition delay. In this paper, some of these circuits are implemented via software programmed in the system controller, and the complicated synchronization circuits are replaced by simple circuits utilizing a 60 Hz power signal for synchronization. The synchronization ciruits used in this paper virtually eliminate the substantial acquisition delay, and is also designed to free influence of 60 Hz zero crossing jitters which reside in a power signal. As a result, a PLM using a DS technique is realized in the form of wall-socket plug, and the PLM hardware would be very much simplified.
PDF

Voice Activity Detection Based on Entropy in Noisy Car Environment (차량 잡음 환경에서 엔트로피 기반의 음성 구간 검출)

Roh, Yong-Wan;Lee, Kue-Bum;Lee, Woo-Seok;Hong, Kwang-Seok
- Journal of the Institute of Convergence Signal Processing
- /
- v.9 no.2
- /
- pp.121-128
- /
- 2008
Accurate voice activity detection have a great impact on performance of speech applications including speech recognition, speech coding, and speech communication. In this paper, we propose methods for voice activity detection that can adapt to various car noise situations during driving. Existing voice activity detection used various method such as time energy, frequency energy, zero crossing rate, and spectral entropy that have a weak point of rapid. decline performance in noisy environments. In this paper, the approach is based on existing spectral entropy for VAD that we propose voice activity detection method using MFB(Met-frequency filter banks) spectral entropy, gradient FFT(Fast Fourier Transform) spectral entropy. and gradient MFB spectral entropy. FFT multiplied by Mel-scale is MFB and Mel-scale is non linear scale when human sound perception reflects characteristic of speech. Proposed MFB spectral entropy method clearly improve the ability to discriminate between speech and non-speech for various in noisy car environments that achieves 93.21% accuracy as a result of experiments. Compared to the spectral entropy method, the proposed voice activity detection gives an average improvement in the correct detection rate of more than 3.2%.
PDF

A Study on the Improvement of DTW with Speech Silence Detection (음성의 묵음구간 검출을 통한 DTW의 성능개선에 관한 연구)

Kim, Jong-Kuk;Jo, Wang-Rae;Bae, Myung-Jin
- Speech Sciences
- /
- v.10 no.4
- /
- pp.117-124
- /
- 2003
Speaker recognition is the technology that confirms the identification of speaker by using the characteristic of speech. Such technique is classified into speaker identification and speaker verification: The first method discriminates the speaker from the preregistered group and recognize the word, the second verifies the speaker who claims the identification. This method that extracts the information of speaker from the speech and confirms the individual identification becomes one of the most efficient technology as the service via telephone network is popularized. Some problems, however, must be solved for the real application as follows; The first thing is concerning that the safe method is necessary to reject the imposter because the recognition is not performed for the only preregistered customer. The second thing is about the fact that the characteristic of speech is changed as time goes by, So this fact causes the severe degradation of recognition rate and the inconvenience of users as the number of times to utter the text increases. The last thing is relating to the fact that the common characteristic among speakers causes the wrong recognition result. The silence parts being included the center of speech cause that identification rate is decreased. In this paper, to make improvement, We proposed identification rate can be improved by removing silence part before processing identification algorithm. The methods detecting speech area are zero crossing rate, energy of signal detect end point and starting point of the speech and process DTW algorithm by using two methods in this paper. As a result, the proposed method is obtained about 3% of improved recognition rate compare with the conventional methods.
PDF

Dimming Control Signal Transmisson of Electronic Ballast on the Power Line and Characteristics Measurement (전력선을 이용한 전자식 안정기 조광 신호 전송과 특성 측정)

이상곤;정은택;강복연;양병렬;유홍균
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.19 no.4
- /
- pp.691-700
- /
- 1994
A power line in not so good in characteristics for communication, because it is a media to transfer the commercial electrical power, and its load noise and high frequency noise are so much. Thus, a simple method to transfer a remote control signal on the power line is studied. The already-existing method is that two signals with upper part eliminated is transmitted every N step. But the method is investigated which the transmitter sends a period signal eliminated in arbitrary phase. Thus the transmission power loss due to elimination of signal can be reduced to the minimum. To implement it, a timer calculating the time from zero-crossing point to the phase is required. The micro-controller, 87C51, precisely calculates the phase using one of two built-in timers. As a result, a remote control signal tramsmitter and receiver using a partially eliminated signal, which is better than the conventional technique using half-eliminated signal in a efficiency of power transmission, is realized, and its characteristics are analyzed.
PDF

Feature Extraction and Evaluation for Classification Models of Injurious Falls Based on Surface Electromyography

Lim, Kitaek;Choi, Woochol Joseph
- Physical Therapy Korea
- /
- v.28 no.2
- /
- pp.123-131
- /
- 2021
Background: Only 2% of falls in older adults result in serious injuries (i.e., hip fracture). Therefore, it is important to differentiate injurious versus non-injurious falls, which is critical to develop effective interventions for injury prevention. Objects: The purpose of this study was to a. extract the best features of surface electromyography (sEMG) for classification of injurious falls, and b. find a best model provided by data mining techniques using the extracted features. Methods: Twenty young adults self-initiated falls and landed sideways. Falling trials were consisted of three initial fall directions (forward, sideways, or backward) and three knee positions at the time of hip impact (the impacting-side knee contacted the other knee ("knee together") or the mat ("knee on mat"), or neither the other knee nor the mat was contacted by the impacting-side knee ("free knee"). Falls involved "backward initial fall direction" or "free knee" were defined as "injurious falls" as suggested from previous studies. Nine features were extracted from sEMG signals of four hip muscles during a fall, including integral of absolute value (IAV), Wilson amplitude (WAMP), zero crossing (ZC), number of turns (NT), mean of amplitude (MA), root mean square (RMS), average amplitude change (AAC), difference absolute standard deviation value (DASDV). The decision tree and support vector machine (SVM) were used to classify the injurious falls. Results: For the initial fall direction, accuracy of the best model (SVM with a DASDV) was 48%. For the knee position, accuracy of the best model (SVM with an AAC) was 49%. Furthermore, there was no model that has sensitivity and specificity of 80% or greater. Conclusion: Our results suggest that the classification model built upon the sEMG features of the four hip muscles are not effective to classify injurious falls. Future studies should consider other data mining techniques with different muscles.
https://doi.org/10.12674/ptk.2021.28.2.123 인용 PDF KSCI

Variable Quad Rate ADPCM for Efficient Speech Transmission and Real Time Implementation on DSP (효율적인 음성신호의 전송을 위한 4배속 가변 변환율 ADPCM기법 및 DSP를 이용한 실시간 구현)

한경호
- Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
- /
- v.18 no.1
- /
- pp.129-136
- /
- 2004
In this paper, we proposed quad variable rates ADPCM coding method for efficient speech transmission and real time porcessing is implemented on TMS320C6711-DSP. The modified ADPCM with four variable coding rates, 16[kbps], 24[kbps], 32[kbps] and 40[kbps] are used for speech window samples for good quality speech transmission at a small data bits and real time encoding and decoding is implemented using DSP. ZCR is used to identify the influence of the noise on the speech signal and to decide the rate change threshold. For noise superior signals, low coding rates are applied to minimize data bit and for noise inferior signals, high coding rates are applied to enhance the speech quality. In most speech telecommunications, silent period takes more than half of the signals, speech quality close to 40[kbps] can be obtained at comparabley low data bits and this is shown by simulation and experiments. TMS320C6711-DSK board has 128K flash memory and performance of 1333MIPS and has meets the requirements for real time implementation of proposed coding algorithm.
https://doi.org/10.5207/JIEIE.2004.18.1.129 인용 PDF KSCI

Real Time Environmental Classification Algorithm Using Neural Network for Hearing Aids (인공 신경망을 이용한 보청기용 실시간 환경분류 알고리즘)

Seo, Sangwan;Yook, Sunhyun;Nam, Kyoung Won;Han, Jonghee;Kwon, See Youn;Hong, Sung Hwa;Kim, Dongwook;Lee, Sangmin;Jang, Dong Pyo;Kim, In Young
- Journal of Biomedical Engineering Research
- /
- v.34 no.1
- /
- pp.8-13
- /
- 2013
Persons with sensorineural hearing impairment have troubles in hearing at noisy environments because of their deteriorated hearing levels and low-spectral resolution of the auditory system and therefore, they use hearing aids to compensate weakened hearing abilities. Various algorithms for hearing loss compensation and environmental noise reduction have been implemented in the hearing aid; however, the performance of these algorithms vary in accordance with external sound situations and therefore, it is important to tune the operation of the hearing aid appropriately in accordance with a wide variety of sound situations. In this study, a sound classification algorithm that can be applied to the hearing aid was suggested. The proposed algorithm can classify the different types of speech situations into four categories: 1) speech-only, 2) noise-only, 3) speech-in-noise, and 4) music-only. The proposed classification algorithm consists of two sub-parts: a feature extractor and a speech situation classifier. The former extracts seven characteristic features - short time energy and zero crossing rate in the time domain; spectral centroid, spectral flux and spectral roll-off in the frequency domain; mel frequency cepstral coefficients and power values of mel bands - from the recent input signals of two microphones, and the latter classifies the current speech situation. The experimental results showed that the proposed algorithm could classify the kinds of speech situations with an accuracy of over 94.4%. Based on these results, we believe that the proposed algorithm can be applied to the hearing aid to improve speech intelligibility in noisy environments.
https://doi.org/10.9718/JBER.2013.34.1.8 인용 PDF KSCI

Facial Feature Retraction for Face and Facial Expression Recognition (얼굴인식 및 표정 인식을 위한 얼굴 및 얼굴요소의 윤곽선 추출)

이경희;변혜란;정찬섭
- Proceedings of the Korean Society for Emotion and Sensibility Conference
- /
- 1998.11a
- /
- pp.25-29
- /
- 1998
본 논문은 얼굴 인식 또는 표정 인식 분야에 있어서 중요한 특징을 나타내는 얼굴과 얼굴의 주요소인 눈과 입, 눈썹의 영역 추출 및 그의 윤곽선·추출에 관한 방법을 제시한다. 얼굴요소의 영역 추출은 엣지 정보와 이진화 영상을 병합하여 이용한 프로젝션 분석을 통하여 얼굴 및 각 얼굴요소를 포함하는 최소포함사각형(MER: Minimum Enclosing Rectangle)을 추출하였다. 얼굴 영상에 관련된 윤곽선 연구에는 가변 템플릿(Deformable Template), 스네이크(Snakes: Active Contour Model)를 이용하는 연구들이 이루어지고 있는데 가변 템플릿 방법은 수행시간이 느리고 추출된 윤곽선의 모양이 획일 된 모양을 갖는 특성이 있다. 본 논문에서는 사람마다 얼굴요소의 모양의 개인차가 반영되고 빠른 수렴을 할 수 있는 스네이크 모델을 정의하여 눈, 입, 눈썹, 얼굴의 윤곽선 추출 실험을 하였다. 또한 스네이크는 초기 윤곽선의 설정이 윤곽선의 추출 곁과에 큰 영향을 미치므로, 초기 윤곽선의 설정 과정이 매우 중요하다. 본 논문에서는 얼굴 및 각 얼굴요소를 포함하는 각각의 최소 포함 사각형(MER)을 추출하고, 이 추출된 MER 내에서 얼굴 및 각 얼굴요소의 일반적인 모양을 초기 윤곽선으로 설정하는 방법을 사용하였다. 실험결과 눈, 입, 얼굴의 MER의 추출은 모두 성공하였고, 눈썹이 흐린 사람들의 경우에만 눈썹의 MER추출이 졸지 않았다. 추출된 MER을 기반으로 하여 스네이크 모델을 적용한 결과, 눈, 입, 눈썹, 얼굴의 다양한 모양을 반영한 윤곽선 추출 결과를 보였다. 특히 눈의 경우는 1차 유도 엣지 연산자에 의한 엣지 와 2차 유도 연산자를 이용한 영점 교차점(Zero Crossing)과 병합한 에너지 함수를 설정하여 보다 더 나은 윤곽선 추출 결과를 보였다. 얼굴의 윤곽선의 경우도 엣지 값과 명도 값을 병합한 에너지 함수에 의해 비교적 정확한 결과를 얻을 수 있었다.잘 동작하였다.되는 데이타를 입력한후 마우스로 원하는 작업의 메뉴를 선택하면 된다. 방법을 타액과 혈청내 testosterone 농도 측정에 응용하여 RIA의 결과와 비교하여 본 바 상관관계가 타액에서 r=0.969, 혈청에서 r=0.990으로 두 결과가 잘 일치하였다. 본 실험에서 측정된 한국인 여성의 타액내 testosterone농도는 107.7$\pm$12.0 pmol／l이었고, 남성의 타액내 농도는 274.2$\pm$22.1 pmol／l이었다. 이상의 결과로 보아 본 연구에서 정립된 EIA 방법은 RIA를 대신하여 소규모의 실험실에서도 활용할 수 있을 것으로 사려된다.또한 상실기 이후 배아에서 합성되며, 발생시기에 따라 그 영향이 다르고 팽창과 부화에 관여하는 것으로 사료된다. 더욱이, 조선의 ${\ulcorner}$구성교육${\lrcorner}$이 조선총독부의 관리하에서 실행되었다는 것을, 당시의 사범학교를 중심으로 한 교육조직을 기술한 문헌에 의해 규명시켰다.nd of letter design which represents -natural objects and was popular at the time of Yukjo Dynasty, and there are some documents of that period left both in Japan and Korea. "Hyojedo" in Korea is supposed to have been influenced by the letter design. Asite- is also considered to have been "Japanese Letter Jobcheso." Therefore, the purpose of this study is to look into the origin of the letter designs in the Chinese character culture
PDF

A Study on the Utilization and Control Method of Hybrid Switching Tap Based Automatic Voltage Regulator on Smart Grid (스마트그리드의 탭 전환 자동 전압 조정기의 다중 스위칭 제어 방법 및 활용 방안에 관한 연구)

Park, Gwang-Yun;Kim, Jung-Ryul;Kim, Byung-Gi
- Journal of the Korea Society of Computer and Information
- /
- v.17 no.12
- /
- pp.31-39
- /
- 2012
In this paper, we propose a microprocessor-based automatic voltage regulator(AVR) to reduce consumers' electric energy consumption and to help controlling peak demanding power. Hybrid Switching Automatic Voltage Regulator (HS-AVR) consist of a toroidal core, several tap control switches, display and command control parts. The coil forms an autotransformer which has a serial main winding and four parallel auxiliary windings. It controls the output voltage by changing the combination of the coils and the switches. Relays are adopted as the link switches of the coils to minimize the loss. To make connecting and disconnecting time accurate, relays of the circuit have parallel TRIACs. A software phase locked loop(PLL) has been used to synchronize the timings of the switches to the voltage waveform. The software PLL informs the input voltage zero-crossing and positive/negative peak timing. The traditional voltage transformers and AVRs have a disadvantage of having a large mandatory capacity to accommodate maximum inrush current to avoid the switch contact damage. But we propose a suitable AVR for every purpose in smart grid with reduced size and increased efficiency.
https://doi.org/10.9708/jksci/2012.17.12.031 인용 PDF KSCI

Search Result 124, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)