Search | Korea Science

A Study on the Voice Dialing using HMM and Post Processing of the Connected Digits (HMM과 연결 숫자음의 후처리를 이용한 음성 다이얼링에 관한 연구)

Yang, Jin-Woo;Kim, Soon-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.14 no.5
- /
- pp.74-82
- /
- 1995
This paper is study on the voice dialing using HMM and post processing of the connected digits. HMM algorithm is widely used in the speech recognition with a good result. But, the maximum likelihood estimation of HMM(Hidden Markov Model) training in the speech recognition does not lead to values which maximize recognition rate. To solve the problem, we applied the post processing to segmental K-means procedure are in the recognition experiment. Korea connected digits are influenced by the prolongation more than English connected digits. To decrease the segmentation error in the level building algorithm some word models which can be produced by the prolongation are added. Some rules for the added models are applied to the recognition result and it is updated. The recognition system was implemented with DSP board having a TMS320C30 processor and IBM PC. The reference patterns were made by 3 male speakers in the noisy laboratory. The recognition experiment was performed for 21 sort of telephone number, 252 data. The recognition rate was $6\%$ in the speaker dependent, and $80.5\%$ in the speaker independent recognition test.
PDF

Voice Personality Transformation Using a Probabilistic Method (확률적 방법을 이용한 음성 개성 변환)

Lee Ki-Seung
- The Journal of the Acoustical Society of Korea
- /
- v.24 no.3
- /
- pp.150-159
- /
- 2005
This paper addresses a voice personality transformation algorithm which makes one person's voices sound as if another person's voices. In the proposed method, one person's voices are represented by LPC cepstrum, pitch period and speaking rate, the appropriate transformation rules for each Parameter are constructed. The Gaussian Mixture Model (GMM) is used to model one speaker's LPC cepstrums and conditional probability is used to model the relationship between two speaker's LPC cepstrums. To obtain the parameters representing each probabilistic model. a Maximum Likelihood (ML) estimation method is employed. The transformed LPC cepstrums are obtained by using a Minimum Mean Square Error (MMSE) criterion. Pitch period and speaking rate are used as the parameters for prosody transformation, which is implemented by using the ratio of the average values. The proposed method reveals the superior performance to the previous VQ-based method in subjective measures including average cepstrum distance reduction ratio and likelihood increasing ratio. In subjective test. we obtained almost the same correct identification ratio as the previous method and we also confirmed that high qualify transformed speech is obtained, which is due to the smoothly evolving spectral contours over time.
PDF KSCI

Traffic-Aware TXOP adjusting Algorithm for IEEE 802.11e Network (IEEE 802.11e에서 전송흐름을 고려한 TXOP 조정 알고리듬)

Joung, Soo-Kyoung;Kim, Nam-Il
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.48 no.1
- /
- pp.33-43
- /
- 2011
This paper proposes a traffic-aware TXOP adjustment algorithm for the IEEE 802.11e networks. In the proposed algorithm the access point (AP) monitors the network traffics periodically and adjusts the TXOP value of the non-QoS traffic in order to improve the network throughput while maintaining the QoS of video and voice applications. The experimental results show that the proposed algorithm outperforms the legacy IEEE 802.11e in terms of the throughput and the fairness.
PDF KSCI

A study on the Speaker Recognition using the Pitch (피치계수를 이용한 화자인식에 관한 연구)

김에녹
- Journal of the Korea Computer Industry Society
- /
- v.2 no.4
- /
- pp.471-480
- /
- 2001
In this thesis, we perform the experiment of speaker recognition by identifying vowels in the pronunciation of each speaker using Adaptive Resource Theory 2(ART2) model. The 5 adult males and 5 adult females pronounce from 0 to 9 digits. We extract the vowels from the pronunciation of each speaker first, we are extracted characteristic coefficient through a pitch detection algorithm, a LPC analysis, and a LPC cepstral analysis to generate an input pattern of ART2. The experimental results showed that pitch coefficients are somewhat more enhanced than LPC or LPC cepstral coefficient.
PDF

Korean Continuous Speech Recognition Using Discrete Duration Control Continuous HMM (이산 지속시간제어 연속분포 HMM을 이용한 연속 음성 인식)

Lee, Jong-Jin;Kim, Soo-Hoon;Hur, Kang-In
- The Journal of the Acoustical Society of Korea
- /
- v.14 no.1
- /
- pp.81-89
- /
- 1995
In this paper, we report the continuous speech recognition system using the continuous HMM with discrete duration control and the regression coefficients. Also, we do recognition experiment using One Pass DP method(for 25 sentences of robot control commands) with finite state automata context control. In the experiment for 4 connected spoken digits, the recognition rates are $93.8\%$ when the discrete duration control and the regression coefficients are included, and $80.7\%$ when they are not included. In the experiment for 25 sentences of the robot control commands, the recognition rate are $90.9\%$ when FSN is not included and $98.4\%$ when FSN is included.
PDF

Comparison of Speech Intelligibility depending on the Sound Source Location in the Classrooms of Middle and High Schools (음원의 위치에 따른 중${\cdot}$고등학교 교실의 음성명료도 비교)

Lee Hwan-Hee;Haan Chan-Hoon
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.487-490
- /
- 2002
학교 교육의 특성상 많은 부분이 교실에서의 음성정보 전달에 의해 이루어지고 있는 점을 감안하면 바람직한 청취환경의 개선이 검토되어야 한다. 또한 중${\cdot}$ 고등학교의 수학능력시험의 국어, 영어 듣기평가 및 다양한 어학 시험이 시청각 시설을 통해 이루어지고 있는 실정이므로 교실의 음환경은 매우 중요한 요소라하겠다. 본 논문에서는 음환경을 좌우하는 음원의 위치에 따라 명료 도가 어떻게 달라지는지를 실험을 통하여 검증하고, 명료도가 높고, 교실 전체에 균등한 분포를 보이는 음원의 위치를 찾아내고자 하였다. 교실 내의 음원의 위치로는 일반적으로 많이 쓰이고 있는 column(벽면 노출형)과 ceiling(천정 매입형) 위치와 임의의 음원 cluster(전면 중앙)를 선정하여 음장 파라메터를 측정한 결과 RASTI 는 세 타입 모두 $0.54\~0.55$로 값으로 근소한 차이를 보이고 있으며, 잔향시간은 ceiling>cluster>column의 순서로 나타났다. 일반적으로 잔향과 명료도와의 관계는 반비례하는 것으로 알려져 있으나, 실험 결과 잔향시간이 1.33초로 가장 긴 column 스피커의 경우 D50 값이 약 $47\%$로 가장 높은 값으로 나타났다. 이것은 column형 스피커의 경우 음원과 각 학생의 위치에 대한 평균 직접음선거리가 가장 짧기 때문인 것으로 나타났다.
PDF

A Study on the Effectiveness of the Lungs Hand Acupuncture Based on Bio Signal Analysis (생체신호분석 기술을 적용한 폐 수지침 요법에 대한 효과성 연구)

Kim, Bong-Hyun;Cho, Dong-Uk
- The KIPS Transactions:PartB
- /
- v.19B no.2
- /
- pp.77-82
- /
- 2012
We carried out study to prove effectiveness as stimulating corresponding points to lung in hand to experiment applied analysis parameters for image and audio signals in this paper. To this end we collected facial image and voice before and after stimulating corresponding points to lung in hand to a male 20s 25 people. In addition, we analyzed change color, voice energy and speaking rate of right cheek area corresponding points to lung to suggest the theory of the Oriental medicine diagnosis based on data collected. As a result, after performing hand acupuncture, L value of right cheek area decreased average 2.33 and a value b value increased 0.76, 0.97 on average. In addition, size of voice energy increased average 0.42, speaking rate decreased average 0.07. In other words, effect of lung function was improved using hand acupuncture corresponding points to lung.
https://doi.org/10.3745/KIPSTB.2012.19B.2.077 인용 PDF KSCI

A New EGG System Design and Speech Analysis for Quantitative Analysis of Human Glottal Vibration Patterns (성문진동 패턴의 정량적인 해석을 위한 새로운 시스템 설계와 음성분석)

김종찬;이재천;김덕원;오명환;윤대희;차일환
- Journal of Biomedical Engineering Research
- /
- v.20 no.4
- /
- pp.427-433
- /
- 1999
The purpose of the study is to develop an improved pitch extraction method that can be used in a variety of speech applications such as high-puality compression and vocoding, and recognition and synthesis of speech. To do so, we develop a new electroglottograph (EGG) measurement system that is based on the four modulation-demodulation type spot electrodes for detecting the EGG signals. Then, the glottal closure instant(GCI) is determined from the EGG signals on a real-time basis. We can obtain the pitch contour using the information on the GCI. It turns out that the new pitch contour algorithm (PCA) operates more reliably as compared to the conventional speech-only-based algorithm. In addition, we study the speech source models and glottal vibratory patterns for Koreans by measuring and analyzing the diversified vibration patterns of the vocal from the EGG signals.
PDF

Tracking Performance Improvement of the Double-Talk Robust Algorithm for Network Echo Cancellation (네트워크 반향제거를 위한 동시통화에 강인한 알고리듬의 추적 성능 개선)

Yoo, Jae-Ha
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.12 no.1
- /
- pp.195-200
- /
- 2012
We present a new algorithm which can improve the tracking performance of the double-talk robust algorithm. A detection method of the echo path change and a modification method for the update equation of the conventional adaptive filter are proposed. A duration of the high error signal to scale parameter ratio varies according to the call status and this property is used to detect the echo path change. The proposed update equation of the adaptive filter improves the tracking performance by prohibiting wrong selection of the error signal. Simulations using real speech signals and echo paths of the ITU-T G.168 standard confirmed that as compared to the conventional algorithm, the proposed algorithm improved the tracking performance by more than 4 dB.
https://doi.org/10.7236/JIWIT.2012.12.1.195 인용 PDF KSCI

Video QoE Measurement Algorithm by Parameter Matching for IPTV Services (파라메터 매칭에 의한 IPTV 영상 QoE 측정 알고리즘)

Ha, Sang-Yong;Kim, Chin-Chol;Shin, Dong-Jin;Jo, Yong-Hyun;Roh, Byeong-Hee
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.36 no.5B
- /
- pp.451-463
- /
- 2011
QoE is defined as a quality perceived by users on a certain service. However, standard method to measure voice QoE(MOS) has been developed, but no standard method to measure video QoE has been defined. In this paper, we propose an efficient algorithm to measure video QoE s automatically for IPTV services. The proposed method selects candidate scenarios that affect the users' MOS directly, and derives weight factors for the selected scenarios. With the weight factors for the scenarios, video QoE value is calculated. For the validation of the proposed algorithm, we made degraded videos reflecting the parameters. With the degraded videos, by comparing the user perceived MOSs with the video QoEs derived by the proposed algorithm, we show that the two values are highly correlated each other.
https://doi.org/10.7840/KICS.2011.36B.5.451 인용 PDF KSCI

Search Result 76, Processing Time 0.137 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)