Search | Korea Science

Voiced-Unvoiced-Silence Detection Algorithm using Perceptron Neural Network (퍼셉트론 신경회로망을 사용한 유성음, 무성음, 묵음 구간의 검출 알고리즘)

Choi, Jae-Seung
- The Journal of the Korea institute of electronic communication sciences
- /
- v.6 no.2
- /
- pp.237-242
- /
- 2011
This paper proposes a detection algorithm for each section which detects the voiced section, unvoiced section, and the silence section at each frame using a multi-layer perceptron neural network. First, a power spectrum and FFT (fast Fourier transform) coefficients obtained by FFT are used as the input to the neural network for each frame, then the neural network is trained using these power spectrum and FFT coefficients. In this experiment, the performance of the proposed algorithm for detection of the voiced section, unvoiced section, and silence section was evaluated based on the detection rates using various speeches, which are degraded by white noise and used as the input data of the neural network. In this experiment, the detection rates were 92% or more for such speech and white noise when training data and evaluation data were the different.
https://doi.org/10.13067/JKIECS.2011.6.2.237 인용 PDF KSCI

An Algorithm for Stable Video Conference System (안정적인 화상회의 시스템을 위한 알고리즘)

Lee Moon-Ku
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.42 no.2 s.302
- /
- pp.11-20
- /
- 2005
In previous video conference system, when the number of participants in video conference increases by n, the bandwidth and memory of n2 is required. And also, it brings about increase in traffic and problem of a say during a conference in aspect of transmission of voice data. In this paper, we propose an algorithm of remote video conference using silence detection algerian to resolve the questions such as buffering method of video data in server and heavy traffic detection algorithm to the increase in participants. Video data buffering algorithm is not a method of broadcasting to other client in the server, but this algorithm uses two other methods; the buffering method of receiving compressed video data from clients and the indexing method for acquiring the video data of other participants in clients according to clients' bandwidth and network transmission speed. We apply a voice transmission algerian and a channel management algorithm to the remote video conference system. The method used in the voice transmission algorithm is a silence detection algorithm which does not send silent participants' voice data to the server. The channel management algorithm is a method allocating a say to the participants who have priority. In consideration of average 20 frames and 30ms regardless of a number of participants, we can safely conclude that the transmission of video and voice data is stable.
PDF KSCI

The Research of Reducing the Fixed Codebook Search Time of G.723.1 MP-MLQ (잡음 환경에서의 전송율 감소를 위한 G.723.1 VAD 성능개선에 관한 연구)

김정진;박영호;배명진
- Proceedings of the IEEK Conference
- /
- 2000.06d
- /
- pp.98-101
- /
- 2000
On CELP type Vocoders G.723.1 6.3kbps/5.3kbps Dual Rate Speech Codec, which is developed for Internet Phone and videoconferencing, uses VAD(Voice Activity Detection)/CNG (Comfort Noise Generator) in order to reduce the bit rate in a silence period. In order to reduce the bit rate effectively in this paper, we first set the boundary condition of the energy threshold to prevent the consumption of unnecessary processing time, and use three decision rules to detect an active frame by energy, pitch gain and LSP distance. To evaluate the performance of the proposed algorithm we use silence-inserted speech data with 0, 5, 10, 20dB of SNR. As a result when SNR is over 5dB, the bit rate is reduced up to about 40% without speech degradation and the processing time is additionally decreased.
PDF

A Study on the Improvement of DTW with Speech Silence Detection (음성의 묵음구간 검출을 통한 DTW의 성능개선에 관한 연구)

Kim, Jong-Kuk;Jo, Wang-Rae;Bae, Myung-Jin
- Speech Sciences
- /
- v.10 no.4
- /
- pp.117-124
- /
- 2003
Speaker recognition is the technology that confirms the identification of speaker by using the characteristic of speech. Such technique is classified into speaker identification and speaker verification: The first method discriminates the speaker from the preregistered group and recognize the word, the second verifies the speaker who claims the identification. This method that extracts the information of speaker from the speech and confirms the individual identification becomes one of the most efficient technology as the service via telephone network is popularized. Some problems, however, must be solved for the real application as follows; The first thing is concerning that the safe method is necessary to reject the imposter because the recognition is not performed for the only preregistered customer. The second thing is about the fact that the characteristic of speech is changed as time goes by, So this fact causes the severe degradation of recognition rate and the inconvenience of users as the number of times to utter the text increases. The last thing is relating to the fact that the common characteristic among speakers causes the wrong recognition result. The silence parts being included the center of speech cause that identification rate is decreased. In this paper, to make improvement, We proposed identification rate can be improved by removing silence part before processing identification algorithm. The methods detecting speech area are zero crossing rate, energy of signal detect end point and starting point of the speech and process DTW algorithm by using two methods in this paper. As a result, the proposed method is obtained about 3% of improved recognition rate compare with the conventional methods.
PDF

Impact of Voice Activity Detection on Channel Allocation in Cellular Networks

Limsaksri, Wichan;Thipchaksurat, Sakchai;Varakulsiripunth, Ruttikorn
- 제어로봇시스템학회:학술대회논문집
- /
- 2004.08a
- /
- pp.1067-1071
- /
- 2004
In this paper, the performance enhancement algorithm of channel allocation for voice and data transmission in cellular networks is proposed. The voice activity detection has been applied to dynamic channel allocation procedure to detect and separate the silence and speech among conversation periods. Hence a data user can use the silent period of an active voice channel to transmit its information. To control the selecting of channel allocation policies, the information of number of data in transmission waiting queue has been determined in order to accept the performance measurement. In the simulation results, the improvement of the performance shows via the quality of services, which are an average delay in queue, a blocking probability, and an impact of the proposed scheme is presented in the system.
PDF

Search Result 5, Processing Time 0.021 seconds

Voiced-Unvoiced-Silence Detection Algorithm using Perceptron Neural Network (퍼셉트론 신경회로망을 사용한 유성음, 무성음, 묵음 구간의 검출 알고리즘)

An Algorithm for Stable Video Conference System (안정적인 화상회의 시스템을 위한 알고리즘)

The Research of Reducing the Fixed Codebook Search Time of G.723.1 MP-MLQ (잡음 환경에서의 전송율 감소를 위한 G.723.1 VAD 성능개선에 관한 연구)

A Study on the Improvement of DTW with Speech Silence Detection (음성의 묵음구간 검출을 통한 DTW의 성능개선에 관한 연구)

Impact of Voice Activity Detection on Channel Allocation in Cellular Networks

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)