• Title/Summary/Keyword: on-the-pause

Search Result 149, Processing Time 0.029 seconds

Dialect classification based on the speed and the pause of speech utterances (발화 속도와 휴지 구간 길이를 사용한 방언 분류)

  • Jonghwan Na;Bowon Lee
    • Phonetics and Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.43-51
    • /
    • 2023
  • In this paper, we propose an approach for dialect classification based on the speed and pause of speech utterances as well as the age and gender of the speakers. Dialect classification is one of the important techniques for speech analysis. For example, an accurate dialect classification model can potentially improve the performance of speaker or speech recognition. According to previous studies, research based on deep learning using Mel-Frequency Cepstral Coefficients (MFCC) features has been the dominant approach. We focus on the acoustic differences between regions and conduct dialect classification based on the extracted features derived from the differences. In this paper, we propose an approach of extracting underexplored additional features, namely the speed and the pauses of speech utterances along with the metadata including the age and the gender of the speakers. Experimental results show that our proposed approach results in higher accuracy, especially with the speech rate feature, compared to the method only using the MFCC features. The accuracy improved from 91.02% to 97.02% compared to the previous method that only used MFCC features, by incorporating all the proposed features in this paper.

Age classification of emergency callers based on behavioral speech utterance characteristics (발화행태 특징을 활용한 응급상황 신고자 연령분류)

  • Son, Guiyoung;Kwon, Soonil;Baik, Sungwook
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.6
    • /
    • pp.96-105
    • /
    • 2017
  • In this paper, we investigated the age classification from the speaker by analyzing the voice calls of the emergency center. We classified the adult and elderly from the call center calls using behavioral speech utterances and SVM(Support Vector Machine) which is a machine learning classifier. We selected two behavioral speech utterances through analysis of the call data from the emergency center: Silent Pause and Turn-taking latency. First, the criteria for age classification selected through analysis based on the behavioral speech utterances of the emergency call center and then it was significant(p <0.05) through statistical analysis. We analyzed 200 datasets (adult: 100, elderly: 100) by the 5 fold cross-validation using the SVM(Support Vector Machine) classifier. As a result, we achieved 70% accuracy using two behavioral speech utterances. It is higher accuracy than one behavioral speech utterance. These results can be suggested age classification as a new method which is used behavioral speech utterances and will be classified by combining acoustic information(MFCC) with new behavioral speech utterances of the real voice data in the further work. Furthermore, it will contribute to the development of the emergency situation judgment system related to the age classification.

Implementation of the PVR(Personal Video Recorder) Chip for HDTV (HDTV용 PVR(Personal Video Recorder) 칩 구현)

  • 정수운;이동호
    • Proceedings of the IEEK Conference
    • /
    • 2001.09a
    • /
    • pp.943-946
    • /
    • 2001
  • We have developed a PVR (Personal Video Recorder) chip that is capable of simultaneous playback and recording of HD quality MPEG-2 streams for digital TV. it provides viewers with some advanced features as well as pause, instant replay, skip forward and fast forward/rewind found in conventional PVRs for analog TV. This paper describes the enhanced and innovative features that art implemented on our PVR.

  • PDF

MPEG-2 TS Streaming System based on nCUBE RTSP Protocol (nCUBE RTST 기반 MPEG-2 TS 스트리밍 시스템 개발)

  • 조창식;배수영;마평수;강지훈
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2003.11b
    • /
    • pp.503-507
    • /
    • 2003
  • 사용자의 고화질 요구와 사업자의 차별화된 서비스 제공 노력의 결과로 기존의 MPEG-4 기반이 아닌 고화질 전용의 MPEG-2 화질을 사용하는 VOD 서비스가 새로운 대안으로 제시되고 있다. MPEG-2 비디오는 높은 네트워크 대역폭을 요구하는 단점이 있는 반면, 사용자에게 양질의 화질을 제공할 수 있으며 표준의 사용으로 컨텐츠 유지. 보수에 유리하다. 본 논문에서는 상용 스트리밍 서버인 nCUBE 서버와 연동하여 MPEG-2 TS 데이터를 스트리밍 하는 VOD 시스템에 대하여 설명한다. VOD 제어 프로토콜로 RTSP(Real Time Streaming Protocol)를 사용하였으며, 스트림 전송 프로토콜로 UDP/IP 방식을 사용하였다. 지원하는 VCR 기능으로는 FF, RW, STOP. Pause가 있다.

  • PDF

Influence of 2-bromo-α-ergocryptine on Plasma Prolactin, Oestradiol-17β and Progesterone Levels in Domestic Hen

  • Reddy, I.J.;David, C.G.;Singh, Khub
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.15 no.8
    • /
    • pp.1103-1109
    • /
    • 2002
  • This study investigated the effect of 2-bromo-$\alpha$-ergocryptine (anti prolactin agent) on plasma levels of prolactin, oestradiol-17$\beta$ and progesterone in domestic hen during the active period of lay. Fifty healthy female White Leghorn birds were administered with anti prolactin agent (2-bromo-$\alpha$-ergocryptine, Sigma-USA., methane sulphonate salt, $C_{32}H_{40}BrN_5O_5.CH_4SO_3$) subcutaneously @100$\mu$g/kg body weight at weekly intervals from 17th to 36th week of age. Another group of fifty birds as controls were given placebo in place of bromocriptine. The level of prolactin remained lower in treated birds than in the control birds from 19 to 36 weeks of age. Level of prolactin even in the control group was found to decrease during the peak production period. Oestradiol-$17{\beta}$ and progesterone concentration in treated birds were significantly (p<0.01) higher than the controls during the treatment. Egg production, is positively correlated with oestradiol-$17{\beta}$ (r=0.02; r=0.67) and progesterone (r=0.49; r=0.90) in control and treated groups respectively where as prolactin level is positively correlated with egg production in the control birds (r=0.07). Prolactin levels were negatively correlated with egg production (r=-0.55) in treated birds; and oestradiol-$17{\beta}$ (r =-0.71; r=-0.53) and progesterone (r=-0.22; r=-0.27) respectively in control and treated groups. The total number of pause days during the treatment period decreased significantly (p<0.01) in the treated group compared to the control group. The reduction in pause days in treated group resulted in 1.76% increase in egg production over that in control group. The increase in egg laying days and the total egg production were found to be significant (p<0.01). These results indicate that a lower level of prolactin in circulatory blood enhances egg production in the domestic hen.

A Study on the Perception of English Rhythm and Intonation Structure by Korea University Students (대학생의 영어 리듬과 억양구조 인식에 대한 연구)

  • Park Joo-Hyun
    • Proceedings of the KSPS conference
    • /
    • 1997.07a
    • /
    • pp.92-114
    • /
    • 1997
  • This study is aimed to grasp the actual problems of the perception of English rhythm and intonation structure by Korean University students who have studied English in the secondary schools for the past six years, and to establish the systems of English rhythm and intonation structure for the Korean students of English. For this study, the listening test is provided, and 100 students are chosen as the subjects of the study. The noticeable findings are summarized as follows: (1) Koreans perceive the words stress comparatively well in nonsense words, unfamiliar place names, and familiar word. (2) Koreans do not perceive the isochrony of English rhythm well enough. The perception of the sentence stress is very unstable, especially in the sentence involved in polysyllabic words, compound words, and 'emphatic stress' pr 'contrastive stress'(or in the different rhythmic patterns). (3) Koreans do not perceive the nucleus well enough. The perception of the nucleus is more stable in content words than in function words, at the end of a sentence than in the middle of a sentence, and in monosyllabic words than in the polysyllabic words. (4) Koreans do not perceive the boundary(or pause) of intonation group well enough. The perception of the pause is unstable in the long or complex sentence. (5) Koreans discriminate the meaning of English word stress comparatively well, especially in disyllabic words. But the discrimination is somewhat unstable in polysyllabic words and between 'adjective' and 'verb' (6) Koreans' discrimination of the intonation meaning is below the level. Koreans do not perceive the differences of intonation meaning according to the pitch accent or the focus. In conclusion, the writer will propose the procedures for the teaching of rhythm and intonation in the following order: word stress drill longrightarrowstressed and reduced syllables drilllongrightarrowrhythm group drilllongrightarrowthe varying rhythm drilllongrightarrowsentence stress drilllongrightarrownucleus drill longrightarrowintonation group drilllongrightarrowlong utterance drill of more than two intonation group.

  • PDF

Design of a System for Collecting and Utilizing Student Feedback Information in Asynchronous Indivisual Learning (비실시간 온라인 수업에서 학습자의 피드백 정보 수집 및 활용 시스템의 설계 및 구현)

  • Tae-Hwan Kim;Dae-Soo Cho;Seung-Min Park
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.1
    • /
    • pp.225-232
    • /
    • 2024
  • The Asynchronous indivisual learning offer advantages such as allowing learners to study at their preferred times without spatial constraints. However, since these classes are not conducted in real-time, there are limitations in conveying learners' feedback on problematic or inadequately explained course content to the instructors. This paper proposed a system for relaying feedback information from learners who view course content to the instructors. Learners can investigate the reasons for pausing online recorded class content, and they can transmit these pause reasons along with the time information of the paused content to the instructors. Instructors receive feedback information and pause times of learners' online recorded class videos in graphical form, making it easier to identify areas with numerous issues in the course content at a glance. Instructors can incorporate this feedback to re-upload the content, resulting in higher-quality course materials, which, in turn, can enhance learners' academic achievements.

Performance Analysis of Routing Protocols for Mobile Ad-hoc Network under Group Mobility Environment (그룹 이동 환경에서의 무선 애드혹 네트워크 라우팅 알고리즘 성능 분석)

  • Yang, Hyo-Sik;Yeo, In-Ho;Rhee, Jong-Myung
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.6
    • /
    • pp.52-59
    • /
    • 2008
  • Most pier performance analysis results for ad-hoc routing protocols have been based upon the model which each node in the network moves independently without restriction. In most real environments, however, it is very common for a group or multiple groups to move under the direction of group leader or group leaders instead of each node's independent movement. This paper presents the performance analysis of routing protocols for mobile ad-hoc network under group mobility environment. The comparative simulations have been made between a table-driven protocol, DSDV, and two on-demand protocols, AODV and DSR, under a group mobility model, RPGM, which is suitable for the practical applications such as military tactical operation. Multiple group movements are also included. The results show that the protocol performances for single group movement are very similar to node independent movement case. However some differences have been observed by varying pause time and connectivity.

Anti-asthmatic Effects of Samjajihwang-tang in OVA-induced Mice (삼자지황탕(三子地黃湯)의 생쥐 모델에 대한 항천식 효과)

  • Kim, Woon-Kil;Park, Yang-Chun
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.23 no.2
    • /
    • pp.343-350
    • /
    • 2009
  • This study aimed to evaluate the anti-asthmatic effects of Samjajihwang-tang (SJT) using OVA-induced asthmatic mice model. Asthmatic mice model was conducted by repeated challenge of OVA using C57BL/6 mice. Each group was treated with distilled water, SJT (400 mg/kg and 200 mg/kg) extract or cyclosporin A (10 mg/kg) for the later 8 weeks, Penh (plethysmography and enhanced pause), immune cells subpopulation, eotaxin, IL-5, TNF-${\alpha}$, Anti-OVA-lgE in BALF (bronchoalveolar lavage), and lung tissue was analyzed, No cytotoxicity of SJT was shown on hFCs (human fibroblast cells). Administration of SJT significantly decreased Penh levels comparing to control group. SJT treatment significantly ameliorated the increase of total cells number and eosinophil including of immune cell subpopulation of $CD3^+/CD69^+$, $CCR3^+$, $B220^+/CD22^+$, $B220^+/CD45^+$, and $B220^+/lgE^+$ cells in BALF comparing to control group. Eotaxin, IL-5, TNF-${\alpha}$, and Anti-OVA-lgE level in BALF were significantly decreased by SJT treatment too. Histopathological finding verified the improvement of infiltration of inflammatory cells and collagen tissue in the SJT groups comparing to control group. These results strongly suggest that SJT would be a effective candidate for herbal-originated anti-asthmatic drug. However, this drug should be further studied for characterization of the accurate action and underlying mechanisms using variant disease model in the future.

Voice Activity Detection Method Using Psycho-Acoustic Model Based on Speech Energy Maximization in Noisy Environments (잡음 환경에서 심리음향모델 기반 음성 에너지 최대화를 이용한 음성 검출 방법)

  • Choi, Gab-Keun;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.5
    • /
    • pp.447-453
    • /
    • 2009
  • This paper introduces the method for detect voices and exact end point at low SNR by maximizing voice energy. Conventional VAD (Voice Activity Detection) algorithm estimates noise level so it tends to detect the end point inaccurately. Moreover, because it uses relatively long analysis range for reflecting temporal change of noise, computing load too high for application. In this paper, the SEM-VAD (Speech Energy Maximization-Voice Activity Detection) method which uses psycho-acoustical bark scale filter banks to maximize voice energy within frames is introduced. Stable threshold values are obtained at various noise environments (SNR 15 dB, 10 dB, 5 dB, 0 dB). At the test for voice detection in car noisy environment, PHR (Pause Hit Rate) was 100%accurate at every noise environment, and FAR (False Alarm Rate) shows 0% at SNR15 dB and 10 dB, 5.6% at SNR5 dB and 9.5% at SNR0 dB.