• 제목/요약/키워드: Speech Class

검색결과 140건 처리시간 0.024초

Education System to Learn the Skills of Management Decision-Making by Using Business Simulator with Speech Recognition Technology

  • Sakata, Daiki;Akiyama, Yusuke;Kaneko, Masaaki;Kumagai, Satoshi
    • Industrial Engineering and Management Systems
    • /
    • 제13권3호
    • /
    • pp.267-277
    • /
    • 2014
  • In this paper, we propose an educational system that involves a business game simulator and related curriculum. To develop these two elements, we examined the decision-making process related to business management and identified some significant skills thereby. In addition, we created an original simulator, named BizLator (http://bizlator.com), to help students develop these skills efficiently. Next, we developed a curriculum suitable for the simulator. We confirmed the effectiveness of the simulator and curriculum in a business-game-based class at Aoyama Gakuin University in Tokyo. On the basis of this, we compared our education system with a conventional system. This allowed us to identify advantages of and issues with our proposed system. Furthermore, we proposed a speech recognition support system named BizVoice in order to provide the teachers with more meaningful feedback, such as level of students' understanding. Concretely, BizVocie fetches students' speech of discussion during the game and converts the voice data to text data with speech recognition technology. Finally, teachers can grasp students' parameters of understanding, and thereby, the students also can take more effective class using BizLator. We also confirmed the effectiveness of the system in the class of Aoyama Gakuin Universiry.

Deep Belief Network를 이용한 뇌파의 음성 상상 모음 분류 (Vowel Classification of Imagined Speech in an Electroencephalogram using the Deep Belief Network)

  • 이태주;심귀보
    • 제어로봇시스템학회논문지
    • /
    • 제21권1호
    • /
    • pp.59-64
    • /
    • 2015
  • In this paper, we found the usefulness of the deep belief network (DBN) in the fields of brain-computer interface (BCI), especially in relation to imagined speech. In recent years, the growth of interest in the BCI field has led to the development of a number of useful applications, such as robot control, game interfaces, exoskeleton limbs, and so on. However, while imagined speech, which could be used for communication or military purpose devices, is one of the most exciting BCI applications, there are some problems in implementing the system. In the previous paper, we already handled some of the issues of imagined speech when using the International Phonetic Alphabet (IPA), although it required complementation for multi class classification problems. In view of this point, this paper could provide a suitable solution for vowel classification for imagined speech. We used the DBN algorithm, which is known as a deep learning algorithm for multi-class vowel classification, and selected four vowel pronunciations:, /a/, /i/, /o/, /u/ from IPA. For the experiment, we obtained the required 32 channel raw electroencephalogram (EEG) data from three male subjects, and electrodes were placed on the scalp of the frontal lobe and both temporal lobes which are related to thinking and verbal function. Eigenvalues of the covariance matrix of the EEG data were used as the feature vector of each vowel. In the analysis, we provided the classification results of the back propagation artificial neural network (BP-ANN) for making a comparison with DBN. As a result, the classification results from the BP-ANN were 52.04%, and the DBN was 87.96%. This means the DBN showed 35.92% better classification results in multi class imagined speech classification. In addition, the DBN spent much less time in whole computation time. In conclusion, the DBN algorithm is efficient in BCI system implementation.

대학수업에서 교수의 이미지메이킹이 학습자의 수업만족 및 수업몰입에 미치는 영향 (The effect of professor's image-making on college student's class satisfaction and class commitment)

  • 정혜림;박선주
    • 한국의상디자인학회지
    • /
    • 제23권3호
    • /
    • pp.73-85
    • /
    • 2021
  • The purpose of this study is to understand the influence of the professor's image making (internal, external, social image) perceived by college students on instructional outcomes. The influence of the professor's image making on class satisfaction and class commitment was analyzed, and the mediating effect of class satisfaction and the relationship between class satisfaction and class commitment in the relationship between image making and class commitment was considered. First, it was found that the external image and social image of the professor had a significant effect on class satisfaction. The level of interpersonal relations, such as communication, manners, manners, and intimacy as well as the management of external expressions, clothing style, makeup, hair, gestures, postures, attitudes, voices, speech, and speech speed brings satisfaction to the class. Second, it was found that the professor's inner image, outer image, and social image had a significant effect on class commitment. In order to satisfy the students' immersion in class, professors are required to manage internal, external, and social images. Third, it was found that class satisfaction had a significant effect on class commitment. If the class satisfaction is high, it means that class immersion also increases. Fourth, as for the social image of a professor, it was found that class satisfaction had a completely mediating effect in the relationship between class commitment, and the external image of a professor was found to have a partial mediating effect in class satisfaction in the relationship between class commitment. It was found that the social image of professors perceived by college students improve class satisfaction, and this improves class satisfaction further enhances class immersion.

소음이 외국어 학습에 미치는 영향 (Noise Effects on Foreign Language Learning)

  • 임은수;김현기;김병삼;김종교
    • 음성과학
    • /
    • 제6권
    • /
    • pp.197-217
    • /
    • 1999
  • In a noisy class, the acoustic-phonetic features of the teacher and the perceptual features of learners are changed comparison with a quiet environment. Acoustical analyses were carried out on a set of French monosyllables consisting of 17 consonants and three vowel /a, e, i/, produced by 1 male speaker talking in quiet and in 50, 60 and 70 dB SPL of masking noise on headphone. The results of the acoustic analyses showed consistent differences in energy and formant center frequency amplitude of consonants and vowels, $F_1$ frequency of vowel and duration of voiceless stops suggesting the increase of vocal effort. The perceptual experiments in which 18 undergraduate female students learning French served as the subjects, were conducted in quiet and in 50, 60 dB of masking noise. The identification scores on consonants were higher in Lombard speech than in normal speech, suggesting that the speaker's vocal effort is useful to overcome the masking effect of noise. And, with increased noise level, the perceptual response to the French consonants given had a tendency to be complex and the subjective reaction score on the noise using the vocabulary representative of 'unpleasant' sensation to be higher. And, in the point of view on the L2(second language) acquisition, the influence of L1 (first language) on L2 examined in the perceptual result supports the interference theory.

  • PDF

워드 임베딩과 품사 태깅을 이용한 클래스 언어모델 연구 (Class Language Model based on Word Embedding and POS Tagging)

  • 정의석;박전규
    • 정보과학회 컴퓨팅의 실제 논문지
    • /
    • 제22권7호
    • /
    • pp.315-319
    • /
    • 2016
  • 음성인식 성능 개선을 위한 언어모델의 기술적 진보는 최근 심층 신경망을 기반으로 한 접근방법으로 한 단계 더 진보한 모양새다. 그러나 연구되고 있는 심층 신경망 기반 언어모델은 대부분 음성인식 이후 리스코링 단계에서 적용할 수 있는 한계를 지닌다. 또한 대규모 어휘에 대한 심층 신경망 접근방법은 아직 시간이 필요하다고 본다. 따라서 본 논문은 심층 신경망 언어 모델의 단순화된 형태인 워드 임베딩 기술을 음성인식 후처리가 아닌 기반 N-gram모델에 바로 적용할 수 있는 접근 방법을 찾는다. 클래스 언어모델이 한 접근 방법이 될 수 있는데, 본 연구에서는 워드 임베딩을 우선 구축하고, 해당 어휘별 벡터 정보를 클러스터링하여 클래스 언어모델을 구축 방법을 제시한다. 이를 기존 어휘기반 N-gram 모델에 통합한 후, 언어모델의 성능 개선 여부를 확인한다. 클래스 언어모델의 타당성 검증을 위해 다양한 클래스 개수의 언어모델 실험과 RNN LM과의 비교 결과를 검토한 후, 모든 언어모델의 성능 개선을 보장하는 품사 부착 언어모델 생성 방법을 제안한다.

신경회로망을 이용한 ARS 장애음성의 식별에 관한 연구 (Classification of Pathological Voice from ARS using Neural Network)

  • 조철우;김광인;김대현;권순복;김기련;김용주;전계록;왕수건
    • 음성과학
    • /
    • 제8권2호
    • /
    • pp.61-71
    • /
    • 2001
  • Speech material, which is collected from ARS(Automatic Response System), was analyzed and classified into disease and non-disease state. The material include 11 different kinds of diseases. Along with ARS speech, DAT(Digital Audio Tape) speech is collected in parallel to give the bench mark. To analyze speech material, analysis tools, which is developed local laboratory, are used to provide an improved and robust performance to the obtained parameters. To classify speech into disease and non-disease class, multi-layered neural network was used. Three different combinations of 3, 6, 12 parameters are tested to obtain the proper network size and to find the best performance. From the experiment, the classification rate of 92.5% was obtained.

  • PDF

모의 지능로봇에서 음성신호에 의한 감정인식 (Speech Emotion Recognition by Speech Signals on a Simulated Intelligent Robot)

  • 장광동;권오욱
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2005년도 추계 학술대회 발표논문집
    • /
    • pp.163-166
    • /
    • 2005
  • We propose a speech emotion recognition method for natural human-robot interface. In the proposed method, emotion is classified into 6 classes: Angry, bored, happy, neutral, sad and surprised. Features for an input utterance are extracted from statistics of phonetic and prosodic information. Phonetic information includes log energy, shimmer, formant frequencies, and Teager energy; Prosodic information includes pitch, jitter, duration, and rate of speech. Finally a patten classifier based on Gaussian support vector machines decides the emotion class of the utterance. We record speech commands and dialogs uttered at 2m away from microphones in 5different directions. Experimental results show that the proposed method yields 59% classification accuracy while human classifiers give about 50%accuracy, which confirms that the proposed method achieves performance comparable to a human.

  • PDF

모의 지능로봇에서의 음성 감정인식 (Speech Emotion Recognition on a Simulated Intelligent Robot)

  • 장광동;김남;권오욱
    • 대한음성학회지:말소리
    • /
    • 제56호
    • /
    • pp.173-183
    • /
    • 2005
  • We propose a speech emotion recognition method for affective human-robot interface. In the Proposed method, emotion is classified into 6 classes: Angry, bored, happy, neutral, sad and surprised. Features for an input utterance are extracted from statistics of phonetic and prosodic information. Phonetic information includes log energy, shimmer, formant frequencies, and Teager energy; Prosodic information includes Pitch, jitter, duration, and rate of speech. Finally a pattern classifier based on Gaussian support vector machines decides the emotion class of the utterance. We record speech commands and dialogs uttered at 2m away from microphones in 5 different directions. Experimental results show that the proposed method yields $48\%$ classification accuracy while human classifiers give $71\%$ accuracy.

  • PDF

자동 음성 인식기를 위한 단채널 음질 향상 알고리즘의 성능 분석 (Performance Analysis of a Class of Single Channel Speech Enhancement Algorithms for Automatic Speech Recognition)

  • 송명석;이창헌;이석필;강홍구
    • The Journal of the Acoustical Society of Korea
    • /
    • 제29권2E호
    • /
    • pp.86-99
    • /
    • 2010
  • This paper analyzes the performance of various single channel speech enhancement algorithms when they are applied to automatic speech recognition (ASR) systems as a preprocessor. The functional modules of speech enhancement systems are first divided into four major modules such as a gain estimator, a noise power spectrum estimator, a priori signal to noise ratio (SNR) estimator, and a speech absence probability (SAP) estimator. We investigate the relationship between speech recognition accuracy and the roles of each module. Simulation results show that the Wiener filter outperforms other gain functions such as minimum mean square error-short time spectral amplitude (MMSE-STSA) and minimum mean square error-log spectral amplitude (MMSE-LSA) estimators when a perfect noise estimator is applied. When the performance of the noise estimator degrades, however, MMSE methods including the decision directed module to estimate a priori SNR and the SAP estimation module helps to improve the performance of the enhancement algorithm for speech recognition systems.

제III급 부정교합자의 안정위시와 /s/ 발음시 혀의 위치에 관한 두부방사선계측학적 연구 (A RADIOGRAPHIC STUDY OF TONGUE POSTURE AT REST POSITION AND DURING THE PHONATION OF /S/ IN CLASS III MALOCCLUSION)

  • 이기헌;김종철
    • 대한치과교정학회지
    • /
    • 제23권2호
    • /
    • pp.179-197
    • /
    • 1993
  • Tongue posture at rest position of Class III malocclusion is very important in malocclusion and phonation. Because Class III malocclusion shoves low tongue position, speech defect is commonly occured. This study was attempted to evaluate the correlationship between the tongue posture at rest position and during /s/ phonation and facial skeleton in centric occlusion. Thirty subjects with Class III malocclusion who had no orofacial defects such as cleft palate, medical history of neurologic pathology, hearing defect and any previous speech therapy were selected. Ninety sheets of lateral cephalometric radiographs taken at rest position, during /s/ phonation and centric occlusion were traced, measured and statistically analysed. The results obtained were as follows ; 1. In Class III malocclusion, the posture of tongue was positively correlated with the position of hyoid body. The hyoid body was positioned anteriorly and inferiorly as the vertical facial skeleton was increased in centric occlusion. 2. In Class III malocclusion, the vertical position of tongue tip at rest position was not correlated with facial skeleton in centric occlusion, but the horizontal position had low correlation with mandibular body length, APDI, and $\underline{1}$ to SN. 3. In Class III malocclusion, there was the tendency that the dorsal position of the tongue was lowered as the vertical facial skeleton was increased. 4. In Class III malocclusion, the vertical and horizontal position of tongue tip during /s/ phonation was not correlated with facial skeleton in centric occlusion.

  • PDF