• 제목/요약/키워드: speech effort

검색결과 67건 처리시간 0.021초

ACOUSTIC CHARACTERISTICS OF KOREAN TRADITIONAL SINGING VOICE: A PRELIMINARY REPORT

  • Moon, Seung-Jae
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 1996년도 10월 학술대회지
    • /
    • pp.367-371
    • /
    • 1996
  • Most Koreans agree that Korean traditional singing voice has a very peculiar sound comparing to Western singing voice. The goal of this paper is to investigate the acoustic characteristics of Korean traditional singing voice called 'Pansori' Materials are analyzed from 3male professional singers and 4 female professional singers. Their singing was compared with their own conversation and other non-singers' conversation. Long term average spectra indicated that all the singers showed a much less spectral tilt than non-singers. The phenomenon was prevailing for professional singers not only in their singing, but also in their conversation. This suggests that it is not the result of a temporary effort but it may involve a certain permanent change in their physiological configuration. (To assess this hypothesis, voice source should be looked at directly. Therefore, in further research, using Rothenberg mask (Rothenberg, 1973) is strongly recommended.) In addition to LTA, individual vowel formants will be studied later.

  • PDF

A Study on Routine Formulas and Downgraders of Request Act in High School English Textbooks

  • Yang, Eun-Mi
    • 영어어문교육
    • /
    • 제11권2호
    • /
    • pp.111-134
    • /
    • 2005
  • This paper examines high school English textbooks to ascertain if they appropriately reflect the kinds and frequencies of routine formulas and downgraders of request act used by English native speakers. It is important to present authentic routine formulas in textbooks for students to acquire proper, efficient and safe communication strategies to communicate with other English speakers. For the analysis, currently available 7 series of 21 high school English textbooks under the $7^{th}$ National Curriculum were selected. Each series of textbooks contains 3 school grade textbooks as High School English, High School English I, and High School English II. The results show that the high school English textbooks generally demonstrate a secund reflection of the English native speakers' use of request strategies and downgraders. That is, the textbooks were found to have presented mostly casual forms of routine formulas while they have not presented sufficient coverage of elaborated polite routine formulas for requesting which English native speakers frequently use. The presence of some kinds of the frequently used downgraders was also very small in proportion in the textbooks. More effort should be given to complement the deficiency in this area by teachers and researchers.

  • PDF

An Empirical Analysis of Auditory Interfaces in Human-computer Interaction

  • Nam, Yoonjae
    • International Journal of Contents
    • /
    • 제9권3호
    • /
    • pp.29-34
    • /
    • 2013
  • This study attempted to compare usability of auditory interfaces, which is a comprehensive concept that includes safety, utility, effectiveness, and efficiency, in personal computing environments: verbal messages (speech sounds), earcons (musical sounds), and auditory icons (natural sounds). This study hypothesized that verbal messages would offer higher usability than earcons and auditory icons, since the verbal messages are easy to interpret and understand based on semiotic process. In this study, usability was measured by a set of seven items: ability to inform what the program is doing, relevance to visual interfaces, degree of stimulation, degree of understandability, perceived time pressure, clearness of sound outputs, and degrees of satisfaction. Through the experimental research, the results showed that verbal messages provided the highest level of usability. On the contrary, auditory icons showed the lowest level of usability, as they require users to establish new coding schemes, and thus demand more mental effort from users.

AI 기반 교육 현황과 기술 동향 (Survey of Recent Research in Education based on Artificial Intelligence)

  • 전형배;정훈;강병옥;이윤경
    • 전자통신동향분석
    • /
    • 제36권1호
    • /
    • pp.71-80
    • /
    • 2021
  • Artificial intelligence (AI) will have a huge impact on future education. We look at the role of AI in education and changes in schools. Personalized education is being attempted in limited services, and an interactive tutor service with speech recognition/dialog technology is being developed. In the future, we look forward to fully personalized education for individual students through AI teachers. Teachers are expected to make more effort to teach creative thinking, critical thinking, communication, and collaboration. As the speed of development of AI technology accelerates, we expect that AI-based education will be deeply established around us in the near future. We first introduce the details of the personalization technology and then discuss the AI-based foreign language speaking education research conducted by ETRI.

Interactive Data Acquisition System based on Hand Tracking to evaluate Children's Cognitive Abilities

  • Ekaterina, Ten;Lee, Suk-Ho
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제14권3호
    • /
    • pp.108-114
    • /
    • 2022
  • Autism (ASD) is a mental disorder characterized by a pronounced deficit in personal, social, speech, and other aspects of development and communication skills. Since autism is a complex developmental disorder that requires a lot of effort to recognize, this research was conducted to develop an interactive data Acquisition System and detect the first signs of ASD in children. The proposed system presents several variants of the tasks in an entertaining form, using hand tracking. Hand tracking is used to attract children's attention and interest them more to achieve more accurate results. The creation of the system is based on such libraries as OpenCV, PyGame, TensorFlow, and Mediapipe. The ultimate goal of the paper is to obtain data on the disease of autism in children for use in further diagnosis by medical experts.

영어 문장 내 상이한 위치에 나타난 저성조 피치 액센트 연구 (A Study on Low Pitch Accent Produced in Different Locations in English Sentences)

  • 이서배;김수정
    • 말소리와 음성과학
    • /
    • 제3권4호
    • /
    • pp.63-70
    • /
    • 2011
  • Recent studies on English $L^*$ (low pitch accent) have revealed the difference of changes in acoustic manifestation between utterances produced by Koreans and those produced by native speakers of English. However, not much effort has been made to compare $L^*$ focused constituents and non-focused constituents. At the same time, most previous works on focus realization are lacking in terms of normalization of acoustic measurement. Therefore, this research is dedicated to comparing the $L^*$ focused items and non-focused items realized by Koreans and Americans and to examining the realization of English $L^*$ produced by the two language groups with improved normalization of the acoustic features (F0, intensity and duration). Within-group analysis comparing focused words and non-focused words showed both Americans and Koreans prolonged the $L^*$ focused syllables but the effect size of syllable lengthening made by Koreans was far less than that made by Americans. Furthermore, significant F0 lowering was found in Americans but not in Koreans. However, the effect of intensity change caused by $L^*$ focus was not significant within each group. The effect of focused words was tested between the two groups revealing that Koreans implemented English $L^*$ focus with higher F0, lower intensity and shorter duration than Americans. In the instances in which a significant Group x Focus Location (initial, middle and final of a sentence) interaction was found, further analysis testing the effect of Group on each Focus Location was conducted. The testing showed that the Koreans produced shorter syllables at initial and middle of a sentence and higher F0 at initial of a sentence than Americans. Implications for the intonation training were also discussed.

  • PDF

워키태깅 : 스마트폰 환경에서 음성기반의 효과적인 영상 콘텐츠 어노테이션 방법에 관한 연구 (WalkieTagging : Efficient Speech-Based Video Annotation Method for Smart Devices)

  • 박준영;이수빈;강동엽;석영태
    • 한국IT서비스학회지
    • /
    • 제12권1호
    • /
    • pp.271-287
    • /
    • 2013
  • The rapid growth and dissemination of touch-based mobile devices such as smart phones and tablet PCs, gives numerous benefits to people using a variety of multimedia contents. Due to its portability, it enables users to watch a soccer game, search video from YouTube, and sometimes tag on contents on the road. However, the limited screen size of mobile devices and touch-based character input methods based on this, are still major problems of searching and tagging multimedia contents. In this paper, we propose WalkieTagging, which provides a much more intuitive way than that of previous one. Just like any other previous video tagging services, WalkieTagging, as a voice-based annotation service, supports inserting detailed annotation data including start time, duration, tags, with little effort of users. To evaluate our methods, we developed the Android-based WalkieTagging application and performed user study via a two-week. Through our experiments by a total of 46 people, we observed that experiment participator think our system is more convenient and useful than that of touch-based one. Consequently, we found out that voice-based annotation methods can provide users with much convenience and satisfaction than that of touch-based methods in the mobile environments.

Emotion Recognition of Low Resource (Sindhi) Language Using Machine Learning

  • Ahmed, Tanveer;Memon, Sajjad Ali;Hussain, Saqib;Tanwani, Amer;Sadat, Ahmed
    • International Journal of Computer Science & Network Security
    • /
    • 제21권8호
    • /
    • pp.369-376
    • /
    • 2021
  • One of the most active areas of research in the field of affective computing and signal processing is emotion recognition. This paper proposes emotion recognition of low-resource (Sindhi) language. This work's uniqueness is that it examines the emotions of languages for which there is currently no publicly accessible dataset. The proposed effort has provided a dataset named MAVDESS (Mehran Audio-Visual Dataset Mehran Audio-Visual Database of Emotional Speech in Sindhi) for the academic community of a significant Sindhi language that is mainly spoken in Pakistan; however, no generic data for such languages is accessible in machine learning except few. Furthermore, the analysis of various emotions of Sindhi language in MAVDESS has been carried out to annotate the emotions using line features such as pitch, volume, and base, as well as toolkits such as OpenSmile, Scikit-Learn, and some important classification schemes such as LR, SVC, DT, and KNN, which will be further classified and computed to the machine via Python language for training a machine. Meanwhile, the dataset can be accessed in future via https://doi.org/10.5281/zenodo.5213073.

Using Syntax and Shallow Semantic Analysis for Vietnamese Question Generation

  • Phuoc Tran;Duy Khanh Nguyen;Tram Tran;Bay Vo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권10호
    • /
    • pp.2718-2731
    • /
    • 2023
  • This paper presents a method of using syntax and shallow semantic analysis for Vietnamese question generation (QG). Specifically, our proposed technique concentrates on investigating both the syntactic and shallow semantic structure of each sentence. The main goal of our method is to generate questions from a single sentence. These generated questions are known as factoid questions which require short, fact-based answers. In general, syntax-based analysis is one of the most popular approaches within the QG field, but it requires linguistic expert knowledge as well as a deep understanding of syntax rules in the Vietnamese language. It is thus considered a high-cost and inefficient solution due to the requirement of significant human effort to achieve qualified syntax rules. To deal with this problem, we collected the syntax rules in Vietnamese from a Vietnamese language textbook. Moreover, we also used different natural language processing (NLP) techniques to analyze Vietnamese shallow syntax and semantics for the QG task. These techniques include: sentence segmentation, word segmentation, part of speech, chunking, dependency parsing, and named entity recognition. We used human evaluation to assess the credibility of our model, which means we manually generated questions from the corpus, and then compared them with the generated questions. The empirical evidence demonstrates that our proposed technique has significant performance, in which the generated questions are very similar to those which are created by humans.

Creation of a Voice Recognition-Based English Aided Learning Platform

  • Hui Xu
    • Journal of Information Processing Systems
    • /
    • 제20권4호
    • /
    • pp.491-500
    • /
    • 2024
  • In hopes of resolving the issue of poor quality of information input for teaching spoken English online, the study creates an English teaching assistance model based on a recognition algorithm named dynamic time warping (DTW) and relies on automated voice recognition technology. In hopes of improving the algorithm's efficiency, the study modifies the speech signal's time-domain properties during the pre-processing stage and enhances the algorithm's performance in terms of computational effort and storage space. Finally, a simulation experiment is employed to evaluate the model application's efficacy. The study's revised DTW model, which achieves recognition rates of above 95% for all phonetic symbols and tops the list for cloudy consonant recognition with rates of 98.5%, 98.8%, and 98.7% throughout the three tests, respectively, is demonstrated by the study's findings. The enhanced model for DTW voice recognition also presents higher efficiency and requires less time for training and testing. The DTW model's KS value, which is the highest among the models analyzed in the KS value analysis, is 0.63. Among the comparative models, the model also presents the lowest curve position for both test functions. This shows that the upgraded DTW model features superior voice recognition capabilities, which could significantly improve online English education and lead to better teaching outcomes.