• Title/Summary/Keyword: speech analysis

Search Result 1,597, Processing Time 0.026 seconds

A Study of Emotional Variation Tendency by Movie Genre Based on Speech Signal Analysis (음성신호 분석 기반의 영화 장르별 감정변화 특성 연구)

  • Yoo, Hwang-Jun;Han, Sang-Hyo;Kim, Bong-Hyun;Ka, Min-Kyoung;Cho, Dong-Uk
    • Proceedings of the KAIS Fall Conference
    • /
    • 2011.12a
    • /
    • pp.295-298
    • /
    • 2011
  • 인간의 능력 중 가장 주목할 만한 것은 언어를 습득하고 그것을 이용하여 서로 의사소통을 할 수 있다는 것이다. 모든 언어에는 그 언어만이 가지는 특수성뿐만 아니라 공통적으로 존재하는 보편적인 특성이 있다. 이것 외에도 언어 위에 입혀지는 사람의 목소리는 의사소통을 하는데 있어 상대의 심리를 파악하는 중요한 단서가 된다. 특히, 언어는 습득되어져야 활용되고 그 습득되어지는 환경에 영향을 받으며 이러한 환경에 따라 사람의 목소리, 억양 등이 변화하게 되는 것이다. 따라서 본 논문에서는 음성신호 분석 기법을 적용하여 장르별 영화시청에 따른 시각적, 청각적 요인이 목소리에 미치는 영향을 분석하는 연구를 수행하였다. 이를 위해 장르별 영화를 시청한 후 성대 진동 및 음성에너지의 크기 변화를 측정하여 감정변화를 분석하는 실험을 수행하였다.

  • PDF

Animation OST Musical Element Analysis based on A Narrative Process Classification Model (내러티브 프로세스 분류 모델 기반 애니메이션 OST의 음악적 요소 분석)

  • Jang, Soeun;Sung, Bongsun;Lee, Jang Hoon;Kim, Jae Ho
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.10
    • /
    • pp.1239-1252
    • /
    • 2014
  • The OST (Original Sound Track) in the film plays a vital role in increasing consensus and concentration to the storyline. The selected 4 animations are classified into 17 Narrative Processes (NP) by using NP Classification Model [1]. For the NPs each having OSTs, the authors have investigated 6 kinds of objective musical elements of the OST such as sound (speech, music, effect), tonality, tempo, range, intensity, and instrumentation. It is found that there are 33.3% common musical elements among all of them for the NPs with OSTs commonly. Among them, it is also found that there are 71.9% of common properties of the musical element. This research is meaningful by firstly showing that there are common properties of objective musical elements in each NP and the corresponding OST.

Recent update on reading disability (dyslexia) focused on neurobiology

  • Kim, Sung Koo
    • Clinical and Experimental Pediatrics
    • /
    • v.64 no.10
    • /
    • pp.497-503
    • /
    • 2021
  • Reading disability (dyslexia) refers to an unexpected difficulty with reading for an individual who has the intelligence to be a much better reader. Dyslexia is most commonly caused by a difficulty in phonological processing (the appreciation of the individual sounds of spoken language), which affects the ability of an individual to speak, read, and spell. In this paper, I describe reading disabilities by focusing on their underlying neurobiological mechanisms. Neurobiological studies using functional brain imaging have uncovered the reading pathways, brain regions involved in reading, and neurobiological abnormalities of dyslexia. The reading pathway is in the order of visual analysis, letter recognition, word recognition, meaning (semantics), phonological processing, and speech production. According to functional neuroimaging studies, the important areas of the brain related to reading include the inferior frontal cortex (Broca's area), the midtemporal lobe region, the inferior parieto-temporal area, and the left occipitotemporal region (visual word form area). Interventions for dyslexia can affect reading ability by causing changes in brain function and structure. An accurate diagnosis and timely specialized intervention are important in children with dyslexia. In cases in which national infant development screening tests have been conducted, as in Korea, if language developmental delay and early predictors of dyslexia are detected, careful observation of the progression to dyslexia and early intervention should be made.

Emotion Recognition of Low Resource (Sindhi) Language Using Machine Learning

  • Ahmed, Tanveer;Memon, Sajjad Ali;Hussain, Saqib;Tanwani, Amer;Sadat, Ahmed
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.369-376
    • /
    • 2021
  • One of the most active areas of research in the field of affective computing and signal processing is emotion recognition. This paper proposes emotion recognition of low-resource (Sindhi) language. This work's uniqueness is that it examines the emotions of languages for which there is currently no publicly accessible dataset. The proposed effort has provided a dataset named MAVDESS (Mehran Audio-Visual Dataset Mehran Audio-Visual Database of Emotional Speech in Sindhi) for the academic community of a significant Sindhi language that is mainly spoken in Pakistan; however, no generic data for such languages is accessible in machine learning except few. Furthermore, the analysis of various emotions of Sindhi language in MAVDESS has been carried out to annotate the emotions using line features such as pitch, volume, and base, as well as toolkits such as OpenSmile, Scikit-Learn, and some important classification schemes such as LR, SVC, DT, and KNN, which will be further classified and computed to the machine via Python language for training a machine. Meanwhile, the dataset can be accessed in future via https://doi.org/10.5281/zenodo.5213073.

Cybercrime as a Discourse of Interpretations: the Semantics of Speech Silence vs Psychological Motivation for Actual Trouble

  • Matveev, Vitaliy;Eduardivna, Nykytchenko Olena;Stefanova, Nataliia;Khrypko, Svitlana;Ishchuk, Alla;PASKO, Katerina
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.203-211
    • /
    • 2021
  • The article studies the discourse and a legal uncertainty of the popular and generally understandable concept of cybercrime. The authors reveal the doctrinal approaches to the definition of cybercrime, cyberspace, computer crime. The analysis of international legal acts and legislation of Ukraine in fighting cybercrime is carried out. The conclusion is made about the need to improve national legislation and establish international cooperation to develop the tools for countering cybercrime and minimizing its negative outcomes. The phenomenon of nicknames is studied as a semantic source, which potentially generates a number of threats and troubles - the crisis of traditional anthroponymic culture, identity crisis, hidden sociality, and indefinite institutionalization, incognito style, a range of manifestations of loneliness - from voluntary solitude to traumatic isolation and forced detachment. The core idea is that it is the phenomenon of incognito and hidden name (nickname and other alternatives) that is the motivational stimulus for the fact of information trouble or crime.

Voice Activity Detection Algorithm using Wavelet Band Entropy Ensemble Analysis in Car Noisy Environments (문서 편집 접근성 향상을 위한 음성 명령 기반 모바일 어플리케이션 개발)

  • Park, Joo Hyun;Park, Seah;Lee, Muneui;Lim, Soon-Bum
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.11
    • /
    • pp.1342-1352
    • /
    • 2018
  • Voice Command systems are important means of ensuring accessibility to digital devices for use in situations where both hands are not free or for people with disabilities. Interests in services using speech recognition technology have been increasing. In this study, we developed a mobile writing application using voice recognition and voice command technology which helps people create and edit documents easily. This application is characterized by the minimization of the touch on the screen and the writing of memo by voice. We have systematically designed a mode to distinguish voice writing and voice command so that the writing and execution system can be used simultaneously in one voice interface. It provides a shortcut function that can control the cursor by voice, which makes document editing as convenient as possible. This allows people to conveniently access writing applications by voice under both physical and environmental constraints.

Information Technologies in The Process of Teaching Foreign Languages in Higher Educational Institutions

  • Fabian, Myroslava;Shavlovska, Tetiana;Shpenyk, Silviia;Khanykina, Nataliіa;Tyshchenko, Oleh;Lebedynets, Hanna
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.3
    • /
    • pp.76-82
    • /
    • 2021
  • An anthological analysis of known literature and historical sources is carried out in the work. It was found that the development of foreign language training of future professionals was influenced by a number of factors: socio-economic (focus on the needs of the labor market, integration into the international space, scientific and technological progress); educational (updating legal documents in the field of education, standardization of educational content, development of methods of professional development of a specialist). The historical period is analyzed and the following stages are determined: ideological (realization of ideological imperative in language and professional training of future specialists; educational-methodical (preparation according to unified curricula, reading and translation as a leading type of speech activity); integration (integration of foreign language teaching and multicultural education)), methodological (use of traditional verbal methods, standardized textbooks). Thus, the research conducted in the article indicates the periods (stages) of formation, functioning and development of foreign language education.

Human Laughter Generation using Hybrid Generative Models

  • Mansouri, Nadia;Lachiri, Zied
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1590-1609
    • /
    • 2021
  • Laughter is one of the most important nonverbal sound that human generates. It is a means for expressing his emotions. The acoustic and contextual features of this specific sound are different from those of speech and many difficulties arise during their modeling process. During this work, we propose an audio laughter generation system based on unsupervised generative models: the autoencoder (AE) and its variants. This procedure is the association of three main sub-process, (1) the analysis which consist of extracting the log magnitude spectrogram from the laughter database, (2) the generative models training, (3) the synthesis stage which incorporate the involvement of an intermediate mechanism: the vocoder. To improve the synthesis quality, we suggest two hybrid models (LSTM-VAE, GRU-VAE and CNN-VAE) that combine the representation learning capacity of variational autoencoder (VAE) with the temporal modelling ability of a long short-term memory RNN (LSTM) and the CNN ability to learn invariant features. To figure out the performance of our proposed audio laughter generation process, objective evaluation (RMSE) and a perceptual audio quality test (listening test) were conducted. According to these evaluation metrics, we can show that the GRU-VAE outperforms the other VAE models.

Relationship between depressive experience and unmet dental needs in the elderly (노인의 우울 경험과 미충족 치과의료 경험의 관계)

  • Kim, Sun-Mi;Jung, Mi-Hee;Ahn, Eunsuk
    • Journal of Korean Academy of Dental Administration
    • /
    • v.8 no.1
    • /
    • pp.30-36
    • /
    • 2020
  • This study is conducted on 1,725 elderly people over 65 years of age using 2018 data obtained from the 7th National Health and Nutrition Survey (KNHANES) data. In this study, an analysis is performed considering the general characteristics of the elderly and their oral health status (authoring discomfort, speech problems, etc.) to confirm the relationship between the elderly's unmet dental experience and depressive experience. The results of this study showed that depressive experiences by the elderly resulted in unmet dental medical experiences, and it was also found that the income level and the complaint of chewing discomfort had an effect. Based on these results, it is believed that oral health policies should be developed to improve the unmet dental medical experience by considering the socio-economic level of the elderly and depressive experiences. This policy development is expected to lead not only to the improvement of oral health for the elderly, but also to improve the quality of life for the elderly through health promotion.

THE REDEFINITION OF SUPPORT SYSTEM FOR LIFELONG EDUCATION FOR THE DEVELOPMENTAL DISABLED BASED ON UNIVERSITY: LEADING THE ESTABLISHMENT OF AN INTEGRATED COMPOSITION SYSTEM BETWEEN COOPERATION WITH LOCAL RELATED ORGANIZATIONS AND FOSTERING QUALIFICATIONS FOR PROFESSIONALS THROUGH CONNECTION WITH CURRICULUM BEYOND THE LEVEL OF USE OF PHYSICAL SPACE

  • Kim, Young-Jun;Kim, Wha-Soo;Rhee, Kun-Yong
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.52-60
    • /
    • 2021
  • This study is conducted with the aim of redefining the university-based lifelong education support system for developmental disabled. The research method consisted of procedures in parallel with literature analysis and expert meetings. As for the contents of the study, a composition system that recognized the problems and solutions of lifelong education for the developmental disabled based on universities was primarily presented. Through this, it was suggested that universities can form an academic foundation for the establishment of a lifelong education support system for the developmental disabled along with the convergence field. In addition, a structural model related to this was presented along with the aspect that universities could develop a curriculum for lifelong education for the developmental disabled according to the school foundation. Also, a composition system was suggested that universities can develop lifelong education curriculum for people with developmental disabilities to lead the cooperation of local related organizations such as welfare centers for the disabled and lifelong education centers for the developmentally disabled. As a result of the study, it was analyzed that leadership in the university-based lifelong education support system for developmental disabled can contribute to fostering professional manpower qualifications and establishing cooperation with local related organizations in an integrated composition system.