• Title/Summary/Keyword: Vocalization

Search Result 115, Processing Time 0.017 seconds

Comparison of Korean Real-time Text-to-Speech Technology Based on Deep Learning (딥러닝 기반 한국어 실시간 TTS 기술 비교)

  • Kwon, Chul Hong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.1
    • /
    • pp.640-645
    • /
    • 2021
  • The deep learning based end-to-end TTS system consists of Text2Mel module that generates spectrogram from text, and vocoder module that synthesizes speech signals from spectrogram. Recently, by applying deep learning technology to the TTS system the intelligibility and naturalness of the synthesized speech is as improved as human vocalization. However, it has the disadvantage that the inference speed for synthesizing speech is very slow compared to the conventional method. The inference speed can be improved by applying the non-autoregressive method which can generate speech samples in parallel independent of previously generated samples. In this paper, we introduce FastSpeech, FastSpeech 2, and FastPitch as Text2Mel technology, and Parallel WaveGAN, Multi-band MelGAN, and WaveGlow as vocoder technology applying non-autoregressive method. And we implement them to verify whether it can be processed in real time. Experimental results show that by the obtained RTF all the presented methods are sufficiently capable of real-time processing. And it can be seen that the size of the learned model is about tens to hundreds of megabytes except WaveGlow, and it can be applied to the embedded environment where the memory is limited.

Deriving the Key Factors of Commentaries in Classical Music Concerts with Commentaries Using DHP (DHP를 이용한 해설이 있는 클래식공연의 해설 핵심요인 도출)

  • Oh, Dae-young;Han, Joo-hee
    • Korean Association of Arts Management
    • /
    • no.53
    • /
    • pp.179-206
    • /
    • 2020
  • The objective of this study is to derive key factors of commentaries in classical music concerts with commentaries and to measure the importance of each attribute, thereby presenting the characteristics of commentaries and commentators as well as suggestions to concert planners in terms of composition. In addition, by developing a scale that can measure classical commentary, a questionnaire is provided so that concert planners can plan programs that gathered the opinions of the audience. To this end, the first, second and third rounds of the Delphi survey and AHP were applied to concert planners, musicians (performing artists), and academic experts. A questionnaire was developed based on the results, and the survey was verified by conducting a pilot test with the general audience. The results of this study can be summarized as follows: First, the purpose of commentaries must be focused on arousing the audience's interest rather than on delivering information. Second, commentators must meet the auditory satisfaction of the audience with a good voice and clear pronunciation based on impeccable vocalization. Third, commentaries must be concise, with the commentaries appearing at least five times per concert, each of which must not exceed five minutes. Fourth, as a result of the pilot test, this study derived 14 items to rate commentary skills across four factors: four items for "arousing interest," three items for "delivering information," three items for "favorability," and four items for "expressiveness." Based on these results, the authors of study presented effective implications for concert planning.

The Development of Stuttering Therapy Device and Clinical Application Cases Using Breathing Control Prolonged Speech Method (호흡 조절식 연장기법을 이용한 말더듬치료 장치개발 및 적용사례 연구)

  • Rhee, Kun Min;Kwon, Sang Nam;Jung, Hyo Jae
    • 재활복지
    • /
    • v.15 no.2
    • /
    • pp.147-173
    • /
    • 2011
  • The purpose of this study was to develop a stuttering therapy device to aid in stutter therapy. The research method used for this study was as follows: First, the stuttering therapy device based on analysis of the prolonged speech method used at home and abroad was designed to achieve the goal of research. Second, the stuttering therapy device was to be developed to maintain a vocalization state, to use bio-feedback visualization, to have enough inspiration, to use Korean language in this device, and to use transfer and maintenance training in daily life. Third, the stuttering therapy device effectiveness was to be verified through use in clinical cases. The results of subjects receiving speech therapy and using the breathing control prolonged speech device and SI(stuttering Interview) evaluation programs for 3 months were as follows: For subject A, the stuttered word rate was reduced from 3.20 SW/M to 0.5 SW/M. For subject B, the stuttered word rate was reduced from 1.90 SW/M to 0.75 SW/M. For subject C, the stuttered word rate was reduced from 3.37 SW/M to 0.34 SW/M. For Subject D, the stuttered word rate was reduced from 0.51 SW/M to 0 SW/M. Follow-up evaluations verified the effectiveness of how the stuttering therapy device can reduce subjects' SW/M.

The actual aspects of North Korea's 1950s Changgeuk through the Chunhyangjeon in the film Moranbong(1958) and the album Corée Moranbong(1960) (영화 <모란봉>(1958)과 음반 (1960) 수록 <춘향전>을 통해 본 1950년대 북한 창극의 실제적 양상)

  • Song, Mi-Kyoung
    • (The) Research of the performance art and culture
    • /
    • no.43
    • /
    • pp.5-46
    • /
    • 2021
  • The film Moranbong is the product of a trip to North Korea in 1958, when Armangati, Chris Marker, Claude Lantzmann, Francis Lemarck and Jean-Claude Bonardo left at the invitation of Joseon Film. However, for political reasons, the film was not immediately released, and it was not until 2010 that it was rediscovered and received attention. The movie consists of the narratives of Young-ran and Dong-il, set in the Korean War, that are folded into the narratives of Chunhyang and Mongryong in the classic Chunhyangjeon of Joseon. At this time, Joseon's classics are reproduced in the form of the drama Chunhyangjeon, which shares the time zone with the two main characters, and the two narratives are covered in a total of six scenes. There are two layers of middle-story frames in the movie, and if the same narrative is set in North Korea in the 1950s, there is an epic produced by the producers and actors of the Changgeuk Chunhyangjeon and the Changgeuk Chunhyangjeon as a complete work. In the outermost frame of the movie, Dong-il is the main character, but in the inner double frame, Young-ran, who is an actor growing up with the Changgeuk Chunhyangjeon and a character in the Changgeuk Chunhyangjeon, is the center. The following three OST albums are Corée Moranbong released in France in 1960, Musique de corée released in 1970, and 朝鮮の伝統音樂-唱劇 「春香伝」と伝統樂器- released in 1968 in Japan. While Corée Moranbong consists only of the music from the film Moranbong, the two subsequent albums included additional songs collected and recorded by Pyongyang National Broadcasting System. However, there is no information about the movie Moranbong on the album released in Japan. Under the circumstances, it is highly likely that the author of the record label or music commentary has not confirmed the existence of the movie Moranbong, and may have intentionally excluded related contents due to the background of the film's ban on its release. The results of analyzing the detailed scenes of the Changgeuk Chunhyangjeon, Farewell Song, Sipjang-ga, Chundangsigwa, Bakseokti and Prison Song in the movie Moranbong or OST album in the 1950s are as follows. First, the process of establishing the North Korean Changgeuk Chunhyangjeon in the 1950s was confirmed. The play, compiled in 1955 through the Joseon Changgeuk Collection, was settled in the form of a Changgeuk that can be performed in the late 1950s by the Changgeuk Chunhyangjeon between 1956 and 1958. Since the 1960s, Chunhyangjeon has no longer been performed as a traditional pansori-style Changgeuk, so the film Moranbong and the album Corée moranbong are almost the last records to capture the Changgeuk Chunhyangjeon and its music. Second, we confirmed the responses of the actors to the controversy over Takseong in the North Korean creative world in the 1950s. Until 1959, there was a voice of criticism surrounding Takseong and a voice of advocacy that it was also a national characteristic. Shin Woo-sun, who almost eliminated Takseong with clear and high-pitched phrases, air man who changed according to the situation, who chose Takseong but did not actively remove Takseong, Lim So-hyang, who tried to maintain his own tone while accepting some of modern vocalization. Although Cho Sang-sun and Lim So-hyang were also guaranteed roles to continue their voices, the selection/exclusion patterns in the movie Moranbong were linked to the Takseong removal guidelines required by North Korean musicians in the name of Dang and People in the 1950s. Second, Changgeuk actors' response to the controversy over the turbidity of the North Korean Changgeuk community in the 1950s was confirmed. Until 1959, there were voices of criticism and support surrounding Taksung in North Korea. Shin Woo-sun, who showed consistent performance in removing turbidity with clear, high-pitched vocal sounds, Gong Gi-nam, who did not actively remove turbidity depending on the situation, Cho Sang-sun, who accepted some of the vocalization required by the party, while maintaining his original tone. On the other hand, Cho Sang-seon and Lim So-hyang were guaranteed roles to continue their sounds, but the selection/exclusion patterns of Moranbong was independently linked to the guidelines for removing turbidity that the Gugak musicians who crossed to North Korea had been asked for.

Anura Call Monitoring Data Collection and Quality Management through Citizen Participation (시민참여형 무미목 양서류 음성신호 수집 및 품질관리 방안)

  • Kyeong-Tae Kim;Hyun-Jung Lee;Won-Kyong Song
    • Korean Journal of Environment and Ecology
    • /
    • v.38 no.3
    • /
    • pp.230-245
    • /
    • 2024
  • Amphibians, sensitive to external environmental changes, serve as bioindicator species for assessing alterations or disturbances in local ecosystems. It is known that one-third of amphibian species within the order Anura are at risk of extinction due to anthropogenic threats such as habitat destruction and fragmentation caused by urbanization. To develop effective protection and conservation strategies for anuran amphibians, species surveys that account for population characteristics are essential. This study aimed to investigate the potential for citizen participation in ecological monitoring using the mating calls of anura species. We also proposed suitable quality control measures to mitigate errors and biases, ensuring the extraction of reliable species occurrence data. The Citizen Science project was carried out nationwide from April 1 to August 31, 2022, targeting 12 species of anura amphibians in Korea. Citizens voluntarily participated in voice signal monitoring, where they listened to anura species' mating calls and recorded them using a mobile application. Additionally, we established a quality control process to extract reliable species occurrence data, categorizing errors and biases from citizen-collected data into three levels: omission, commission, and incorrect identification. A total of 6,808 observations were collected during the citizen participation in anura species vocalization monitoring. Through the quality control process, errors and biases were identified in 1,944 (28.55%) of the 6,808 data. The most common type of error was omission, accounting for 922 cases (47.43%), followed by incorrect identification with 540 cases (27.78%), and commission with 482 cases (24.79%). During the Citizen Science project, we successfully recorded the mating calls of 10 out of the 12 anuran amphibian species in Korea, excluding the Asian toads (Bufo gargarizans Cantor), Korean brown frog (Rana coreana). Difficulties in collecting mating calls were primarily attributed to challenges in observing due to population decline or discrepancies between the breeding season of non-emergent individuals and the timing of the citizen science project. This study represents the first investigation of distribution status and species emergence data collection through mating calls of anura species in Korea based on citizen participation. It can serve as a foundation for designing future bioacoustic monitoring that incorporates citizen science and quality control measures for citizen science data.