• 제목/요약/키워드: vocal process

검색결과 75건 처리시간 0.021초

음성 하모닉스 스펙트럼의 피크-피팅을 이용한 피치검출에 관한 연구 (A Study on the Pitch Detection of Speech Harmonics by the Peak-Fitting)

  • 김종국;조왕래;배명진
    • 음성과학
    • /
    • 제10권2호
    • /
    • pp.85-95
    • /
    • 2003
  • In speech signal processing, it is very important to detect the pitch exactly in speech recognition, synthesis and analysis. If we exactly pitch detect in speech signal, in the analysis, we can use the pitch to obtain properly the vocal tract parameter. It can be used to easily change or to maintain the naturalness and intelligibility of quality in speech synthesis and to eliminate the personality for speaker-independence in speech recognition. In this paper, we proposed a new pitch detection algorithm. First, positive center clipping is process by using the incline of speech in order to emphasize pitch period with a glottal component of removed vocal tract characteristic in time domain. And rough formant envelope is computed through peak-fitting spectrum of original speech signal infrequence domain. Using the roughed formant envelope, obtain the smoothed formant envelope through calculate the linear interpolation. As well get the flattened harmonics waveform with the algebra difference between spectrum of original speech signal and smoothed formant envelope. Inverse fast fourier transform (IFFT) compute this flattened harmonics. After all, we obtain Residual signal which is removed vocal tract element. The performance was compared with LPC and Cepstrum, ACF. Owing to this algorithm, we have obtained the pitch information improved the accuracy of pitch detection and gross error rate is reduced in voice speech region and in transition region of changing the phoneme.

  • PDF

Improvement of Vocal Detection Accuracy Using Convolutional Neural Networks

  • You, Shingchern D.;Liu, Chien-Hung;Lin, Jia-Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권2호
    • /
    • pp.729-748
    • /
    • 2021
  • Vocal detection is one of the fundamental steps in musical information retrieval. Typically, the detection process consists of feature extraction and classification steps. Recently, neural networks are shown to outperform traditional classifiers. In this paper, we report our study on how to improve detection accuracy further by carefully choosing the parameters of the deep network model. Through experiments, we conclude that a feature-classifier model is still better than an end-to-end model. The recommended model uses a spectrogram as the input plane and the classifier is an 18-layer convolutional neural network (CNN). With this arrangement, when compared with existing literature, the proposed model improves the accuracy from 91.8% to 94.1% in Jamendo dataset. As the dataset has an accuracy of more than 90%, the improvement of 2.3% is difficult and valuable. If even higher accuracy is required, the ensemble learning may be used. The recommend setting is a majority vote with seven proposed models. Doing so, the accuracy increases by about 1.1% in Jamendo dataset.

A Study on Comparison of Pronunciation Accuracy of Soprano Singers

  • Song, Uk-Jin;Park, Hyungwoo;Bae, Myung-Jin
    • International journal of advanced smart convergence
    • /
    • 제6권2호
    • /
    • pp.59-64
    • /
    • 2017
  • There are three sorts of voices of female vocalists: soprano, mezzo-soprano, and contralto according to the transliteration. Among them, the soprano has the highest vocal range. Since the voice is generated through the human vocal tract based on the voice generation model, it is greatly influenced by the vocal tract. The structure of vocal organs differs from person to person, and the formants characteristic of vocalization differ accordingly. The formant characteristic refers to a characteristic in which a specific frequency band appears distinctly due to resonance occurring in each vocal tract in the vocal process. Formant characteristics include personality that occurs in the throat, jaw, lips, and teeth, as well as phonological properties of phonemes. The first formant is the throat, the second formant is the jaw, the third formant and the fourth formant are caused by the resonance phenomenon in the lips and the teeth. Among them, pronunciation is influenced not only by phonological information but also by jaws, lips and teeth. When the mouth is small or the jaw is stiff when pronouncing, pronunciation becomes unclear. Therefore, the higher the accuracy of the pronunciation characteristics, the more clearly the formant characteristics appear in the grammar spectrum. However, many soprano singers can not open their mouths because their jaws, lips, teeth, and facial muscles are rigid to maintain high tones when singing, which makes the pronunciation unclear and thus the formant characteristics become unclear. In this paper, in order to confirm the accuracy of the pronunciation characteristics of soprano singers, the experimental group was selected as the soprano singers A, B, C, D, E of Korea and analyzed the grammar spectrum and conducted the MOS test for pronunciation recognition. As a result, soprano singer B showed a clear recognition from F1 to F5 and MOS test result showed the highest recognition rate with 4.6 points. Soprano singers A, C, and D appear from F1 to F3, but it was difficult to find formants above 2kHz. Finally, the soprano singer E had difficulty in finding the formant as a whole, and MOS test showed the lowest recognition rate at 2.1 points. Therefore, we confirmed that the soprano singer B, which exhibits the most distinct formant characteristics in the grammar spectrum, has the best pronunciation accuracy.

피치 검출을 위한 스펙트럼 평탄화 기법 (Flattening Techniques for Pitch Detection)

  • 김종국;조왕래;배명진
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 하계종합학술대회 논문집(4)
    • /
    • pp.381-384
    • /
    • 2002
  • In speech signal processing, it Is very important to detect the pitch exactly in speech recognition, synthesis and analysis. but, it is very difficult to pitch detection from speech signal because of formant and transition amplitude affect. therefore, in this paper, we proposed a pitch detection using the spectrum flattening techniques. Spectrum flattening is to eliminate the formant and transition amplitude affect. In time domain, positive center clipping is process in order to emphasize pitch period with a glottal component of removed vocal tract characteristic. And rough formant envelope is computed through peak-fitting spectrum of original speech signal in frequency domain. As a results, well get the flattened harmonics waveform with the algebra difference between spectrum of original speech signal and smoothed formant envelope. After all, we obtain residual signal which is removed vocal tract element The performance was compared with LPC and Cepstrum, ACF 0wing to this algorithm, we have obtained the pitch information improved the accuracy of pitch detection and gross error rate is reduced in voice speech region and in transition region of changing the phoneme.

  • PDF

낮은 차원의 벡터 변환을 통한 음성 변환 (Voice conversion using low dimensional vector mapping)

  • 이기승;도원;윤대희
    • 전자공학회논문지S
    • /
    • 제35S권4호
    • /
    • pp.118-127
    • /
    • 1998
  • In this paper, we propose a voice personality transformation method which makes one person's voice sound like another person's voice. In order to transform the voice personality, vocal tract transfer function is used as a transformation parameter. Comparing with previous methods, the proposed method can obtain high-quality transformed speech with low computational complexity. Conversion between the vocal tract transfer functions is implemented by a linear mapping based on soft clustering. In this process, mean LPC cepstrum coefficients and mean removed LPC cepstrum modeled by the low dimensional vector are used as transformation parameters. To evaluate the performance of the proposed method, mapping rules are generated from 61 Korean words uttered by two male and one female speakers. These rules are then applied to 9 sentences uttered by the same persons, and objective evaluation and subjective listening tests for the transformed speech are performed.

  • PDF

제 3 형 갑상연골성형술에 의한 변성발성장애의 치험 1례 (A Case of Mutational Dysphonia Treated with Type III Thyroplasty)

  • 최홍식;조창현;김광문
    • 대한후두음성언어의학회지
    • /
    • 제6권1호
    • /
    • pp.43-45
    • /
    • 1995
  • Type III thyroplasty is a useful surgical procedure reducing the tension of vocal cords by removing the vertical strip of anterior thyroid cartilage and resuturing the cut ends. One of the indications for this procedure is mutational dysphonia, the disease of men who has a childlike vocal pattern even after the process of puberty. We have experienced one case of mutational dysphonia treated with type III thyroplasty. He had high pitched voice from the middle school age and his preoperative fundamental frequency was 272.35 Hz. Two months after the surgery. the fundamental frequency was 129.58 Hz and the patient was also subjectively satisfied with his low-toned voice.

  • PDF

뮤지컬 보컬 코치의 기능과 역할 (Functions and Roles of Musical Vocal Coach)

  • 임지현;민경원
    • 한국콘텐츠학회논문지
    • /
    • 제18권1호
    • /
    • pp.642-650
    • /
    • 2018
  • 뮤지컬은 제작부터 연출, 작가, 작곡가, 작사가를 비롯하여 안무가, 음악감독, 배우 등 많은 전문가들을 통해 이루어진다. 이처럼 하나의 뮤지컬이 성공하기 위해서는 그 모든 분야의 사람들이 다 함께 창의력을 발휘해야 한다. 라이센스 뮤지컬이 아닌 이상 먼저 작가, 작곡가, 작사가를 통해 한 작품의 기초 틀이 만들어진다. 이들을 크리에이티브 팀(Creative Team)이라 하고 연출가, 안무가, 배우, 스텝 등을 프로덕션 팀(Production Team)이라고 한다. 그리고 이 두 팀을 합쳐서 크리에이티브 스텝(Creative Staff)이라고 한다. 이후 제작 규모에 따라 2차 크리에이티브 스텝들이 참여하게 되고 실제 연습에 들어가게 되는데 음악감독, 무대 디자이너, 음향 디자이너 등 각 팀의 세부 구성을 이루게 된다. 뮤지컬 크리에이티브 팀에서의 음악 관련 스텝들은 사실 작품의 음악적 색깔과 장르를 결정하는 음악 슈퍼바이저로부터 시작되어 세분화되고 분업화 된다. 하지만 국내에서는 작곡가 또는 음악감독이 그 역할을 모두 담당하고 있다. 본 연구는 해외 뮤지컬 제작 과정 시스템의 사례를 분석하여 작업 공정의 세분화에 따른 보컬 코치의 역할과 개념을 정립하고, 국내 뮤지컬산업에서 보컬 코치의 역할과 필요성을 고찰해 보고자 한다. 일반적인 보이스 티쳐들과 뮤지컬 보컬 코치의 역할과 기능의 공통점과 차이점을 살펴보고, 인터뷰를 통한 국내 뮤지컬 보컬코치 사례를 함께 알아보았다. 그와 더불어 국내 뮤지컬 음악 크리에이티브팀 시스템에 관해 고찰해 보았다.

Application Of Innovative Technologies In Higher Education Institutions Of Ukraine: Forms And Methods

  • Dovgal, Olena;Havrylova, Olena;Potryvaieva, Natalia;Tolstova, Natalia;Ostapchuk, Taras;Onyshchenko, Nataliіa
    • International Journal of Computer Science & Network Security
    • /
    • 제21권5호
    • /
    • pp.43-47
    • /
    • 2021
  • In the course of this article, the concept of "innovation" was considered and analyzed, which is considered not only as a subject, something new, but also as a process. The process of introducing something new into life, and in our case, into the educational process. Innovative educational technologies are varied and plentiful. In this article, the most commonly used. Among them: the use of ICT, game techniques, the portfolio method, personality-oriented, information support of the learning process, educational and health-saving technologies, and others.

보컬 녹음에 필요한 이펙트의 개념과 사용법에 관한 제언 - Reverb를 중심으로 - (Suggestions on the Concept and Usage of Effects Needed for Vocal Recording -With focus on reverb-)

  • 조태선;최원준
    • 한국산학기술학회논문지
    • /
    • 제19권2호
    • /
    • pp.380-386
    • /
    • 2018
  • 믹싱의 기술적인 측면에서 가장 해결하기 어려웠던 부분 중에 하나가 바로 보컬(목소리) 이다. 악기와 달리 보컬은 각각 가수들마다 너무 나 다른 톤, 즉 색깔 때문에 공통의 수치를 적용하기도 어려울뿐더러 여러 이펙터를 적절히 배합해야 하기 때문에 매우 힘든 작업이었다. 본 논문은 그중에서도 가장 대표적인 보컬이펙터인 리버브의 개념과, 현황, 사용방법에 대해 학생들이 가장 많이 사용하는 이펙터 중에 하나인 Wave Renaissance Reverb 를 사용하여 효과적인 보컬 리버브 사용에 관한 제안을 해보도록 하겠다. 가요 믹싱에서 가장 중요한 부분은 보컬의 목소리를 어떻게 만들어 주는지가 관건이다. 소리의 공간감은 음악을 더욱 더 아름답게 만들어주기 때문에 보컬이펙터로서 리버브의 역할은 절대적이다. 컴퓨터기술은 음악을 더욱 손쉽게 만들 수 있게 해주었으나 정해진 프리셋만을 사용하게 하는 등 개개인의 기술능력이 떨어지는 부작용을 낳았다. 보다 더 더 세밀한 노력을 통한 뮤지션들의 리버브 연구는 결국 좋은 음악을 창출해 낼 수 있을 것이다.

1930년대 조선성악연구회(朝鮮聲樂硏究會)의 창극적 상상력과 식민성 (Ch'anggŭk Imagination and Coloniality of Chosŏn Sŏngak Yŏn'guhoe in the 1930s)

  • 김향
    • 공연문화연구
    • /
    • 제39호
    • /
    • pp.357-392
    • /
    • 2019
  • 이 논문은 1930년대 창극 형성 과정을 창극 유성기음반과 조선성악연구회(朝鮮聲樂硏究會)의 창극 레퍼토리들을 중심으로 재고찰한 것이다. 1930년대 창극 형성의 중심에 있었던 조선성악연구회의 구체적인 활동을 살피면서 그 의의와 한계를 논했다. 창극 유성기음반에서 구현되는 '입체창'과 '해설자' 역할은 조선성악연구회 회원들이 판소리와 변별되는 '극적 공간과 무대'를 인식하게 된 창극적 상상력의 구현으로 보았다. 그리고 창극 형성으로 파생된 서항석과 송석하의 '신창극' 개념의 차이와 의미를 논했다. 1930년대 무대 창극 공연은 '가극'이라는 지향점에 도달해야 한다는 논의의 반복이었는데, 당시 '가극'이라는 용어가 '창극'으로 대체되는 과정을 살폈으며 그 과정에서 창극 무대가 온전한 형태를 갖추게 된 것을 논했다. 조선성악연구회 회원들의 창극 만들기는 창극사적으로는 중요한 업적일 수 있으나 일제의 문화정책에 따른 '정제'와 '배제'로 인해 형식적 측면에서는 한계를 드러낼 수밖에 없었다. 판소리 음악성은 고양되었으나 시대정신과 다양성을 담을 수 없었기에 초보적인 창극적 상상력에 그칠 수밖에 없었다고 할 수 있다. 창극은 태생적으로 한계를 지닌 장르였지만 시대의 흐름 속에서 극복되고 있다고 할 수 있으며 이는 후속 논문에서 다루고자 한다.