• Title/Summary/Keyword: Pitch Contour

Search Result 68, Processing Time 0.022 seconds

The Study on Korean Prosody Generation using Artificial Neural Networks (인공 신경망의 한국어 운율 발생에 관한 연구)

  • Min Kyung-Joong;Lim Un-Cheon
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.337-340
    • /
    • 2004
  • The exactly reproduced prosody of a TTS system is one of the key factors that affect the naturalness of synthesized speech. In general, rules about prosody had been gathered either from linguistic knowledge or by analyzing the prosodic information from natural speech. But these could not be perfect and some of them could be incorrect. So we proposed artificial neural network(ANN)s that can be trained to team the prosody of natural speech and generate it. In learning phase, let ANNs learn the pitch and energy contour of center phoneme by applying a string of phonemes in a sentence to ANNs and comparing the output pattern with target pattern and making adjustment in weighting values to get the least mean square error between them. In test phase, the estimation rates were computed. We saw that ANNs could generate the prosody of a sentence.

  • PDF

A Comparative Analysis of Content-based Music Retrieval Systems (내용기반 음악검색 시스템의 비교 분석)

  • Ro, Jung-Soon
    • Journal of the Korean Society for information Management
    • /
    • v.30 no.3
    • /
    • pp.23-48
    • /
    • 2013
  • This study compared and analyzed 15 CBMR (Content-based Music Retrieval) systems accessible on the web in terms of DB size and type, query type, access point, input and output type, and search functions, with reviewing features of music information and techniques used for transforming or transcribing of music sources, extracting and segmenting melodies, extracting and indexing features of music, and matching algorithms for CBMR systems. Application of text information retrieval techniques such as inverted indexing, N-gram indexing, Boolean search, truncation, keyword and phrase search, normalization, filtering, browsing, exact matching, similarity measure using edit distance, sorting, etc. to enhancing the CBMR; effort for increasing DB size and usability; and problems in extracting melodies, deleting stop notes in queries, and using solfege as pitch information were found as the results of analysis.

Improved Harmonic-CELP Speech Coder with Dual Bit-Rates(2.4/4.0 kbps) (이중 전송률(2.4/4.0 kbps)을 갖는 개선된 하모닉-CELP 음성부호화기)

  • 김경민;윤성완;최용수;박영철;윤대희;강태익
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.3C
    • /
    • pp.239-247
    • /
    • 2003
  • This paper presents a dual-rate (2.4/4.0 kbps) Improved Harmonic-CELP(IHC) speech coder based on the EHC(Efficient Harmonic-CELP) which was presented by the authors. The proposed IHC employs the harmonic coding for voiced and the CELP for unvoiced segments. In the IHC, an initial voiced/unvoiced estimate is obtained by the pitch gain and energy. Then, the final V/UV mode is decided by using the frame energy contour. A new harmonic estimation combining peak picking and delta adjustment provides a more reliable harmonic estimation than that in the EHC. In addition, a noise mixing scheme in conjunction with an improved band voicing measurement provides the naturalness of the synthesized speech. To demonstrate the performance of the proposed IHC coder, the coder has been implemented and compared with the 2.0/4.0 kbps HVXC(Harmonic excitation Vector Coding) standardized by MPEG-4. Results of subjective evaluation showed that the proposed IHC coder and produce better speech quality than the HVXC, with only 40% complexity of the HVXC.

Improved CycleGAN for underwater ship engine audio translation (수중 선박엔진 음향 변환을 위한 향상된 CycleGAN 알고리즘)

  • Ashraf, Hina;Jeong, Yoon-Sang;Lee, Chong Hyun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.4
    • /
    • pp.292-302
    • /
    • 2020
  • Machine learning algorithms have made immense contributions in various fields including sonar and radar applications. Recently developed Cycle-Consistency Generative Adversarial Network (CycleGAN), a variant of GAN has been successfully used for unpaired image-to-image translation. We present a modified CycleGAN for translation of underwater ship engine sounds with high perceptual quality. The proposed network is composed of an improved generator model trained to translate underwater audio from one vessel type to other, an improved discriminator to identify the data as real or fake and a modified cycle-consistency loss function. The quantitative and qualitative analysis of the proposed CycleGAN are performed on publicly available underwater dataset ShipsEar by evaluating and comparing Mel-cepstral distortion, pitch contour matching, nearest neighbor comparison and mean opinion score with existing algorithms. The analysis results of the proposed network demonstrate the effectiveness of the proposed network.

Computational Fluid Dynamics of the aerodynamic characteristics for Flying Wing configuration with Flaperon (플래퍼론이 전개된 플라잉윙 형상의 공력 특성에 대한 전산유동해석)

  • Ko, Arim;Chang, Kyoungsik;Park, Changhwan;Sheen, Dongjin
    • Journal of Aerospace System Engineering
    • /
    • v.13 no.5
    • /
    • pp.32-38
    • /
    • 2019
  • The flying wing configuration with high sweep angles and rounded leading edge represent a complex flow of structures by the leading edge vortex. For control of the tailless flying wing configuration with unstable directional stability, flaperon is used. In this study, we conducted numerical simulations for a non-slender flying wing configuration with a rounded leading edge and analyzed the effect of the sideslip angle and flaperon. Through aerodynamic coefficient analysis, it was found that the effect of AoS on lift and drag coefficient was minimal and the side force and moment coefficient were markedly influenced by AoS. As the sideslip angle increased, the pitch break, which is related to the pitching moment coefficient, was delayed. Through stability analysis, the directional and lateral static stability of the flying wing configuration were increased by flaperon. Also, the structure and behavior of the leading edge vortex were analyzed by observing the contour of the pressure coefficient and the skin friction line.

A Study on Smart Factory System Design for Screw Machining Management (나사 가공 관리를 위한 스마트팩토리 시스템 설계에 관한 연구)

  • Lee, Eun-Kyu;Kim, Dong-Wan;Lee, Sang-Wan;Kim, Jae-joong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.329-331
    • /
    • 2018
  • In this paper, we propose a monitoring system that starts with the supply of raw materials for threading, is processed into a lathe machine, and checks for defects of the product are automatically performed by the robot with Smart Factory technology through assembly and disassembly. Completion check according to the production instruction quantity and production instruction is made by checking the production status according to whether or not the raw material is worn by the displacement sensor, and checking the pitch and the contour of the processed female and male to determine OK and NG. The robotic system acts as a relay for loading and unloading of raw materials, pallet transfer, and overall process, and it acts as an intermediary for organically driving. The location information of the threaded products is collected by using the non-contact wireless tag and the energy saving system Production efficiency and utilization rate were checked. The environmental sensor collects the air-conditioning environment data (temperature, humidity), measures the temperature and humidity accurately, and checks the quality of product processing. It monitors and monitors the driving hazard level environment (overheating, humidity) of the product. Controls for CNC and robot module PLC as a heterogeneous system.

  • PDF

Prosodic Phrasing and Focus in Korea

  • Baek, Judy Yoo-Kyung
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.246-246
    • /
    • 1996
  • Purpose: Some of the properties of the prosodic phrasing and some acoustic and phonological effects of contrastive focus on the tonal pattern of Seoul Korean is explored based on a brief experiment of analyzing the fundamental frequency(=FO) contour of the speech of the author. Data Base and Analysis Procedures: The examples were chosen to contain mostly nasal and liquid consonants, since it is difficult to track down the formants in stops and fricatives during their corresponding consonantal intervals and stops may yield an effect of unwanted increase in the FO value due to their burst into the following vowel. All examples were recorded three times and the spectrum of the most stable repetition was generated, from which the FO contour of each sentence was obtained, the peaks with a value higher than 250Hz being interpreted as a high tone (=H). The result is then discussed within the prosodic hierarchy framework of Selkirk (1986) and compared with the tonal pattern of the Northern Kyungsang dialect of Korean reported in Kenstowicz & Sohn (1996). Prosodic Phrasing: In N.K. Korean, H never appears both on the object and on the verb in a neutral sentence, which indicates the object and the verb form a single Phonological Phrase ($={\phi}$), given that there is only one pitch peak for each $={\phi}$. However, Seoul Korean shows that both the object and the verb have H of their own, indicating that they are not contained in one $={\phi}$. This violates the Optimality constraint of Wrap-XP (=Enclose a lexical head and its arguments in one $={\phi}$), while N.K. Korean obeys the constraint by grouping a VP in a single $={\phi}$. This asymmetry can be resolved through a constraint that favors the separate grouping of each lexical category and is ranked higher than Wrap-XP in Seoul Korean but vice versa in N.K. Korean; $Align-x^{lex}$ (=Align the left edge of a lexical category with that of a $={\phi}$). (1) nuna-ka manll-ll mEk-nIn-ta ('sister-NOM garlic-ACC eat-PRES-DECL') a. (LLH) (LLH) (HLL) ----Seoul Korean b. (LLH) (LLL LHL) ----N.K. Korean Focus and Phrasing: Two major effects of contrastive focus on phonological phrasing are found in Seoul Korean: (a) the peak of an Intonatioanl Phrase (=IP) falls on the focused element; and (b) focus has the effect of deleting all the following prosodic structures. A focused element always attracts the peak of IP, showing an increase of approximately 30Hz compared with the peak of a non-focused IP. When a subject is focused, no H appears either on the object or on the verb and a focused object is never followed by a verb with H. The post-focus deletion of prosodic boundaries is forced through the interaction of StressFocus (=If F is a focus and DF is its semantic domain, the highest prominence in DF will be within F) and Rightmost-IP (=The peak of an IP projects from the rightmost $={\phi}$). First Stress-F requires the peak of IP to fall on the focused element. Then to avoid violating Rightmost-IP, all the boundaries after the focused element should delete, minimizing the number of $={\phi}$'s intervening from the right edge of IP. (2) (omitted) Conclusion: In general, there seems to be no direct alignment constraints between the syntactically focused element and the edge of $={\phi}$ determined in phonology; all the alignment effects come from a single requirement that the peak of IP projects from the rightmost $={\phi}$ as proposed in Truckenbrodt (1995).

  • PDF

A Study on Prospective Plan Comparison using DVH-index in Tomotherapy Planning (토모 테라피 치료 시 선량 체적 히스토그램 표지자를 이용한 치료계획 비교에 관한 연구)

  • Kim, Joo-Ho;Cho, Jeong-Hee;Lee, Sang-Kyoo;Jeon, Byeong-Chul;Yoon, Jong-Won;Kim, Dong-Wook
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.19 no.2
    • /
    • pp.113-122
    • /
    • 2007
  • Purpose: We proposed the method using dose-volume Histogram index to compare prospective plan trials in tomotherapy planning optimization. Materials and Methods: For 3 patients in cranial region, thorax and abdominal region, we acquired computed tomography images with PQ 5000 in each case. Then we delineated target structure and normal organ contour with pinnacle Ver 7.6c, after transferred each data to tomotherapy planning system (hi-art system Ver 2.0), we optimized 3 plan trials in each case that used differ from beam width, pitch, importance. We analyzed 3 plan trials in each region with isodose distribution, dose-volume histogram and dose statistics. Also we verified 3 plan trials with specialized DVH-indexes that is dose homogeneity index in target organ, conformity index around target structure and dose gradient index in non-target structures. Results: We compared with the similarity of results that the one is decide the best plan trial using isodose distribution, dose volume histogram and dose statistics, and the another is using DVH-indexes. They all decided the same plan trial to better result in each case. Conclusion: In some of case, it was appeared a little difference of results that used to DVH-index for comparison of plan trial in tomotherapy by special goal in it. But because DVH-index represented both dose distribution in target structure and high dose risk about normal tissue, it will be reasonable method for comparison of many plan trials before the tomotherapy treatments.

  • PDF