Search | Korea Science

Time-Scale Modification of Polyphonic Audio Signals Using Sinusoidal Modeling (정현파 모델링을 이용한 폴리포닉 오디오 신호의 시간축 변화)

장호근;박주성
- The Journal of the Acoustical Society of Korea
- /
- v.20 no.2
- /
- pp.77-85
- /
- 2001
This paper proposes a method of time-scale modification of polyphonic audio signals based on a sinusoidal model. The signals are modeled with sinusoidal component and noise component. A multiresolution filter bank is designed which splits the input signal into six octave-spaced subbands without aliasing and sinusoidal modeling is applied to each subband signal. To alleviate smearing of transients in time-scale modification a dynamic segmentation method is applied to subbands which determines the analysis-synthesis frame size adaptively to fit time-frequency characteristics of the subband signal. For extracting sinusoidal components and calculating their parameters matching pursuit algorithm is applied to each analysis frame of subband signal. In accordance with spectrum analysis a psychoacoustic model implementing the effect of frequency masking is incorporated with matching pursuit to provide a resonable stop condition of iteration and reduce the number of sinusoids. The noise component obtained by subtracting the synthesized signal with sinusoidal components from the original signal is modeled by line-segment model of short time spectrum envelope. For various polyphonic audio signals the result of simulation shows suggested sinusoidal modeling can synthesize original signal without loss of perceptual quality and do more robust and high quality time-scale modification for large scale factor because of representing transients without any perceptual loss.
PDF

A Basic Study on the System of Converting Color Image into Sound (컬러이미지-소리 변환 시스템에 관한 기초연구)

Kim, Sung-Ill;Jung, Jin-Seung
- Journal of the Korean Institute of Intelligent Systems
- /
- v.20 no.2
- /
- pp.251-256
- /
- 2010
This paper aims for developing the intelligent robot emulating human synesthetic skills which associate a color image with sound, so that we are able to build an application system based on the principle of mutual conversion between color image and sound. As the first step, in this study, we have tried to realize a basic system using the color image to sound conversion. This study describes a new conversion method to convert color image into sound, based on the likelihood in the physical frequency information between light and sound. In addition, we present the method of converting color image into sound using color model conversion as well as histograms in the converted color model. In the basis of the method proposed in this study, we built a basic system using Microsoft Visual C++(ver. 6.0). The simulation results revealed that the hue, saturation and intensity elements of a input color image were converted into F0, harmonic and octave elements of a sound, respectively. The converted sound elements were synthesized to generate a sound source with WAV file format using Csound toolkit.
https://doi.org/10.5391/JKIIS.2010.20.2.251 인용 PDF KSCI

Sound Enhancement of low Sample rate Audio Using LMS in DWT Domain (DWT영역에서 LMS를 이용한 저 샘플링 비율 오디오 신호의 음질 향상)

백수진;윤원중;박규식
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.1
- /
- pp.54-60
- /
- 2004
In order to mitigate the problems in storage space and network bandwidth for the full CD quality audio, current digital audio is always restricted by sampling rate and bandwidth. This restriction normally results in low sample rate audio or calls for the data compression scheme such as MP3. However, they can only reproduce a lower frequency range than a regular CD quality because of the Nyquist sampling theory. Consequently they lose rich spatial information embedded in high frequency. The propose of this paper is to propose efficient high frequency enhancement of low sample rate audio using n adaptive filtering and DWT analysis and synthesis. The proposed algorithm uses the LMS adaptive algorithm to estimate the missing high frequency contents in DWT domain and it then reconstructs the spectrally enhanced audio by using the DWT synthesis procedure. Several experiments with real speech and audio are performed and compared with other algorithm. From the experimental results of spectrogram and sonic test, we confirm that the proposed algorithm outperforms the other algorithm and reasonably works well for the most of audio cases.
PDF KSCI

A Study on the Timbre of Pyeon-gyoung (국악 타악기 편경의 음색연구)

Yoon, Ji-Won;Kim, Jun
- Journal of Korea Multimedia Society
- /
- v.13 no.11
- /
- pp.1728-1738
- /
- 2010
Pyeon-gyeong, similar to Chinese Bianqing, is a Korean traditional lithophone with multiple stone chimes. Due to the temperature- and humidity-insensitive characteristics of its material, pumice stone, the instrument provides highly stable pitch and therefore has played a key role in Korean traditional court music. By reason of having absolute pitch, it is an important part of the research on the standard pitch and scale system of korean traditional music, but as an instrument, the study on the sound characteristics and worth is not making satisfactory progress, to date. This research is an analysis paper for physical modeling synthesis of pyeongyeong. Through this study, we will determine the original characteristics of the timbre of pyeongyeong as a unique korean traditional percussion, and investigate these characteristics objectively, based on the music acoustics by scientific analysis. Furthermore, this study will be used as an important basic material for physical modeling synthesis of pyeongyeong, and also make a huge contribution to the cultural applicability by the vitalization of graft onto the various artistic creation field, through the comprehension of the timbre of pyeongyeong as an instrument.
PDF KSCI

Synthesis and Characterization of Molybdeum Complexes with Schiff-Bases(II), Dioxobis(N-aryl-3-methoxysalicyaldiminato) Molybdenum(VI) Complexes (몰리브덴의 시프-염기착물의 합성과 그 성질 (제2보). 다이옥소비스(질소-아릴-3-메톡시살리실알디미나토)몰리브데늄(VI) 착물)

O, Sang O;Gu, Bon Gwon
- Journal of the Korean Chemical Society
- /
- v.29 no.3
- /
- pp.257-264
- /
- 1985
Dioxobis(3-methoxysalicyaldehydato)molybdeum(VI) complex has been synthesized by reactions of 3-methoxysalicylaldehyde and ammonium paramolybdate in methanol solution. With appropriate primary amine, the resulting complex gave schiff-base complexes, MoO$_2$(CH$_3$O-sal-N-R)$_2$ in which C=O oxide ligands had been replaced by nitrogen. The properties and possible molecular structure of these complexes were discussed by elemental analysis, spectroscopic studies and electric conductivities measurements. It was found that the Mo(VI) complexes contain a cis-MoO$_2$ group since their infrared spectra two Mo=O band at about 900cm$^{-1}$ and the combining ratios for MoO$_2$-ligand are 1 : 2. Also, electronic spectra of molybdenyl complexes assigned to ligand-to-metal charge transfer transition. All of these complexes are yellow or orange, depolar compound and slightly soluble in alcohol, dichloromethane, chloroform and N,N-dimethylformamide.
PDF

Design and Implementation of Vocal Sound Variation Rules for Korean Language (한국어 음운 변동 처리 규칙의 설계 및 구현)

Lee, Gye-Young
- The Transactions of the Korea Information Processing Society
- /
- v.5 no.3
- /
- pp.851-861
- /
- 1998
Korean language is to be characterized by the rich vocal sound variation. In order to increase the probability of vocal sound recognition and to provide a natural vocal sound synthesis, a systematic and thorough research into the characteristics of Korean language including its vocal sound changing rules is required. This paper addresses an effective way of vocal sound recognition and synthesis by providing the design and implementation of the Korean vocal sound variation rule. The regulation we followed for the design of the vocal sound variation rule is the Phonetic Standard(Section 30. Chapter 7) of the Korean Orthographic Standards. We have first factor out rules for each regulations, then grouped them into 27 groups for eaeh final-consonant. The Phonological Change Processing System suggested in the paper provides a fast processing ability for vocal sound variation by a single application of the rule. The contents of the process for information augmented to words or the stem of innected words are included in the rules. We believe that the Phonological Change Processing System will facilitate the vocal sound recognition and synthesis by the sentence. Also, this system may be referred as an example for similar research areas.
PDF

An acoustic Doppler-based silent speech interface technology using generative adversarial networks (생성적 적대 신경망을 이용한 음향 도플러 기반 무 음성 대화기술)

Lee, Ki-Seung
- The Journal of the Acoustical Society of Korea
- /
- v.40 no.2
- /
- pp.161-168
- /
- 2021
In this paper, a Silent Speech Interface (SSI) technology was proposed in which Doppler frequency shifts of the reflected signal were used to synthesize the speech signals when 40kHz ultrasonic signal was incident to speaker's mouth region. In SSI, the mapping rules from the features derived from non-speech signals to those from audible speech signals was constructed, the speech signals are synthesized from non-speech signals using the constructed mapping rules. The mapping rules were built by minimizing the overall errors between the estimated and true speech parameters in the conventional SSI methods. In the present study, the mapping rules were constructed so that the distribution of the estimated parameters is similar to that of the true parameters by using Generative Adversarial Networks (GAN). The experimental result using 60 Korean words showed that, both objectively and subjectively, the performance of the proposed method was superior to that of the conventional neural networks-based methods.
https://doi.org/10.7776/ASK.2021.40.2.161 인용 PDF KSCI

Quantifying the Urgency Perception of Voice Alarm Generated by Concatenative Synthesizer (연결형 합성음성을 이용한 경보음의 주관적 위급도 정량화)

Jang, Pil-Sik;Lee, Gyeong-Tae
- Journal of the Ergonomics Society of Korea
- /
- v.25 no.2
- /
- pp.63-70
- /
- 2006
This paper presents an experimental study of the factors modulating the urgency perception of voice alarm generated by concatenative synthesizers. Four experiments were conducted using psycho-physical approach in which 105 participants made magnitude estimation for urgency perception of various voice alarm stimuli. Experiment 1 identified 6 acoustic and non-acoustic factors modulating the perceived urgency of synthesized voice alarm. Experiment 2, 3 and 4 quantified the relations between the objective changes in each of the quantifiable parameters and the subjective changes in urgency perception. This research has implications for the design and implementation of synthesized voice alarm systems where urgency mapping is required.
https://doi.org/10.5143/JESK.2006.25.2.063 인용 PDF KSCI

A Study on Objective Quality Assessment for Synthesized speech by Rule (규칙합성음의 객관적 품질평가에 관한 연구)

홍진우;김순협
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.30B no.10
- /
- pp.42-49
- /
- 1993
In this paper, we evaluate the quality of synthesized speech by rule using the LPC CD as a objective measure, and then compare the test result with the subjective one. Speech used for the test consists of 108 words which are selected by word construction method using Korean attribute and frequency distribution, synthesized by demi-syllable rule. By evaluating the quality of synthesized speech by reule objectively, we have tried to resolve the problems such as lots of evaluation time, expansion of test scale, and variables of analysis result arised by subjective measure. We have, also, proved the validity of the objective test using the LPC CD, by comparing intelligibility which is the index for the subjective quality evaluation of synthesized speech by rule with MOS. From this results, we can provide a guide for quality assessment that would be useful in the R&D of synthesis method and the commercial products using synthesized speech.
PDF

Break Strength Prediction Using Maximum a Posterior Probability (MAP 확률을 이용한 끊어 읽기 강도 예측)

Kim Sanghun;Park Jun;Lee Youngjik
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.75-78
- /
- 2000
본 논문은 자연스러운 합성음 생성을 위한 끊어 읽기 강도 예측에 관한 것으로, 문장에 대한 품사열이 주어졌을 때 Posteriori 확률을 최대화하는 끊어 읽기 강도를 비터비 디코딩으로 예측한다. 훈련용 데이터는 여성화자 1인이 발성한 2,100 문장이며, 음성 데이터로부터 휴지길이(pause)에 따라 끊어 읽기 강도를 2단계로 할당하고, 텍스트에서는 30개의 품사 태그 심볼을 이용하여 형태소분석 및 태깅을 수행하였다. 관측확률은 3개 연속하는 품사열이 발생할 확률로 하고 끊어 읽기 강도 천이확률은 bigram으로 했을 때, cross validation 방법으로 성능 평가를 수행하였다 평가결과, 훈련데이타에 대해서는 $89.7\%$, 테스트 데이터에 대해서는 $84.9\%$의 예측정확률을 보였다.
PDF

Search Result 333, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)