Search | Korea Science

Pornographic Content Detection Scheme Using Bi-directional Relationships in Audio Signals (음향 신호의 양방향적 연관성을 고려한 유해 콘텐츠 검출 기법)

Song, KwangHo;Kim, Yoo-Sung
- The Journal of the Korea Contents Association
- /
- v.20 no.5
- /
- pp.1-10
- /
- 2020
In this paper, we propose a new pornographic content detection scheme using bi-directional relationships between neighboring auditory signals in order to accurately detect sound-centered obscene contents that are rapidly spreading via the Internet. To capture the bi-directional relationships between neighboring signals, we design a multilayered bi-directional dilated-causal convolution network by stacking several dilated-causal convolution blocks each of which performs bi-directional dilated-causal convolution operations. To verify the performance of the proposed scheme, we compare its accuracy to those of the previous two schemes each of which uses simple auditory feature vectors with a support vector machine and uses only the forward relationships in audio signals by a previous stack of dilated-causal convolution layers. As the results, the proposed scheme produces an accuracy of up to 84.38% that is superior performance up to 25.80% than other two comparison schemes.
https://doi.org/10.5392/JKCA.2020.20.05.001 인용 PDF KSCI HTML

A Study on PLU (Phone-Likely Unit) for Korean Continuous Speech Recognition (강건한 한국어 연속음성인식을 위한 유사음소단일에 대한 연구)

Seo Jun-Bae;Kim Joo-Gon;Kim Min-Jung;Jung Ho-Youl;Chung Hyun-Yeol
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.37-40
- /
- 2004
본 논문은 한국어 연속음성인식에 효율적인 문맥의존 음향모델 수에 대한 연구로써 유사음소단위 수에 따른 인식 성능을 비교, 평가하였다. 기존에 본연구실에서는 48음소를 기본인식단위로 이용하고 있으나 연속음성인식의 경우 문맥종속모델이 사용되고 문맥종속모델은 변이 음을 고려한 음소가 이미 포함되어 있어 이를 고려하면 기본 음소를 줄이므로서 계산량의 감소와 인식 성능 향상을 기대할 수 있을 것으로 생각된다. 따라서 , 본 논문에서는 기존의 48음소와 이를 39음소로 줄여 인식실험에 사용하여 그 성능을 비교 평가하기로 하였다. 이를 위하여 다양한 태스크의 데이터베이스를 통합하여 부족한 문맥요소들을 확장한 후 인식실험을 수행하였다. 실험결과 변이음의 개수를 줄이면서도 인식 성능저하가 없음을 확인할 수 있었으며 연속 음성의 경우 39음소를 이용한 경우가 $10\%$정도의 향상된 인식성능을 얻을 수 있음을 확인할 수 있었다.
PDF

A comparison study of padé approximation and Model Order Reduction scheme for efficient acoustic analysis (효율적인 음향 해석을 위한 padé 근사법과 모델차수축소법의 비교연구)

Goo, Seongyeol;Kook, Junghwan;Hyun, Jaeyub;Wang, Semyung
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2012.10a
- /
- pp.731-732
- /
- 2012
PDF

Combustion Instability Prediction Using 1D Thermoacoustic Model in a Gas Turbine Combustor (가스터빈 연소기에서 1D 열음향 모델을 이용한 연소불안정 예측)

Kim, Jin Ah;Kim, Daesik
- Journal of ILASS-Korea
- /
- v.20 no.4
- /
- pp.241-246
- /
- 2015
The objective of the current study is to develop an 1D thermoacoustic model for predicting basic characteristics of combustion instability and to investigate effects of key parameters on the instabilities such as effects of flame geometry and acoustic boundary conditions. Another focus of the paper is placed on limit cycle prediction. In order to improve the model accuracy, the 1D model was modified considering the actual flame location and flame length (i.e. distribution of time delay). As a result, it is found that the reflection coefficients have a great effect on the growth rate of the instabilities. In addition, instability characteristics are shown to be strongly dependent upon the fuel compositions.
https://doi.org/10.15435/JILASSKR.2015.20.4.241 인용 PDF KSCI

Optimal Design and Analysis of a Class IV Flextensional Transducer (Class Flextensional 트랜스듀서의 최적설계 및 특성해석)

강국진;노용래
- The Journal of the Acoustical Society of Korea
- /
- v.19 no.4
- /
- pp.69-76
- /
- 2000
In this research, with the FEM we analyzed the variation of the sound pressure and thermal distribution of a Class IV Flextensional transducer in relation to its material properties and structures. Based on the results, we determined optimal structure of a Class IV Flextensional transducer that had maximum sound pressure, minimum thermal distribution, and 1 kHz resonance frequency. The sound pressure by the optimal structure is higher than that of the basic structure by two times, and the thermal distribution is much lower. Results of the present work can be utilized to design Class IV Flextensional transducers of various resonance frequency, maximum sound pressure, and minimum thermal distribution.
PDF

Performance Improvement of Speech Recognizer in Noisy Environments Based on Auditory Modeling (청각 구조를 이용한 잡음 음성의 인식 성능 향상)

Jung, Ho-Young;Kim, Do-Yeong;Un, Chong-Kwan;Lee, Soo-Young
- The Journal of the Acoustical Society of Korea
- /
- v.14 no.5
- /
- pp.51-57
- /
- 1995
In this paper, we study a noise-robust feature extraction method of speech signal based on auditory modeling. The auditory model consists of a basilar membrane, a hair cell model and spectrum output stage. Basilar membrane model describes a response characteristic of membrane according to vibration in speech wave, and is represented as a band-pass filter bank. Hair cell model describes a neural transduction according to displacements of the basilar membrane. It responds adaptively to relative values of input and plays an important role for noise-robustness. Spectrum output stage constructs a mean rate spectrum using the average firing rate of each channel. And we extract feature vectors using a mean rate spectrum. Simulation results show that when auditory-based feature extraction is used, the speech recognition performance in noisy environments is improved compared to other feature extraction methods.
PDF

Analysis on Thermal Structural Characteristics of Thermal Protection System Panel for a High-speed Vehicle (초고속 비행체 열방어 시스템 패널의 열구조 특성 분석)

Lee, Heesoo;Kim, Yongha;Park, Jungsun;Goo, Namseo;Kim, Jaeyoung
- Proceedings of the Korean Society of Propulsion Engineers Conference
- /
- 2017.05a
- /
- pp.942-944
- /
- 2017
High-speed vehicles are subjected to complex loads, such as acoustic pressure from the engine at launch and aerodynamic heating and aerodynamic pressure during flight. A thermal protection system panel is required to protect internal systems such as the fuel tank of the vehicle from the external environment. This study defines analytical models for heat transfer and thermal structure characteristics of the thermal protection system panel. Furthermore, the study performed parameters analysis to achieve the thermal structural integrity and to make it lighter.
PDF

A Study on Recognition of Korean Continuous Speech using Discrete Duration CHMM. (이산 시간 제어 CHMM을 이용한 한국어 연속 음성 인식에 관한 연구)

김상범
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1994.06c
- /
- pp.368-372
- /
- 1994
확률적 모델을 이용한 HMM 으로 한국어 연속 음성 인식시스템을 구성하였다. 학습 모델로서는 양자화 DCK가 없는 연속출력 확률밀도를 사용한 연속출력 확률분포 HMM과 과도 구간 및 정상 구간의 시간구조를 충분히 BYGUS할 수 없는 것을 계속시간 확률 파라메터를 추가하여 보완한 이산 지속시간 제어 연속출력 확률분포 HMM을 이용하였다. 인식 알고리즘은 시계열 패턴의 시간축상에서의 비선형 신축을 고려한 에 매칭으로서, 음절의 경계를 자동으로 검출하는 O에을 이용하였다. 실험에서 사용된 연속음성데이타는 4연 숫자음과 연속음성 10문장으로 하였다. 인식 실험 결과 4연 숫자음에서 CHMM은 80.7%, DDCHMM은 92.9%의 인식률을 얻었고, 신문 사설에서 발췌한 연속 음성문장의 경우 CHMM 54.2%, DDCHMM에서는 68.9%을 얻어, 시간장 제어를 고려한 DDCHMM이 CHMM보다 SHB은 인식률을 얻었다.
PDF

A CELP Speech Coder Using Secondary Long Term Prediction with Multi-Band Pass Filtered Multi-Pulses (다중 펄스와 다중 대역 이차 장구간 예측을 이용한 CELP 음성 부호화기)

서정태;최용수;강홍구;윤대희
- The Journal of the Acoustical Society of Korea
- /
- v.17 no.1
- /
- pp.9-16
- /
- 1998
본 논문에서는 낮은 비트율 CELP 음성 부호화기의 장구간 예측기의 성능 향상 방 법을 제안한다. 비트율을 낮추기 위해서는 분석 구간의 길이가 길어져야하며 이에 따라 장 구간 예측기의 성능이 저하되어 장구간 예측 후에도 준 주기성 성분이 상당량 존재하므로 백색 잡음으로 구성된 통계 코드북만으로는 이를 모델링하기 어려워진다. 제안 방법에서는 다중 대역 필터와 다중 펄스열을 이용하여 한 번 더 필터링(이차 장구간 예측)함으로써 장 구간 예측 후의 신호가 통계 코드북에 적합한 백색 잡음 형태로 되도록 모델링한다. 제안된 방법의 성능을 평가하기 위해 4.8kbps 비트율로 양자화한 후, 기존에 제안된 같은 전송률의 MBCELP와 DoD-CELP와 비교하였다. 실험 결과 제안된 방법이 기존 부호화기들에 비해 주/객관적인 음질에서 우수한 성능을 보여준다.
PDF

Statistical Analysis of Korean Phonological Variations Using a Grapheme-to-phoneme System (발음열 자동 생성기를 이용한 한국어 음운 변화 현상의 통계적 분석)

이경님;정민화
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.7
- /
- pp.656-664
- /
- 2002
We present a statistical analysis of Korean phonological variations using a Grapheme-to-Phoneme (GPT) system. The GTP system used for experiments generates pronunciation variants by applying rules modeling obligatory and optional phonemic changes and allophonic changes. These rules are derived form morphophonological analysis and government standard pronunciation rules. The GTP system is optimized for continuous speech recognition by generating phonetic transcriptions for training and constructing a pronunciation dictionary for recognition. In this paper, we describe Korean phonological variations by analyzing the statistics of phonemic change rule applications for the 60,000 sentences in the Samsung PBS Speech DB. Our results show that the most frequently happening obligatory phonemic variations are in the order of liaison, tensification, aspirationalization, and nasalization of obstruent, and that the most frequently happening optional phonemic variations are in the order of initial consonant h-deletion, insertion of final consonant with the same place of articulation as the next consonants, and deletion of final consonant with the same place of articulation as the next consonant's, These statistics can be used for improving the performance of speech recognition systems.
PDF KSCI

Search Result 110, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)