Search | Korea Science

An Optimization on the Psychoacoustic Model for MPEG-2 AAC Encoder (MPEG-2 AAC Encoder의 심리음향 모델 최적화)

Park, Jong-Tae;Moon, Kyu-Sung;Rhee, Kang-Hyeon
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.38 no.2
- /
- pp.33-41
- /
- 2001
Currently, the compression is one of the most important technology in multimedia society. Audio files arc rapidly propagated throughout internet Among them, the most famous one is MP-3(MPEC-1 Laver3) which can obtain CD tone from 128Kbps, but tone quality is abruptly down below 64Kbps. MPEC-II AAC(Advanccd Audio Coding) is not compatible with MPEG 1, but it has high compression of 1.4 times than MP 3, has max. 7.1 and 96KHz sampling rate. In this paper, we propose an algorithm that decreased the capacity of AAC encoding computation but increased the processing speed by optimizing psychoacoustic model which has enormous amount of computation in MPEG 2 AAC encoder. The optimized psychoacoustic model algorithm was implemented by C++ language. The experiment shows that the psychoacoustic model carries out FFT(Fast Fourier Transform) computation of 3048 point with 44.1 KHz sampling rate for SMR(Signal to Masking Ratio), and each entropy value is inputted to the subband filters for the control of encoder block. The proposed psychoacoustic model is operated with high speed because of optimization of unpredictable value. Also, when we transform unpredictable value into a tonality index, the speed of operation process is increased by a tonality index optimized in high frequency range.
PDF

Iris Region Masking based on Blurring Technique (블러링기법 기반의 홍채영역 마스킹 방법)

Lee, Gi Seong;Kim, Soo Hyung
- Smart Media Journal
- /
- v.11 no.2
- /
- pp.25-30
- /
- 2022
With the recent development of device performance such as smartphones, cameras, and video cameras, it has become possible to obtain human biometric information from images and photos. A German hacker group obtained human iris information from high-definition photos and revealed hacking into iris scanners on smartphones. As high-quality images and photos can be obtained with such advanced devices, the need for a suitable security system is also emerging. Therefore, in this paper, we propose a method of automatically masking human iris information in images and photos using Haar Cascades and Blur models from openCV. It is a technology that automatically masks iris information by recognizing a person's eye in a photo or video and provides the result. If this technology is used in devices and applications such as smartphones and zoom, it is expected to provide better security services to users.
https://doi.org/10.30693/SMJ.2022.11.2.25 인용 PDF KSCI

Peridynamic Impact Fracture Analysis of Multilayered Glass with Nonlocal Ghost Interlayer Model (비국부 층간 결합 모델을 고려한 다중적층 유리의 페리다이나믹 충돌 파괴 해석)

Ha, Youn Doh;An, Tae Sick
- Journal of the Computational Structural Engineering Institute of Korea
- /
- v.31 no.6
- /
- pp.373-380
- /
- 2018
We present the peridynamic dynamic fracture analysis to solve impact fracturing of multilayered glass impacted by a high-velocity object. In the most practical multilayered glass structures, main layers are glued by thin elastic masking films. Thus, it is difficult and expensive to construct the numerical model for such a multilayered structure. In this paper, we employ efficient numerical modeling of multilayered structures with a nonlocal ghost interlayer model in which ghost particles are distributed between main layers and they are interacting with each other in peridynamic way. We also consider a simple nonlocal contact condition in peridynamic frameworks to solve impact and penetration of the high-velocity impactor to the multilayered structure. Finally we can confirm the fracture capabilities of the method using a multilayered glass model in which 7 glass layers and a single elastic backing layer are affixed by polyvinyl butyral films.
https://doi.org/10.7734/COSEIK.2018.31.6.373 인용 PDF KSCI

Training Techniques for Data Bias Problem on Deep Learning Text Summarization (딥러닝 텍스트 요약 모델의 데이터 편향 문제 해결을 위한 학습 기법)

Cho, Jun Hee;Oh, Hayoung
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.26 no.7
- /
- pp.949-955
- /
- 2022
Deep learning-based text summarization models are not free from datasets. For example, a summarization model trained with a news summarization dataset is not good at summarizing other types of texts such as internet posts and papers. In this study, we define this phenomenon as Data Bias Problem (DBP) and propose two training methods for solving it. The first is the 'proper nouns masking' that masks proper nouns. The second is the 'length variation' that randomly inflates or deflates the length of text. As a result, experiments show that our methods are efficient for solving DBP. In addition, we analyze the results of the experiments and present future development directions. Our contributions are as follows: (1) We discovered DBP and defined it for the first time. (2) We proposed two efficient training methods and conducted actual experiments. (3) Our methods can be applied to all summarization models and are easy to implement, so highly practical.
https://doi.org/10.6109/jkiice.2022.26.7.949 인용 PDF KSCI

Speech Enhancement Based on Psychoacoustic Model (심리음향모델에 근거한 음성개선)

Lee Jingeol
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.337-338
- /
- 2000
The perceptual filter for speech enhancement was analytically derived where the frequency content of the input noisy signal was made the same as that of the estimated clean signal in auditory domain. However, the analytical derivation should rely on the deconvolution associated with the spreading function in the psychoacoustic model, which results in an ill-conditioned problem. In order to cope with the problem associated with the deconvolution, we propose a novel psychoacoustic model based speech enhancement filter whose principle is the same as the perceptual filter, however the filter is derived by a constrained optimization which provides solutions to the ill-conditioned problem.
PDF

Speech enhancement using psychoacoustics model (사이코어쿠스틱스 모델을 이용한 음성 향상)

Kwon, Chul-Hyun;Shin, Dae-Kyu;Park, Sang-Hui
- Proceedings of the KIEE Conference
- /
- 1999.11c
- /
- pp.748-750
- /
- 1999
In this study, a speech enhancement is presented based on the utilization of well-known auditory mechanism, noise masking. The speech enhancement approach adopted here is to derive an modifier that achieves audible noise suppression. This modification selectively affects the perceptually significant spectral values, and is therefore less prone to introduction of unwanted distortions than methods that affect the complete STSA and produces more enhanced results at low SNR as well as at high SNR. The speech enhancement method adopted here needs exact estimation of the minimum specteal value per critical band because it uses only the minimum spectral value per critical band. For this, the method adopted here uses the modified spectral subtraction that is more flexible than power spectral subtraction. So, the result in experiment represented better SNR than before.
PDF

Feature Compensation with Model-based Estimation for Noise Masking (잡음마스킹을 이용한 환경보상기법)

Kim, Young-Joon;Kim, Nam-Soo;Lee, Yun-Gun
- Proceedings of the KSPS conference
- /
- 2006.11a
- /
- pp.7-10
- /
- 2006
본 논문에서는 음성의 모델을 이용하여 확률적인 기반으로 잡음의 마스킹 정도를 측정하는 방법에 대해서 제시한다. 잡음의 마스킹 정도를 측정하는 기준으로서 '잡음 마스킹 확률'을 구하는 방법에 대해서 설명하고 이의 특성에 대해서 알아본다. 그리고 잡음에 대한 '잡음 마스킹 확률'을 이용하여 잡음 환경에서의 음성인식 특징벡터의 성능 향상에 대해 적용해 보았다. 제안된 방법은 ETSI 에서 음성인식 표준실험으로 제시한 Aurora2 데이터베이스 상에서 실험해 보았다. 그 결과 기존의 알고리즘에 비해 16.58%의 성능 향상을 이루어 낼 수 있었다.
PDF

Noise suppressor Using Psychoacoustic Model and Wavelet Packet Transform (심리음향 모델과 웨이블릿 패킷 변환을 이용한 잡음제거기)

Kim, Mi-Seon;Kim, Young-Ju;Lee, In-Sung
- Proceedings of the IEEK Conference
- /
- 2006.06a
- /
- pp.345-346
- /
- 2006
In this paper, we propose the noise suppressor with the psychoacoustic model and wavelet packet transform. The objective of the scheme is to enhance speech corrupted by colored or non-stationary noise. If corrupted noise is colored, subband approach would be more efficient than whole band one. To avoid serious residual noise and speech distortion, we must adjust the Wavelet Coefficient threshold. In this paper, the subband is designed matching with the critical band. And WCT is adapted by noise masking threshold(NMT) and segmental signal to noise ratio(seg_SNR). Consequently this work improve the PESQ-MOS about 0.23 in the case of coded speech.
PDF

An efficient multipath propagation prediction using improved vector representation (효율적 다중경로 전파 예측을 위한 Ray-Tracing의 개선된 벡터 표현법)

이상호;강선미;고한석
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.24 no.12A
- /
- pp.1974-1984
- /
- 1999
In this paper, we introduce a highly efficient data structure that effectively captures the multipath phenomenon needed for accurate propagation modeling and fast propagation prediction. The proposed object representation procedure is called 'circular representation (CR)' of microwave masking objects such as buildings, to improve over the conventional vector representation (VR) form in fast ray tracing. The proposed CR encapsulates a building with a circle represented by a center point and radius. In this configuration, the CR essentially functions as the basic building block for higher geometric structures, enhancing the efficiency more than when VR is used alone. The simulation results indicate that the proposed CR scheme reduces the computational load proportionally to the number of potential scattering objects while its hierarchical structure achieves about 50% of computational load reduction in the hierarchical octree structure.
PDF

A Study on Searching for Export Candidate Countries of the Korean Food and Beverage Industry Using Node2vec Graph Embedding and Light GBM Link Prediction (Node2vec 그래프 임베딩과 Light GBM 링크 예측을 활용한 식음료 산업의 수출 후보국가 탐색 연구)

Lee, Jae-Seong;Jun, Seung-Pyo;Seo, Jinny
- Journal of Intelligence and Information Systems
- /
- v.27 no.4
- /
- pp.73-95
- /
- 2021
This study uses Node2vec graph embedding method and Light GBM link prediction to explore undeveloped export candidate countries in Korea's food and beverage industry. Node2vec is the method that improves the limit of the structural equivalence representation of the network, which is known to be relatively weak compared to the existing link prediction method based on the number of common neighbors of the network. Therefore, the method is known to show excellent performance in both community detection and structural equivalence of the network. The vector value obtained by embedding the network in this way operates under the condition of a constant length from an arbitrarily designated starting point node. Therefore, it has the advantage that it is easy to apply the sequence of nodes as an input value to the model for downstream tasks such as Logistic Regression, Support Vector Machine, and Random Forest. Based on these features of the Node2vec graph embedding method, this study applied the above method to the international trade information of the Korean food and beverage industry. Through this, we intend to contribute to creating the effect of extensive margin diversification in Korea in the global value chain relationship of the industry. The optimal predictive model derived from the results of this study recorded a precision of 0.95 and a recall of 0.79, and an F1 score of 0.86, showing excellent performance. This performance was shown to be superior to that of the binary classifier based on Logistic Regression set as the baseline model. In the baseline model, a precision of 0.95 and a recall of 0.73 were recorded, and an F1 score of 0.83 was recorded. In addition, the light GBM-based optimal prediction model derived from this study showed superior performance than the link prediction model of previous studies, which is set as a benchmarking model in this study. The predictive model of the previous study recorded only a recall rate of 0.75, but the proposed model of this study showed better performance which recall rate is 0.79. The difference in the performance of the prediction results between benchmarking model and this study model is due to the model learning strategy. In this study, groups were classified by the trade value scale, and prediction models were trained differently for these groups. Specific methods are (1) a method of randomly masking and learning a model for all trades without setting specific conditions for trade value, (2) arbitrarily masking a part of the trades with an average trade value or higher and using the model method, and (3) a method of arbitrarily masking some of the trades with the top 25% or higher trade value and learning the model. As a result of the experiment, it was confirmed that the performance of the model trained by randomly masking some of the trades with the above-average trade value in this method was the best and appeared stably. It was found that most of the results of potential export candidates for Korea derived through the above model appeared appropriate through additional investigation. Combining the above, this study could suggest the practical utility of the link prediction method applying Node2vec and Light GBM. In addition, useful implications could be derived for weight update strategies that can perform better link prediction while training the model. On the other hand, this study also has policy utility because it is applied to trade transactions that have not been performed much in the research related to link prediction based on graph embedding. The results of this study support a rapid response to changes in the global value chain such as the recent US-China trade conflict or Japan's export regulations, and I think that it has sufficient usefulness as a tool for policy decision-making.
https://doi.org/10.13088/jiis.2021.27.4.073 인용 PDF KSCI

Search Result 66, Processing Time 0.033 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)