Search | Korea Science

A study on speech enhancement using complex-valued spectrum employing Feature map Dependent attention gate (특징 맵 중요도 기반 어텐션을 적용한 복소 스펙트럼 기반 음성 향상에 관한 연구)

Jaehee Jung;Wooil Kim
- The Journal of the Acoustical Society of Korea
- /
- v.42 no.6
- /
- pp.544-551
- /
- 2023
Speech enhancement used to improve the perceptual quality and intelligibility of noise speech has been studied as a method using a complex-valued spectrum that can improve both magnitude and phase in a method using a magnitude spectrum. In this paper, a study was conducted on how to apply attention mechanism to complex-valued spectrum-based speech enhancement systems to further improve the intelligibility and quality of noise speech. The attention is performed based on additive attention and allows the attention weight to be calculated in consideration of the complex-valued spectrum. In addition, the global average pooling was used to consider the importance of the feature map. Complex-valued spectrum-based speech enhancement was performed based on the Deep Complex U-Net (DCUNET) model, and additive attention was conducted based on the proposed method in the Attention U-Net model. The results of the experiments on noise speech in a living room environment showed that the proposed method is improved performance over the baseline model according to evaluation metrics such as Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short Time Object Intelligence (STOI), and consistently improved performance across various background noise environments and low Signal-to-Noise Ratio (SNR) conditions. Through this, the proposed speech enhancement system demonstrated its effectiveness in improving the intelligibility and quality of noisy speech.
https://doi.org/10.7776/ASK.2023.42.6.544 인용 PDF

The Applications of Viscoelastic Dampers for Vibration control (고층건물의 진동제어를 위한 점탄성 감쇠기의 활용)

김진구;홍성일;이경아;이동근
- Journal of the Earthquake Engineering Society of Korea
- /
- v.4 no.1
- /
- pp.77-88
- /
- 2000
복소모드 중첩법은 점탄성 감쇠기가 설치된 비비례 감쇠시스템의 정확한 동적 거동을 예측할 수 있는 방법이지만 많은 자유도를 갖는 고층건물의 해석시 고유치 해석과 모드중첩과정에서 많은 시간과 노력이 필요하게 된다. 본 논문에서는 효율적인 모형화를 위하여 강막가정과 행렬응축기법을 적용하고 구조물의 진동에 영향을 주는 주요모드의 선택을 위한 복소모드 응답참여계수를 제안하므로써 복소모드 중첩법의 효율성은 높였다. 또한 비비례 감쇠시스템에서 감쇠를 고려하여 응답스펙트럼을 재구성한후 선택된 주요 모드를 중첩하여 최대층간변위가 발생하는 곳에 감쇠기를 설치하였다 이 방법은 감쇠기가 설치된 구조물에 대하여 만족되는 수준의 최대층간변위가 발생할 때 까지 고유치 해석만을 반복.수행하면서 감쇠기를 연속적으로 설치하는 방법이다. 제안된 방법의 정확성과 효율성을 검토하기 위하여 예제 구조물의 대상으로 해석한 결과 응답의 정확성을 유지하면서 해석에 필요한 시간을 대폭 절감할 수 있었다.
PDF

Magneto-impedance and Magnetic Relaxation in Electrodeposited Cu/Ni₈₀Fe₂₀ Core/Shell Composite Wire (전기도금 된 Cu/Ni₈₀Fe₂₀ 코어/쉘 복합 와이어에서 자기임피던스 및 자기완화)

Yoon, Seok Soo;Cho, Seong Eon;Kim, Dong Young
- Journal of the Korean Magnetics Society
- /
- v.25 no.1
- /
- pp.10-15
- /
- 2015
The model for the magneto-impedance of composite wires composed of highly conductive nonmagnetic metal core and soft magnetic shell was derived based on the Maxwell's equations. The Cu($100{\mu}m$ diameter)/$Ni_{80}Fe_{20}$($15{\mu}m$ thickness) core/shell composite wire was fabricated by electrodeposition. The impedance spectra for the $Cu/Ni_{80}Fe_{20}$ core/shell composite wire were measured in the frequency range of 10 kHz~10 MHz under longitudinal dc magnetic field in 0 Oe~200 Oe. The spectra of complex permeability in circumferential direction were extracted from the impedance spectra by using the derived model. The extracted spectra of complex permeability showed relaxation-type dispersion which is well curve-fitted with Debye equation with single relaxation frequency. By analyzing the magnetic field dependence of the complex permeability spectra, it has been verified that the composite wire has magnetic anisotropy in longitudinal direction and the origin of the single relaxation process is the magnetization rotation in circumferential direction.
https://doi.org/10.4283/JKMS.2015.25.1.010 인용 PDF KSCI

A study on loss combination in time and frequency for effective speech enhancement based on complex-valued spectrum (효과적인 복소 스펙트럼 기반 음성 향상을 위한 시간과 주파수 영역 손실함수 조합에 관한 연구)

Jung, Jaehee;Kim, Wooil
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.1
- /
- pp.38-44
- /
- 2022
Speech enhancement is performed to improve intelligibility and quality of the noise-corrupted speech. In this paper, speech enhancement performance was compared using different loss functions in time and frequency domains. This study proposes a combination of loss functions to utilize advantage of each domain by considering both the details of spectrum and the speech waveform. In our study, Scale Invariant-Source to Noise Ratio (SI-SNR) is used for the time domain loss function, and Mean Squared Error (MSE) is used for the frequency domain, which is calculated over the complex-valued spectrum and magnitude spectrum. The phase loss is obtained using the sin function. Speech enhancement result is evaluated using Source-to-Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI). In order to confirm the result of speech enhancement, resulting spectrograms are also compared. The experimental results over the TIMIT database show the highest performance when using combination of SI-SNR and magnitude loss functions.
https://doi.org/10.7776/ASK.2022.41.1.038 인용 PDF KSCI

Determination of the complex refractive index and thickness of MNA/PMMA thin film (MNA/PMMA 고분자박막의 복소굴절율 및 두께결정)

김상열
- Korean Journal of Optics and Photonics
- /
- v.7 no.4
- /
- pp.357-362
- /
- 1996
The thickness and the spectrum of the complex refractive index in the region 1.5~4.5 eV, of an MNA/PMMA thin film fabricated by spin casting are determined. The film thickness and the refractive index in its transparent region is calculated by modeling the spectroscopic ellipsometry data. The extinction coefficient spectrum is obtained from the absorption spectrum in its non-transparent region. The best fit oscillator parameters of the classical Lorentz oscillator and a quantum mechanical oscillator are found. The complex refractive index spectrum by these oscillators are compared. The present technique can be applied to get the thickness and the complex refractive index of unknown polymer films and thus it will be useful in optical characterization of those films.
PDF

Complex refractive index of PECVD grown DLC thin films and density variation versus growth condition (PECVD 방법으로 성장시킨 DLC 박막의 복소굴절율 및 성장조건에 따른 박막상수 변화)

김상준;방현용;김상열;김성화;이상현;김성영
- Korean Journal of Optics and Photonics
- /
- v.8 no.4
- /
- pp.277-282
- /
- 1997
The complex refractive index of Diamond-like Carbon (DLC) thin films, which can be applied to optical devices or electrical devices, have been determined using optical methods. DLC thin films are grown on Si(100) substrates and vitreous silica substrates respectively, using the technique of plasma enhanced chemical vapor deposition (PECVD). The spectroscopic ellipsometry data($\psi$, $\Delta$) and the transmission spectra of these DLC films are obtained. These optical spectra are analyzed with the help of the Sellmeier dipersion relation and a quantum mechanically derived dispersion relation. Using spectroscopic ellipsometry data at their transparent region, the refractive index and the effective thickness of DLC films on vitreous silica are model calculated, Then the transmission spectra are inverted to yield the extinction coefficient spectra k(λ) at absorbing region. These spectra are fit to the quantum mechanical dispersion relation and the best fit dispersion constants are determined. The complex refractive indices are easily calculated with these constants. The spectroscopic ellipsometry data at the absorbing region in model calculated to give the packing densities and the degrees of surface microroughness of DLC films. Discussions are made in correlation with the growth condition of DLC films.
PDF

A study on skip-connection with time-frequency self-attention for improving speech enhancement based on complex-valued spectrum (복소 스펙트럼 기반 음성 향상의 성능 향상을 위한 time-frequency self-attention 기반 skip-connection 기법 연구)

Jaehee Jung;Wooil Kim
- The Journal of the Acoustical Society of Korea
- /
- v.42 no.2
- /
- pp.94-101
- /
- 2023
A deep neural network composed of encoders and decoders, such as U-Net, used for speech enhancement, concatenates the encoder to the decoder through skip-connection. Skip-connection helps reconstruct the enhanced spectrum and complement the lost information. The features of the encoder and the decoder connected by the skip-connection are incompatible with each other. In this paper, for complex-valued spectrum based speech enhancement, Self-Attention (SA) method is applied to skip-connection to transform the feature of encoder to be compatible with the features of decoder. SA is a technique in which when generating an output sequence in a sequence-to-sequence tasks the weighted average of input is used to put attention on subsets of input, showing that noise can be effectively eliminated by being applied in speech enhancement. The three models using encoder and decoder features to apply SA to skip-connection are studied. As experimental results using TIMIT database, the proposed methods show improvements in all evaluation metrics compared to the Deep Complex U-Net (DCUNET) with skip-connection only.
https://doi.org/10.7776/ASK.2023.42.2.094 인용 PDF

Complex nested U-Net-based speech enhancement model using a dual-branch decoder (이중 분기 디코더를 사용하는 복소 중첩 U-Net 기반 음성 향상 모델)

Seorim Hwang;Sung Wook Park;Youngcheol Park
- The Journal of the Acoustical Society of Korea
- /
- v.43 no.2
- /
- pp.253-259
- /
- 2024
This paper proposes a new speech enhancement model based on a complex nested U-Net with a dual-branch decoder. The proposed model consists of a complex nested U-Net to simultaneously estimate the magnitude and phase components of the speech signal, and the decoder has a dual-branch decoder structure that performs spectral mapping and time-frequency masking in each branch. At this time, compared to the single-branch decoder structure, the dual-branch decoder structure allows noise to be effectively removed while minimizing the loss of speech information. The experiment was conducted on the VoiceBank + DEMAND database, commonly used for speech enhancement model training, and was evaluated through various objective evaluation metrics. As a result of the experiment, the complex nested U-Net-based speech enhancement model using a dual-branch decoder increased the Perceptual Evaluation of Speech Quality (PESQ) score by about 0.13 compared to the baseline, and showed a higher objective evaluation score than recently proposed speech enhancement models.
https://doi.org/10.7776/ASK.2024.43.2.253 인용 PDF

Performance comparison evaluation of real and complex networks for deep neural network-based speech enhancement in the frequency domain (주파수 영역 심층 신경망 기반 음성 향상을 위한 실수 네트워크와 복소 네트워크 성능 비교 평가)

Hwang, Seo-Rim;Park, Sung Wook;Park, Youngcheol
- The Journal of the Acoustical Society of Korea
- /
- v.41 no.1
- /
- pp.30-37
- /
- 2022
This paper compares and evaluates model performance from two perspectives according to the learning target and network structure for training Deep Neural Network (DNN)-based speech enhancement models in the frequency domain. In this case, spectrum mapping and Time-Frequency (T-F) masking techniques were used as learning targets, and a real network and a complex network were used for the network structure. The performance of the speech enhancement model was evaluated through two objective evaluation metrics: Perceptual Evaluation of Speech Quality (PESQ) and Short-Time Objective Intelligibility (STOI) depending on the scale of the dataset. Test results show the appropriate size of the training data differs depending on the type of networks and the type of dataset. In addition, they show that, in some cases, using a real network may be a more realistic solution if the number of total parameters is considered because the real network shows relatively higher performance than the complex network depending on the size of the data and the learning target.
https://doi.org/10.7776/ASK.2022.41.1.030 인용 PDF KSCI

Refractive index change of nonlinear polymer thin films induced by corona poling and quantitative evaluation of poling effect (코로나 극성배향이 비선형 고분자박막의 복소굴절율에 미치는 영향 및 배향효과의 정량화)

길현옥;김상준;방현용;김상열
- Korean Journal of Optics and Photonics
- /
- v.10 no.3
- /
- pp.181-187
- /
- 1999
We prepared the side-chain type nonlinear optical NPP(N-(6-nitrophenyl)-(L)-prolinol) polymer films by spin coating method. Ellipsometric spectra were in situ collected by using spectroscopic phase modulated ellipsometer while the NPP polymer films were being corona poled at the temperature above glass transition. We calculated film thickness and the refractive index dispersion by modeling the spectro-ellipsometry data in transparent region. We also calculated the refractive index and the extinction coefficient of the polymer films by numerically inverting the spectro-ellipsometry data in absorbing region, while the previously determined film thickness was used. The independently determined extinction coefficient spectra from the analysis of transmission spectra were compared with those by spectro-ellipsometry and they showed an excellent agreement with each other. From the analysis of the complex refractive index change of the NPP polymer thin films induced by the corona poling, we could determine the vertical complex refractive index and the horizontal complex refractive index separately. Using the volume fraction of the vertical component f⊥, the degree of poling of poled NPP polymer films was quantitatively addressed. It is suggested that the present method can be used to quantitatively address the degree of poling in an absolute manner and to depth profile the poled fraction of thick polymer films. It will be useful to understand the structural change of polymer films and hence the poling mechanism during the poling process.
PDF

Search Result 50, Processing Time 0.02 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)