Search | Korea Science

SVM Based Speaker Verification Using Sparse Maximum A Posteriori Adaptation

Kim, Younggwan;Roh, Jaeyoung;Kim, Hoirin
- IEIE Transactions on Smart Processing and Computing
- /
- v.2 no.5
- /
- pp.277-281
- /
- 2013
Modern speaker verification systems based on support vector machines (SVMs) use Gaussian mixture model (GMM) supervectors as their input feature vectors, and the maximum a posteriori (MAP) adaptation is a conventional method for generating speaker-dependent GMMs by adapting a universal background model (UBM). MAP adaptation requires the appropriate amount of input utterance due to the number of model parameters to be estimated. On the other hand, with limited utterances, unreliable MAP adaptation can be performed, which causes adaptation noise even though the Bayesian priors used in the MAP adaptation smooth the movements between the UBM and speaker dependent GMMs. This paper proposes a sparse MAP adaptation method, which is known to perform well in the automatic speech recognition area. By introducing sparse MAP adaptation to the GMM-SVM-based speaker verification system, the adaptation noise can be mitigated effectively. The proposed method utilizes the L0 norm as a regularizer to induce sparsity. The experimental results on the TIMIT database showed that the sparse MAP-based GMM-SVM speaker verification system yields a 42.6% relative reduction in the equal error rate with few additional computations.
PDF

Korean Phoneme Recognition Using Self-Organizing Feature Map (SOFM 신경회로망을 이용한 한국어 음소 인식)

Jeon, Yong-Koo;Yang, Jin-Woo;Kim, Soon-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.14 no.2
- /
- pp.101-112
- /
- 1995
In order to construct a feature map-based phoneme classification system for speech recognition, two procedures are usually required. One is clustering and the other is labeling. In this paper, we present a phoneme classification system based on the Kohonen's Self-Organizing Feature Map (SOFM) for clusterer and labeler. It is known that the SOFM performs self-organizing process by which optimal local topographical mapping of the signal space and yields a reasonably high accuracy in recognition tasks. Consequently, SOFM can effectively be applied to the recognition of phonemes. Besides to improve the performance of the phoneme classification system, we propose the learning algorithm combined with the classical K-mans clustering algorithm in fine-tuning stage. In order to evaluate the performance of the proposed phoneme classification algorithm, we first use totaly 43 phonemes which construct six intra-class feature maps for six different phoneme classes. From the speaker-dependent phoneme classification tests using these six feature maps, we obtain recognition rate of $87.2\%$ and confirm that the proposed algorithm is an efficient method for improvement of recognition performance and convergence speed.
PDF

Query-based Visual Attention Algorithm for Object Recognition of A Mobile Robot (이동로봇의 물체인식을 위한 질의 기반 시각 집중 알고리즘)

Ryu, Gwang-Geun;Lee, Sang-Hoon;Suh, Il-Hong
- Journal of the Institute of Electronics Engineers of Korea SC
- /
- v.44 no.1
- /
- pp.50-58
- /
- 2007
In this paper, we propose a query-based visual attention algorithm for effective object finding of a vision-based mobile robot. This algorithm is developed by extending conventional bottom-up visual attention algorithms. In our proposed algorithm various conspicuity maps are merged to make a saliency map, where weighting values are determined by query-dependent object properties. The saliency map is then used to find possible attentive location of queried object. To show the validities of our proposed algorithm, several objects are employed to compare performances of our proposed algorithm with those of conventional bottom-up approaches. Here, as one of exemplar query-dependent object property, color property is used.
PDF KSCI

Korean Phoneme Recognition by Combining Self-Organizing Feature Map with K-means clustering algorithm

Jeon, Yong-Ku;Lee, Seong-Kwon;Yang, Jin-Woo;Lee, Hyung-Jun;Kim, Soon-Hyob
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1994.06a
- /
- pp.1046-1051
- /
- 1994
It is known that SOFM has the property of effectively creating topographically the organized map of various features on input signals, SOFM can effectively be applied to the recognition of Korean phonemes. However, is isn't guaranteed that the network is sufficiently learned in SOFM algorithm. In order to solve this problem, we propose the learning algorithm combined with the conventional K-means clustering algorithm in fine-tuning stage. To evaluate the proposed algorithm, we performed speaker dependent recognition experiment using six phoneme classes. Comparing the performances of the Kohonen's algorithm with a proposed algorithm, we prove that the proposed algorithm is better than the conventional SOFM algorithm.
PDF

A study on speech enhancement using complex-valued spectrum employing Feature map Dependent attention gate (특징 맵 중요도 기반 어텐션을 적용한 복소 스펙트럼 기반 음성 향상에 관한 연구)

Jaehee Jung;Wooil Kim
- The Journal of the Acoustical Society of Korea
- /
- v.42 no.6
- /
- pp.544-551
- /
- 2023
Speech enhancement used to improve the perceptual quality and intelligibility of noise speech has been studied as a method using a complex-valued spectrum that can improve both magnitude and phase in a method using a magnitude spectrum. In this paper, a study was conducted on how to apply attention mechanism to complex-valued spectrum-based speech enhancement systems to further improve the intelligibility and quality of noise speech. The attention is performed based on additive attention and allows the attention weight to be calculated in consideration of the complex-valued spectrum. In addition, the global average pooling was used to consider the importance of the feature map. Complex-valued spectrum-based speech enhancement was performed based on the Deep Complex U-Net (DCUNET) model, and additive attention was conducted based on the proposed method in the Attention U-Net model. The results of the experiments on noise speech in a living room environment showed that the proposed method is improved performance over the baseline model according to evaluation metrics such as Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short Time Object Intelligence (STOI), and consistently improved performance across various background noise environments and low Signal-to-Noise Ratio (SNR) conditions. Through this, the proposed speech enhancement system demonstrated its effectiveness in improving the intelligibility and quality of noisy speech.
https://doi.org/10.7776/ASK.2023.42.6.544 인용 PDF

Defection Detection Analysis Based on Time-Dependent Data

Song, Hee-Seok;Kim, Jae-Kyeong;Chae, Kyung-Hee
- Proceedings of the Korea Inteligent Information System Society Conference
- /
- 2002.11a
- /
- pp.445-453
- /
- 2002
Past and current customer behavior is the best predicator of future customer behavior. This paper introduces a procedure on personalized defection detection and prevention for an online game site. The basic idea for our defection detection and prevention is adopted from the observation that potential defectors have a tendency to take a couple of months or weeks to gradually change their behavior (i.e. trim-out their usage volume) before their eventual withdrawal. For this purpose, we suggest a SOM (Self-Organizing Map) based procedure to determine the possible states of customer behavior from past behavior data. Based on this representation of the state of behavior, potential defectors are detected by comparing their monitored trajectories of behavior states with frequent and confident trajectories of past defectors. The key feature of this study includes a defection prevention procedure which recommends the desirable behavior state for the ext period so as to lower the likelihood of defection. The defection prevention procedure can be used to design a marketing campaign on an individual basis because it provides desirable behavior patterns for the next period. The experiments demonstrate that our approach is effective for defection prevention and efficient for defection detection because it predicts potential defectors without deterioration of prediction accuracy compared to that of the MLP (Multi-Layer Perceptron) neural network.
PDF

Adaptive Reconstruction of Harmonic Time Series Using Point-Jacobian Iteration MAP Estimation and Dynamic Compositing: Simulation Study

Lee, Sang-Hoon
- Korean Journal of Remote Sensing
- /
- v.24 no.1
- /
- pp.79-89
- /
- 2008
Irregular temporal sampling is a common feature of geophysical and biological time series in remote sensing. This study proposes an on-line system for reconstructing observation image series contaminated by noises resulted from mechanical problems or sensing environmental condition. There is also a high likelihood that during the data acquisition periods the target site corresponding to any given pixel may be covered by fog or cloud, thereby resulting in bad or missing observation. The surface parameters associated with the land are usually dependent on the climate, and many physical processes that are displayed in the image sensed from the land then exhibit temporal variation with seasonal periodicity. A feedback system proposed in this study reconstructs a sequence of images remotely sensed from the land surface having the physical processes with seasonal periodicity. The harmonic model is used to track seasonal variation through time, and a Gibbs random field (GRF) is used to represent the spatial dependency of digital image processes. The experimental results of this simulation study show the potentiality of the proposed system to reconstruct the image series observed by imperfect sensing technology from the environment which are frequently influenced by bad weather. This study provides fundamental information on the elements of the proposed system for right usage in application.
https://doi.org/10.7780/kjrs.2008.24.1.79 인용 PDF KSCI

A Study on the Development of Embedded Serial Multi-modal Biometrics Recognition System (임베디드 직렬 다중 생체 인식 시스템 개발에 관한 연구)

Kim, Joeng-Hoon;Kwon, Soon-Ryang
- Journal of the Korean Institute of Intelligent Systems
- /
- v.16 no.1
- /
- pp.49-54
- /
- 2006
The recent fingerprint recognition system has unstable factors, such as copy of fingerprint patterns and hacking of fingerprint feature point, which mali cause significant system error. Thus, in this research, we used the fingerprint as the main recognition device and then implemented the multi-biometric recognition system in serial using the speech recognition which has been widely used recently. As a multi-biometric recognition system, once the speech is successfully recognized, the fingerprint recognition process is run. In addition, speaker-dependent DTW(Dynamic Time Warping) algorithm is used among existing speech recognition algorithms (VQ, DTW, HMM, NN) for effective real-time process while KSOM (Kohonen Self-Organizing feature Map) algorithm, which is the artificial intelligence method, is applied for the fingerprint recognition system because of its calculation amount. The experiment of multi-biometric recognition system implemented in this research showed 2 to $7\%$ lower FRR (False Rejection Ratio) than single recognition systems using each fingerprints or voice, but zero FAR (False Acceptance Ratio), which is the most important factor in the recognition system. Moreover, there is almost no difference in the recognition time(average 1.5 seconds) comparing with other existing single biometric recognition systems; therefore, it is proved that the multi-biometric recognition system implemented is more efficient security system than single recognition systems based on various experiments.
https://doi.org/10.5391/JKIIS.2006.16.1.049 인용 PDF KSCI

A Study on Determinants of Commercial Land Values in Gwangju City (광주시 상업지 지가의 형성요인에 관한 연구)

Lee, Hyun-Wook
- Journal of the Korean association of regional geographers
- /
- v.2 no.2
- /
- pp.159-171
- /
- 1996
The aim of this study is which factors affect the commercial land values and how they act upon them through distribution of commercial land values by multiple regression analysis in Gwangju city. The major findings of this study are as follows: (1) The changes of commercial land values distribution in $1989{\sim}1996$, We see that the commercial area of higher land values extends following the main arterial road. This is related to urbanization in urban fringe while the decline of commercial land values occurs in city center with long history of commercial region. This is due to unsuitableness in rapid changes of commercial environment because of fragmented lots, old buildings. traffic congestion etc. (2) The regions where commercial land values greatly rose are the west in constructed the new planning city center of Sangmu-dong. and the south west in which is related to the extension of high density apartment and the location of big discount stores. (3) Through the changes in commercial land values distribution map. and road map, topographical map, we know that commercial land values is related to various factors; namely, distance from CBD, convenient traffic, reputation of commercial district, condition of a road, size of supplementary, a degree of commercial land use etc. (4) From the above related factor, six variables are extracted by operational definition. That is the spatial distance from the city center, the walking distance to a stopping place, the road width, the amount of bus traffic, the amount of pedestrian, the number of the shop. (5) Data of seven variables are collected on the highest values point of each Dong. We applicate multiple regression analysis with commercial land values as a dependent variable, extracted six variables as independent variables. (6) As a result of multiple regression on the determinants of commercial land values, the variables which is greatly related to commercial land values are the amount of pedestrain, the spatial distance from city center. We identify that two variables explain variance of the commercial land values by 65%. (7) In order to make clear about not explained 35%. we carry out analysis of residual. In consequence, we see small estimate in downtown area and large estimate in urban fringe. This feature is due to simple core structure of Gwangju city and limits of this regression model.
PDF

Visible and SWIR Satellite Image Fusion Using Multi-Resolution Transform Method Based on Haze-Guided Weight Map (Haze-Guided Weight Map 기반 다중해상도 변환 기법을 활용한 가시광 및 SWIR 위성영상 융합)

Taehong Kwak;Yongil Kim
- Korean Journal of Remote Sensing
- /
- v.39 no.3
- /
- pp.283-295
- /
- 2023
With the development of sensor and satellite technology, numerous high-resolution and multi-spectral satellite images have been available. Due to their wavelength-dependent reflection, transmission, and scattering characteristics, multi-spectral satellite images can provide complementary information for earth observation. In particular, the short-wave infrared (SWIR) band can penetrate certain types of atmospheric aerosols from the benefit of the reduced Rayleigh scattering effect, which allows for a clearer view and more detailed information to be captured from hazed surfaces compared to the visible band. In this study, we proposed a multi-resolution transform-based image fusion method to combine visible and SWIR satellite images. The purpose of the fusion method is to generate a single integrated image that incorporates complementary information such as detailed background information from the visible band and land cover information in the haze region from the SWIR band. For this purpose, this study applied the Laplacian pyramid-based multi-resolution transform method, which is a representative image decomposition approach for image fusion. Additionally, we modified the multiresolution fusion method by combining a haze-guided weight map based on the prior knowledge that SWIR bands contain more information in pixels from the haze region. The proposed method was validated using very high-resolution satellite images from Worldview-3, containing multi-spectral visible and SWIR bands. The experimental data including hazed areas with limited visibility caused by smoke from wildfires was utilized to validate the penetration properties of the proposed fusion method. Both quantitative and visual evaluations were conducted using image quality assessment indices. The results showed that the bright features from the SWIR bands in the hazed areas were successfully fused into the integrated feature maps without any loss of detailed information from the visible bands.
https://doi.org/10.7780/kjrs.2023.39.3.3 인용 PDF HTML

Search Result 10, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)