Search | Korea Science

A Hardware Implementation of Support Vector Machines for Speaker Verification System (에스 브이 엠을 이용한 화자인증 알고리즘의 하드웨어 구현 연구)

최우용;황병희;이경희;반성범;정용화;정상화
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.41 no.3
- /
- pp.175-182
- /
- 2004
There is a growing interest in speaker verification, which verifies someone by his/her voices. There are many speaker vitrification algorithms such as HMM and DTW. However, it is impossible to apply these algorithms to memory limited applications because of large number of feature vectors to register or verify users. In this paper we introduces a speaker verification system using SVM, which needs a little memory usage and computation time. Also we proposed hardware architecture for SVM. Experiments were conducted with Korean database which consists of four-digit strings. Although the error rate of SVM is slightly higher than that of HMM, SVM required much less computation time and small model size.
PDF KSCI

Research on data augmentation algorithm for time series based on deep learning

Shiyu Liu;Hongyan Qiao;Lianhong Yuan;Yuan Yuan;Jun Liu
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.17 no.6
- /
- pp.1530-1544
- /
- 2023
Data monitoring is an important foundation of modern science. In most cases, the monitoring data is time-series data, which has high application value. The deep learning algorithm has a strong nonlinear fitting capability, which enables the recognition of time series by capturing anomalous information in time series. At present, the research of time series recognition based on deep learning is especially important for data monitoring. Deep learning algorithms require a large amount of data for training. However, abnormal sample is a small sample in time series, which means the number of abnormal time series can seriously affect the accuracy of recognition algorithm because of class imbalance. In order to increase the number of abnormal sample, a data augmentation method called GANBATS (GAN-based Bi-LSTM and Attention for Time Series) is proposed. In GANBATS, Bi-LSTM is introduced to extract the timing features and then transfer features to the generator network of GANBATS.GANBATS also modifies the discriminator network by adding an attention mechanism to achieve global attention for time series. At the end of discriminator, GANBATS is adding averagepooling layer, which merges temporal features to boost the operational efficiency. In this paper, four time series datasets and five data augmentation algorithms are used for comparison experiments. The generated data are measured by PRD(Percent Root Mean Square Difference) and DTW(Dynamic Time Warping). The experimental results show that GANBATS reduces up to 26.22 in PRD metric and 9.45 in DTW metric. In addition, this paper uses different algorithms to reconstruct the datasets and compare them by classification accuracy. The classification accuracy is improved by 6.44%-12.96% on four time series datasets.
https://doi.org/10.3837/tiis.2023.06.002 인용 PDF HTML

A Novel Query-by-Singing/Humming Method by Estimating Matching Positions Based on Multi-layered Perceptron

Pham, Tuyen Danh;Nam, Gi Pyo;Shin, Kwang Yong;Park, Kang Ryoung
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.7 no.7
- /
- pp.1657-1670
- /
- 2013
The increase in the number of music files in smart phone and MP3 player makes it difficult to find the music files which people want. So, Query-by-Singing/Humming (QbSH) systems have been developed to retrieve music from a user's humming or singing without having to know detailed information about the title or singer of song. Most previous researches on QbSH have been conducted using musical instrument digital interface (MIDI) files as reference songs. However, the production of MIDI files is a time-consuming process. In addition, more and more music files are newly published with the development of music market. Consequently, the method of using the more common MPEG-1 audio layer 3 (MP3) files for reference songs is considered as an alternative. However, there is little previous research on QbSH with MP3 files because an MP3 file has a different waveform due to background music and multiple (polyphonic) melodies compared to the humming/singing query. To overcome these problems, we propose a new QbSH method using MP3 files on mobile device. This research is novel in four ways. First, this is the first research on QbSH using MP3 files as reference songs. Second, the start and end positions on the MP3 file to be matched are estimated by using multi-layered perceptron (MLP) prior to performing the matching with humming/singing query file. Third, for more accurate results, four MLPs are used, which produce the start and end positions for dynamic time warping (DTW) matching algorithm, and those for chroma-based DTW algorithm, respectively. Fourth, two matching scores by the DTW and chroma-based DTW algorithms are combined by using PRODUCT rule, through which a higher matching accuracy is obtained. Experimental results with AFA MP3 database show that the accuracy (Top 1 accuracy of 98%, with an MRR of 0.989) of the proposed method is much higher than that of other methods. We also showed the effectiveness of the proposed system on consumer mobile device.
https://doi.org/10.3837/tiis.2013.07.008 인용 PDF KSCI

Segmental Analysis of Curved Non-Prismatic Prestressed Concrete Box Girder Bridges (시공단계를 고려환 곡선변단면 프리스트레스트 콘크리트 박스거더교량의 해석)

Park, Chan Min;Kang, Young Jin
- KSCE Journal of Civil and Environmental Engineering Research
- /
- v.14 no.1
- /
- pp.71-81
- /
- 1994
A method is presented for the analysis of curved segmentally erected prestressed concrete box girder bridges including time-dependent effects due to load history, temperature history, creep, shrinkage, aging of concrete and relaxation of prestressing steel. The segments can be either precast or cast-in-place. Thin-walled beam theory and finite element method are combined to develop a curved nonprismatic thin-walled box beam element. The element consists of three nodes and each node has eight displacement degrees of freedom, including transverse distortion and longitudinal warping of the cross section.
PDF

Implementation of Sound Recognition for Security Camera (보안카메라에서 소리인식 구현)

Yun, Tae-In;Ku, Ha-Neul;Kim, Do-Eun;Jang, Won-Serk;Kwon, Soon-Kak;Kwon, Oh-Jun
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2012.05a
- /
- pp.491-493
- /
- 2012
소리인식이란 우리 귀에 들리는 모든 소리를 받아 들여 소리의 값과 저장되어 있는 데이터의 값을 비교하여 인식 결과를 도출해내는 과정을 의미한다. 보안 카메라는 현재 다양한 장소에서 설치되어 있어도 여전히 보안의 사각지대는 존재하며, 이를 보완하기 위해서는 여러 방향을 촬영하기 위한 아주 많은 보완 카메라가 설치될 수 밖에 없다. 그렇게 되면 설치비용이 더욱 증가되고, 무수한 카메라는 사람들에게 심적 부담감을 줄 것이다. 본 논문은 보안 카메라에 마이크를 설치하고, 입력되는 소리를 인식하여 발생되는 상황을 판단하는 시스템을 설계하고 구현하기 위한 것이다. 이를 바탕으로 보안 카메라의 사각지대를 소리인식으로 해결할 수 있어서 보완 카메라의 설치 비용을 줄일 수 있다.
PDF

Implementation of Speaker Verification Security System Using DSP Processor(TMS320C32) (DSP Processor(TMS320C32)를 이용한 화자인증 보안시스템의 구현)

Haam, Young-Jun;Kwon, Hyuk-Jae;Choi, Soo-Young;Jeong, lk-Joo
- Journal of Industrial Technology
- /
- v.21 no.B
- /
- pp.107-116
- /
- 2001
The speech includes various kinds of information : language information, speaker's information, affectivity, hygienic condition, utterance environment etc. when a person communicates with others. All technologies to utilize in real life processing this speech are called the speech technology. The speech technology contains speaker's information that among them and it includes a speech which is known as a speaker recognition. DTW(Dynamic Time Warping) is the speaker recognition technology that seeks the pattern of standard speech signal and the similarity degree in an inputted speech signal using dynamic programming. ln this study, using TMS320C32 DSP processor, we are to embody this DTW and to construct a security system.
PDF

Study on the pronunciation correction in English Learning (영어 학습 시의 발성 교정 기술에 관한 연구)

Kim Jae-Min;Beack Seung-Kwon;Hahn Minsoo
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.119-122
- /
- 2000
In this paper, we implement an elementary system to correct accent, pronunciation, and intonation in English spoken by non-native English speakers. In case of the accent evaluation, energy and pitch information are used to find stressed syllables, and then we extract the segment information of input patterns using a dynamic time warping method to discriminate and evaluate accent position. For the pronunciation evaluation. we utilize the segment information using the same algorithm as in accent evaluation and calculate the spectral distance measure for each phoneme between input and reference. For the intonation evaluation. we propose nine pattern of slope to estimate pitch contour, then we grade test sentences by accumulated error obtained by the distance measure and estimated slope. Our result shows that 98 percent of accent and 71 percent of pronunciation evaluation agree with perceptual measure. As the result of the intonation evaluation. system represent the similar order of grade for the four sentences having different intonation patterns compared with perceptual evaluation.
PDF

Noise Reduction and Estimating the Similarity of Ambulatory ECG Signals (이동형 심전도 신호의 잡음 제거 및 유사도 평가)

Shin, Seung-Won;Lee, Jeong-Whan;Lee, Kang-Hwi;Kim, Dong-Jun;Kim, Kyeong-Seop
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.57 no.3
- /
- pp.507-513
- /
- 2008
In this study, we develope an ambulatory ECG acquisition system by implementing a patch-style and wireless electrode. To alleviate the inherent noisy characteristics of the mobile signal, we apply a matched filter and concurrently detect R-peak values. Moreover, the measure for resolving shape distance is computed to estimate the relative similarity between two ECG signals and to decide whether the abnormal characteristics in ECG exist or not.
PDF KSCI

Effect On-line Automatic Signature Verification by Improved DTW (개선된 DTW를 통한 효과적인 서명인식 시스템의 제안)

Dong-uk Cho;Gun-hee Han
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.4 no.2
- /
- pp.87-95
- /
- 2003
Dynamic Programming Matching (DPM) is a mathematical optimization technique for sequentially structured problems, which has, over the years, played a major role in providing primary algorithms in pattern recognition fields. Most practical applications of this method in signature verification have been based on the practical implementational version proposed by Sakoe and Chiba [9], and il usually applied as a case of slope constraint p = 0. We found, in this case, a modified version of DPM by applying a heuristic (forward seeking) implementation is more efficient, offering significantly reduced processing complexity as well as slightly improved verification performance.
PDF

Genetic Algorithm for Speaker Adaptation in Speech Recognition (유전자 알고리듬을 이용한 화자 적응적 음성인식)

임동철
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06c
- /
- pp.107-110
- /
- 1998
본 논문은 DTW(Dynamic Time Warping)을 이용한 음성인식에서 표준패턴(reference patterns)으로 사용되는 벡터열을 GA(Genetic Algorithm)을 이용하여 보다 적응된 패턴의 벡터열로 생성하는 방법을 제시한다. 본 논문의 필요성은 다음과 같다. 음성인식의 주요한 엔진들 중에 하나로 DTW가 사용된다[1]. DTW는 표준패턴과 시험패턴(test patterns)간의 최적 경로(optimal path)를 찾아내어 가장 유사한 패턴을 찾아내는 방법을 말한다. 그러나 음성은 같은 발음에 대해서도 사람의 발성 길이와 목의 상태 등에 따라 다양한 패턴으로 나타나며 동일 화자의 같은 어휘도 시간과 환경에 따라 변한다. 따라서 이러한 음성의 동적 특성에 적응하는 방법이 필요하다. 본 논문은 이러한 문제에 대한 해결 방법으로 GA를 이용하여 보다 적합하고 적응적인 표준 패턴을 생성시켜 적응하는 방법을 개발하였다.
PDF

Search Result 292, Processing Time 0.02 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)