Search | Korea Science

A Novel Video Image Text Detection Method

Zhou, Lin;Ping, Xijian;Gao, Haolin;Xu, Sen
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.6 no.4
- /
- pp.1140-1152
- /
- 2012
A novel and universal method of video image text detection is proposed. A coarse-to-fine text detection method is implemented. Firstly, the spectral clustering (SC) method is adopted to coarsely detect text regions based on the stationary wavelet transform (SWT). In order to make full use of the information, multi-parameters kernel function which combining the features similarity information and spatial adjacency information is employed in the SC method. Secondly, 28 dimension classifying features are proposed and support vector machine (SVM) is implemented to classify text regions with non-text regions. Experimental results on video images show the encouraging performance of the proposed algorithm and classifying features.
https://doi.org/10.3837/tiis.2012.04.011 인용 PDF KSCI

Development of Electrocardiogram Identification Algorithm using SVM classifier (SVM분류기를 이용한 심전도 개인인식 알고리즘 개발)

Lee, Sang-Joon;Lee, Myoung-Ho
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.60 no.3
- /
- pp.654-661
- /
- 2011
This paper is about a personal identification algorithm using an ECG that has been studied by a few researchers recently. Previously published algorithm can be classified as two methods. One is the method that analyzes of ECG features and the other is the morphological analysis of ECG. The main characteristic of proposed algorithm can be classified the method of analysis ECG features. Proposed algorithm adopts DSTW(Down Slope Trace Wave) for extracting ECG features, and applies SVM(Support Vector Machine) to training and testing as a classifier algorithm. We choose 18 ECG files from MIT-BIH Normal Sinus Rhythm Database for estimating of algorithm performance. The algorithm extracts 100 heartbeats from each ECG file, and use 40 heartbeats for training and 60 heartbeats for testing. The proposed algorithm shows clearly superior performance in all ECG data, amounting to 93.89% heartbeat recognition rate and 100% ECG recognition rate.
https://doi.org/10.5370/KIEE.2011.60.3.654 인용 PDF KSCI

Robust Speech Recognition by Utilizing Class Histogram Equalization (클래스 히스토그램 등화 기법에 의한 강인한 음성 인식)

Suh, Yung-Joo;Kim, Hor-Rin;Lee, Yun-Keun
- MALSORI
- /
- no.60
- /
- pp.145-164
- /
- 2006
This paper proposes class histogram equalization (CHEQ) to compensate noisy acoustic features for robust speech recognition. CHEQ aims to compensate for the acoustic mismatch between training and test speech recognition environments as well as to reduce the limitations of the conventional histogram equalization (HEQ). In contrast to HEQ, CHEQ adopts multiple class-specific distribution functions for training and test environments and equalizes the features by using their class-specific training and test distributions. According to the class-information extraction methods, CHEQ is further classified into two forms such as hard-CHEQ based on vector quantization and soft-CHEQ using the Gaussian mixture model. Experiments on the Aurora 2 database confirmed the effectiveness of CHEQ by producing a relative word error reduction of 61.17% over the baseline met-cepstral features and that of 19.62% over the conventional HEQ.
PDF

Action Recognition with deep network features and dimension reduction

Li, Lijun;Dai, Shuling
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.2
- /
- pp.832-854
- /
- 2019
Action recognition has been studied in computer vision field for years. We present an effective approach to recognize actions using a dimension reduction method, which is applied as a crucial step to reduce the dimensionality of feature descriptors after extracting features. We propose to use sparse matrix and randomized kd-tree to modify it and then propose modified Local Fisher Discriminant Analysis (mLFDA) method which greatly reduces the required memory and accelerate the standard Local Fisher Discriminant Analysis. For feature encoding, we propose a useful encoding method called mix encoding which combines Fisher vector encoding and locality-constrained linear coding to get the final video representations. In order to add more meaningful features to the process of action recognition, the convolutional neural network is utilized and combined with mix encoding to produce the deep network feature. Experimental results show that our algorithm is a competitive method on KTH dataset, HMDB51 dataset and UCF101 dataset when combining all these methods.
https://doi.org/10.3837/tiis.2019.02.019 인용 PDF KSCI HTML

Crime amount prediction based on 2D convolution and long short-term memory neural network

Dong, Qifen;Ye, Ruihui;Li, Guojun
- ETRI Journal
- /
- v.44 no.2
- /
- pp.208-219
- /
- 2022
Crime amount prediction is crucial for optimizing the police patrols' arrangement in each region of a city. First, we analyzed spatiotemporal correlations of the crime data and the relationships between crime and related auxiliary data, including points-of-interest (POI), public service complaints, and demographics. Then, we proposed a crime amount prediction model based on 2D convolution and long short-term memory neural network (2DCONV-LSTM). The proposed model captures the spatiotemporal correlations in the crime data, and the crime-related auxiliary data are used to enhance the regional spatial features. Extensive experiments on real-world datasets are conducted. Results demonstrated that capturing both temporal and spatial correlations in crime data and using auxiliary data to extract regional spatial features improve the prediction performance. In the best case scenario, the proposed model reduces the prediction error by at least 17.8% and 8.2% compared with support vector regression (SVR) and LSTM, respectively. Moreover, excessive auxiliary data reduce model performance because of the presence of redundant information.
https://doi.org/10.4218/etrij.2021-0396 인용 PDF KSCI

Speech Emotion Recognition with SVM, KNN and DSVM

Hadhami Aouani ;Yassine Ben Ayed
- International Journal of Computer Science & Network Security
- /
- v.23 no.8
- /
- pp.40-48
- /
- 2023
Speech Emotions recognition has become the active research theme in speech processing and in applications based on human-machine interaction. In this work, our system is a two-stage approach, namely feature extraction and classification engine. Firstly, two sets of feature are investigated which are: the first one is extracting only 13 Mel-frequency Cepstral Coefficient (MFCC) from emotional speech samples and the second one is applying features fusions between the three features: Zero Crossing Rate (ZCR), Teager Energy Operator (TEO), and Harmonic to Noise Rate (HNR) and MFCC features. Secondly, we use two types of classification techniques which are: the Support Vector Machines (SVM) and the k-Nearest Neighbor (k-NN) to show the performance between them. Besides that, we investigate the importance of the recent advances in machine learning including the deep kernel learning. A large set of experiments are conducted on Surrey Audio-Visual Expressed Emotion (SAVEE) dataset for seven emotions. The results of our experiments showed given good accuracy compared with the previous studies.
https://doi.org/10.22937/IJCSNS.2023.23.8.6 인용 PDF

Data Mining-Aided Automatic Landslide Detection Using Airborne Laser Scanning Data in Densely Forested Tropical Areas

Mezaal, Mustafa Ridha;Pradhan, Biswajeet
- Korean Journal of Remote Sensing
- /
- v.34 no.1
- /
- pp.45-74
- /
- 2018
Landslide is a natural hazard that threats lives and properties in many areas around the world. Landslides are difficult to recognize, particularly in rainforest regions. Thus, an accurate, detailed, and updated inventory map is required for landslide susceptibility, hazard, and risk analyses. The inconsistency in the results obtained using different features selection techniques in the literature has highlighted the importance of evaluating these techniques. Thus, in this study, six techniques of features selection were evaluated. Very-high-resolution LiDAR point clouds and orthophotos were acquired simultaneously in a rainforest area of Cameron Highlands, Malaysia by airborne laser scanning (LiDAR). A fuzzy-based segmentation parameter (FbSP optimizer) was used to optimize the segmentation parameters. Training samples were evaluated using a stratified random sampling method and set to 70% training samples. Two machine-learning algorithms, namely, Support Vector Machine (SVM) and Random Forest (RF), were used to evaluate the performance of each features selection algorithm. The overall accuracies of the SVM and RF models revealed that three of the six algorithms exhibited higher ranks in landslide detection. Results indicated that the classification accuracies of the RF classifier were higher than the SVM classifier using either all features or only the optimal features. The proposed techniques performed well in detecting the landslides in a rainforest area of Malaysia, and these techniques can be easily extended to similar regions.
https://doi.org/10.7780/kjrs.2018.34.1.4 인용 PDF KSCI HTML

Identification of Individuals using Single-Lead Electrocardiogram Signal (단일 리드 심전도를 이용한 개인 식별)

Lim, Seohyun;Min, Kyeongran;Lee, Jongshill;Jang, Dongpyo;Kim, Inyoung
- Journal of Biomedical Engineering Research
- /
- v.35 no.3
- /
- pp.42-49
- /
- 2014
We propose an individual identification method using a single-lead electrocardiogram signal. In this paper, lead I ECG is measured from subjects in various physical and psychological states. We performed a noise reduction for lead I signal as a preprocessing stage and this signal is used to acquire the representative beat waveform for individuals by utilizing the ensemble average. From the P-QRS-T waves, features are extracted to identify individuals, 19 using the duration and amplitude information, and 16 from the QRS complex acquired by applying Pan-Tompkins algorithm to the ensemble averaged waveform. To analyze the effect of each feature and to improve efficiency while maintaining the performance, Relief-F algorithm is used to select features from the 35 features extracted. Some or all of these 35 features were used in the support vector machine (SVM) learning and tests. The classification accuracy using the entire feature set was 98.34%. Experimental results show that it is possible to identify a person by features extracted from limb lead I signal only.
https://doi.org/10.9718/JBER.2014.35.3.42 인용 PDF KSCI

A Study on Feature Projection Methods for a Real-Time EMG Pattern Recognition (실시간 근전도 패턴인식을 위한 특징투영 기법에 관한 연구)

Chu, Jun-Uk;Kim, Shin-Ki;Mun, Mu-Seong;Moon, In-Hyuk
- Journal of Institute of Control, Robotics and Systems
- /
- v.12 no.9
- /
- pp.935-944
- /
- 2006
EMG pattern recognition is essential for the control of a multifunction myoelectric hand. The main goal of this study is to develop an efficient feature projection method for EMC pattern recognition. To this end, we propose a linear supervised feature projection that utilizes linear discriminant analysis (LDA). We first perform wavelet packet transform (WPT) to extract the feature vector from four channel EMC signals. For dimensionality reduction and clustering of the WPT features, the LDA incorporates class information into the learning procedure, and finds a linear matrix to maximize the class separability for the projected features. Finally, the multilayer perceptron classifies the LDA-reduced features into nine hand motions. To evaluate the performance of LDA for the WPT features, we compare LDA with three other feature projection methods. From a visualization and quantitative comparison, we show that LDA has better performance for the class separability, and the LDA-projected features improve the classification accuracy with a short processing time. We implemented a real-time pattern recognition system for a multifunction myoelectric hand. In experiment, we show that the proposed method achieves 97.2% recognition accuracy, and that all processes, including the generation of control commands for myoelectric hand, are completed within 97 msec. These results confirm that our method is applicable to real-time EMG pattern recognition far myoelectric hand control.
https://doi.org/10.5302/J.ICROS.2006.12.9.935 인용 PDF KSCI

A Video Expression Recognition Method Based on Multi-mode Convolution Neural Network and Multiplicative Feature Fusion

Ren, Qun
- Journal of Information Processing Systems
- /
- v.17 no.3
- /
- pp.556-570
- /
- 2021
The existing video expression recognition methods mainly focus on the spatial feature extraction of video expression images, but tend to ignore the dynamic features of video sequences. To solve this problem, a multi-mode convolution neural network method is proposed to effectively improve the performance of facial expression recognition in video. Firstly, OpenFace 2.0 is used to detect face images in video, and two deep convolution neural networks are used to extract spatiotemporal expression features. Furthermore, spatial convolution neural network is used to extract the spatial information features of each static expression image, and the dynamic information feature is extracted from the optical flow information of multiple expression images based on temporal convolution neural network. Then, the spatiotemporal features learned by the two deep convolution neural networks are fused by multiplication. Finally, the fused features are input into support vector machine to realize the facial expression classification. Experimental results show that the recognition accuracy of the proposed method can reach 64.57% and 60.89%, respectively on RML and Baum-ls datasets. It is better than that of other contrast methods.
https://doi.org/10.3745/JIPS.02.0156 인용 PDF KSCI

Search Result 998, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)