Search | Korea Science

Musical Genre Classification Based on Deep Residual Auto-Encoder and Support Vector Machine

Xue Han;Wenzhuo Chen;Changjian Zhou
- Journal of Information Processing Systems
- /
- v.20 no.1
- /
- pp.13-23
- /
- 2024
Music brings pleasure and relaxation to people. Therefore, it is necessary to classify musical genres based on scenes. Identifying favorite musical genres from massive music data is a time-consuming and laborious task. Recent studies have suggested that machine learning algorithms are effective in distinguishing between various musical genres. However, meeting the actual requirements in terms of accuracy or timeliness is challenging. In this study, a hybrid machine learning model that combines a deep residual auto-encoder (DRAE) and support vector machine (SVM) for musical genre recognition was proposed. Eight manually extracted features from the Mel-frequency cepstral coefficients (MFCC) were employed in the preprocessing stage as the hybrid music data source. During the training stage, DRAE was employed to extract feature maps, which were then used as input for the SVM classifier. The experimental results indicated that this method achieved a 91.54% F1-score and 91.58% top-1 accuracy, outperforming existing approaches. This novel approach leverages deep architecture and conventional machine learning algorithms and provides a new horizon for musical genre classification tasks.
https://doi.org/10.3745/JIPS.04.0300 인용 PDF

A Comparison of the Land Cover Data Sets over Asian Region: USGS, IGBP, and UMd (아시아 지역 지면피복자료 비교 연구: USGS, IGBP, 그리고 UMd)

Kang, Jeon-Ho;Suh, Myoung-Seok;Kwak, Chong-Heum
- Atmosphere
- /
- v.17 no.2
- /
- pp.159-169
- /
- 2007
A comparison of the three land cover data sets (United States Geological Survey: USGS, International Geosphere Biosphere Programme: IGBP, and University of Maryland: UMd), derived from 1992-1993 Advanced Very High Resolution Radiometer(AVHRR) data sets, was performed over the Asian continent. Preprocesses such as the unification of map projection and land cover definition, were applied for the comparison of the three different land cover data sets. Overall, the agreement among the three land cover data sets was relatively high for the land covers which have a distinct phenology, such as urban, open shrubland, mixed forest, and bare ground (>45%). The ratios of triple agreement (TA), couple agreement (CA) and total disagreement (TD) among the three land cover data sets are 30.99%, 57.89% and 8.91%, respectively. The agreement ratio between USGS and IGBP is much greater (about 80%) than that (about 32%) between USGS and UMd (or IGBP and UMd). The main reasons for the relatively low agreement among the three land cover data sets are differences in 1) the number of land cover categories, 2) the basic input data sets used for the classification, 3) classification (or clustering) methodologies, and 4) level of preprocessing. The number of categories for the USGS, IGBP and UMd are 24, 17 and 14, respectively. USGS and IGBP used only the 12 monthly normalized difference vegetation index (NDVI), whereas UMd used the 12 monthly NDVI and other 29 auxiliary data derived from AVHRR 5 channels. USGS and IGBP used unsupervised clustering method, whereas UMd used the supervised technique, decision tree using the ground truth data derived from the high resolution Landsat data. The insufficient preprocessing in USGS and IGBP compared to the UMd resulted in the spatial discontinuity and misclassification.
PDF KSCI

Detection of Traffic Light using Color after Morphological Preprocessing (형태학적 전처리 후 색상을 이용한 교통 신호의 검출)

Kim, Chang-dae;Choi, Seo-hyuk;Kang, Ji-hun;Ryu, Sung-pil;Kim, Dong-woo;Ahn, Jae-hyeong
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2015.05a
- /
- pp.367-370
- /
- 2015
This paper proposes an improve method of the detection performance of traffic lights for autonomous driving cars. Earlier detection methods used to adopt color thresholding, template matching and based learning maching methods, but its have some problems such as recognition rate decreasing, slow processing time. The proposed method uses both detection mask and morphological preprocessing. Firstly, input color images are converted to YCbCr image in order to strengthen its illumination, and horizontal edge components are extracted in the Y Channel. Secondly, the region of interest is detected according to morphological characteristics of the traffic lights. Finally, the traffic signal is detected based on color distributions. The proposed method showed that the detection rate and processing time improved rather than the conventional algorithm about some surrounding environments.
PDF

A Motion Detection Approach based on UAV Image Sequence

Cui, Hong-Xia;Wang, Ya-Qi;Zhang, FangFei;Li, TingTing
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.12 no.3
- /
- pp.1224-1242
- /
- 2018
Aiming at motion analysis and compensation, it is essential to conduct motion detection with images. However, motion detection and tracking from low-altitude images obtained from an unmanned aerial system may pose many challenges due to degraded image quality caused by platform motion, image instability and illumination fluctuation. This research tackles these challenges by proposing a modified joint transform correlation algorithm which includes two preprocessing strategies. In spatial domain, a modified fuzzy edge detection method is proposed for preprocessing the input images. In frequency domain, to eliminate the disturbance of self-correlation items, the cross-correlation items are extracted from joint power spectrum output plane. The effectiveness and accuracy of the algorithm has been tested and evaluated by both simulation and real datasets in this research. The simulation experiments show that the proposed approach can derive satisfactory peaks of cross-correlation and achieve detection accuracy of displacement vectors with no more than 0.03pixel for image pairs with displacement smaller than 20pixels, when addition of image motion blurring in the range of 0~10pixel and 0.002variance of additive Gaussian noise. Moreover,this paper proposes quantitative analysis approach using tri-image pairs from real datasets and the experimental results show that detection accuracy can be achieved with sub-pixel level even if the sampling frequency can only attain 50 frames per second.
https://doi.org/10.3837/tiis.2018.03.014 인용 PDF KSCI

Method of Human Detection using Edge Symmetry and Feature Vector (에지 대칭과 특징 벡터를 이용한 사람 검출 방법)

Byun, Oh-Sung
- Journal of the Korea Society of Computer and Information
- /
- v.16 no.8
- /
- pp.57-66
- /
- 2011
In this paper, it is proposed for algorithm to detect human efficiently using a edge symmetry and gradient directional characteristics in realtime by the feature extraction in a single input image. Proposed algorithm is composed of three stages, preprocessing, region partition of human candidates, verification of candidate regions. Here, preprocessing stage is strong the image regardless of the intensity and brightness of surrounding environment, also detects a contour with characteristics of human as considering the shape features size and the condition of human for characteristic of human. And stage for region partition of human candidates has separated the region with edge symmetry for human and size in the detected contour, also divided 1st candidates region with applying the adaboost algorithm. Finally, the candidate region verification stage makes excellent the performance for the false detection by verifying the candidate region using feature vector of a gradient for divided local area and classifier. The results of the simulations, which is applying the proposed algorithm, the processing speed of the proposed algorithms is improved approximately 1.7 times, also, the FNR(False Negative Rate) is confirmed to be better 3% than the conventional algorithm which is a single structure algorithm.
https://doi.org/10.9708/jksci.2011.16.8.057 인용 PDF KSCI

Multiple Model Fuzzy Prediction Systems with Adaptive Model Selection Based on Rough Sets and its Application to Time Series Forecasting (러프 집합 기반 적응 모델 선택을 갖는 다중 모델 퍼지 예측 시스템 구현과 시계열 예측 응용)

Bang, Young-Keun;Lee, Chul-Heui
- Journal of the Korean Institute of Intelligent Systems
- /
- v.19 no.1
- /
- pp.25-33
- /
- 2009
Recently, the TS fuzzy models that include the linear equations in the consequent part are widely used for time series forecasting, and the prediction performance of them is somewhat dependent on the characteristics of time series such as stationariness. Thus, a new prediction method is suggested in this paper which is especially effective to nonstationary time series prediction. First, data preprocessing is introduced to extract the patterns and regularities of time series well, and then multiple model TS fuzzy predictors are constructed. Next, an appropriate model is chosen for each input data by an adaptive model selection mechanism based on rough sets, and the prediction is going. Finally, the error compensation procedure is added to improve the performance by decreasing the prediction error. Computer simulations are performed on typical cases to verify the effectiveness of the proposed method. It may be very useful for the prediction of time series with uncertainty and/or nonstationariness because it handles and reflects better the characteristics of data.
https://doi.org/10.5391/JKIIS.2009.19.1.025 인용 PDF KSCI

Design of Face Recognition algorithm Using PCA＆LDA combined for Data Pre-Processing and Polynomial-based RBF Neural Networks (PCA와 LDA를 결합한 데이터 전 처리와 다항식 기반 RBFNNs을 이용한 얼굴 인식 알고리즘 설계)

Oh, Sung-Kwun;Yoo, Sung-Hoon
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.61 no.5
- /
- pp.744-752
- /
- 2012
In this study, the Polynomial-based Radial Basis Function Neural Networks is proposed as an one of the recognition part of overall face recognition system that consists of two parts such as the preprocessing part and recognition part. The design methodology and procedure of the proposed pRBFNNs are presented to obtain the solution to high-dimensional pattern recognition problems. In data preprocessing part, Principal Component Analysis(PCA) which is generally used in face recognition, which is useful to express some classes using reduction, since it is effective to maintain the rate of recognition and to reduce the amount of data at the same time. However, because of there of the whole face image, it can not guarantee the detection rate about the change of viewpoint and whole image. Thus, to compensate for the defects, Linear Discriminant Analysis(LDA) is used to enhance the separation of different classes. In this paper, we combine the PCA&LDA algorithm and design the optimized pRBFNNs for recognition module. The proposed pRBFNNs architecture consists of three functional modules such as the condition part, the conclusion part, and the inference part as fuzzy rules formed in 'If-then' format. In the condition part of fuzzy rules, input space is partitioned with Fuzzy C-Means clustering. In the conclusion part of rules, the connection weight of pRBFNNs is represented as two kinds of polynomials such as constant, and linear. The coefficients of connection weight identified with back-propagation using gradient descent method. The output of the pRBFNNs model is obtained by fuzzy inference method in the inference part of fuzzy rules. The essential design parameters (including learning rate, momentum coefficient and fuzzification coefficient) of the networks are optimized by means of Differential Evolution. The proposed pRBFNNs are applied to face image(ex Yale, AT&T) datasets and then demonstrated from the viewpoint of the output performance and recognition rate.
https://doi.org/10.5370/KIEE.2012.61.5.744 인용 PDF KSCI

A Fast Shortest Path Algorithm Between Two Points inside a Segment-Visible Polygon (선분가시 다각형 내부에 있는 두 점 사이의 최단 경로를 구하는 빠른 알고리즘)

Kim, Soo-Hwan
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.14 no.2
- /
- pp.369-374
- /
- 2010
The shortest path between two points inside a simple polygon P is a minimum-length path among all paths connecting them which don't pass by the exterior of P. A linear time algorithm for computing the shortest path in a general simple polygon requires triangulating a polygon as preprocessing. The linear time triangulating is known to very complex to understand and implement it. It is also inefficient in case that the input without very large size is given because its time complexity has a big constant factor. In this paper, we present the customized shortest path algorithm for a segment-visible polygon which is a simple polygon weakly visible from an internal line segment. Our algorithm doesn't require triangulating as preprocessing and consists of simple procedures such as construction of convex hulls, so it is easy to implement and runs very fast in linear time.
https://doi.org/10.6109/jkiice.2010.14.2.369 인용 PDF KSCI

Fast Algorithms for Computing the Shortest Path between Two Points inside a Simple Polygon (다각형 내부에 있는 두 점 사이의 최단 경로를 구하는 빠른 알고리즘)

Kim, Soo-Hwan;Lim, Intaek;Choi, Jinoh;Choi, Jinho
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2009.10a
- /
- pp.807-810
- /
- 2009
In this paper, we consider the shortest path problems in a simple polygon. The shortest path between two points inside a polygon P is a minimum-length path among all paths connecting them which don't pass by the exterior of P. A linear time algorithm for computing the shortest path in a general simple polygon requires triangulating a polygon as preprocessing. The linear time triangulating is known to very complex to understand and implement it. It is also inefficient in cases without very large input size. In this paper, we present the customized shortest path algorithms for specific polygon classes such as star-shaped polygons, edge-visible polygons, and monotone polygons. These algorithms need not triangulating as preprocessing, so they are simple and run very fast in linear time.
PDF

Extraction of Skin Regions through Filtering-based Noise Removal (필터링 기반의 잡음 제거를 통한 피부 영역의 추출)

Jang, Seok-Woo
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.21 no.12
- /
- pp.672-678
- /
- 2020
Ultra-high-speed images that accurately depict the minute movements of objects have become common as low-cost and high-performance cameras that can film at high speeds have emerged. In this paper, the proposed method removes unexpected noise contained in images after input at high speed, and then extracts an area of interest that can represent personal information, such as skin areas, from the image in which noise has been removed. In this paper, noise generated by abnormal electrical signals is removed by applying bilateral filters. A color model created through pre-learning is then used to extract the area of interest that represents the personal information contained within the image. Experimental results show that the introduced algorithms remove noise from high-speed images and then extract the area of interest robustly. The approach presented in this paper is expected to be useful in various applications related to computer vision, such as image preprocessing, noise elimination, tracking and monitoring of target areas, etc.
https://doi.org/10.5762/KAIS.2020.21.12.672 인용 PDF KSCI

Search Result 298, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)