• Title/Summary/Keyword: Improved Support Vector Machine

Search Result 141, Processing Time 0.023 seconds

Asymmetric Semi-Supervised Boosting Scheme for Interactive Image Retrieval

  • Wu, Jun;Lu, Ming-Yu
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.766-773
    • /
    • 2010
  • Support vector machine (SVM) active learning plays a key role in the interactive content-based image retrieval (CBIR) community. However, the regular SVM active learning is challenged by what we call "the small example problem" and "the asymmetric distribution problem." This paper attempts to integrate the merits of semi-supervised learning, ensemble learning, and active learning into the interactive CBIR. Concretely, unlabeled images are exploited to facilitate boosting by helping augment the diversity among base SVM classifiers, and then the learned ensemble model is used to identify the most informative images for active learning. In particular, a bias-weighting mechanism is developed to guide the ensemble model to pay more attention on positive images than negative images. Experiments on 5000 Corel images show that the proposed method yields better retrieval performance by an amount of 0.16 in mean average precision compared to regular SVM active learning, which is more effective than some existing improved variants of SVM active learning.

Generating of Pareto frontiers using machine learning (기계학습을 이용한 파레토 프런티어의 생성)

  • Yun, Yeboon;Jung, Nayoung;Yoon, Min
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.3
    • /
    • pp.495-504
    • /
    • 2013
  • Evolutionary algorithms have been applied to multi-objective optimization problems by approximation methods using computational intelligence. Those methods have been improved gradually in order to generate more exactly many approximate Pareto optimal solutions. The paper introduces a new method using support vector machine to find an approximate Pareto frontier in multi-objective optimization problems. Moreover, this paper applies an evolutionary algorithm to the proposed method in order to generate more exactly approximate Pareto frontiers. Then a decision making with two or three objective functions can be easily performed on the basis of visualized Pareto frontiers by the proposed method. Finally, a few examples will be demonstrated for the effectiveness of the proposed method.

Topic Extraction and Classification Method Based on Comment Sets

  • Tan, Xiaodong
    • Journal of Information Processing Systems
    • /
    • v.16 no.2
    • /
    • pp.329-342
    • /
    • 2020
  • In recent years, emotional text classification is one of the essential research contents in the field of natural language processing. It has been widely used in the sentiment analysis of commodities like hotels, and other commentary corpus. This paper proposes an improved W-LDA (weighted latent Dirichlet allocation) topic model to improve the shortcomings of traditional LDA topic models. In the process of the topic of word sampling and its word distribution expectation calculation of the Gibbs of the W-LDA topic model. An average weighted value is adopted to avoid topic-related words from being submerged by high-frequency words, to improve the distinction of the topic. It further integrates the highest classification of the algorithm of support vector machine based on the extracted high-quality document-topic distribution and topic-word vectors. Finally, an efficient integration method is constructed for the analysis and extraction of emotional words, topic distribution calculations, and sentiment classification. Through tests on real teaching evaluation data and test set of public comment set, the results show that the method proposed in the paper has distinct advantages compared with other two typical algorithms in terms of subject differentiation, classification precision, and F1-measure.

Local Binary Pattern Based Defocus Blur Detection Using Adaptive Threshold

  • Mahmood, Muhammad Tariq;Choi, Young Kyu
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.3
    • /
    • pp.7-11
    • /
    • 2020
  • Enormous methods have been proposed for the detection and segmentation of blur and non-blur regions of the images. Due to the limited available information about the blur type, scenario and the level of blurriness, detection and segmentation is a challenging task. Hence, the performance of the blur measure operators is an essential factor and needs improvement to attain perfection. In this paper, we propose an effective blur measure based on the local binary pattern (LBP) with the adaptive threshold for blur detection. The sharpness metric developed based on LBP uses a fixed threshold irrespective of the blur type and level which may not be suitable for images with large variations in imaging conditions and blur type and level. Contradictory, the proposed measure uses an adaptive threshold for each image based on the image and the blur properties to generate an improved sharpness metric. The adaptive threshold is computed based on the model learned through the support vector machine (SVM). The performance of the proposed method is evaluated using a well-known dataset and compared with five state-of-the-art methods. The comparative analysis reveals that the proposed method performs significantly better qualitatively and quantitatively against all the methods.

A Learning-based Visual Inspection System for Part Verification in a Panorama Sunroof Assembly Line using the SVM Algorithm (SVM 학습 알고리즘을 이용한 자동차 썬루프의 부품 유무 비전검사 시스템)

  • Kim, Giseok;Lee, Saac;Cho, Jae-Soo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.12
    • /
    • pp.1099-1104
    • /
    • 2013
  • This paper presents a learning-based visual inspection method that addresses the need for an improved adaptability of a visual inspection system for parts verification in panorama sunroof assembly lines. It is essential to ensure that the many parts required (bolts and nuts, etc.) are properly installed in the PLC sunroof manufacturing process. Instead of human inspectors, a visual inspection system can automatically perform parts verification tasks to assure that parts are properly installed while rejecting any that are improperly assembled. The proposed visual inspection method is able to adapt to changing inspection tasks and environmental conditions through an efficient learning process. The proposed system consists of two major modules: learning mode and test mode. The SVM (Support Vector Machine) learning algorithm is employed to implement part learning and verification. The proposed method is very robust for changing environmental conditions, and various experimental results show the effectiveness of the proposed method.

A Study on the Extraction of Psychological Distance Embedded in Company's SNS Messages Using Machine Learning (머신 러닝을 활용한 회사 SNS 메시지에 내포된 심리적 거리 추출 연구)

  • Seongwon Lee;Jin Hyuk Kim
    • Information Systems Review
    • /
    • v.21 no.1
    • /
    • pp.23-38
    • /
    • 2019
  • The social network service (SNS) is one of the important marketing channels, so many companies actively exploit SNSs by posting SNS messages with appropriate content and style for their customers. In this paper, we focused on the psychological distances embedded in the SNS messages and developed a method to measure the psychological distance in SNS message by mixing a traditional content analysis, natural language processing (NLP), and machine learning. Through a traditional content analysis by human coding, the psychological distance was extracted from the SNS message, and these coding results were used for input data for NLP and machine learning. With NLP, word embedding was executed and Bag of Word was created. The Support Vector Machine, one of machine learning techniques was performed to train and test the psychological distance in SNS message. As a result, sensitivity and precision of SVM prediction were significantly low because of the extreme skewness of dataset. We improved the performance of SVM by balancing the ratio of data by upsampling technique and using data coded with the same value in first content analysis. All performance index was more than 70%, which showed that psychological distance can be measured well.

An investigation into the effects of lime-stabilization on soil-geosynthetic interface behavior

  • Khadije Mahmoodi;Nazanin Mahbubi Motlagh;Ahmad-Reza Mahboubi Ardakani
    • Geomechanics and Engineering
    • /
    • v.38 no.3
    • /
    • pp.231-247
    • /
    • 2024
  • The use of lime stabilization and geosynthetic reinforcement is a common approach to improve the performance of fine-grained soils in geotechnical applications. However, the impact of this combination on the soil-geosynthetic interaction remains unclear. This study addresses this gap by evaluating the interface efficiency and soil-geosynthetic interaction parameters of lime-stabilized clay (2%, 4%, 6%, and 8% lime content) reinforced with geotextile or geogrid using direct shear tests at various curing times (1, 7, 14, and 28 days). Additionally, machine learning algorithms (Support Vector Machine and Artificial Neural Network) were employed to predict soil shear strength. Findings revealed that lime stabilization significantly increased soil shear strength and interaction parameters, particularly at the optimal lime content (4%). Notably, stabilization improved the performance of soil-geogrid interfaces but had an adverse effect on soil-geotextile interfaces. Furthermore, machine learning algorithms effectively predicted soil shear strength, with sensitivity analysis highlighting lime percentage and geosynthetic type as the most significant influencing factors.

The prediction of the stock price movement after IPO using machine learning and text analysis based on TF-IDF (증권신고서의 TF-IDF 텍스트 분석과 기계학습을 이용한 공모주의 상장 이후 주가 등락 예측)

  • Yang, Suyeon;Lee, Chaerok;Won, Jonggwan;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.237-262
    • /
    • 2022
  • There has been a growing interest in IPOs (Initial Public Offerings) due to the profitable returns that IPO stocks can offer to investors. However, IPOs can be speculative investments that may involve substantial risk as well because shares tend to be volatile, and the supply of IPO shares is often highly limited. Therefore, it is crucially important that IPO investors are well informed of the issuing firms and the market before deciding whether to invest or not. Unlike institutional investors, individual investors are at a disadvantage since there are few opportunities for individuals to obtain information on the IPOs. In this regard, the purpose of this study is to provide individual investors with the information they may consider when making an IPO investment decision. This study presents a model that uses machine learning and text analysis to predict whether an IPO stock price would move up or down after the first 5 trading days. Our sample includes 691 Korean IPOs from June 2009 to December 2020. The input variables for the prediction are three tone variables created from IPO prospectuses and quantitative variables that are either firm-specific, issue-specific, or market-specific. The three prospectus tone variables indicate the percentage of positive, neutral, and negative sentences in a prospectus, respectively. We considered only the sentences in the Risk Factors section of a prospectus for the tone analysis in this study. All sentences were classified into 'positive', 'neutral', and 'negative' via text analysis using TF-IDF (Term Frequency - Inverse Document Frequency). Measuring the tone of each sentence was conducted by machine learning instead of a lexicon-based approach due to the lack of sentiment dictionaries suitable for Korean text analysis in the context of finance. For this reason, the training set was created by randomly selecting 10% of the sentences from each prospectus, and the sentence classification task on the training set was performed after reading each sentence in person. Then, based on the training set, a Support Vector Machine model was utilized to predict the tone of sentences in the test set. Finally, the machine learning model calculated the percentages of positive, neutral, and negative sentences in each prospectus. To predict the price movement of an IPO stock, four different machine learning techniques were applied: Logistic Regression, Random Forest, Support Vector Machine, and Artificial Neural Network. According to the results, models that use quantitative variables using technical analysis and prospectus tone variables together show higher accuracy than models that use only quantitative variables. More specifically, the prediction accuracy was improved by 1.45% points in the Random Forest model, 4.34% points in the Artificial Neural Network model, and 5.07% points in the Support Vector Machine model. After testing the performance of these machine learning techniques, the Artificial Neural Network model using both quantitative variables and prospectus tone variables was the model with the highest prediction accuracy rate, which was 61.59%. The results indicate that the tone of a prospectus is a significant factor in predicting the price movement of an IPO stock. In addition, the McNemar test was used to verify the statistically significant difference between the models. The model using only quantitative variables and the model using both the quantitative variables and the prospectus tone variables were compared, and it was confirmed that the predictive performance improved significantly at a 1% significance level.

A Machine Learning-Based Vocational Training Dropout Prediction Model Considering Structured and Unstructured Data (정형 데이터와 비정형 데이터를 동시에 고려하는 기계학습 기반의 직업훈련 중도탈락 예측 모형)

  • Ha, Manseok;Ahn, Hyunchul
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.1
    • /
    • pp.1-15
    • /
    • 2019
  • One of the biggest difficulties in the vocational training field is the dropout problem. A large number of students drop out during the training process, which hampers the waste of the state budget and the improvement of the youth employment rate. Previous studies have mainly analyzed the cause of dropouts. The purpose of this study is to propose a machine learning based model that predicts dropout in advance by using various information of learners. In particular, this study aimed to improve the accuracy of the prediction model by taking into consideration not only structured data but also unstructured data. Analysis of unstructured data was performed using Word2vec and Convolutional Neural Network(CNN), which are the most popular text analysis technologies. We could find that application of the proposed model to the actual data of a domestic vocational training institute improved the prediction accuracy by up to 20%. In addition, the support vector machine-based prediction model using both structured and unstructured data showed high prediction accuracy of the latter half of 90%.

Forecasting Day-ahead Electricity Price Using a Hybrid Improved Approach

  • Hu, Jian-Ming;Wang, Jian-Zhou
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.6
    • /
    • pp.2166-2176
    • /
    • 2017
  • Electricity price prediction plays a crucial part in making the schedule and managing the risk to the competitive electricity market participants. However, it is a difficult and challenging task owing to the characteristics of the nonlinearity, non-stationarity and uncertainty of the price series. This study proposes a hybrid improved strategy which incorporates data preprocessor components and a forecasting engine component to enhance the forecasting accuracy of the electricity price. In the developed forecasting procedure, the Seasonal Adjustment (SA) method and the Ensemble Empirical Mode Decomposition (EEMD) technique are synthesized as the data preprocessing component; the Coupled Simulated Annealing (CSA) optimization method and the Least Square Support Vector Regression (LSSVR) algorithm construct the prediction engine. The proposed hybrid approach is verified with electricity price data sampled from the power market of New South Wales in Australia. The simulation outcome manifests that the proposed hybrid approach obtains the observable improvement in the forecasting accuracy compared with other approaches, which suggests that the proposed combinational approach occupies preferable predication ability and enough precision.