• Title/Summary/Keyword: Machine learning algorithm

Search Result 1,482, Processing Time 0.028 seconds

An Online Review Mining Approach to a Recommendation System (고객 온라인 구매후기를 활용한 추천시스템 개발 및 적용)

  • Cho, Seung-Yean;Choi, Jee-Eun;Lee, Kyu-Hyun;Kim, Hee-Woong
    • Information Systems Review
    • /
    • v.17 no.3
    • /
    • pp.95-111
    • /
    • 2015
  • The recommendation system automatically provides the predicted items which are expected to be purchased by analyzing the previous customer behaviors. This recommendation system has been applied to many e-commerce businesses, and it is generating positive effects on user convenience as well as the company's revenue. However, there are several limitations of the existing recommendation systems. They do not reflect specific criteria for evaluating products or the factors that affect customer buying decisions. Thus, our research proposes a collaborative recommendation model algorithm that utilizes each customer's online product reviews. This study deploys topic modeling method for customer opinion mining. Also, it adopts a kernel-based machine learning concept by selecting kernels explaining individual similarities in accordance with customers' purchase history and online reviews. Our study further applies a multiple kernel learning algorithm to integrate the kernelsinto a combined model for predicting the product ratings, and it verifies its validity with a data set (including purchased item, product rating, and online review) of BestBuy, an online consumer electronics store. This study theoretically implicates by suggesting a new method for the online recommendation system, i.e., a collaborative recommendation method using topic modeling and kernel-based learning.

A Novel of Data Clustering Architecture for Outlier Detection to Electric Power Data Analysis (전력데이터 분석에서 이상점 추출을 위한 데이터 클러스터링 아키텍처에 관한 연구)

  • Jung, Se Hoon;Shin, Chang Sun;Cho, Young Yun;Park, Jang Woo;Park, Myung Hye;Kim, Young Hyun;Lee, Seung Bae;Sim, Chun Bo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.10
    • /
    • pp.465-472
    • /
    • 2017
  • In the past, researchers mainly used the supervised learning technique of machine learning to analyze power data and investigated the identification of patterns through the data mining technique. Data analysis research, however, faces its limitations with the old data classification and analysis techniques today when the size of electric power data has increased with the possible real-time provision of data. This study thus set out to propose a clustering architecture to analyze large-sized electric power data. The clustering process proposed in the study supplements the K-means algorithm, an unsupervised learning technique, for its problems and is capable of automating the entire process from the collection of electric power data to their analysis. In the present study, power data were categorized and analyzed in total three levels, which include the row data level, clustering level, and user interface level. In addition, the investigator identified K, the ideal number of clusters, based on principal component analysis and normal distribution and proposed an altered K-means algorithm to reduce data that would be categorized as ideal points in order to increase the efficiency of clustering.

Forensic Decision of Median Filtering by Pixel Value's Gradients of Digital Image (디지털 영상의 픽셀값 경사도에 의한 미디언 필터링 포렌식 판정)

  • RHEE, Kang Hyeon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.6
    • /
    • pp.79-84
    • /
    • 2015
  • In a distribution of digital image, there is a serious problem that is a distribution of the altered image by a forger. For the problem solution, this paper proposes a median filtering (MF) image forensic decision algorithm using a feature vector according to the pixel value's gradients. In the proposed algorithm, AR (Autoregressive) coefficients are computed from pixel value' gradients of original image then 1th~6th order coefficients to be six feature vector. And the reconstructed image is produced by the solution of Poisson's equation with the gradients. From the difference image between original and its reconstructed image, four feature vector (Average value, Max. value and the coordinate i,j of Max. value) is extracted. Subsequently, Two kinds of the feature vector combined to 10 Dim. feature vector that is used in the learning of a SVM (Support Vector Machine) classification for MF (Median Filtering) detector of the altered image. On the proposed algorithm of the median filtering detection, compare to MFR (Median Filter Residual) scheme that had the same 10 Dim. feature vectors, the performance is excellent at Unaltered, Averaging filtering ($3{\times}3$) and JPEG (QF=90) images, and less at Gaussian filtering ($3{\times}3$) image. However, in the measured performances of all items, AUC (Area Under Curve) by the sensitivity and 1-specificity is approached to 1. Thus, it is confirmed that the grade evaluation of the proposed algorithm is 'Excellent (A)'.

Development on Identification Algorithm of Risk Situation around Construction Vehicle using YOLO-v3 (YOLO-v3을 활용한 건설 장비 주변 위험 상황 인지 알고리즘 개발)

  • Shim, Seungbo;Choi, Sang-Il
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.7
    • /
    • pp.622-629
    • /
    • 2019
  • Recently, the government is taking new approaches to change the fact that the accident rate and accident death rate of the construction industry account for a high percentage of the whole industry. Especially, it is investing heavily in the development of construction technology that is fused with ICT technology in line with the current trend of the 4th Industrial Revolution. In order to cope with this situation, this paper proposed a concept to recognize and share the work situation information between the construction machine driver and the surrounding worker to enhance the safety in the place where construction machines are operated. In order to realize the part of the concept, we applied image processing technology using camera based on artificial intelligence to earth-moving work. Especially, we implemented an algorithm that can recognize the surrounding worker's circumstance and identify the risk situation through the experiment using the compaction equipment. and image processing algorithm based on YOLO-v3. This algorithm processes 15.06 frames per second in video and can recognize danger situation around construction machine with accuracy of 90.48%. We will contribute to the prevention of safety accidents at the construction site by utilizing this technology in the future.

Artificial Intelligence to forecast new nurse turnover rates in hospital (인공지능을 이용한 신규간호사 이직률 예측)

  • Choi, Ju-Hee;Park, Hye-Kyung;Park, Ji-Eun;Lee, Chang-Min;Choi, Byung-Gwan
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.9
    • /
    • pp.431-440
    • /
    • 2018
  • In this study, authors predicted probability of resignation of newly employed nurses using TensorFlow, an open source software library for numerical computation and machine learning developed by Google, and suggested strategic human resources management plan. Data of 1,018 nurses who resigned between 2010 and 2017 in single university hospital were collected. After the order of data were randomly shuffled, 80% of total data were used for machine leaning and the remaining data were used for testing purpose. We utilized multiple neural network with one input layer, one output layer and 3 hidden layers. The machine-learning algorithm correctly predicted for 88.7% of resignation of nursing staff with in one year of employment and 79.8% of that within 3 years of employment. Most of resigned nurses were in their late 20s and 30s. Leading causes of resignation were marriage, childbirth, childcare and personal affairs. However, the most common cause of resignation of nursing staff with in one year of employment were maladaptation to the work and problems in interpersonal relationship.

Data analysis by Integrating statistics and visualization: Visual verification for the prediction model (통계와 시각화를 결합한 데이터 분석: 예측모형 대한 시각화 검증)

  • Mun, Seong Min;Lee, Kyung Won
    • Design Convergence Study
    • /
    • v.15 no.6
    • /
    • pp.195-214
    • /
    • 2016
  • Predictive analysis is based on a probabilistic learning algorithm called pattern recognition or machine learning. Therefore, if users want to extract more information from the data, they are required high statistical knowledge. In addition, it is difficult to find out data pattern and characteristics of the data. This study conducted statistical data analyses and visual data analyses to supplement prediction analysis's weakness. Through this study, we could find some implications that haven't been found in the previous studies. First, we could find data pattern when adjust data selection according as splitting criteria for the decision tree method. Second, we could find what type of data included in the final prediction model. We found some implications that haven't been found in the previous studies from the results of statistical and visual analyses. In statistical analysis we found relation among the multivariable and deducted prediction model to predict high box office performance. In visualization analysis we proposed visual analysis method with various interactive functions. Finally through this study we verified final prediction model and suggested analysis method extract variety of information from the data.

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

  • Lee, Mo-Se;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.167-181
    • /
    • 2018
  • Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.

A Design of Fuzzy Classifier with Hierarchical Structure (계층적 구조를 가진 퍼지 패턴 분류기 설계)

  • Ahn, Tae-Chon;Roh, Seok-Beom;Kim, Yong Soo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.4
    • /
    • pp.355-359
    • /
    • 2014
  • In this paper, we proposed the new fuzzy pattern classifier which combines several fuzzy models with simple consequent parts hierarchically. The basic component of the proposed fuzzy pattern classifier with hierarchical structure is a fuzzy model with simple consequent part so that the complexity of the proposed fuzzy pattern classifier is not high. In order to analyze and divide the input space, we use Fuzzy C-Means clustering algorithm. In addition, we exploit Conditional Fuzzy C-Means clustering algorithm to analyze the sub space which is divided by Fuzzy C-Means clustering algorithm. At each clustered region, we apply a fuzzy model with simple consequent part and build the fuzzy pattern classifier with hierarchical structure. Because of the hierarchical structure of the proposed pattern classifier, the data distribution of the input space can be analyzed in the macroscopic point of view and the microscopic point of view. Finally, in order to evaluate the classification ability of the proposed pattern classifier, the machine learning data sets are used.

Video Based Fall Detection Algorithm Using Hidden Markov Model (은닉 마르코프 모델을 이용한 동영상 기반 낙상 인식 알고리듬)

  • Kim, Nam Ho;Yu, Yun Seop
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.8
    • /
    • pp.232-237
    • /
    • 2013
  • A newly developed fall detection algorithm using the HMM (Hidden Markov Model) extracted from the video is introduced. To distinguish between the fall from personal difference fall pattern or the normal activities of daily living (ADL), HMM machine learning algorithm is used. For getting fall feature vector of video, the motion vector from the optical flow is applied to the PCA (Principal Component Analysis). The combination of the angle, ratio of long-short axis, velocity from results of PCA make the new fall feature parameters. These parameters were applied to the HMM and the results were compared and analyzed. Among the newly proposed various kinds of fall parameters, the angle of movement showed the best results. The results show that this parameter can distinguish various types of fall from ADLs with 91.5% sensitivity and 88.01% specificity.

Binary classification by the combination of Adaboost and feature extraction methods (특징 추출 알고리즘과 Adaboost를 이용한 이진분류기)

  • Ham, Seaung-Lok;Kwak, No-Jun
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.49 no.4
    • /
    • pp.42-53
    • /
    • 2012
  • In pattern recognition and machine learning society, classification has been a classical problem and the most widely researched area. Adaptive boosting also known as Adaboost has been successfully applied to binary classification problems. It is a kind of boosting algorithm capable of constructing a strong classifier through a weighted combination of weak classifiers. On the other hand, the PCA and LDA algorithms are the most popular linear feature extraction methods used mainly for dimensionality reduction. In this paper, the combination of Adaboost and feature extraction methods is proposed for efficient classification of two class data. Conventionally, in classification problems, the roles of feature extraction and classification have been distinct, i.e., a feature extraction method and a classifier are applied sequentially to classify input variable into several categories. In this paper, these two steps are combined into one resulting in a good classification performance. More specifically, each projection vector is treated as a weak classifier in Adaboost algorithm to constitute a strong classifier for binary classification problems. The proposed algorithm is applied to UCI dataset and FRGC dataset and showed better recognition rates than sequential application of feature extraction and classification methods.