• Title/Summary/Keyword: Conventional machine learning

Search Result 286, Processing Time 0.027 seconds

Computer Vision-Based Measurement Method for Wire Harness Defect Classification

  • Yun Jung Hong;Geon Lee;Jiyoung Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.1
    • /
    • pp.77-84
    • /
    • 2024
  • In this paper, we propose a method for accurately and rapidly detecting defects in wire harnesses by utilizing computer vision to calculate six crucial measurement values: the length of crimped terminals, the dimensions (width) of terminal ends, and the width of crimped sections (wire and core portions). We employ Harris corner detection to locate object positions from two types of data. Additionally, we generate reference points for extracting measurement values by utilizing features specific to each measurement area and exploiting the contrast in shading between the background and objects, thus reflecting the slope of each sample. Subsequently, we introduce a method using the Euclidean distance and correction coefficients to predict values, allowing for the prediction of measurements regardless of changes in the wire's position. We achieve high accuracy for each measurement type, 99.1%, 98.7%, 92.6%, 92.5%, 99.9%, and 99.7%, achieving outstanding overall average accuracy of 97% across all measurements. This inspection method not only addresses the limitations of conventional visual inspections but also yields excellent results with a small amount of data. Moreover, relying solely on image processing, it is expected to be more cost-effective and applicable with less data compared to deep learning methods.

Indoor Positioning System using Geomagnetic Field with Recurrent Neural Network Model (순환신경망을 이용한 자기장 기반 실내측위시스템)

  • Bae, Han Jun;Choi, Lynn;Park, Byung Joon
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.14 no.6
    • /
    • pp.57-65
    • /
    • 2018
  • Conventional RF signal-based indoor localization techniques such as BLE or Wi-Fi based fingerprinting method show considerable localization errors even in small-scale indoor environments due to unstable received signal strength(RSS) of RF signals. Therefore, it is difficult to apply the existing RF-based fingerprinting techniques to large-scale indoor environments such as airports and department stores. In this paper, instead of RF signal we use the geomagnetic sensor signal for indoor localization, whose signal strength is more stable than RF RSS. Although similar geomagnetic field values exist in indoor space, an object movement would experience a unique sequence of the geomagnetic field signals as the movement continues. We use a deep neural network model called the recurrent neural network (RNN), which is effective in recognizing time-varying sequences of sensor data, to track the user's location and movement path. To evaluate the performance of the proposed geomagnetic field based indoor positioning system (IPS), we constructed a magnetic field map for a campus testbed of about $94m{\times}26$ dimension and trained RNN using various potential movement paths and their location data extracted from the magnetic field map. By adjusting various hyperparameters, we could achieve an average localization error of 1.20 meters in the testbed.

TeGCN:Transformer-embedded Graph Neural Network for Thin-filer default prediction (TeGCN:씬파일러 신용평가를 위한 트랜스포머 임베딩 기반 그래프 신경망 구조 개발)

  • Seongsu Kim;Junho Bae;Juhyeon Lee;Heejoo Jung;Hee-Woong Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.419-437
    • /
    • 2023
  • As the number of thin filers in Korea surpasses 12 million, there is a growing interest in enhancing the accuracy of assessing their credit default risk to generate additional revenue. Specifically, researchers are actively pursuing the development of default prediction models using machine learning and deep learning algorithms, in contrast to traditional statistical default prediction methods, which struggle to capture nonlinearity. Among these efforts, Graph Neural Network (GNN) architecture is noteworthy for predicting default in situations with limited data on thin filers. This is due to their ability to incorporate network information between borrowers alongside conventional credit-related data. However, prior research employing graph neural networks has faced limitations in effectively handling diverse categorical variables present in credit information. In this study, we introduce the Transformer embedded Graph Convolutional Network (TeGCN), which aims to address these limitations and enable effective default prediction for thin filers. TeGCN combines the TabTransformer, capable of extracting contextual information from categorical variables, with the Graph Convolutional Network, which captures network information between borrowers. Our TeGCN model surpasses the baseline model's performance across both the general borrower dataset and the thin filer dataset. Specially, our model performs outstanding results in thin filer default prediction. This study achieves high default prediction accuracy by a model structure tailored to characteristics of credit information containing numerous categorical variables, especially in the context of thin filers with limited data. Our study can contribute to resolving the financial exclusion issues faced by thin filers and facilitate additional revenue within the financial industry.

A Hybrid Collaborative Filtering-based Product Recommender System using Search Keywords (검색 키워드를 활용한 하이브리드 협업필터링 기반 상품 추천 시스템)

  • Lee, Yunju;Won, Haram;Shim, Jaeseung;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.151-166
    • /
    • 2020
  • A recommender system is a system that recommends products or services that best meet the preferences of each customer using statistical or machine learning techniques. Collaborative filtering (CF) is the most commonly used algorithm for implementing recommender systems. However, in most cases, it only uses purchase history or customer ratings, even though customers provide numerous other data that are available. E-commerce customers frequently use a search function to find the products in which they are interested among the vast array of products offered. Such search keyword data may be a very useful information source for modeling customer preferences. However, it is rarely used as a source of information for recommendation systems. In this paper, we propose a novel hybrid CF model based on the Doc2Vec algorithm using search keywords and purchase history data of online shopping mall customers. To validate the applicability of the proposed model, we empirically tested its performance using real-world online shopping mall data from Korea. As the number of recommended products increases, the recommendation performance of the proposed CF (or, hybrid CF based on the customer's search keywords) is improved. On the other hand, the performance of a conventional CF gradually decreased as the number of recommended products increased. As a result, we found that using search keyword data effectively represents customer preferences and might contribute to an improvement in conventional CF recommender systems.

Safety Verification Techniques of Privacy Policy Using GPT (GPT를 활용한 개인정보 처리방침 안전성 검증 기법)

  • Hye-Yeon Shim;MinSeo Kweun;DaYoung Yoon;JiYoung Seo;Il-Gu Lee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.2
    • /
    • pp.207-216
    • /
    • 2024
  • As big data was built due to the 4th Industrial Revolution, personalized services increased rapidly. As a result, the amount of personal information collected from online services has increased, and concerns about users' personal information leakage and privacy infringement have increased. Online service providers provide privacy policies to address concerns about privacy infringement of users, but privacy policies are often misused due to the long and complex problem that it is difficult for users to directly identify risk items. Therefore, there is a need for a method that can automatically check whether the privacy policy is safe. However, the safety verification technique of the conventional blacklist and machine learning-based privacy policy has a problem that is difficult to expand or has low accessibility. In this paper, to solve the problem, we propose a safety verification technique for the privacy policy using the GPT-3.5 API, which is a generative artificial intelligence. Classification work can be performed evenin a new environment, and it shows the possibility that the general public without expertise can easily inspect the privacy policy. In the experiment, how accurately the blacklist-based privacy policy and the GPT-based privacy policy classify safe and unsafe sentences and the time spent on classification was measured. According to the experimental results, the proposed technique showed 10.34% higher accuracy on average than the conventional blacklist-based sentence safety verification technique.

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

A Study on the Turbidity Estimation Model Using Data Mining Techniques in the Water Supply System (데이터마이닝 기법을 이용한 상수도 시스템 내의 탁도 예측모형 개발에 관한 연구)

  • Park, No-Suk;Kim, Soonho;Lee, Young Joo;Yoon, Sukmin
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.38 no.2
    • /
    • pp.87-95
    • /
    • 2016
  • Turbidity is a key indicator to the user that the 'Discolored Water' phenomenon known to be caused by corrosion of the pipeline in the water supply system. 'Discolored Water' is defined as a state with a turbidity of the degree to which the user visually be able to recognize water. Therefore, this study used data mining techniques in order to estimate turbidity changes in water supply system. Decision tree analysis was applied in data mining techniques to develop estimation models for turbidity changes in the water supply system. The pH and residual chlorine dataset was used as variables of the turbidity estimation model. As a result, the case of applying both variables(pH and residual chlorine) were shown more reasonable estimation results than models only using each variable. However, the estimation model developed in this study were shown to have underestimated predictions for the peak observed values. To overcome this disadvantage, a high-pass filter method was introduced as a pretreatment of estimation model. Modified model using high-pass filter method showed more exactly predictions for the peak observed values as well as improved prediction performance than the conventional model.

A Study of Statistical Learning as a CRM s Classifier Functions (CRM의 기능 분류를 위한 통계적 학습에 관한 연구)

  • Jang, Geun;Lee, Jung-Bae;Lee, Byung-Soo
    • The KIPS Transactions:PartB
    • /
    • v.11B no.1
    • /
    • pp.71-76
    • /
    • 2004
  • The recent ERP and CRM is mostly focused on the conventional function performances. However, the recent business environment has brought the change in market due to the rapid progress of internet and e-commerce. It is mostly becoming e-business and spreading out as development of the relationship with other cooperating companies, the rapid progress of the relationship with customers, and intensification competitive power through the development of business progress in the organization. CRM(custom relationship management) is a kind of the marketing progress which forms, manages, and intensifies the relationship between the customers and companies to manage the acquired customers and increase the worth of customers for the company. It needs the system base which analyzes the information of customers since it functions on the basis of various information about customers and is linked to the business category such as producing, marketing, and decision making. Since ERP is extending its function to SCM, CRM, and SEM(strategic Enterprise Management), the 21 century s ERP develop as the strategy tool of e-business and, as the mediation for this, will subdivide the functions of CRM effectively by the analogic study of data. Also, to accomplish classification work of the file which in existing becomes accomplished with possibility work with an automatic movement with the user will be able to accomplish a more efficiently work the agent which in order leads the machine studying law, it is one thing with system feature.

A Method of Detecting the Aggressive Driving of Elderly Driver (노인 운전자의 공격적인 운전 상태 검출 기법)

  • Koh, Dong-Woo;Kang, Hang-Bong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.537-542
    • /
    • 2017
  • Aggressive driving is a major cause of car accidents. Previous studies have mainly analyzed young driver's aggressive driving tendency, yet they were only done through pure clustering or classification technique of machine learning. However, since elderly people have different driving habits due to their fragile physical conditions, it is necessary to develop a new method such as enhancing the characteristics of driving data to properly analyze aggressive driving of elderly drivers. In this study, acceleration data collected from a smartphone of a driving vehicle is analyzed by a newly proposed ECA(Enhanced Clustering method for Acceleration data) technique, coupled with a conventional clustering technique (K-means Clustering, Expectation-maximization algorithm). ECA selects high-intensity data among the data of the cluster group detected through K-means and EM in all of the subjects' data and models the characteristic data through the scaled value. Using this method, the aggressive driving data of all youth and elderly experiment participants were collected, unlike the pure clustering method. We further found that the K-means clustering has higher detection efficiency than EM method. Also, the results of K-means clustering demonstrate that a young driver has a driving strength 1.29 times higher than that of an elderly driver. In conclusion, the proposed method of our research is able to detect aggressive driving maneuvers from data of the elderly having low operating intensity. The proposed method is able to construct a customized safe driving system for the elderly driver. In the future, it will be possible to detect abnormal driving conditions and to use the collected data for early warning to drivers.

Load Fidelity Improvement of Piecewise Integrated Composite Beam by Construction Training Data of k-NN Classification Model (k-NN 분류 모델의 학습 데이터 구성에 따른 PIC 보의 하중 충실도 향상에 관한 연구)

  • Ham, Seok Woo;Cheon, Seong S.
    • Composites Research
    • /
    • v.33 no.3
    • /
    • pp.108-114
    • /
    • 2020
  • Piecewise Integrated Composite (PIC) beam is composed of different stacking against loading type depending upon location. The aim of current study is to assign robust stacking sequences against external loading to every corresponding part of the PIC beam based on the value of stress triaxiality at generated reference points using the k-NN (k-Nearest Neighbor) classification, which is one of representative machine learning techniques, in order to excellent superior bending characteristics. The stress triaxiality at reference points is obtained by three-point bending analysis of the Al beam with training data categorizing the type of external loading, i.e., tension, compression or shear. Loading types of each plane of the beam were classified by independent plane scheme as well as total beam scheme. Also, loading fidelities were calibrated for each case with the variation of hyper-parameters. Most effective stacking sequences were mapped into the PIC beam based on the k-NN classification model with the highest loading fidelity. FE analysis result shows the PIC beam has superior external loading resistance and energy absorption compared to conventional beam.