• Title/Summary/Keyword: 최근접 이웃

Search Result 187, Processing Time 0.03 seconds

Personalized Size Recommender System for Online Apparel Shopping: A Collaborative Filtering Approach

  • Dongwon Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.8
    • /
    • pp.39-48
    • /
    • 2023
  • This study was conducted to provide a solution to the problem of sizing errors occurring in online purchases due to discrepancies and non-standardization in clothing sizes. This paper discusses an implementation approach for a machine learning-based recommender system capable of providing personalized sizes to online consumers. We trained multiple validated collaborative filtering algorithms including Non-Negative Matrix Factorization (NMF), Singular Value Decomposition (SVD), k-Nearest Neighbors (KNN), and Co-Clustering using purchasing data derived from online commerce and compared their performance. As a result of the study, we were able to confirm that the NMF algorithm showed superior performance compared to other algorithms. Despite the characteristic of purchase data that includes multiple buyers using the same account, the proposed model demonstrated sufficient accuracy. The findings of this study are expected to contribute to reducing the return rate due to sizing errors and improving the customer experience on e-commerce platforms.

Adaptive Operation of Boryeong Dam Water Supply Adjustment Standards against Multi-year Droughts (다년 가뭄 대비 보령댐 용수공급 조정기준의 적응형 운영방안)

  • Kim, Gi Joo;Lee, Jae Hwang;Lee, Joohyung;Kim, Young-Oh
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.373-373
    • /
    • 2022
  • 전세계적으로 기후변화로 인해 3년 이상의 기간동안 지속되는 다년 가뭄의 빈도와 심도가 증가하고 있으며, 이로 인한 피해도 증가하고 있다. 본 연구에서는 이를 반영하여 전국 다목적댐 및 용수댐에서 모두 주요 가뭄 대응 대책으로 사용되고 있는 현행 용수공급 조정기준을 개선하는 방안을 제안하고자 한다. 가장 먼저, 장기 기억 반영이 가능한 시계열 모형인 ARFIMA(Autoregressive Fractional Integrated Moving Average) 모델을 사용하여 다양한 강도의 장기 기억을 가지고 있는 연간 유입량을 생성하였다. 이후, 연간 유입량을 k-최근접 이웃 방법 기반의 배분 도구를 사용하여 10일 단위 유입량으로 분배하였으며 이를 대체 용수공급 조정기준을 생성하기 위한 입력 변수로 사용하였다. 새로운 용수공급 조정기준은 매 시점마다 새롭게 업데이트되는 정보를 통해 현행 기준과 함께 적응형으로 저수지 운영에 사용되었다. 다년 가뭄이 반영된 유입량으로 적응형으로 저수지 운영을 관측 유입량 하에서 빈도와 크기의 측면에서 분석을 시행하였다. 그 결과, 심각한 실패(물 부족 비율 30% 이상)의 빈도의 경우 현행 기준 운영 시 6.14%에서 적응형 운영 시행 시 2.99%로 개선되었지만, 전체 기간 동안의 신뢰도는 적응형 운영보다(26.42%) 현행 운영 하에서 더욱 나은 결과를 보였다(41.19%). 위와 같은 분석 결과는 심각한 실패의 빈도와 크기를 줄이는 용수공급 조정기준을 시행하는 원론적인 목적과 일치하기에, 본 연구에서 제안하는 다년 가뭄에 대비한 적응형 운영 방안은 향후 길게 지속되는 가뭄 조건에서 저수지 운영 정책으로 활용될 수 있음을 확인하였다.

  • PDF

Evaluation of Classification Models of Mild Left Ventricular Diastolic Dysfunction by Tei Index (Tei Index를 이용한 경도의 좌심실 이완 기능 장애 분류 모델 평가)

  • Su-Min Kim;Soo-Young Ye
    • Journal of the Korean Society of Radiology
    • /
    • v.17 no.5
    • /
    • pp.761-766
    • /
    • 2023
  • In this paper, TI was measured to classify the presence or absence of mild left ventricular diastolic dysfunction. Of the total 306 data, 206 were used as training data and 100 were used as test data, and the machine learning models used for classification used SVM and KNN. As a result, it was confirmed that SVM showed relatively higher accuracy than KNN and was more useful in diagnosing the presence of left ventricular diastolic dysfunction. In future research, it is expected that classification performance can be further improved by adding various indicators that evaluate not only TI but also cardiac function and securing more data. Furthermore, it is expected to be used as basic data to predict and classify other diseases and solve the problem of insufficient medical manpower compared to the increasing number of tests.

A Study on Research Trends of Library Science and Information Science Through Analyzing Subject Headings of Doctoral Dissertations Recently Published in the U.S. (학위논문 분석을 통한 미국 도서관학 및 정보과학 최근 연구 동향에 관한 연구)

  • Kim, Hyunjung
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.3
    • /
    • pp.11-39
    • /
    • 2018
  • The study examines the research trends of doctoral dissertations in Library Science and Information Science published in the U.S. for the last 5 years. Data collected from PQDT Global includes 1,016 doctoral dissertations containing "Library Science" or "Information Science" as subject headings, and keywords extracted from those dissertations were used for a network analysis, which helps identifying the intellectual structure of the dissertations. Also, the analysis using 103 subject heading keywords resulted in various centrality measures, including triangle betweenness centrality and nearest neighbor centrality, as well as 26 clusters of associated subject headings. The most frequently studied subjects include computer-related subjects, education-related subjects, and communication-related subjects, and a cluster with information science as the most central subject contains most of the computer-related keywords, while a cluster with library science as the most central subject contains many of the education-related keywords. Other related subjects include various user groups for user studies, and subjects related to information systems such as management, economics, geography, and biomedical engineering.

Predicting the Response of Segmented Customers for the Promotion Using Data Mining (데이터마이닝을 이용한 세분화된 고객집단의 프로모션 고객반응 예측)

  • Hong, Tae-Ho;Kim, Eun-Mi
    • Information Systems Review
    • /
    • v.12 no.2
    • /
    • pp.75-88
    • /
    • 2010
  • This paper proposed a method that segmented customers utilizing SOM(Self-organizing Map) and predicted the customers' response of a marketing promotion for each customer's segments. Our proposed method focused on predicting the response of customers dividing into customers' segment whereas most studies have predicted the response of customers all at once. We deployed logistic regression, neural networks, and support vector machines to predict customers' response that is a kind of dichotomous classification while the integrated approach was utilized to improve the performance of the prediction model. Sample data including 45 variables regarding demographic data about 600 customers, transaction data, and promotion activities were applied to the proposed method presenting classification matrix and the comparative analyses of each data mining techniques. We could draw some significant promotion strategies for segmented customers applying our proposed method to sample data.

Cancer Diagnosis System using Genetic Algorithm and Multi-boosting Classifier (Genetic Algorithm과 다중부스팅 Classifier를 이용한 암진단 시스템)

  • Ohn, Syng-Yup;Chi, Seung-Do
    • Journal of the Korea Society for Simulation
    • /
    • v.20 no.2
    • /
    • pp.77-85
    • /
    • 2011
  • It is believed that the anomalies or diseases of human organs are identified by the analysis of the patterns. This paper proposes a new classification technique for the identification of cancer disease using the proteome patterns obtained from two-dimensional polyacrylamide gel electrophoresis(2-D PAGE). In the new classification method, three different classification methods such as support vector machine(SVM), multi-layer perceptron(MLP) and k-nearest neighbor(k-NN) are extended by multi-boosting method in an array of subclassifiers and the results of each subclassifier are merged by ensemble method. Genetic algorithm was applied to obtain optimal feature set in each subclassifier. We applied our method to empirical data set from cancer research and the method showed the better accuracy and more stable performance than single classifier.

A study on the imputation solution for missing speed data on UTIS by using adaptive k-NN algorithm (적응형 k-NN 기법을 이용한 UTIS 속도정보 결측값 보정처리에 관한 연구)

  • Kim, Eun-Jeong;Bae, Gwang-Soo;Ahn, Gye-Hyeong;Ki, Yong-Kul;Ahn, Yong-Ju
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.13 no.3
    • /
    • pp.66-77
    • /
    • 2014
  • UTIS(Urban Traffic Information System) directly collects link travel time in urban area by using probe vehicles. Therefore it can estimate more accurate link travel speed compared to other traffic detection systems. However, UTIS includes some missing data caused by the lack of probe vehicles and RSEs on road network, system failures, and other factors. In this study, we suggest a new model, based on k-NN algorithm, for imputing missing data to provide more accurate travel time information. New imputation model is an adaptive k-NN which can flexibly adjust the number of nearest neighbors(NN) depending on the distribution of candidate objects. The evaluation result indicates that the new model successfully imputed missing speed data and significantly reduced the imputation error as compared with other models(ARIMA and etc). We have a plan to use the new imputation model improving traffic information service by applying UTIS Central Traffic Information Center.

Exploring the Performance of Synthetic Minority Over-sampling Technique (SMOTE) to Predict Good Borrowers in P2P Lending (P2P 대부 우수 대출자 예측을 위한 합성 소수집단 오버샘플링 기법 성과에 관한 탐색적 연구)

  • Costello, Francis Joseph;Lee, Kun Chang
    • Journal of Digital Convergence
    • /
    • v.17 no.9
    • /
    • pp.71-78
    • /
    • 2019
  • This study aims to identify good borrowers within the context of P2P lending. P2P lending is a growing platform that allows individuals to lend and borrow money from each other. Inherent in any loans is credit risk of borrowers and needs to be considered before any lending. Specifically in the context of P2P lending, traditional models fall short and thus this study aimed to rectify this as well as explore the problem of class imbalances seen within credit risk data sets. This study implemented an over-sampling technique known as Synthetic Minority Over-sampling Technique (SMOTE). To test our approach, we implemented five benchmarking classifiers such as support vector machines, logistic regression, k-nearest neighbor, random forest, and deep neural network. The data sample used was retrieved from the publicly available LendingClub dataset. The proposed SMOTE revealed significantly improved results in comparison with the benchmarking classifiers. These results should help actors engaged within P2P lending to make better informed decisions when selecting potential borrowers eliminating the higher risks present in P2P lending.

Key Word Network Analysis to Identify the Trends of Research in Social Welfare for Disabled People (장애인복지연구의 동향에 관한 주제어 연결망 분석)

  • Kam, Jeong Ki;Oh, Bong Hee
    • 재활복지
    • /
    • v.21 no.1
    • /
    • pp.1-26
    • /
    • 2017
  • The main purpose of this paper is identifying the trends of researches in the realm of social welfare for disabled people. It tries to introduce an alternative analytic approach - key word network analysis - that might surpass the shortage of analysis methods of existing studies which have mainly relied on descriptive analysis. For this purpose the authors constructed a database composed of key words and informations about research methods of the data sources of this study. The sources are such researches as doctoral theses and the papers of Korean Journal of Social Welfare and Journal of Rehabilitation Research which were published during the 20 years from 1996 to 2015. Total numbers of the thesis or papers analysed at this study are 1,034. The results are shown in three ways as follows: First, the method trends of selected researches. Second the lists of interested types and population groups of the disabled persons and interested issues. Third, the intellectual structure of the research. Based on the findings of this study, some suggestions are given considering desirable directions of future researches in relation to perspectives, methods, targets and subjects of them.

Using collaborative filtering techniques Mobile ad recommendation system (협업필터링 기법을 이용한 모바일 광고 추천 시스템)

  • Kim, Eun-suk;Yoon, Sung-dae
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.3-6
    • /
    • 2012
  • Due to recent rapid growth of mobile market, the modern people increasing make use of mobile contents as a means to obtain the desired information quickly by overcoming various restraints of a computer. The wide range of recommended contents, however, takes much time in selection of contents. To resolve such issues, a system that predicts the contents desired by the user and makes an accurate recommendation is necessary. In this paper, in order to provide the desired contents in line with the user demands, a method to increase select the number of recommendation using cooperative filtering is proposed. In the first step, the categories are formulated with super-classes and the similarity between the target customer and users is found, and the nearest-neighbors are constituted to find the preference predictions between super-classes, and the super-class with the highest resulting value is recommended to the target customer. In the second step, the preference predictions between sub-classes are found and the sub-class with the highest value is recommended to the target customer. In the experiment, mobile contents are recommended through super-class-based cooperative filtering, and then the mobile contents are recommended through sub-class-based cooperative filtering, and sub-class collaborative filtering method to select a high number of verification.

  • PDF