• Title/Summary/Keyword: 학습 기반 필터링 기법

Search Result 66, Processing Time 0.027 seconds

Scalable Collaborative Filtering Technique based on Adaptive Clustering (적응형 군집화 기반 확장 용이한 협업 필터링 기법)

  • Lee, O-Joun;Hong, Min-Sung;Lee, Won-Jin;Lee, Jae-Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.73-92
    • /
    • 2014
  • An Adaptive Clustering-based Collaborative Filtering Technique was proposed to solve the fundamental problems of collaborative filtering, such as cold-start problems, scalability problems and data sparsity problems. Previous collaborative filtering techniques were carried out according to the recommendations based on the predicted preference of the user to a particular item using a similar item subset and a similar user subset composed based on the preference of users to items. For this reason, if the density of the user preference matrix is low, the reliability of the recommendation system will decrease rapidly. Therefore, the difficulty of creating a similar item subset and similar user subset will be increased. In addition, as the scale of service increases, the time needed to create a similar item subset and similar user subset increases geometrically, and the response time of the recommendation system is then increased. To solve these problems, this paper suggests a collaborative filtering technique that adapts a condition actively to the model and adopts the concepts of a context-based filtering technique. This technique consists of four major methodologies. First, items are made, the users are clustered according their feature vectors, and an inter-cluster preference between each item cluster and user cluster is then assumed. According to this method, the run-time for creating a similar item subset or user subset can be economized, the reliability of a recommendation system can be made higher than that using only the user preference information for creating a similar item subset or similar user subset, and the cold start problem can be partially solved. Second, recommendations are made using the prior composed item and user clusters and inter-cluster preference between each item cluster and user cluster. In this phase, a list of items is made for users by examining the item clusters in the order of the size of the inter-cluster preference of the user cluster, in which the user belongs, and selecting and ranking the items according to the predicted or recorded user preference information. Using this method, the creation of a recommendation model phase bears the highest load of the recommendation system, and it minimizes the load of the recommendation system in run-time. Therefore, the scalability problem and large scale recommendation system can be performed with collaborative filtering, which is highly reliable. Third, the missing user preference information is predicted using the item and user clusters. Using this method, the problem caused by the low density of the user preference matrix can be mitigated. Existing studies on this used an item-based prediction or user-based prediction. In this paper, Hao Ji's idea, which uses both an item-based prediction and user-based prediction, was improved. The reliability of the recommendation service can be improved by combining the predictive values of both techniques by applying the condition of the recommendation model. By predicting the user preference based on the item or user clusters, the time required to predict the user preference can be reduced, and missing user preference in run-time can be predicted. Fourth, the item and user feature vector can be made to learn the following input of the user feedback. This phase applied normalized user feedback to the item and user feature vector. This method can mitigate the problems caused by the use of the concepts of context-based filtering, such as the item and user feature vector based on the user profile and item properties. The problems with using the item and user feature vector are due to the limitation of quantifying the qualitative features of the items and users. Therefore, the elements of the user and item feature vectors are made to match one to one, and if user feedback to a particular item is obtained, it will be applied to the feature vector using the opposite one. Verification of this method was accomplished by comparing the performance with existing hybrid filtering techniques. Two methods were used for verification: MAE(Mean Absolute Error) and response time. Using MAE, this technique was confirmed to improve the reliability of the recommendation system. Using the response time, this technique was found to be suitable for a large scaled recommendation system. This paper suggested an Adaptive Clustering-based Collaborative Filtering Technique with high reliability and low time complexity, but it had some limitations. This technique focused on reducing the time complexity. Hence, an improvement in reliability was not expected. The next topic will be to improve this technique by rule-based filtering.

Improving Orbit Determination Precision of Satellite Optical Observation Data Using Deep Learning (심층 학습을 이용한 인공위성 광학 관측 데이터의 궤도결정 정밀도 향상)

  • Hyeon-man Yun;Chan-Ho Kim;In-Soo Choi;Soung-Sub Lee
    • Journal of Advanced Navigation Technology
    • /
    • v.28 no.3
    • /
    • pp.262-271
    • /
    • 2024
  • In this paper, by applying deep learning, one of the A.I. techniques, through angle information, which is optical observation data generated when observing satellites at observatories, distance information from observatories is learned to predict range data, thereby increasing the precision of satellite's orbit determination. To this end, we generated observational data from GMAT, reduced the learning data error of deep learning through preprocessing of the generated observational data, and conducted deep learning through MATLAB. Based on the predicted distance information from learning, trajectory determination was performed using an extended Kalman filter, one of the filtering techniques for trajectory determination, through GMAT. The reliability of the model was verified by comparing and analyzing the orbital determination with angular information without distance information and the orbital determination result with predicted distance information from the model.

Image Filtering Method for an Effective Inverse Tone - mapping (효과적인 역 톤 매핑을 위한 영상 필터링 기법)

  • Kang, Rahoon;Park, Bumjun;Jeong, Jechang
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2018.11a
    • /
    • pp.55-58
    • /
    • 2018
  • 본 논문에서는 가이디드 영상 필터를 (guided image filter) 이용하여 컨볼루션 신경망 (convolutional neural network) 을 이용한 역 톤 매핑 (inver tone - mapping; iTMO) 기법의 결과를 향상 시킬 수 있는 알고리듬을 제안한다. 기존 low dynamic range (LDR ) 영상을 high dynamic range (HDR ) 디스플레이에서 표현할 수 있는 역 톤 매핑 기법이 과거부터 계속 제안되어 왔다. 최근에 컨볼루션 신경망을 이용하여 단일 LDR 영상만으로 넓은 동적 범위 (dynamic range) 를 가진 HDR 영상으로 변환하는 알고리듬이 많이 연구되었다. 기존의 알고리듬 중 포화 영역 (saturated region) 으로 인해 잃어버린 화소 정보를 학습된 컨볼루션 신경망을 이용해서 복원하는 알고리듬은 그 효과가 좋지만 포화 영역이 아닌 부분의 잡음을 제거하지 못하며 포화 영역의 디테일을 복원하지 못한다. 제안한 알고리듬은 입력 영상에 가중치 기반 가이디드 영상 필터를 사용해서 비포화 영역의 잡음을 제거하고 포화 영역의 디테일을 복원시킨 다음 컨볼루션 신경망에 인가하여 결과 영상의 품질을 개선하였다. 제안하는 알고리듬은 실험을 통해서 기존의 알고리듬에 비해 높은 정량적 화질 평가 지수를 나타내었고, 기존의 알고리듬에 비해 세부 사항을 효과적으로 복원할 수 있음을 확인할 수 있었다.

  • PDF

Token-Based Classification and Dataset Construction for Detecting Modified Profanity (변형된 비속어 탐지를 위한 토큰 기반의 분류 및 데이터셋)

  • Sungmin Ko;Youhyun Shin
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.4
    • /
    • pp.181-188
    • /
    • 2024
  • Traditional profanity detection methods have limitations in identifying intentionally altered profanities. This paper introduces a new method based on Named Entity Recognition, a subfield of Natural Language Processing. We developed a profanity detection technique using sequence labeling, for which we constructed a dataset by labeling some profanities in Korean malicious comments and conducted experiments. Additionally, to enhance the model's performance, we augmented the dataset by labeling parts of a Korean hate speech dataset using one of the large language models, ChatGPT, and conducted training. During this process, we confirmed that filtering the dataset created by the large language model by humans alone could improve performance. This suggests that human oversight is still necessary in the dataset augmentation process.

Conditional Generative Adversarial Network based Collaborative Filtering Recommendation System (Conditional Generative Adversarial Network(CGAN) 기반 협업 필터링 추천 시스템)

  • Kang, Soyi;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.157-173
    • /
    • 2021
  • With the development of information technology, the amount of available information increases daily. However, having access to so much information makes it difficult for users to easily find the information they seek. Users want a visualized system that reduces information retrieval and learning time, saving them from personally reading and judging all available information. As a result, recommendation systems are an increasingly important technologies that are essential to the business. Collaborative filtering is used in various fields with excellent performance because recommendations are made based on similar user interests and preferences. However, limitations do exist. Sparsity occurs when user-item preference information is insufficient, and is the main limitation of collaborative filtering. The evaluation value of the user item matrix may be distorted by the data depending on the popularity of the product, or there may be new users who have not yet evaluated the value. The lack of historical data to identify consumer preferences is referred to as data sparsity, and various methods have been studied to address these problems. However, most attempts to solve the sparsity problem are not optimal because they can only be applied when additional data such as users' personal information, social networks, or characteristics of items are included. Another problem is that real-world score data are mostly biased to high scores, resulting in severe imbalances. One cause of this imbalance distribution is the purchasing bias, in which only users with high product ratings purchase products, so those with low ratings are less likely to purchase products and thus do not leave negative product reviews. Due to these characteristics, unlike most users' actual preferences, reviews by users who purchase products are more likely to be positive. Therefore, the actual rating data is over-learned in many classes with high incidence due to its biased characteristics, distorting the market. Applying collaborative filtering to these imbalanced data leads to poor recommendation performance due to excessive learning of biased classes. Traditional oversampling techniques to address this problem are likely to cause overfitting because they repeat the same data, which acts as noise in learning, reducing recommendation performance. In addition, pre-processing methods for most existing data imbalance problems are designed and used for binary classes. Binary class imbalance techniques are difficult to apply to multi-class problems because they cannot model multi-class problems, such as objects at cross-class boundaries or objects overlapping multiple classes. To solve this problem, research has been conducted to convert and apply multi-class problems to binary class problems. However, simplification of multi-class problems can cause potential classification errors when combined with the results of classifiers learned from other sub-problems, resulting in loss of important information about relationships beyond the selected items. Therefore, it is necessary to develop more effective methods to address multi-class imbalance problems. We propose a collaborative filtering model using CGAN to generate realistic virtual data to populate the empty user-item matrix. Conditional vector y identify distributions for minority classes and generate data reflecting their characteristics. Collaborative filtering then maximizes the performance of the recommendation system via hyperparameter tuning. This process should improve the accuracy of the model by addressing the sparsity problem of collaborative filtering implementations while mitigating data imbalances arising from real data. Our model has superior recommendation performance over existing oversampling techniques and existing real-world data with data sparsity. SMOTE, Borderline SMOTE, SVM-SMOTE, ADASYN, and GAN were used as comparative models and we demonstrate the highest prediction accuracy on the RMSE and MAE evaluation scales. Through this study, oversampling based on deep learning will be able to further refine the performance of recommendation systems using actual data and be used to build business recommendation systems.

Spam Classification by Analyzing Characteristics of a Single Web Document (단일 문서의 특징 분석을 이용한 스팸 분류 방법)

  • Sim, Sangkwon;Lee, Soowon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.11a
    • /
    • pp.845-848
    • /
    • 2014
  • 블로그는 인터넷에서 개인의 정보나 의견을 표출하고 커뮤니티를 형성하는데 사용되는 중요한 수단이나, 광고 유치, 페이지 순위 올리기, 쓰레기 데이터 생성 등 다양한 목적을 가진 스팸블로그가 생성되어 악용되기도 한다. 본 연구에서는 이러한 문제를 해결하기 위해 웹 문서에서 나타나는 특징들을 이용한 스팸 탐지 기법을 제안한다. 먼저 블로그 본문의 길이, 태그의 비율, 태그 수, 이미지 수, 랭크의 수 등 하나의 웹 문서에서 추출할 수 있는 특징을 기반으로 각 문서에 대한 특징 벡터를 생성하고 기계학습을 통해 모델을 생성하여 스팸 블로그를 판별한다. 제안 방법의 성능 평가를 위해 블로그 포스트 데이터를 사용하여 제안방법과 기존의 스팸 분류 연구를 비교 실험을 진행하였다. Bayesian 필터링 기법을 사용하는 기존연구와 비교 실험 결과, 제안방법이 더 좋은 정확도를 가지면서 특징 추출 속도 및 메모리 사용 효율성을 보였다.

A Fashion Design Recommender Agent System using Collaborative Filtering and Sensibilities related to Textile Design Factors (텍스타일 기반의 협력적 필터링 기술과 디자인 요소에 따른 감성 분석을 이용한 패션 디자인 추천 에이전트 시스템)

  • 정경용;나영주;이정현
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.10 no.2
    • /
    • pp.174-188
    • /
    • 2004
  • In the life environment changed with not only the quality and the price of the products but also the material abundance, it is the most crucial factor for the strategy of product sales to investigate consumer's sensibility and preference degree. In this perspective, it is necessary to design and merchandise the products in cope with each consumer's sensibility and needs as well as its functional aspects. In this paper, we propose the Fashion Design Recommender Agent System (FDRAS-pro) for textile design applying collaborative filtering personalization technique as one of the methods of material development centered on consumer's sensibility and preference. For a collaborative filtering system based on textile, Representative-Attribute Neighborhood is adopted to determine the number or neighbors that will be used for preferences estimation. Pearson's Correlation Coefficient is used to calculate similarity weights among users. We build a database founded on the sensibility adjectives to develop textile designs by extracting the representative sensibility adjectives from users' sensibility and preferences about textile designs. FDRAS-pro recommends textile designs to a customer who has a similar propensity about textile. To investigate the sensibility and emotion according to the effect of design factors, fertile designs were analyzed in terms of 9 design factors, such as, motif source, motif-background ratio, motif variation, motif interpretation, motif arrangement, motif articulation, hue contrast, value contrast, chroma contrast. Finally, we plan to conduct empirical applications to verify the adequacy and the validity of our system.

CBIR-based Data Augmentation and Its Application to Deep Learning (CBIR 기반 데이터 확장을 이용한 딥 러닝 기술)

  • Kim, Sesong;Jung, Seung-Won
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.403-408
    • /
    • 2018
  • Generally, a large data set is required for learning of deep learning. However, since it is not easy to create large data sets, there are a lot of techniques that make small data sets larger through data expansion such as rotation, flipping, and filtering. However, these simple techniques have limitation on extendibility because they are difficult to escape from the features already possessed. In order to solve this problem, we propose a method to acquire new image data by using existing data. This is done by retrieving and acquiring similar images using existing image data as a query of the content-based image retrieval (CBIR). Finally, we compare the performance of the base model with the model using CBIR.

A Study on the Performance of Deep learning-based Automatic Classification of Forest Plants: A Comparison of Data Collection Methods (데이터 수집방법에 따른 딥러닝 기반 산림수종 자동분류 정확도 변화에 관한 연구)

  • Kim, Bomi;Woo, Heesung;Park, Joowon
    • Journal of Korean Society of Forest Science
    • /
    • v.109 no.1
    • /
    • pp.23-30
    • /
    • 2020
  • The use of increased computing power, machine learning, and deep learning techniques have dramatically increased in various sectors. In particular, image detection algorithms are broadly used in forestry and remote sensing areas to identify forest types and tree species. However, in South Korea, machine learning has rarely, if ever, been applied in forestry image detection, especially to classify tree species. This study integrates the application of machine learning and forest image detection; specifically, we compared the ability of two machine learning data collection methods, namely image data captured by forest experts (D1) and web-crawling (D2), to automate the classification of five trees species. In addition, two methods of characterization to train/test the system were investigated. The results indicated a significant difference in classification accuracy between D1 and D2: the classification accuracy of D1 was higher than that of D2. In order to increase the classification accuracy of D2, additional data filtering techniques were required to reduce the noise of uncensored image data.

WDENet: Wavelet-based Detail Enhanced Image Denoising Network (Wavelet 기반의 영상 디테일 향상 잡음 제거 네트워크)

  • Zheng, Jun;Wee, Seungwoo;Jeong, Jechang
    • Journal of Broadcast Engineering
    • /
    • v.26 no.6
    • /
    • pp.725-737
    • /
    • 2021
  • Although the performance of cameras is gradually improving now, there are noise in the acquired digital images from the camera, which acts as an obstacle to obtaining high-resolution images. Traditionally, a filtering method has been used for denoising, and a convolutional neural network (CNN), one of the deep learning techniques, has been showing better performance than traditional methods in the field of image denoising, but the details in images could be lost during the learning process. In this paper, we present a CNN for image denoising, which improves image details by learning the details of the image based on wavelet transform. The proposed network uses two subnetworks for detail enhancement and noise extraction. The experiment was conducted through Gaussian noise and real-world noise, we confirmed that our proposed method was able to solve the detail loss problem more effectively than conventional algorithms, and we verified that both objective quality evaluation and subjective quality comparison showed excellent results.