• 제목/요약/키워드: real-world dataset

검색결과 148건 처리시간 0.025초

The Effect of the Products' Review on Consumers' Response

  • Feng, Zhou
    • 산경연구논집
    • /
    • 제7권2호
    • /
    • pp.13-20
    • /
    • 2016
  • Purpose - The purpose of this research is to discover whether the presence of the product average rating introduces biases or change the way people perceive information. We posit that review's overall rating has a predisposition effect on consumers' perception towards detailed review information. Research design, data, and methodology - To test these hypotheses, we conducted an empirical study on a real-world setting of online shopping platform. We choose the Amazon website to test our results. The data we use were collected by the Stanford Network Analysis Project1 (McAuley et al., 2013). Results - With a dataset containing reviews of seven product categories from amazon.com., our findings could possess more generalizability as they are produced on the typical and influential online market. Second, as our research provides alternative views of consumers' shopping behavior, it is better to test our hypotheses by data from the same source. Conclusions - Our study reveals the impact of the collective rating presence on consumers' diagnosticity perception and sheds light upon some of the conflictive results in prior studies. Our research generates implications to both theories and business practices, and suggests future directions for the research question.

Dessert Ateliers Recommendation Methods for Dessert E-commerce Services

  • 손연빈;장태우;최예림
    • 인터넷정보학회논문지
    • /
    • 제21권1호
    • /
    • pp.111-117
    • /
    • 2020
  • Dessert Ateliers (DA) are small shops that sell high-end homemade desserts such as macaroons, cakes, and cookies, and their popularity is increasing according to the emergence of small luxury trends. Even though each DA sells the same kinds of desserts, they are differentiated by the personality of their pastry chef; thus, there is a need to purchase desserts online that customers cannot see and purchase offline, and thus dessert e-commerce has emerged. However, it is impossible for customers to identify all the information of each DA and clearly understand customers' preferences when buying desserts through the dessert e-commerce. When a dessert e-commerce service provides a DA recommendation service, customers can reduce the time they hesitate before making a decision. Therefore, this paper proposes two kinds of DA recommendation method: a clustering-based recommendation method that calculates the similarity between customers' content and DAs and a dynamic weighting-based recommendation method that trains the importance of decision factors considering customer preferences. Various experiments were conducted using a real-world dataset to evaluate the performance of the proposed methods and it showed satisfactory results.

Deep Image Annotation and Classification by Fusing Multi-Modal Semantic Topics

  • Chen, YongHeng;Zhang, Fuquan;Zuo, WanLi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권1호
    • /
    • pp.392-412
    • /
    • 2018
  • Due to the semantic gap problem across different modalities, automatically retrieval from multimedia information still faces a main challenge. It is desirable to provide an effective joint model to bridge the gap and organize the relationships between them. In this work, we develop a deep image annotation and classification by fusing multi-modal semantic topics (DAC_mmst) model, which has the capacity for finding visual and non-visual topics by jointly modeling the image and loosely related text for deep image annotation while simultaneously learning and predicting the class label. More specifically, DAC_mmst depends on a non-parametric Bayesian model for estimating the best number of visual topics that can perfectly explain the image. To evaluate the effectiveness of our proposed algorithm, we collect a real-world dataset to conduct various experiments. The experimental results show our proposed DAC_mmst performs favorably in perplexity, image annotation and classification accuracy, comparing to several state-of-the-art methods.

Locally-Weighted Polynomial Neural Network for Daily Short-Term Peak Load Forecasting

  • Yu, Jungwon;Kim, Sungshin
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제16권3호
    • /
    • pp.163-172
    • /
    • 2016
  • Electric load forecasting is essential for effective power system planning and operation. Complex and nonlinear relationships exist between the electric loads and their exogenous factors. In addition, time-series load data has non-stationary characteristics, such as trend, seasonality and anomalous day effects, making it difficult to predict the future loads. This paper proposes a locally-weighted polynomial neural network (LWPNN), which is a combination of a polynomial neural network (PNN) and locally-weighted regression (LWR) for daily shortterm peak load forecasting. Model over-fitting problems can be prevented effectively because PNN has an automatic structure identification mechanism for nonlinear system modeling. LWR applied to optimize the regression coefficients of LWPNN only uses the locally-weighted learning data points located in the neighborhood of the current query point instead of using all data points. LWPNN is very effective and suitable for predicting an electric load series with nonlinear and non-stationary characteristics. To confirm the effectiveness, the proposed LWPNN, standard PNN, support vector regression and artificial neural network are applied to a real world daily peak load dataset in Korea. The proposed LWPNN shows significantly good prediction accuracy compared to the other methods.

사용자 간 신뢰·불신 관계 네트워크 분석 기반 추천 알고리즘에 관한 연구 (A Study on the Recommendation Algorithm based on Trust/Distrust Relationship Network Analysis)

  • 노희룡;안현철
    • Journal of Information Technology Applications and Management
    • /
    • 제24권1호
    • /
    • pp.169-185
    • /
    • 2017
  • This study proposes a novel recommendation algorithm that reflects the results from trust/distrust network analysis as a solution to enhance prediction accuracy of recommender systems. The recommendation algorithm of our study is based on memory-based collaborative filtering (CF), which is the most popular recommendation algorithm. But, unlike conventional CF, our proposed algorithm considers not only the correlation of the rating patterns between users, but also the results from trust/distrust relationship network analysis (e.g. who are the most trusted/distrusted users?, whom are the target user trust or distrust?) when calculating the similarity between users. To validate the performance of the proposed algorithm, we applied it to a real-world dataset that contained the trust/distrust relationships among users as well as their numeric ratings on movies. As a result, we found that the proposed algorithm outperformed the conventional CF with statistical significance. Also, we found that distrust relationship was more important than trust relationship in measuring similarities between users. This implies that we need to be more careful about negative relationship rather than positive one when tracking and managing social relationships among users.

A Max-Flow-Based Similarity Measure for Spectral Clustering

  • Cao, Jiangzhong;Chen, Pei;Zheng, Yun;Dai, Qingyun
    • ETRI Journal
    • /
    • 제35권2호
    • /
    • pp.311-320
    • /
    • 2013
  • In most spectral clustering approaches, the Gaussian kernel-based similarity measure is used to construct the affinity matrix. However, such a similarity measure does not work well on a dataset with a nonlinear and elongated structure. In this paper, we present a new similarity measure to deal with the nonlinearity issue. The maximum flow between data points is computed as the new similarity, which can satisfy the requirement for similarity in the clustering method. Additionally, the new similarity carries the global and local relations between data. We apply it to spectral clustering and compare the proposed similarity measure with other state-of-the-art methods on both synthetic and real-world data. The experiment results show the superiority of the new similarity: 1) The max-flow-based similarity measure can significantly improve the performance of spectral clustering; 2) It is robust and not sensitive to the parameters.

퍼지 클러스터 기반 디지털 유방 X선 영상 진단 시스템 (Fuzzy Cluster Based Diagnosis System for Digital Mammogram)

  • 이현숙;윤석민
    • 정보처리학회논문지B
    • /
    • 제16B권2호
    • /
    • pp.165-172
    • /
    • 2009
  • 최근 ACS에 따르면 여성에게 유방암은 가장 많이 발병하는 암으로서 그 사망자 수도 두 번째로 많은 암이다. 유방 X선 영상의 종괴나 석회 환부는 진단을 위한 가장 중요한 단서로서 알려져 있으므로 유방암의 조기진단을 위하여 디지털 유방 X선 영상을 컴퓨터에서 처리하는 연구가 진행되고 있다. 본 논문에서는 퍼지 클러스터 지식베이스에 기반을 둔 진단시스템을 제안한다. 제안된 시스템은 듀얼 OFUN-NET에 두 가지 종류의 특징 데이터를 처리하여 진단결과와 그 가능성을 알려준다. 실세계 의료기관으로부터 수집되고 공개적으로 제공되는 유방 X선 데이터베이스 DDSM으로부터 획득한 종괴와 석회 환부의 데이터를 사용하여 실험한다. 실험결과는 제안된 시스템이 기존의 방법보다 높은 분류 정확도와 유방 X선 영상 진단시스템으로서 전문가의 의사 결정을 도울 수 있는 타당한 결과를 보여준다.

트리플 데이터베이스 단축 경로 이득 함수와 구성 인자 실험 분석 (Empirical Analysis on the Shortcut Benefit Function and its Factors for Triple Database)

  • 강승석;심준호
    • 한국전자거래학회지
    • /
    • 제19권1호
    • /
    • pp.131-143
    • /
    • 2014
  • 3-컬럼의 트리플 테이블로 구성되는 트리플 데이터베이스의 질의 처리는 고비용이 드는데, 단축 경로는 그 비용을 감소시키는 방법으로 알려졌다. 어떠한 단축 경로를 선택 구성할지는 주요한 문제이며, 질의 빈도를 기반으로 단축 경로 이득을 계산하는 방식이 주로 사용된다. 하지만 이러한 방식은 트리플 데이터의 추가 혹은 변경을 적절히 반영하지 못한다. 본 논문에서는 질의 처리 시간 단축 측면뿐 아니라 경로 구축 및 유지 비용도 고려하는 이득 모델을 다룬다. 이득 모델은 이득 함수로 설계되어 단축 경로 선택 기법에 적용된다. 이득 함수 구성 인자가 미치는 영향을 실세계 트리플 데이터를 사용해 실험 분석한다.

A New Perspective to Stable Marriage Problem in Profit Maximization of Matrimonial Websites

  • Bhatnagar, Aniket;Gambhir, Varun;Thakur, Manish Kumar
    • Journal of Information Processing Systems
    • /
    • 제14권4호
    • /
    • pp.961-979
    • /
    • 2018
  • For many years, matching in a bipartite graph has been widely used in various assignment problems, such as stable marriage problem (SMP). As an application of bipartite matching, the problem of stable marriage is defined over equally sized sets of men and women to identify a stable matching in which each person is assigned a partner of opposite gender according to their preferences. The classical SMP proposed by Gale and Shapley uses preference lists for each individual (men and women) which are infeasible in real world applications for a large populace of men and women such as matrimonial websites. In this paper, we have proposed an enhancement to the SMP by computing a weighted score for the users registered at matrimonial websites. The proposed enhancement has been formulated into profit maximization of matrimonial websites in terms of their ability to provide a suitable match for the users. The proposed formulation to maximize the profits of matrimonial websites leads to a combinatorial optimization problem. We have proposed greedy and genetic algorithm based approaches to solve the proposed optimization problem. We have shown that the proposed genetic algorithm based approaches outperform the existing Gale-Shapley algorithm on the dataset crawled from matrimonial websites.

아파치 스파크에서의 PARAFAC 분해 기반 텐서 재구성을 이용한 추천 시스템 (PARAFAC Tensor Reconstruction for Recommender System based on Apache Spark)

  • 임어진;용환승
    • 한국멀티미디어학회논문지
    • /
    • 제22권4호
    • /
    • pp.443-454
    • /
    • 2019
  • In recent years, there has been active research on a recommender system that considers three or more inputs in addition to users and goods, making it a multi-dimensional array, also known as a tensor. The main issue with using tensor is that there are a lot of missing values, making it sparse. In order to solve this, the tensor can be shrunk using the tensor decomposition algorithm into a lower dimensional array called a factor matrix. Then, the tensor is reconstructed by calculating factor matrices to fill original empty cells with predicted values. This is called tensor reconstruction. In this paper, we propose a user-based Top-K recommender system by normalized PARAFAC tensor reconstruction. This method involves factorization of a tensor into factor matrices and reconstructs the tensor again. Before decomposition, the original tensor is normalized based on each dimension to reduce overfitting. Using the real world dataset, this paper shows the processing of a large amount of data and implements a recommender system based on Apache Spark. In addition, this study has confirmed that the recommender performance is improved through normalization of the tensor.