• Title/Summary/Keyword: Transaction-Based Data

Search Result 535, Processing Time 0.029 seconds

Customer Behavior Prediction of Binary Classification Model Using Unstructured Information and Convolution Neural Network: The Case of Online Storefront (비정형 정보와 CNN 기법을 활용한 이진 분류 모델의 고객 행태 예측: 전자상거래 사례를 중심으로)

  • Kim, Seungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.221-241
    • /
    • 2018
  • Deep learning is getting attention recently. The deep learning technique which had been applied in competitions of the International Conference on Image Recognition Technology(ILSVR) and AlphaGo is Convolution Neural Network(CNN). CNN is characterized in that the input image is divided into small sections to recognize the partial features and combine them to recognize as a whole. Deep learning technologies are expected to bring a lot of changes in our lives, but until now, its applications have been limited to image recognition and natural language processing. The use of deep learning techniques for business problems is still an early research stage. If their performance is proved, they can be applied to traditional business problems such as future marketing response prediction, fraud transaction detection, bankruptcy prediction, and so on. So, it is a very meaningful experiment to diagnose the possibility of solving business problems using deep learning technologies based on the case of online shopping companies which have big data, are relatively easy to identify customer behavior and has high utilization values. Especially, in online shopping companies, the competition environment is rapidly changing and becoming more intense. Therefore, analysis of customer behavior for maximizing profit is becoming more and more important for online shopping companies. In this study, we propose 'CNN model of Heterogeneous Information Integration' using CNN as a way to improve the predictive power of customer behavior in online shopping enterprises. In order to propose a model that optimizes the performance, which is a model that learns from the convolution neural network of the multi-layer perceptron structure by combining structured and unstructured information, this model uses 'heterogeneous information integration', 'unstructured information vector conversion', 'multi-layer perceptron design', and evaluate the performance of each architecture, and confirm the proposed model based on the results. In addition, the target variables for predicting customer behavior are defined as six binary classification problems: re-purchaser, churn, frequent shopper, frequent refund shopper, high amount shopper, high discount shopper. In order to verify the usefulness of the proposed model, we conducted experiments using actual data of domestic specific online shopping company. This experiment uses actual transactions, customers, and VOC data of specific online shopping company in Korea. Data extraction criteria are defined for 47,947 customers who registered at least one VOC in January 2011 (1 month). The customer profiles of these customers, as well as a total of 19 months of trading data from September 2010 to March 2012, and VOCs posted for a month are used. The experiment of this study is divided into two stages. In the first step, we evaluate three architectures that affect the performance of the proposed model and select optimal parameters. We evaluate the performance with the proposed model. Experimental results show that the proposed model, which combines both structured and unstructured information, is superior compared to NBC(Naïve Bayes classification), SVM(Support vector machine), and ANN(Artificial neural network). Therefore, it is significant that the use of unstructured information contributes to predict customer behavior, and that CNN can be applied to solve business problems as well as image recognition and natural language processing problems. It can be confirmed through experiments that CNN is more effective in understanding and interpreting the meaning of context in text VOC data. And it is significant that the empirical research based on the actual data of the e-commerce company can extract very meaningful information from the VOC data written in the text format directly by the customer in the prediction of the customer behavior. Finally, through various experiments, it is possible to say that the proposed model provides useful information for the future research related to the parameter selection and its performance.

A Study on the Performance Evaluation of G2B Procurement Process Innovation by Using MAS: Korea G2B KONEPS Case (멀티에이전트시스템(MAS)을 이용한 G2B 조달 프로세스 혁신의 효과평가에 관한 연구 : 나라장터 G2B사례)

  • Seo, Won-Jun;Lee, Dae-Cheor;Lim, Gyoo-Gun
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.157-175
    • /
    • 2012
  • It is difficult to evaluate the performance of process innovation of e-procurement which has large scale and complex processes. The existing evaluation methods for measuring the effects of process innovation have been mainly done with statistically quantitative methods by analyzing operational data or with qualitative methods by conducting surveys and interviews. However, these methods have some limitations to evaluate the effects because the performance evaluation of e-procurement process innovation should consider the interactions among participants who are active either directly or indirectly through the processes. This study considers the e-procurement process as a complex system and develops a simulation model based on MAS(Multi-Agent System) to evaluate the effects of e-procurement process innovation. Multi-agent based simulation allows observing interaction patterns of objects in virtual world through relationship among objects and their behavioral mechanism. Agent-based simulation is suitable especially for complex business problems. In this study, we used Netlogo Version 4.1.3 as a MAS simulation tool which was developed in Northwestern University. To do this, we developed a interaction model of agents in MAS environment. We defined process agents and task agents, and assigned their behavioral characteristics. The developed simulation model was applied to G2B system (KONEPS: Korea ON-line E-Procurement System) of Public Procurement Service (PPS) in Korea and used to evaluate the innovation effects of the G2B system. KONEPS is a successfully established e-procurement system started in the year 2002. KONEPS is a representative e-Procurement system which integrates characteristics of e-commerce into government for business procurement activities. KONEPS deserves the international recognition considering the annual transaction volume of 56 billion dollars, daily exchanges of electronic documents, users consisted of 121,000 suppliers and 37,000 public organizations, and the 4.5 billion dollars of cost saving. For the simulation, we analyzed the e-procurement of process of KONEPS into eight sub processes such as 'process 1: search products and acquisition of proposal', 'process 2 : review the methods of contracts and item features', 'process 3 : a notice of bid', 'process 4 : registration and confirmation of qualification', 'process 5 : bidding', 'process 6 : a screening test', 'process 7 : contracts', and 'process 8 : invoice and payment'. For the parameter settings of the agents behavior, we collected some data from the transactional database of PPS and some information by conducting a survey. The used data for the simulation are 'participants (government organizations, local government organizations and public institutions)', 'the number of bidding per year', 'the number of total contracts', 'the number of shopping mall transactions', 'the rate of contracts between bidding and shopping mall', 'the successful bidding ratio', and the estimated time for each process. The comparison was done for the difference of time consumption between 'before the innovation (As-was)' and 'after the innovation (As-is).' The results showed that there were productivity improvements in every eight sub processes. The decrease ratio of 'average number of task processing' was 92.7% and the decrease ratio of 'average time of task processing' was 95.4% in entire processes when we use G2B system comparing to the conventional method. Also, this study found that the process innovation effect will be enhanced if the task process related to the 'contract' can be improved. This study shows the usability and possibility of using MAS in process innovation evaluation and its modeling.

Effect of Market Basket Size on the Accuracy of Association Rule Measures (장바구니 크기가 연관규칙 척도의 정확성에 미치는 영향)

  • Kim, Nam-Gyu
    • Asia pacific journal of information systems
    • /
    • v.18 no.2
    • /
    • pp.95-114
    • /
    • 2008
  • Recent interests in data mining result from the expansion of the amount of business data and the growing business needs for extracting valuable knowledge from the data and then utilizing it for decision making process. In particular, recent advances in association rule mining techniques enable us to acquire knowledge concerning sales patterns among individual items from the voluminous transactional data. Certainly, one of the major purposes of association rule mining is to utilize acquired knowledge in providing marketing strategies such as cross-selling, sales promotion, and shelf-space allocation. In spite of the potential applicability of association rule mining, unfortunately, it is not often the case that the marketing mix acquired from data mining leads to the realized profit. The main difficulty of mining-based profit realization can be found in the fact that tremendous numbers of patterns are discovered by the association rule mining. Due to the many patterns, data mining experts should perform additional mining of the results of initial mining in order to extract only actionable and profitable knowledge, which exhausts much time and costs. In the literature, a number of interestingness measures have been devised for estimating discovered patterns. Most of the measures can be directly calculated from what is known as a contingency table, which summarizes the sales frequencies of exclusive items or itemsets. A contingency table can provide brief insights into the relationship between two or more itemsets of concern. However, it is important to note that some useful information concerning sales transactions may be lost when a contingency table is constructed. For instance, information regarding the size of each market basket(i.e., the number of items in each transaction) cannot be described in a contingency table. It is natural that a larger basket has a tendency to consist of more sales patterns. Therefore, if two itemsets are sold together in a very large basket, it can be expected that the basket contains two or more patterns and that the two itemsets belong to mutually different patterns. Therefore, we should classify frequent itemset into two categories, inter-pattern co-occurrence and intra-pattern co-occurrence, and investigate the effect of the market basket size on the two categories. This notion implies that any interestingness measures for association rules should consider not only the total frequency of target itemsets but also the size of each basket. There have been many attempts on analyzing various interestingness measures in the literature. Most of them have conducted qualitative comparison among various measures. The studies proposed desirable properties of interestingness measures and then surveyed how many properties are obeyed by each measure. However, relatively few attentions have been made on evaluating how well the patterns discovered by each measure are regarded to be valuable in the real world. In this paper, attempts are made to propose two notions regarding association rule measures. First, a quantitative criterion for estimating accuracy of association rule measures is presented. According to this criterion, a measure can be considered to be accurate if it assigns high scores to meaningful patterns that actually exist and low scores to arbitrary patterns that co-occur by coincidence. Next, complementary measures are presented to improve the accuracy of traditional association rule measures. By adopting the factor of market basket size, the devised measures attempt to discriminate the co-occurrence of itemsets in a small basket from another co-occurrence in a large basket. Intensive computer simulations under various workloads were performed in order to analyze the accuracy of various interestingness measures including traditional measures and the proposed measures.

The Prediction of Purchase Amount of Customers Using Support Vector Regression with Separated Learning Method (Support Vector Regression에서 분리학습을 이용한 고객의 구매액 예측모형)

  • Hong, Tae-Ho;Kim, Eun-Mi
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.213-225
    • /
    • 2010
  • Data mining has empowered the managers who are charge of the tasks in their company to present personalized and differentiated marketing programs to their customers with the rapid growth of information technology. Most studies on customer' response have focused on predicting whether they would respond or not for their marketing promotion as marketing managers have been eager to identify who would respond to their marketing promotion. So many studies utilizing data mining have tried to resolve the binary decision problems such as bankruptcy prediction, network intrusion detection, and fraud detection in credit card usages. The prediction of customer's response has been studied with similar methods mentioned above because the prediction of customer's response is a kind of dichotomous decision problem. In addition, a number of competitive data mining techniques such as neural networks, SVM(support vector machine), decision trees, logit, and genetic algorithms have been applied to the prediction of customer's response for marketing promotion. The marketing managers also have tried to classify their customers with quantitative measures such as recency, frequency, and monetary acquired from their transaction database. The measures mean that their customers came to purchase in recent or old days, how frequent in a period, and how much they spent once. Using segmented customers we proposed an approach that could enable to differentiate customers in the same rating among the segmented customers. Our approach employed support vector regression to forecast the purchase amount of customers for each customer rating. Our study used the sample that included 41,924 customers extracted from DMEF04 Data Set, who purchased at least once in the last two years. We classified customers from first rating to fifth rating based on the purchase amount after giving a marketing promotion. Here, we divided customers into first rating who has a large amount of purchase and fifth rating who are non-respondents for the promotion. Our proposed model forecasted the purchase amount of the customers in the same rating and the marketing managers could make a differentiated and personalized marketing program for each customer even though they were belong to the same rating. In addition, we proposed more efficient learning method by separating the learning samples. We employed two learning methods to compare the performance of proposed learning method with general learning method for SVRs. LMW (Learning Method using Whole data for purchasing customers) is a general learning method for forecasting the purchase amount of customers. And we proposed a method, LMS (Learning Method using Separated data for classification purchasing customers), that makes four different SVR models for each class of customers. To evaluate the performance of models, we calculated MAE (Mean Absolute Error) and MAPE (Mean Absolute Percent Error) for each model to predict the purchase amount of customers. In LMW, the overall performance was 0.670 MAPE and the best performance showed 0.327 MAPE. Generally, the performances of the proposed LMS model were analyzed as more superior compared to the performance of the LMW model. In LMS, we found that the best performance was 0.275 MAPE. The performance of LMS was higher than LMW in each class of customers. After comparing the performance of our proposed method LMS to LMW, our proposed model had more significant performance for forecasting the purchase amount of customers in each class. In addition, our approach will be useful for marketing managers when they need to customers for their promotion. Even if customers were belonging to same class, marketing managers could offer customers a differentiated and personalized marketing promotion.

Effect of Information Security Incident on Outcome of Investment by Type of Investors: Case of Personal Information Leakage Incident (정보보안사고가 투자주체별 투자성과에 미치는 영향: 개인정보유출사고 중심으로)

  • Eom, Jae-Ha;Kim, Min-Jeong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.26 no.2
    • /
    • pp.463-474
    • /
    • 2016
  • As IT environment has changed, paths of information security in financial environment which is based on IT have become more diverse and damage caused by information leakage has been more serious. Among security incidents, personal information leakage incident is liable to give the greatest damage. Personal information leakage incident is more serious than any other types of information leakage incidents in that it may lead to secondary damage. The purpose of this study is to find how much personal information leakage incident influences corporate value by analyzing 21 cases of personal information leakage incident for the last 15 years 1,899 listing firm through case research method and inferring investors' response of to personal information leakage incident surveying a change in transaction before and after personal information leakage incident. This study made a quantitative analysis of what influence personal information leakage incident has on outcome of investment by types of investors by classifying types of investors into foreign investors, private investors and institutional investors. This study is significant in that it helps improve awareness of importance of personal information security by providing data that personal information leakage incident can have a significant influence on outcome of investment as well as corporate value in Korea stock market.

Analysis the Types of Consumer Damages Incurred by Using a Digital Contents (디지털콘텐츠 소비자 피해유형 분석)

  • Nam, Su-Jung;Lee, Eun-Hee;Park, Sang-Mi
    • Korean Journal of Human Ecology
    • /
    • v.16 no.6
    • /
    • pp.1197-1209
    • /
    • 2007
  • The advance of digital contents industry shifts the focus of consumptions; from analogue to digital ones. It gives significant impact on individual life as well as overall society and culture, and it leads to the increased consumption of digital contents. Nevertheless, current digital contents industry fails to secure the sufficient consumer protection systems including relevant rules and laws which regulate the distribution, use, and other transaction activities of digital contents and the efforts, on the part of contents providers, to provide information to consumers and to protect them. Digital contents, by its nature, is different from the existing products so that its nature is likely to cause unique consumer problems totally different from the offline transactions and the electrical transactions of existing products. This study, therefore, aims to identify the possible problems which may be incurred by consumers in their use of digital contents, specify the types of consumer damages, and provide the underlying materials to improve the systems related to digital contents and take legally complementary measures for consumer protection. To identify the types of consumer damages, this study analyzed the results from consumer counselling cases, experts opinion survey, and FGI. For consumer damage cases, this study analyzed the consumer complaints received by open consumer counselling sites of the Korea Consumer Agency and Seoul Electronic Commerce Center. For experts opinion survey, it conducted questionnaire survey of the group of experts from digital contents manufacturers or providers, and those who treated consumer damages directly. For FGI analysis, it organized a panel of students and employees who had used digital contents to understand the types of consumer damages. The results of this study can be summed up as follows. Based on the results from consumer counselling cases, experts opinion survey, and FGI analysis, the consumer damages related to digital contents can be classified, in their nature, into economic or financial damages (25 cases), emotional or psychological ones (15 cases), time-related ones (7 cases), physical ones (4 cases), and privacy-related ones (i.e. leakage of personal data)(3 cases). More specifying the types of damages, damages can be subdivided into contract-, charge-, maintenance-, use-, individual-related ones and other ones. Among them, both contract- and charge-related damages appeared only in the economic or financial damages, whereas user-specific individual damages appeared only in physical and emotional or psychological ones. On the other hand, maintenance- and use-related damages and other ones were observed in both categories of economical or financial damages and time-related ones. Use- and privacy-related damages, in particular, caused emotional or psychological damages.

Influencing Factors on the Lending Intention of Online Peer-to-Peer Lending: Lessons from Renrendai.com (온라인 P2P 대출의도의 영향요인에 관한 연구: 런런다이 사례를 중심으로)

  • Yang, Qin;Lee, Young-Chan
    • The Journal of Information Systems
    • /
    • v.25 no.2
    • /
    • pp.79-110
    • /
    • 2016
  • Purpose Online Peer-to-peer lending (hereafter P2P lending), is a new method of lending money to unrelated individuals through an online financial intermediary. Usually in the online P2P transaction, individuals who would like to borrow money (hereafter borrowers) and those who would like to lend money (hereafter lenders) have no previous relationship. Based on enormous previous studies, this study develops an integrated model, particularly for the online P2P lending environment in China, to better understand the critical factors that influence lenders' intention to lend money through the online P2P lending platform. Design/methodology/approach In order to verify the hypotheses, we develop a questionnaire with 42 survey items. We measured all the items on a five-point Likert-type scale. We use Sojump.com to collect questionnaire and gather 246 valid responses from registered members of Renrendai.com. We analyzed the main survey data by using SPSS 18.0 and AMOS 20.0. We first estimated the reliability, validity, composite reliability and AVE and then conduct common method bias test. The mediating role of trust in platform and in borrower has been tested. Last we tested the hypotheses through the structural model. Findings The results reveal that service quality, information quality, structural assurance, awareness and reputation significantly impact lenders' trust in the online P2P lending platform. Second, awareness, reputation and perceived risk significantly impact lenders' trust in borrower and lending intention. Third, trust propensity has a positive effect on lenders' trust on borrower. Last, awareness, reputation, perceived risk, platform trust and borrower trust can directly impact lenders' lending intention.

A Study on Improvement for Service Proliferation Based on Blockchain (블록체인 기반 서비스 확산을 위한 개선 방안 연구)

  • Yoo, Soonduck;Kim, Kiheung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.1
    • /
    • pp.185-194
    • /
    • 2018
  • This study investigates the limitations of blockchain technology and the ways to improve it by using Delphi technique. Limit factors and improvement measures are classified into technology, service, and legal system. First, from a technical point of view, lack of standardization of the technology, insufficiency of integration, lack of scalability, unclear cancellation or correction policy, excessive cost of transaction verification, insufficient personal information protection and not enough to respond to hacking defense were the limiting factors. In order to improve these, the followings; ensuring standardization, securing integration and scalability, establishing cancellation of each applicable data, establishment of correction policy, efficiency of verification cost, the protection of personal information and countermeasure against hacking are provided. The related technology development and countermeasures must be established to effectively introduce the blockchain technology to the market. Second, in the early stage of blockchain service, it showed lack of utilization of the blockchain, security threat, shortage of skilled workers, and lack of legal liability. As a solution to these problems, it is necessary to suggest various applications, against security threat, training professional manpower, and securing legal responsibility. It should also provide a foundation for providing institutionally stable services. Third, from as legal system point of view, inadequate legal compliance, lack of relevant regulation, and uncertainty in the regulation were the limiting factors. Therefore establishing a legal system, which is the most important area for activating the service, should be accompanied by the provision of legal countermeasures, clearness of regulations and measures to be taken by relevant governmental authorities. This study will contribute as a reference for a research, related to the blockchain.

An Efficient Recovery System for Spatial Main Memory DBMS (공간 메인 메모리 DBMS를 위한 효율적인 회복 시스템)

  • Kim, Joung-Joon;Ju, Sung-Wan;Kang, Hong-Koo;Hong, Dong-Sook;Han, Ki-Joon
    • Journal of Korea Spatial Information System Society
    • /
    • v.8 no.3
    • /
    • pp.1-14
    • /
    • 2006
  • Recently, to efficiently support the real-time requirements of LBS and Telematics services, interest in the spatial main memory DBMS is rising. In the spatial main memory DBMS, because all spatial data can be lost when the system failure happens, the recovery system is very important for the stability of the database. Especially, disk I/O in executing the log and the checkpoint becomes the bottleneck of letting down the total system performance. Therefore, it is urgently necessary to research about the recovery system to reduce disk I/O in the spatial main memory DBMS. In this paper, we study an efficient recovery system for the spatial main memory DBMS. First, the pre-commit log method is used for the decrement of disk I/O and the improvement of transaction concurrency. In addition, we propose the fuzzy-shadow checkpoint method for the recovery system of the spatial main memory DBMS. This method can solve the problem of duplicated disk I/O on the same page of the existing fuzzy-pingpong checkpoint method for the improvement of the whole system performance. Finally, we also report the experimental results confirming the benefit of the proposed recovery system.

  • PDF

A Study on Nonlinear Dynamic Adjustment of Spot Prices of Major Crude Oils (주요 원유 현물가격간의 비선형 동적조정에 관한 연구)

  • Park, Haesun;Lee, Sangjik
    • Environmental and Resource Economics Review
    • /
    • v.24 no.4
    • /
    • pp.657-677
    • /
    • 2015
  • We employ a 3 regime-threshold vector error correction models (TVECM) to investigate the nonlinear dynamic adjustments of three marker crude oil prices such as WTI (West Texas Intermediate), Brent and Dubai. Especially we deal with 3 combinations of oil prices including WTI-Brent, WTI-Dubai and Brent-Dubai in order to analyze the dynamic adjustments of the prices based on the effects of the price spreads among these crude oil prices. Our daily spot prices data run from 2001.1.3 to 2014.12.31. We found that each combination is cointegrated over the period. WTI had dropped significantly in 2010 which had affected the movements of the spreads. To accomodate this fact, we divide the period into two sub-periods: 2000.1.3-2009.12.31 and 2010.1.1-2014.12.31. It is found that each combination is cointegrated in both sub-periods. Moroever, in the first sub-period, all three oil prices are shown to follow nonlinear dynamic adjustments. In the second sub-period, however, TVECM is better than VECM(vector error correction model) for WTI-Dubai and Brent-Dubai while VECM performs better for WTI-Brent. The transaction costs are estimated to be reduced for the second sub-period for WTI-Dubai and Brent-Dubai compared to the first sub-period.