• Title/Summary/Keyword: conversion algorithm

Search Result 714, Processing Time 0.026 seconds

Analysis of shopping website visit types and shopping pattern (쇼핑 웹사이트 탐색 유형과 방문 패턴 분석)

  • Choi, Kyungbin;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.85-107
    • /
    • 2019
  • Online consumers browse products belonging to a particular product line or brand for purchase, or simply leave a wide range of navigation without making purchase. The research on the behavior and purchase of online consumers has been steadily progressed, and related services and applications based on behavior data of consumers have been developed in practice. In recent years, customization strategies and recommendation systems of consumers have been utilized due to the development of big data technology, and attempts are being made to optimize users' shopping experience. However, even in such an attempt, it is very unlikely that online consumers will actually be able to visit the website and switch to the purchase stage. This is because online consumers do not just visit the website to purchase products but use and browse the websites differently according to their shopping motives and purposes. Therefore, it is important to analyze various types of visits as well as visits to purchase, which is important for understanding the behaviors of online consumers. In this study, we explored the clustering analysis of session based on click stream data of e-commerce company in order to explain diversity and complexity of search behavior of online consumers and typified search behavior. For the analysis, we converted data points of more than 8 million pages units into visit units' sessions, resulting in a total of over 500,000 website visit sessions. For each visit session, 12 characteristics such as page view, duration, search diversity, and page type concentration were extracted for clustering analysis. Considering the size of the data set, we performed the analysis using the Mini-Batch K-means algorithm, which has advantages in terms of learning speed and efficiency while maintaining the clustering performance similar to that of the clustering algorithm K-means. The most optimized number of clusters was derived from four, and the differences in session unit characteristics and purchasing rates were identified for each cluster. The online consumer visits the website several times and learns about the product and decides the purchase. In order to analyze the purchasing process over several visits of the online consumer, we constructed the visiting sequence data of the consumer based on the navigation patterns in the web site derived clustering analysis. The visit sequence data includes a series of visiting sequences until one purchase is made, and the items constituting one sequence become cluster labels derived from the foregoing. We have separately established a sequence data for consumers who have made purchases and data on visits for consumers who have only explored products without making purchases during the same period of time. And then sequential pattern mining was applied to extract frequent patterns from each sequence data. The minimum support is set to 10%, and frequent patterns consist of a sequence of cluster labels. While there are common derived patterns in both sequence data, there are also frequent patterns derived only from one side of sequence data. We found that the consumers who made purchases through the comparative analysis of the extracted frequent patterns showed the visiting pattern to decide to purchase the product repeatedly while searching for the specific product. The implication of this study is that we analyze the search type of online consumers by using large - scale click stream data and analyze the patterns of them to explain the behavior of purchasing process with data-driven point. Most studies that typology of online consumers have focused on the characteristics of the type and what factors are key in distinguishing that type. In this study, we carried out an analysis to type the behavior of online consumers, and further analyzed what order the types could be organized into one another and become a series of search patterns. In addition, online retailers will be able to try to improve their purchasing conversion through marketing strategies and recommendations for various types of visit and will be able to evaluate the effect of the strategy through changes in consumers' visit patterns.

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis (다계층 이원 네트워크를 활용한 사용자 관점의 이슈 클러스터링)

  • Kim, Jieun;Kim, Namgyu;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.93-107
    • /
    • 2014
  • In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.

Development of Convertor supporting Multi-languages for Mobile Network (무선전용 다중 언어의 번역을 지원하는 변환기의 구현)

  • Choe, Ji-Won;Kim, Gi-Cheon
    • The KIPS Transactions:PartC
    • /
    • v.9C no.2
    • /
    • pp.293-296
    • /
    • 2002
  • UP Link is One of the commercial product which converts HTML to HDML convertor in order to show the internet www contents in the mobile environments. When UP browser accesses HTML pages, the agent in the UP Link controls the converter to change the HTML to the HDML, I-Mode, which is developed by NTT-Docomo of Japan, has many contents through the long and stable commercial service. Micro Explorer, which is developed by Stinger project, also has many additional function. In this paper, we designed and implemented WAP convertor which can accept C-HTML contents and mHTML contents. C-HTML format by I-Mode is a subset of HTML format, mHTML format by ME is similar to C-HTML, So the content provides can easily develop C-HTML contents compared with WAP and the other case. Since C-HTML, mHTML and WML are used under the mobile environment, the limited transmission capacity of one page is also similar. In order to make a match table. After that, we apply conversion algorithm on it. If we can not find matched element, we arrange some tags which only can be supported by WML to display in the best shape. By the result, we can convert over 90% contents.

Impact of Lambertian Cloud Top Pressure Error on Ozone Profile Retrieval Using OMI (램버시안 구름 모델의 운정기압 오차가 OMI 오존 프로파일 산출에 미치는 영향)

  • Nam, Hyeonshik;Kim, Jae Hawn;Shin, Daegeun;Baek, Kanghyun
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.3
    • /
    • pp.347-358
    • /
    • 2019
  • Lambertian cloud model (Lambertian Cloud Model) is the simplified cloud model which is used to effectively retrieve the vertical ozone distribution of the atmosphere where the clouds exist. By using the Lambertian cloud model, the optical characteristics of clouds required for radiative transfer simulation are parametrized by Optical Centroid Cloud Pressure (OCCP) and Effective Cloud Fraction (ECF), and the accuracy of each parameter greatly affects the radiation simulation accuracy. However, it is very difficult to generalize the vertical ozone error due to the OCCP error because it varies depending on the radiation environment and algorithm setting. In addition, it is also difficult to analyze the effect of OCCP error because it is mixed with other errors that occur in the vertical ozone calculation process. This study analyzed the ozone retrieval error due to OCCP error using two methods. First, we simulated the impact of OCCP error on ozone retrieval based on Optimal Estimation. Using LIDORT radiation model, the radiation error due to the OCCP error is calculated. In order to convert the radiation error to the ozone calculation error, the radiation error is assigned to the conversion equation of the optimal estimation method. The results show that when the OCCP error occurs by 100 hPa, the total ozone is overestimated by 2.7%. Second, a case analysis is carried out to find the ozone retrieval error due to OCCP error. For the case analysis, the ozone retrieval error is simulated assuming OCCP error and compared with the ozone error in the case of PROFOZ 2005-2006, an OMI ozone profile product. In order to define the ozone error in the case, we assumed an ideal assumption. Considering albedo, and the horizontal change of ozone for satisfying the assumption, the 49 cases are selected. As a result, 27 out of 49 cases(about 55%)showed a correlation of 0.5 or more. This result show that the error of OCCP has a significant influence on the accuracy of ozone profile calculation.