• Title/Summary/Keyword: Decision System

Search Result 6,597, Processing Time 0.042 seconds

A Study on the Effect of Network Centralities on Recommendation Performance (네트워크 중심성 척도가 추천 성능에 미치는 영향에 대한 연구)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.23-46
    • /
    • 2021
  • Collaborative filtering, which is often used in personalization recommendations, is recognized as a very useful technique to find similar customers and recommend products to them based on their purchase history. However, the traditional collaborative filtering technique has raised the question of having difficulty calculating the similarity for new customers or products due to the method of calculating similaritiesbased on direct connections and common features among customers. For this reason, a hybrid technique was designed to use content-based filtering techniques together. On the one hand, efforts have been made to solve these problems by applying the structural characteristics of social networks. This applies a method of indirectly calculating similarities through their similar customers placed between them. This means creating a customer's network based on purchasing data and calculating the similarity between the two based on the features of the network that indirectly connects the two customers within this network. Such similarity can be used as a measure to predict whether the target customer accepts recommendations. The centrality metrics of networks can be utilized for the calculation of these similarities. Different centrality metrics have important implications in that they may have different effects on recommended performance. In this study, furthermore, the effect of these centrality metrics on the performance of recommendation may vary depending on recommender algorithms. In addition, recommendation techniques using network analysis can be expected to contribute to increasing recommendation performance even if they apply not only to new customers or products but also to entire customers or products. By considering a customer's purchase of an item as a link generated between the customer and the item on the network, the prediction of user acceptance of recommendation is solved as a prediction of whether a new link will be created between them. As the classification models fit the purpose of solving the binary problem of whether the link is engaged or not, decision tree, k-nearest neighbors (KNN), logistic regression, artificial neural network, and support vector machine (SVM) are selected in the research. The data for performance evaluation used order data collected from an online shopping mall over four years and two months. Among them, the previous three years and eight months constitute social networks composed of and the experiment was conducted by organizing the data collected into the social network. The next four months' records were used to train and evaluate recommender models. Experiments with the centrality metrics applied to each model show that the recommendation acceptance rates of the centrality metrics are different for each algorithm at a meaningful level. In this work, we analyzed only four commonly used centrality metrics: degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality. Eigenvector centrality records the lowest performance in all models except support vector machines. Closeness centrality and betweenness centrality show similar performance across all models. Degree centrality ranking moderate across overall models while betweenness centrality always ranking higher than degree centrality. Finally, closeness centrality is characterized by distinct differences in performance according to the model. It ranks first in logistic regression, artificial neural network, and decision tree withnumerically high performance. However, it only records very low rankings in support vector machine and K-neighborhood with low-performance levels. As the experiment results reveal, in a classification model, network centrality metrics over a subnetwork that connects the two nodes can effectively predict the connectivity between two nodes in a social network. Furthermore, each metric has a different performance depending on the classification model type. This result implies that choosing appropriate metrics for each algorithm can lead to achieving higher recommendation performance. In general, betweenness centrality can guarantee a high level of performance in any model. It would be possible to consider the introduction of proximity centrality to obtain higher performance for certain models.

Suggestion of Urban Regeneration Type Recommendation System Based on Local Characteristics Using Text Mining (텍스트 마이닝을 활용한 지역 특성 기반 도시재생 유형 추천 시스템 제안)

  • Kim, Ikjun;Lee, Junho;Kim, Hyomin;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.149-169
    • /
    • 2020
  • "The Urban Renewal New Deal project", one of the government's major national projects, is about developing underdeveloped areas by investing 50 trillion won in 100 locations on the first year and 500 over the next four years. This project is drawing keen attention from the media and local governments. However, the project model which fails to reflect the original characteristics of the area as it divides project area into five categories: "Our Neighborhood Restoration, Housing Maintenance Support Type, General Neighborhood Type, Central Urban Type, and Economic Base Type," According to keywords for successful urban regeneration in Korea, "resident participation," "regional specialization," "ministerial cooperation" and "public-private cooperation", when local governments propose urban regeneration projects to the government, they can see that it is most important to accurately understand the characteristics of the city and push ahead with the projects in a way that suits the characteristics of the city with the help of local residents and private companies. In addition, considering the gentrification problem, which is one of the side effects of urban regeneration projects, it is important to select and implement urban regeneration types suitable for the characteristics of the area. In order to supplement the limitations of the 'Urban Regeneration New Deal Project' methodology, this study aims to propose a system that recommends urban regeneration types suitable for urban regeneration sites by utilizing various machine learning algorithms, referring to the urban regeneration types of the '2025 Seoul Metropolitan Government Urban Regeneration Strategy Plan' promoted based on regional characteristics. There are four types of urban regeneration in Seoul: "Low-use Low-Level Development, Abandonment, Deteriorated Housing, and Specialization of Historical and Cultural Resources" (Shon and Park, 2017). In order to identify regional characteristics, approximately 100,000 text data were collected for 22 regions where the project was carried out for a total of four types of urban regeneration. Using the collected data, we drew key keywords for each region according to the type of urban regeneration and conducted topic modeling to explore whether there were differences between types. As a result, it was confirmed that a number of topics related to real estate and economy appeared in old residential areas, and in the case of declining and underdeveloped areas, topics reflecting the characteristics of areas where industrial activities were active in the past appeared. In the case of the historical and cultural resource area, since it is an area that contains traces of the past, many keywords related to the government appeared. Therefore, it was possible to confirm political topics and cultural topics resulting from various events. Finally, in the case of low-use and under-developed areas, many topics on real estate and accessibility are emerging, so accessibility is good. It mainly had the characteristics of a region where development is planned or is likely to be developed. Furthermore, a model was implemented that proposes urban regeneration types tailored to regional characteristics for regions other than Seoul. Machine learning technology was used to implement the model, and training data and test data were randomly extracted at an 8:2 ratio and used. In order to compare the performance between various models, the input variables are set in two ways: Count Vector and TF-IDF Vector, and as Classifier, there are 5 types of SVM (Support Vector Machine), Decision Tree, Random Forest, Logistic Regression, and Gradient Boosting. By applying it, performance comparison for a total of 10 models was conducted. The model with the highest performance was the Gradient Boosting method using TF-IDF Vector input data, and the accuracy was 97%. Therefore, the recommendation system proposed in this study is expected to recommend urban regeneration types based on the regional characteristics of new business sites in the process of carrying out urban regeneration projects."

A Study for Strategy of On-line Shopping Mall: Based on Customer Purchasing and Re-purchasing Pattern (시스템 다이내믹스 기법을 활용한 온라인 쇼핑몰의 전략에 관한 연구 : 소비자의 구매 및 재구매 행동을 중심으로)

  • Lee, Sang-Gun;Min, Suk-Ki;Kang, Min-Cheol
    • Asia pacific journal of information systems
    • /
    • v.18 no.3
    • /
    • pp.91-121
    • /
    • 2008
  • Electronic commerce, commonly known as e-commerce or eCommerce, has become a major business trend in these days. The amount of trade conducted electronically has grown extraordinarily by developing the Internet technology. Most electronic commerce has being conducted between businesses to customers; therefore, the researches with respect to e-commerce are to find customer's needs, behaviors through statistical methods. However, the statistical researches, mostly based on a questionnaire, are the static researches, They can tell us the dynamic relationships between initial purchasing and repurchasing. Therefore, this study proposes dynamic research model for analyzing the cause of initial purchasing and repurchasing. This paper is based on the System-Dynamic theory, using the powerful simulation model with some restriction, The restrictions are based on the theory TAM(Technology Acceptance Model), PAM, and TPB(Theory of Planned Behavior). This article investigates not only the customer's purchasing and repurchasing behavior by passing of time but also the interactive effects to one another. This research model has six scenarios and three steps for analyzing customer behaviors. The first step is the research of purchasing situations. The second step is the research of repurchasing situations. Finally, the third step is to study the relationship between initial purchasing and repurchasing. The purpose of six scenarios is to find the customer's purchasing patterns according to the environmental changes. We set six variables in these scenarios by (1) changing the number of products; (2) changing the number of contents in on-line shopping malls; (3) having multimedia files or not in the shopping mall web sites; (4) grading on-line communities; (5) changing the qualities of products; (6) changing the customer's degree of confidence on products. First three variables are applied to study customer's purchasing behavior, and the other variables are applied to repurchasing behavior study. Through the simulation study, this paper presents some inter-relational result about customer purchasing behaviors, For example, Active community actions are not the increasing factor of purchasing but the increasing factor of word of mouth effect, Additionally. The higher products' quality, the more word of mouth effects increase. The number of products and contents on the web sites have same influence on people's buying behaviors. All simulation methods in this paper is not only display the result of each scenario but also find how to affect each other. Hence, electronic commerce firm can make more realistic marketing strategy about consumer behavior through this dynamic simulation research. Moreover, dynamic analysis method can predict the results which help the decision of marketing strategy by using the time-line graph. Consequently, this dynamic simulation analysis could be a useful research model to make firm's competitive advantage. However, this simulation model needs more further study. With respect to reality, this simulation model has some limitations. There are some missing factors which affect customer's buying behaviors in this model. The first missing factor is the customer's degree of recognition of brands. The second factor is the degree of customer satisfaction. The third factor is the power of word of mouth in the specific region. Generally, word of mouth affects significantly on a region's culture, even people's buying behaviors. The last missing factor is the user interface environment in the internet or other on-line shopping tools. In order to get more realistic result, these factors might be essential matters to make better research in the future studies.

A Study on Case for Localization of Korean Enterprises in India (인도 진출 한국기업의 현지화에 관한 사례 연구)

  • Seo, Min-Kyo;Kim, Hee-Jun
    • International Commerce and Information Review
    • /
    • v.16 no.4
    • /
    • pp.409-437
    • /
    • 2014
  • The purpose of this study is to present the specific ways of successful localization by analyzing the success and failures case for localization within the framework of the strategic models through a theoretical background and strategic models of localization. The strategic models of localization are divided by management aspects such as the localization of product and sourcing, the localization of human resources, the localization of marketing, the localization of R&D, harmony with a local community and delegation of authority between headquarters and local subsidiaries. The results, by comparing and analyzing the success and failures case for localization of individual companies operating in India, indicate that in terms of localization of product and sourcing, there are successful companies which procure a components locally and produce a suitable model which local consumers prefer and the failed companies which can not meet local consumers' needs. In case of localization of human resources, most companies recognize the importance of this portion and make use of superior human resource aggressively through a related education. In case of localization of marketing, It is found that the successful companies perform pre-market research & management and build a effective marketing skills & after service network and select local business partner which has a technical skills and carry out a business activities, customer support, complaint handling with their own organization. In terms of localization of R&D, the successful major companies establish and operate R&D center to promote a suitable model for local customers. In part of harmony with a local community, it shows that companies which made a successful localization understand the cultural environment and contribute to the community through CSR. In aspect of delegation of authority between headquarters and local subsidiaries, it is found that most of Korean companies are very weak for this part. there is a tendency to be determined by the head office rather than local subsidiaries. Implication of this thesis is that Korean enterprises in India should carry forward localization of products and components, foster of local human resource who recognize management and system of company and take part in voluntary market strategy decision, wholly owned subsidiary, establishment and operation of R & D center, understanding of local culture and system, corporate social responsibility, autonomy in management.

  • PDF

Increasing Accuracy of Classifying Useful Reviews by Removing Neutral Terms (중립도 기반 선택적 단어 제거를 통한 유용 리뷰 분류 정확도 향상 방안)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.129-142
    • /
    • 2016
  • Customer product reviews have become one of the important factors for purchase decision makings. Customers believe that reviews written by others who have already had an experience with the product offer more reliable information than that provided by sellers. However, there are too many products and reviews, the advantage of e-commerce can be overwhelmed by increasing search costs. Reading all of the reviews to find out the pros and cons of a certain product can be exhausting. To help users find the most useful information about products without much difficulty, e-commerce companies try to provide various ways for customers to write and rate product reviews. To assist potential customers, online stores have devised various ways to provide useful customer reviews. Different methods have been developed to classify and recommend useful reviews to customers, primarily using feedback provided by customers about the helpfulness of reviews. Most shopping websites provide customer reviews and offer the following information: the average preference of a product, the number of customers who have participated in preference voting, and preference distribution. Most information on the helpfulness of product reviews is collected through a voting system. Amazon.com asks customers whether a review on a certain product is helpful, and it places the most helpful favorable and the most helpful critical review at the top of the list of product reviews. Some companies also predict the usefulness of a review based on certain attributes including length, author(s), and the words used, publishing only reviews that are likely to be useful. Text mining approaches have been used for classifying useful reviews in advance. To apply a text mining approach based on all reviews for a product, we need to build a term-document matrix. We have to extract all words from reviews and build a matrix with the number of occurrences of a term in a review. Since there are many reviews, the size of term-document matrix is so large. It caused difficulties to apply text mining algorithms with the large term-document matrix. Thus, researchers need to delete some terms in terms of sparsity since sparse words have little effects on classifications or predictions. The purpose of this study is to suggest a better way of building term-document matrix by deleting useless terms for review classification. In this study, we propose neutrality index to select words to be deleted. Many words still appear in both classifications - useful and not useful - and these words have little or negative effects on classification performances. Thus, we defined these words as neutral terms and deleted neutral terms which are appeared in both classifications similarly. After deleting sparse words, we selected words to be deleted in terms of neutrality. We tested our approach with Amazon.com's review data from five different product categories: Cellphones & Accessories, Movies & TV program, Automotive, CDs & Vinyl, Clothing, Shoes & Jewelry. We used reviews which got greater than four votes by users and 60% of the ratio of useful votes among total votes is the threshold to classify useful and not-useful reviews. We randomly selected 1,500 useful reviews and 1,500 not-useful reviews for each product category. And then we applied Information Gain and Support Vector Machine algorithms to classify the reviews and compared the classification performances in terms of precision, recall, and F-measure. Though the performances vary according to product categories and data sets, deleting terms with sparsity and neutrality showed the best performances in terms of F-measure for the two classification algorithms. However, deleting terms with sparsity only showed the best performances in terms of Recall for Information Gain and using all terms showed the best performances in terms of precision for SVM. Thus, it needs to be careful for selecting term deleting methods and classification algorithms based on data sets.

Comparison of the Science Education Curriculum and the Environmental Education Curriculum for Promoting Environmental Education (환경교육 활성화를 위한 과학과 교육과정과 환경과 교육과정 비교)

  • Yoon, Jin-A;Nam, Younkyeong
    • Journal of the Korean earth science society
    • /
    • v.41 no.2
    • /
    • pp.155-175
    • /
    • 2020
  • The purpose of this study is to analyze the 2015 revised Science Curriculum and Environment Curriculum, and to provide implications for the correlation and complementarity between the two standards. For the analysis of the curriculum, the contents system of the two standards was reorganized based on the four categories of knowledge, attitude, inquiry, participation and practice, which are common literacy elements of science and environment, based on previous studies. Content Analysis was performed on content elements and detailed performance expectation. As a result of this study, there was a difference in terms of the core competencies and goals. The Environment Curriculum emphasized sustainable development and social participation while the Science Curriculum emphasized scientific inquiry and literacy. The contents system analysis results according to the four literacy factors are as follows. First, in terms of attitude, both standards deal with research ethics in common. However Environment Curriculum values learners' values and views on the environment more than Science Curriculum which emphasizes the science attitude as science investigators. Second, there was a serious problem in the knowledge linkage between two standards. In same grade groups, the level of content knowledge dealt in two standards was not consistent. Third, in the inquiry aspect, the Environment Curriculum deals with interdisciplinary topics in the purposefully designed inquiry unit, whereas the Science Curriculum presents various research activities based on related science concepts in every unit. Fourth, in the participation and practice aspect, the Environment Curriculum focused on participation and practice while the Science Curriculum focused on sustainable science and technology development and improvement, scientific interest and decision-making ability. This study provide implications for education for sustainable development(ESD) by providing the complementary potentials between Science Curriculum and Environment Curriculum.

Predicting Regional Soybean Yield using Crop Growth Simulation Model (작물 생육 모델을 이용한 지역단위 콩 수량 예측)

  • Ban, Ho-Young;Choi, Doug-Hwan;Ahn, Joong-Bae;Lee, Byun-Woo
    • Korean Journal of Remote Sensing
    • /
    • v.33 no.5_2
    • /
    • pp.699-708
    • /
    • 2017
  • The present study was to develop an approach for predicting soybean yield using a crop growth simulation model at the regional level where the detailed and site-specific information on cultivation management practices is not easily accessible for model input. CROPGRO-Soybean model included in Decision Support System for Agrotechnology Transfer (DSSAT) was employed for this study, and Illinois which is a major soybean production region of USA was selected as a study region. As a first step to predict soybean yield of Illinois using CROPGRO-Soybean model, genetic coefficients representative for each soybean maturity group (MG I~VI) were estimated through sowing date experiments using domestic and foreign cultivars with diverse maturity in Seoul National University Farm ($37.27^{\circ}N$, $126.99^{\circ}E$) for two years. The model using the representative genetic coefficients simulated the developmental stages of cultivars within each maturity group fairly well. Soybean yields for the grids of $10km{\times}10km$ in Illinois state were simulated from 2,000 to 2,011 with weather data under 18 simulation conditions including the combinations of three maturity groups, three seeding dates and two irrigation regimes. Planting dates and maturity groups were assigned differently to the three sub-regions divided longitudinally. The yearly state yields that were estimated by averaging all the grid yields simulated under non-irrigated and fully-Irrigated conditions showed a big difference from the statistical yields and did not explain the annual trend of yield increase due to the improved cultivation technologies. Using the grain yield data of 9 agricultural districts in Illinois observed and estimated from the simulated grid yield under 18 simulation conditions, a multiple regression model was constructed to estimate soybean yield at agricultural district level. In this model a year variable was also added to reflect the yearly yield trend. This model explained the yearly and district yield variation fairly well with a determination coefficients of $R^2=0.61$ (n = 108). Yearly state yields which were calculated by weighting the model-estimated yearly average agricultural district yield by the cultivation area of each agricultural district showed very close correspondence ($R^2=0.80$) to the yearly statistical state yields. Furthermore, the model predicted state yield fairly well in 2012 in which data were not used for the model construction and severe yield reduction was recorded due to drought.

Reproducibility of Adenosine Tc-99m sestaMIBI SPECT for the Diagnosis of Coronary Artery Disease (관동맥질환의 진단을 위한 아데노신 Tc-99m sestaMIBI SPECT의 재현성)

  • Lee, Duk-Young;Bae, Jin-Ho;Lee, Sang-Woo;Chun, Kyung-Ah;Yoo, Jeong-Soo;Ahn, Byeong-Cheol;Ha, Jeoung-Hee;Chae, Shung-Chull;Lee, Kyu-Bo;Lee, Jae-Tae
    • The Korean Journal of Nuclear Medicine
    • /
    • v.39 no.6
    • /
    • pp.473-480
    • /
    • 2005
  • Purpose: Adenosine myocardial perfusion SPECT has proven to be useful in the detection of coronary artery disease, in the follow up the success of various therapeutic regimens and in assessing the prognosis of coronary artery disease. The purpose of this study is to define the reproducibility of myocardial perfusion SPECT using adenosine stress testing between two consecutive Tc-99m sestaMIBI (MIBI) SPECT studies in the same subjects. Methods: Thirty patients suspected of coronary artery disease in stable condition underwent sequential Tc-99m MIBI SPECT studies using intravenous adenosine. Gamma camera, acquisition and processing protocols used for the two tests were identical and no invasive procedures were performed between two tests. Mean interval between two tests were 4.1 days (range: 2-11 days). The left ventricular wall was divided into na segments and the degree of myocardial tracer uptake was graded with four-point scoring system by visual analysis. Images were interpretated by two independent nuclear medicine physicians and consensus was taken for final decision, if segmental score was not agreeable. Results: Hemodynamic responses to adenosine were not different between two consecutive studies. There were no serious side effects to stop infusion of adenosine and side effects profile was not different. When myocardial uptake was divided into normal and abnormal uptake, 481 of 540 segments were concordant (agreement rate 89%, Kappa index 0.74). With four-grade storing system, exact agreement was 81.3% (439 of 540 segments, tau b=0.73). One and two-grade differences were observed in 97 segments (18%) and 4 segments (0.7%) respectively, but three-grade difference was not observed in any segment. Extent and severity scores were not different between two studios. The extent and severity scores of the perfusion defect revealed excellent positive correlation between two test (r value for percentage extent and severity score is 0.982 and 0.965, p<0.001) Conclusion: Hemodynamic responses and side effects profile were not different between two consecutive adenosine stress tests in the same subjects. Adenosine Tc-99m sestaMIBI SPECT is highly reproducible, and could be used to assess temporal changes in myocardial perfusion in individual patients.

The Application of Operations Research to Librarianship : Some Research Directions (운영연구(OR)의 도서관응용 -그 몇가지 잠재적응용분야에 대하여-)

  • Choi Sung Jin
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.4
    • /
    • pp.43-71
    • /
    • 1975
  • Operations research has developed rapidly since its origins in World War II. Practitioners of O. R. have contributed to almost every aspect of government and business. More recently, a number of operations researchers have turned their attention to library and information systems, and the author believes that significant research has resulted. It is the purpose of this essay to introduce the library audience to some of these accomplishments, to present some of the author's hypotheses on the subject of library management to which he belives O. R. has great potential, and to suggest some future research directions. Some problem areas in librianship where O. R. may play a part have been discussed and are summarized below. (1) Library location. It is usually necessary to make balance between accessibility and cost In location problems. Many mathematical methods are available for identifying the optimal locations once the balance between these two criteria has been decided. The major difficulties lie in relating cost to size and in taking future change into account when discriminating possible solutions. (2) Planning new facilities. Standard approaches to using mathematical models for simple investment decisions are well established. If the problem is one of choosing the most economical way of achieving a certain objective, one may compare th althenatives by using one of the discounted cash flow techniques. In other situations it may be necessary to use of cost-benefit approach. (3) Allocating library resources. In order to allocate the resources to best advantage the librarian needs to know how the effectiveness of the services he offers depends on the way he puts his resources. The O. R. approach to the problems is to construct a model representing effectiveness as a mathematical function of levels of different inputs(e.g., numbers of people in different jobs, acquisitions of different types, physical resources). (4) Long term planning. Resource allocation problems are generally concerned with up to one and a half years ahead. The longer term certainly offers both greater freedom of action and greater uncertainty. Thus it is difficult to generalize about long term planning problems. In other fields, however, O. R. has made a significant contribution to long range planning and it is likely to have one to make in librarianship as well. (5) Public relations. It is generally accepted that actual and potential users are too ignorant both of the range of library services provided and of how to make use of them. How should services be brought to the attention of potential users? The answer seems to lie in obtaining empirical evidence by controlled experiments in which a group of libraries participated. (6) Acquisition policy. In comparing alternative policies for acquisition of materials one needs to know the implications of each service which depends on the stock. Second is the relative importance to be ascribed to each service for each class of user. By reducing the level of the first, formal models will allow the librarian to concentrate his attention upon the value judgements which will be necessary for the second. (7) Loan policy. The approach to choosing between loan policies is much the same as the previous approach. (8) Manpower planning. For large library systems one should consider constructing models which will permit the skills necessary in the future with predictions of the skills that will be available, so as to allow informed decisions. (9) Management information system for libraries. A great deal of data can be available in libraries as a by-product of all recording activities. It is particularly tempting when procedures are computerized to make summary statistics available as a management information system. The values of information to particular decisions that may have to be taken future is best assessed in terms of a model of the relevant problem. (10) Management gaming. One of the most common uses of a management game is as a means of developing staff's to take decisions. The value of such exercises depends upon the validity of the computerized model. If the model were sufficiently simple to take the form of a mathematical equation, decision-makers would probably able to learn adequately from a graph. More complex situations require simulation models. (11) Diagnostics tools. Libraries are sufficiently complex systems that it would be useful to have available simple means of telling whether performance could be regarded as satisfactory which, if it could not, would also provide pointers to what was wrong. (12) Data banks. It would appear to be worth considering establishing a bank for certain types of data. It certain items on questionnaires were to take a standard form, a greater pool of data would de available for various analysis. (13) Effectiveness measures. The meaning of a library performance measure is not readily interpreted. Each measure must itself be assessed in relation to the corresponding measures for earlier periods of time and a standard measure that may be a corresponding measure in another library, the 'norm', the 'best practice', or user expectations.

  • PDF

Development of Market Growth Pattern Map Based on Growth Model and Self-organizing Map Algorithm: Focusing on ICT products (자기조직화 지도를 활용한 성장모형 기반의 시장 성장패턴 지도 구축: ICT제품을 중심으로)

  • Park, Do-Hyung;Chung, Jaekwon;Chung, Yeo Jin;Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.1-23
    • /
    • 2014
  • Market forecasting aims to estimate the sales volume of a product or service that is sold to consumers for a specific selling period. From the perspective of the enterprise, accurate market forecasting assists in determining the timing of new product introduction, product design, and establishing production plans and marketing strategies that enable a more efficient decision-making process. Moreover, accurate market forecasting enables governments to efficiently establish a national budget organization. This study aims to generate a market growth curve for ICT (information and communication technology) goods using past time series data; categorize products showing similar growth patterns; understand markets in the industry; and forecast the future outlook of such products. This study suggests the useful and meaningful process (or methodology) to identify the market growth pattern with quantitative growth model and data mining algorithm. The study employs the following methodology. At the first stage, past time series data are collected based on the target products or services of categorized industry. The data, such as the volume of sales and domestic consumption for a specific product or service, are collected from the relevant government ministry, the National Statistical Office, and other relevant government organizations. For collected data that may not be analyzed due to the lack of past data and the alteration of code names, data pre-processing work should be performed. At the second stage of this process, an optimal model for market forecasting should be selected. This model can be varied on the basis of the characteristics of each categorized industry. As this study is focused on the ICT industry, which has more frequent new technology appearances resulting in changes of the market structure, Logistic model, Gompertz model, and Bass model are selected. A hybrid model that combines different models can also be considered. The hybrid model considered for use in this study analyzes the size of the market potential through the Logistic and Gompertz models, and then the figures are used for the Bass model. The third stage of this process is to evaluate which model most accurately explains the data. In order to do this, the parameter should be estimated on the basis of the collected past time series data to generate the models' predictive value and calculate the root-mean squared error (RMSE). The model that shows the lowest average RMSE value for every product type is considered as the best model. At the fourth stage of this process, based on the estimated parameter value generated by the best model, a market growth pattern map is constructed with self-organizing map algorithm. A self-organizing map is learning with market pattern parameters for all products or services as input data, and the products or services are organized into an $N{\times}N$ map. The number of clusters increase from 2 to M, depending on the characteristics of the nodes on the map. The clusters are divided into zones, and the clusters with the ability to provide the most meaningful explanation are selected. Based on the final selection of clusters, the boundaries between the nodes are selected and, ultimately, the market growth pattern map is completed. The last step is to determine the final characteristics of the clusters as well as the market growth curve. The average of the market growth pattern parameters in the clusters is taken to be a representative figure. Using this figure, a growth curve is drawn for each cluster, and their characteristics are analyzed. Also, taking into consideration the product types in each cluster, their characteristics can be qualitatively generated. We expect that the process and system that this paper suggests can be used as a tool for forecasting demand in the ICT and other industries.