• Title/Summary/Keyword: process mining

Search Result 1,061, Processing Time 0.039 seconds

Reformability evaluation of blasting-enhanced permeability in in situ leaching mining of low-permeability sandstone-type uranium deposits

  • Wei Wang;Xuanyu Liang;Qinghe Niu;Qizhi Wang;Jinyi Zhuo;Xuebin Su;Genmao Zhou;Lixin Zhao;Wei Yuan;Jiangfang Chang;Yongxiang Zheng;Jienan Pan;Zhenzhi Wang;Zhongmin Ji
    • Nuclear Engineering and Technology
    • /
    • v.55 no.8
    • /
    • pp.2773-2784
    • /
    • 2023
  • It is essential to evaluate the blasting-enhanced permeability (BEP) feasibility of a low-permeability sandstone-type uranium deposit. In this work, the mineral composition, reservoir physical properties and rock mechanical properties of samples from sandstone-type uranium deposits were first measured. Then, the reformability evaluation method was established by the analytic hierarchy process-entropy weight method (AHP-EWM) and the fuzzy mathematics method. Finally, evaluation results were verified by the split Hopkinson Pressure Bar (SHPB) experiment and permeability test. Results show that medium sandstone, argillaceous sandstone and siltstone exhibit excellent reformability, followed by coarse sandstone and fine sandstone, while the reformability of sandy mudstone is poor and is not able to accept BEP reservoir stimulation. The permeability improvement and the distribution of damage fractures before and after the SHPB experiment confirm the correctness of evaluation results. This research provides a reformability evaluation method for the BEP of the low-permeability sandstone-type uranium deposit, which contributes to the selection of the appropriate regional and stratigraphic horizon of the BEP and the enhanced ISL of the low-permeability sandstone-type uranium deposit.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

Efficient Topic Modeling by Mapping Global and Local Topics (전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안)

  • Choi, Hochang;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.69-94
    • /
    • 2017
  • Recently, increase of demand for big data analysis has been driving the vigorous development of related technologies and tools. In addition, development of IT and increased penetration rate of smart devices are producing a large amount of data. According to this phenomenon, data analysis technology is rapidly becoming popular. Also, attempts to acquire insights through data analysis have been continuously increasing. It means that the big data analysis will be more important in various industries for the foreseeable future. Big data analysis is generally performed by a small number of experts and delivered to each demander of analysis. However, increase of interest about big data analysis arouses activation of computer programming education and development of many programs for data analysis. Accordingly, the entry barriers of big data analysis are gradually lowering and data analysis technology being spread out. As the result, big data analysis is expected to be performed by demanders of analysis themselves. Along with this, interest about various unstructured data is continually increasing. Especially, a lot of attention is focused on using text data. Emergence of new platforms and techniques using the web bring about mass production of text data and active attempt to analyze text data. Furthermore, result of text analysis has been utilized in various fields. Text mining is a concept that embraces various theories and techniques for text analysis. Many text mining techniques are utilized in this field for various research purposes, topic modeling is one of the most widely used and studied. Topic modeling is a technique that extracts the major issues from a lot of documents, identifies the documents that correspond to each issue and provides identified documents as a cluster. It is evaluated as a very useful technique in that reflect the semantic elements of the document. Traditional topic modeling is based on the distribution of key terms across the entire document. Thus, it is essential to analyze the entire document at once to identify topic of each document. This condition causes a long time in analysis process when topic modeling is applied to a lot of documents. In addition, it has a scalability problem that is an exponential increase in the processing time with the increase of analysis objects. This problem is particularly noticeable when the documents are distributed across multiple systems or regions. To overcome these problems, divide and conquer approach can be applied to topic modeling. It means dividing a large number of documents into sub-units and deriving topics through repetition of topic modeling to each unit. This method can be used for topic modeling on a large number of documents with limited system resources, and can improve processing speed of topic modeling. It also can significantly reduce analysis time and cost through ability to analyze documents in each location or place without combining analysis object documents. However, despite many advantages, this method has two major problems. First, the relationship between local topics derived from each unit and global topics derived from entire document is unclear. It means that in each document, local topics can be identified, but global topics cannot be identified. Second, a method for measuring the accuracy of the proposed methodology should be established. That is to say, assuming that global topic is ideal answer, the difference in a local topic on a global topic needs to be measured. By those difficulties, the study in this method is not performed sufficiently, compare with other studies dealing with topic modeling. In this paper, we propose a topic modeling approach to solve the above two problems. First of all, we divide the entire document cluster(Global set) into sub-clusters(Local set), and generate the reduced entire document cluster(RGS, Reduced global set) that consist of delegated documents extracted from each local set. We try to solve the first problem by mapping RGS topics and local topics. Along with this, we verify the accuracy of the proposed methodology by detecting documents, whether to be discerned as the same topic at result of global and local set. Using 24,000 news articles, we conduct experiments to evaluate practical applicability of the proposed methodology. In addition, through additional experiment, we confirmed that the proposed methodology can provide similar results to the entire topic modeling. We also proposed a reasonable method for comparing the result of both methods.

Geology of Athabasca Oil Sands in Canada (캐나다 아사바스카 오일샌드 지질특성)

  • Kwon, Yi-Kwon
    • The Korean Journal of Petroleum Geology
    • /
    • v.14 no.1
    • /
    • pp.1-11
    • /
    • 2008
  • As conventional oil and gas reservoirs become depleted, interests for oil sands has rapidly increased in the last decade. Oil sands are mixture of bitumen, water, and host sediments of sand and clay. Most oil sand is unconsolidated sand that is held together by bitumen. Bitumen has hydrocarbon in situ viscosity of >10,000 centipoises (cP) at reservoir condition and has API gravity between $8-14^{\circ}$. The largest oil sand deposits are in Alberta and Saskatchewan, Canada. The reverves are approximated at 1.7 trillion barrels of initial oil-in-place and 173 billion barrels of remaining established reserves. Alberta has a number of oil sands deposits which are grouped into three oil sand development areas - the Athabasca, Cold Lake, and Peace River, with the largest current bitumen production from Athabasca. Principal oil sands deposits consist of the McMurray Fm and Wabiskaw Mbr in Athabasca area, the Gething and Bluesky formations in Peace River area, and relatively thin multi-reservoir deposits of McMurray, Clearwater, and Grand Rapid formations in Cold Lake area. The reservoir sediments were deposited in the foreland basin (Western Canada Sedimentary Basin) formed by collision between the Pacific and North America plates and the subsequent thrusting movements in the Mesozoic. The deposits are underlain by basement rocks of Paleozoic carbonates with highly variable topography. The oil sands deposits were formed during the Early Cretaceous transgression which occurred along the Cretaceous Interior Seaway in North America. The oil-sands-hosting McMurray and Wabiskaw deposits in the Athabasca area consist of the lower fluvial and the upper estuarine-offshore sediments, reflecting the broad and overall transgression. The deposits are characterized by facies heterogeneity of channelized reservoir sands and non-reservoir muds. Main reservoir bodies of the McMurray Formation are fluvial and estuarine channel-point bar complexes which are interbedded with fine-grained deposits formed in floodplain, tidal flat, and estuarine bay. The Wabiskaw deposits (basal member of the Clearwater Formation) commonly comprise sheet-shaped offshore muds and sands, but occasionally show deep-incision into the McMurray deposits, forming channelized reservoir sand bodies of oil sands. In Canada, bitumen of oil sands deposits is produced by surface mining or in-situ thermal recovery processes. Bitumen sands recovered by surface mining are changed into synthetic crude oil through extraction and upgrading processes. On the other hand, bitumen produced by in-situ thermal recovery is transported to refinery only through bitumen blending process. The in-situ thermal recovery technology is represented by Steam-Assisted Gravity Drainage and Cyclic Steam Stimulation. These technologies are based on steam injection into bitumen sand reservoirs for increase in reservoir in-situ temperature and in bitumen mobility. In oil sands reservoirs, efficiency for steam propagation is controlled mainly by reservoir geology. Accordingly, understanding of geological factors and characteristics of oil sands reservoir deposits is prerequisite for well-designed development planning and effective bitumen production. As significant geological factors and characteristics in oil sands reservoir deposits, this study suggests (1) pay of bitumen sands and connectivity, (2) bitumen content and saturation, (3) geologic structure, (4) distribution of mud baffles and plugs, (5) thickness and lateral continuity of mud interbeds, (6) distribution of water-saturated sands, (7) distribution of gas-saturated sands, (8) direction of lateral accretion of point bar, (9) distribution of diagenetic layers and nodules, and (10) texture and fabric change within reservoir sand body.

  • PDF

The Efficiency Analysis of CRM System in the Hotel Industry Using DEA (DEA를 이용한 호텔 관광 서비스 업계의 CRM 도입 효율성 분석)

  • Kim, Tai-Young;Seol, Kyung-Jin;Kwak, Young-Dai
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.91-110
    • /
    • 2011
  • This paper analyzes the cases where the hotels have increased their services and enhanced their work process through IT solutions to cope with computerization globalization. Also the cases have been studies where national hotels use the CRM solution internally to respond effectively to customers requests, increase customer analysis, and build marketing strategies. In particular, this study discusses the introduction of the CRM solutions and CRM sales business and marketing services using a process for utilizing the presumed, CRM by introducing effective DEA(Data Envelopment Analysis). First, the comparison has done regarding the relative efficiency of L Company with the CCR model, then compared L Company's restaurants and facilities' effectiveness through BCC model. L Company reached a conclusion that it is important to precisely create and manage sales data which are the preliminary data for CRM, and for that reason it made it possible to save sales data generated by POS system on each sales performance database. In order to do that, it newly established Oracle POS system and LORIS POS system concerned with restaurants for food and beverage as well as rooms, and made it possible to stably generate and manage sales data and manage. Moreover, it set up a composite database to control comprehensively the results of work processes during a specific period by collecting customer registration information and made it possible to systematically control the information on sales performances. By establishing a system which unifies database and managing it comprehensively, impeccability of data has been greatly enhanced and a problem which generated asymmetric data could be thoroughly solved. Using data accumulated on the comprehensive database, sales data can be analyzed, categorized, classified through data mining engine imbedded in Polaris CRM and the results can be organized on data mart to provide them in the form of CRM application data. By transforming original sales data into forms which are easy to handle and saving them on data mart separately, it enabled acquiring well-organized data with ease when engaging in various marketing operations, holding a morning meeting and working on decision-making. By using summarized data at data mart, it was possible to process marketing operations such as telemarketing, direct mailing, internet marketing service and service product developments for perceived customers; moreover, information on customer perceptions which is one of CRM's end-products could feed back into the comprehensive database. This research was undertaken to find out how effectively CRM has been employed by comparing and analyzing the management performance of each enterprise site and store after introducing CRM to Hotel enterprises using DEA technique. According to the research results, efficiency evaluation for each site was calculated through input and output factors to find out comparative CRM system usage efficiency of L's Company four sites; moreover, with regard to stores, the sizes of workforce and budget application show a huge difference and so does the each store efficiency. Furthermore, by using the DEA technique, it could assess which sites have comparatively high efficiency and which don't by comparing and evaluating hotel enterprises IT project outcomes such as CRM introduction using the CCR model for each site of the related enterprises. By using the BCC model, it could comparatively evaluate the outcome of CRM usage at each store of A site, which is representative of L Company, and as a result, it could figure out which stores maintain high efficiency in using CRM and which don't. It analyzed the cases of CRM introduction at L Company, which is a hotel enterprise, and precisely evaluated them through DEA. L Company analyzed the customer analysis system by introducing CRM and achieved to provide customers identified through client analysis data with one to one tailored services. Moreover, it could come up with a plan to differentiate the service for customers who revisit by assessing customer discernment rate. As tasks to be solved in the future, it is required to do research on the process analysis which can lead to a specific outcome such as increased sales volumes by carrying on test marketing, target marketing using CRM. Furthermore, it is also necessary to do research on efficiency evaluation in accordance with linkages between other IT solutions such as ERP and CRM system.

Proposals on How to Research Iron Manufacture Relics (제철유적 조사연구법 시론)

  • Kim, Kwon Il
    • Korean Journal of Heritage: History & Science
    • /
    • v.43 no.3
    • /
    • pp.144-179
    • /
    • 2010
  • Investigation into iron manufacture relics has been active since 1970s, especially accelerated in 1990s across the country. Consideration of the importance of production site relics has lately attracted attention to iron manufacture relics. Methodological studies of the investigation into iron manufacture relics, however, were less made compared with those of the investigation into tomb, dwelling, or swampy place relics. It is because the process of iron manufacture is too complicated to understand and also requires professional knowledge of metal engineering. With the recognition of these problems this research is to form an opinion about how to excavate, to rearrange and classify, and to examine iron manufacture relics, based upon the understanding of the nature of iron, iron production process, and metal engineering features of related relics like slag, iron lumps and so on. This research classifies iron manufacture relics into seven types according to the production process; mining, smelting, refining, tempering, melting, steelmaking, and the others. Then it arranges methods to survey in each stage of field study, trial digging, and excavation. It also explains how to classify and examine excavated relics, what field of natural science to be used to know the features of relics, and what efforts have been made to reconstruct a furnace and what their problems were, making the best use of examples, drawings, and photos. It comes to the conclusion, in spite of the lack of in-depth discussion on application and development of various investigation methods, that iron manufacture relics can be classified according to the production process, that natural sciences should be applied to get comprehensive understanding of relics as well as archeological knowledge, and that efforts to reconstruct a furnace should be continued from the aspect of experimental archeology.

UX Methodology Study by Data Analysis Focusing on deriving persona through customer segment classification (데이터 분석을 통한 UX 방법론 연구 고객 세그먼트 분류를 통한 페르소나 도출을 중심으로)

  • Lee, Seul-Yi;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.151-176
    • /
    • 2021
  • As the information technology industry develops, various kinds of data are being created, and it is now essential to process them and use them in the industry. Analyzing and utilizing various digital data collected online and offline is a necessary process to provide an appropriate experience for customers in the industry. In order to create new businesses, products, and services, it is essential to use customer data collected in various ways to deeply understand potential customers' needs and analyze behavior patterns to capture hidden signals of desire. However, it is true that research using data analysis and UX methodology, which should be conducted in parallel for effective service development, is being conducted separately and that there is a lack of examples of use in the industry. In thiswork, we construct a single process by applying data analysis methods and UX methodologies. This study is important in that it is highly likely to be used because it applies methodologies that are actively used in practice. We conducted a survey on the topic to identify and cluster the associations between factors to establish customer classification and target customers. The research methods are as follows. First, we first conduct a factor, regression analysis to determine the association between factors in the happiness data survey. Groups are grouped according to the survey results and identify the relationship between 34 questions of psychological stability, family life, relational satisfaction, health, economic satisfaction, work satisfaction, daily life satisfaction, and residential environment satisfaction. Second, we classify clusters based on factors affecting happiness and extract the optimal number of clusters. Based on the results, we cross-analyzed the characteristics of each cluster. Third, forservice definition, analysis was conducted by correlating with keywords related to happiness. We leverage keyword analysis of the thumb trend to derive ideas based on the interest and associations of the keyword. We also collected approximately 11,000 news articles based on the top three keywords that are highly related to happiness, then derived issues between keywords through text mining analysis in SAS, and utilized them in defining services after ideas were conceived. Fourth, based on the characteristics identified through data analysis, we selected segmentation and targetingappropriate for service discovery. To this end, the characteristics of the factors were grouped and selected into four groups, and the profile was drawn up and the main target customers were selected. Fifth, based on the characteristics of the main target customers, interviewers were selected and the In-depthinterviews were conducted to discover the causes of happiness, causes of unhappiness, and needs for services. Sixth, we derive customer behavior patterns based on segment results and detailed interviews, and specify the objectives associated with the characteristics. Seventh, a typical persona using qualitative surveys and a persona using data were produced to analyze each characteristic and pros and cons by comparing the two personas. Existing market segmentation classifies customers based on purchasing factors, and UX methodology measures users' behavior variables to establish criteria and redefine users' classification. Utilizing these segment classification methods, applying the process of producinguser classification and persona in UX methodology will be able to utilize them as more accurate customer classification schemes. The significance of this study is summarized in two ways: First, the idea of using data to create a variety of services was linked to the UX methodology used to plan IT services by applying it in the hot topic era. Second, we further enhance user classification by applying segment analysis methods that are not currently used well in UX methodologies. To provide a consistent experience in creating a single service, from large to small, it is necessary to define customers with common goals. To this end, it is necessary to derive persona and persuade various stakeholders. Under these circumstances, designing a consistent experience from beginning to end, through fast and concrete user descriptions, would be a very effective way to produce a successful service.

Case Study on Revising Curriculum of a Industrial High School through Analysis of Manufacturing Workforce demand focused on Chungnam Province in Korea (지역 기반 산업의 인력 수요 분석을 통한 공업 계열 특성화 고등학교의 교육과정 개편 사례 연구)

  • Yi, Sangbong;Choi, Jiyeon
    • 대한공업교육학회지
    • /
    • v.38 no.1
    • /
    • pp.221-238
    • /
    • 2013
  • The purpose of this study was to revise and reorganize the direction of the department of ${\bigcirc}{\bigcirc}$Industrial High School though analysis of manufacturing status and workforce demand in Chungnam province focused on the Geumsan Area. In the study, ${\bigcirc}{\bigcirc}$Industrial High School of the status and actual conditions were identified through interview, literature review and data analysis. Surveys of the school teachers, parents and students was conducted in order to investigate the awareness of renaming and reorganization of school departments, curriculum revision of the school. Statistical data was collected and analyzed in order to figure out manufacturing industry and its workforce demand of Chungnam Province in Korea. Findings of the study were as follows: Small and medium enterprises of manufacturing industry have been developed a lot in Geumsan Area in Chungnam province. Four major industries including (1) automobile parts, (2) electronic and information equipment, (3) Cutting edge culture, and (4) Agricultural-livestock and bio are intensively fostered as regional strategic industries in the Chungnam province. The manufacturing industry has a 33.6-percent, and then service-mining and manufacturing industry has a 80.0-percent of total number of employee in Geumsan Area. It is expected that industrial workforce demand of Geumsan Area come out of manufacturing and service-mining industrial sector. The following is recommended for the school curriculum revision: (1) focussing on mechanical control for the revision of computer applying mechanical department, (2) focussing on automation electric equipment for the revision of electric control department, (3) focussing on food process control for revising of bio-food industrial department. It's helpful to make a progress of the school that establish identification of industrial specialized high school as an institution of vocational education at the secondary level through supplying qualified workforce to Manufacturing industry in Chungnam Province.

A study on the CRM strategy for medium and small industry of distribution (중소유통업체의 CRM 도입방안에 관한 연구)

  • Kim, Gi-Pyoung
    • Journal of Distribution Science
    • /
    • v.8 no.3
    • /
    • pp.37-47
    • /
    • 2010
  • CRM refers to the operating activities that always maintain and promote good relationship with customers to ultimately maximize the company's profits by understanding the value of customers to meet their demands, establishing a strategy which may maximize the Life Time Value and successfully operating the business by integrating the customer management processes. In our country, many big businesses are introducing CRM initiatively to use it in marketing strategy however, most medium and small sized companies do not understand CRM clearly or they feel difficult to introduce it due to huge investment needed. This study is intended to present CRM promotion strategy and activities plan fit for the medium and small sized companies by analyzing the success factors of the leading companies those have already executed CRM by surveying the precedents to make the distributors out of the industries have close relation with consumers to overcome their weakness in scale and strengthen their competitiveness in such a rapidly changing and fiercely competing market. There are 5 stages to build CRM such as the recognition of the needs of CRM establishment, the establishment of CRM integrated database, the establishment of customer analysis and marketing strategy through data mining, the practical use of customer analysis through data mining and the implementation of response analysis and close loop process. Through the case study of leading companies, CRM is needed in types of businesses where the companies constantly contact their customers. To meet their needs, they assertively analyze their customer information. Through this, they develop their own CRM programs personalized for their customers to provide high quality service products. For customers helping them make profits, the VIP marketing strategy is conducted to keep the customers from breaking their relationships with the companies. Through continuous management, CRM should be executed. In other words, through customer segmentation, the profitability for the customers should be maximized. The maximization of the profitability for the customers is the key to CRM. These are the success factors of the CRM of the distributors in Korea. Firstly, the top management's will power for CS management is needed. Secondly, the culture across the company should be made to respect the customers. Thirdly, specialized customer management and CRM workers should be trained. Fourthly, CRM behaviors should be developed for the whole staff members. Fifthly, CRM should be carried out through systematic cooperation between related departments. To make use of the case study for CRM, the company should understand the customer and establish customer management programs to set the optimal CRM strategy and continuously pursue it according to a long-term plan. For this, according to collected information and customer data, customers should be segmented and the responsive customer system should be designed according to the differentiated strategy according to the class of the customers. In terms of the future CRM, integrated CRM is essential where the customer information gathers together in one place. As the degree of customers' expectation increases a lot, the effective way to meet the customers' expectation should be pursued. As the IT technology improved rapidly, RFID (Radio Frequency Identification) appears. On a real-time basis, information about products and customers is obtained massively in a very short time. A strategy for successful CRM promotion should be improving the organizations in charge of contacting customers, re-planning the customer management processes and establishing the integrated system with the marketing strategy to keep good relation with the customers according to a long-term plan and a proper method suitable to the market conditions and run a company-wide program. In addition, a CRM program should be continuously improved and complemented to meet the company's characteristics. Especially, a strategy for successful CRM for the medium and small sized distributors should be as follows. First, they should change their existing recognition in CRM and keep in-depth care for the customers. Second, they should benchmark the techniques of CRM from the leading companies and find out success points to use. Third, they should seek some methods best suited for their particular conditions by achieving the ideas combining their own strong points with marketing. Fourth, a CRM model should be developed that will promote relationship with individual customers just like the precedents of small sized businesses in Switzerland through small but noticeable events.

  • PDF

Optimization of Support Vector Machines for Financial Forecasting (재무예측을 위한 Support Vector Machine의 최적화)

  • Kim, Kyoung-Jae;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.241-254
    • /
    • 2011
  • Financial time-series forecasting is one of the most important issues because it is essential for the risk management of financial institutions. Therefore, researchers have tried to forecast financial time-series using various data mining techniques such as regression, artificial neural networks, decision trees, k-nearest neighbor etc. Recently, support vector machines (SVMs) are popularly applied to this research area because they have advantages that they don't require huge training data and have low possibility of overfitting. However, a user must determine several design factors by heuristics in order to use SVM. For example, the selection of appropriate kernel function and its parameters and proper feature subset selection are major design factors of SVM. Other than these factors, the proper selection of instance subset may also improve the forecasting performance of SVM by eliminating irrelevant and distorting training instances. Nonetheless, there have been few studies that have applied instance selection to SVM, especially in the domain of stock market prediction. Instance selection tries to choose proper instance subsets from original training data. It may be considered as a method of knowledge refinement and it maintains the instance-base. This study proposes the novel instance selection algorithm for SVMs. The proposed technique in this study uses genetic algorithm (GA) to optimize instance selection process with parameter optimization simultaneously. We call the model as ISVM (SVM with Instance selection) in this study. Experiments on stock market data are implemented using ISVM. In this study, the GA searches for optimal or near-optimal values of kernel parameters and relevant instances for SVMs. This study needs two sets of parameters in chromosomes in GA setting : The codes for kernel parameters and for instance selection. For the controlling parameters of the GA search, the population size is set at 50 organisms and the value of the crossover rate is set at 0.7 while the mutation rate is 0.1. As the stopping condition, 50 generations are permitted. The application data used in this study consists of technical indicators and the direction of change in the daily Korea stock price index (KOSPI). The total number of samples is 2218 trading days. We separate the whole data into three subsets as training, test, hold-out data set. The number of data in each subset is 1056, 581, 581 respectively. This study compares ISVM to several comparative models including logistic regression (logit), backpropagation neural networks (ANN), nearest neighbor (1-NN), conventional SVM (SVM) and SVM with the optimized parameters (PSVM). In especial, PSVM uses optimized kernel parameters by the genetic algorithm. The experimental results show that ISVM outperforms 1-NN by 15.32%, ANN by 6.89%, Logit and SVM by 5.34%, and PSVM by 4.82% for the holdout data. For ISVM, only 556 data from 1056 original training data are used to produce the result. In addition, the two-sample test for proportions is used to examine whether ISVM significantly outperforms other comparative models. The results indicate that ISVM outperforms ANN and 1-NN at the 1% statistical significance level. In addition, ISVM performs better than Logit, SVM and PSVM at the 5% statistical significance level.