Search | Korea Science

Knowledge Extraction Methodology and Framework from Wikipedia Articles for Construction of Knowledge-Base (지식베이스 구축을 위한 한국어 위키피디아의 학습 기반 지식추출 방법론 및 플랫폼 연구)

Kim, JaeHun;Lee, Myungjin
- Journal of Intelligence and Information Systems
- /
- v.25 no.1
- /
- pp.43-61
- /
- 2019
Development of technologies in artificial intelligence has been rapidly increasing with the Fourth Industrial Revolution, and researches related to AI have been actively conducted in a variety of fields such as autonomous vehicles, natural language processing, and robotics. These researches have been focused on solving cognitive problems such as learning and problem solving related to human intelligence from the 1950s. The field of artificial intelligence has achieved more technological advance than ever, due to recent interest in technology and research on various algorithms. The knowledge-based system is a sub-domain of artificial intelligence, and it aims to enable artificial intelligence agents to make decisions by using machine-readable and processible knowledge constructed from complex and informal human knowledge and rules in various fields. A knowledge base is used to optimize information collection, organization, and retrieval, and recently it is used with statistical artificial intelligence such as machine learning. Recently, the purpose of the knowledge base is to express, publish, and share knowledge on the web by describing and connecting web resources such as pages and data. These knowledge bases are used for intelligent processing in various fields of artificial intelligence such as question answering system of the smart speaker. However, building a useful knowledge base is a time-consuming task and still requires a lot of effort of the experts. In recent years, many kinds of research and technologies of knowledge based artificial intelligence use DBpedia that is one of the biggest knowledge base aiming to extract structured content from the various information of Wikipedia. DBpedia contains various information extracted from Wikipedia such as a title, categories, and links, but the most useful knowledge is from infobox of Wikipedia that presents a summary of some unifying aspect created by users. These knowledge are created by the mapping rule between infobox structures and DBpedia ontology schema defined in DBpedia Extraction Framework. In this way, DBpedia can expect high reliability in terms of accuracy of knowledge by using the method of generating knowledge from semi-structured infobox data created by users. However, since only about 50% of all wiki pages contain infobox in Korean Wikipedia, DBpedia has limitations in term of knowledge scalability. This paper proposes a method to extract knowledge from text documents according to the ontology schema using machine learning. In order to demonstrate the appropriateness of this method, we explain a knowledge extraction model according to the DBpedia ontology schema by learning Wikipedia infoboxes. Our knowledge extraction model consists of three steps, document classification as ontology classes, proper sentence classification to extract triples, and value selection and transformation into RDF triple structure. The structure of Wikipedia infobox are defined as infobox templates that provide standardized information across related articles, and DBpedia ontology schema can be mapped these infobox templates. Based on these mapping relations, we classify the input document according to infobox categories which means ontology classes. After determining the classification of the input document, we classify the appropriate sentence according to attributes belonging to the classification. Finally, we extract knowledge from sentences that are classified as appropriate, and we convert knowledge into a form of triples. In order to train models, we generated training data set from Wikipedia dump using a method to add BIO tags to sentences, so we trained about 200 classes and about 2,500 relations for extracting knowledge. Furthermore, we evaluated comparative experiments of CRF and Bi-LSTM-CRF for the knowledge extraction process. Through this proposed process, it is possible to utilize structured knowledge by extracting knowledge according to the ontology schema from text documents. In addition, this methodology can significantly reduce the effort of the experts to construct instances according to the ontology schema.
https://doi.org/10.13088/jiis.2019.25.1.043 인용 PDF KSCI HTML

A Study on the Determinants of Blockchain-oriented Supply Chain Management (SCM) Services (블록체인 기반 공급사슬관리 서비스 활용의 결정요인 연구)

Kwon, Youngsig;Ahn, Hyunchul
- Knowledge Management Research
- /
- v.22 no.2
- /
- pp.119-144
- /
- 2021
Recently, as competition in the market evolves from the competition among companies to the competition among their supply chains, companies are struggling to enhance their supply chain management (hereinafter SCM). In particular, as blockchain technology with various technical advantages is combined with SCM, a lot of domestic manufacturing and distribution companies are considering the adoption of blockchain-oriented SCM (BOSCM) services today. Thus, it is an important academic topic to examine the factors affecting the use of blockchain-oriented SCM. However, most prior studies on blockchain and SCMs have designed their research models based on Technology Acceptance Model (TAM) or the Unified Theory of Acceptance and Use of Technology (UTAUT), which are suitable for explaining individual's acceptance of information technology rather than companies'. Under this background, this study presents a novel model of blockchain-oriented SCM acceptance model based on the Technology-Organization-Environment (TOE) framework to consider companies as the unit of analysis. In addition, Value-based Adoption Model (VAM) is applied to the research model in order to consider the benefits and the sacrifices caused by a new information system comprehensively. To validate the proposed research model, a survey of 126 companies were collected. Among them, by applying PLS-SEM (Partial Least Squares Structural Equation Modeling) with data of 122 companies, the research model was verified. As a result, 'business innovation', 'tracking and tracing', 'security enhancement' and 'cost' from technology viewpoint are found to significantly affect 'perceived value', which in turn affects 'intention to use blockchain-oriented SCM'. Also, 'organization readiness' is found to affect 'intention to use' with statistical significance. However, it is found that 'complexity' and 'regulation environment' have little impact on 'perceived value' and 'intention to use', respectively. It is expected that the findings of this study contribute to preparing practical and policy alternatives for facilitating blockchain-oriented SCM adoption in Korean firms.
https://doi.org/10.15813/kmr.2021.22.2.007 인용 PDF KSCI

Factors Influencing Leisure Satisfaction Among Elderly with Economic Burden and Health Problems: Focusing on Leisure Activities (경제적 부담과 건강 문제를 겪는 노인들의 여가만족 요인에 관한 연구: 여가활동을 중심으로)

Hong, Seokho
- 한국노년학
- /
- v.40 no.1
- /
- pp.197-216
- /
- 2020
This study aimed to suggest leisure activities and policy-level support in the light of the characteristics and needs among the elderly by examining constraint factors of leisure activities among the elderly. Data of 3887 elderly with the age of 65 and above with economic burden and health problems from the 6th Korean Retirement and Income study were used for the statistical analyses. Hierarchical linear models were tested by entering factors stepswise; demographic factors(age, gender, marriage status, single household, region, living expenses, health status) in the first step, leisure factors(leisure time, leisure motivation) in the second step, and lastly leisure activity factors(desired leisure activities, undesired leisure activities) in the third step. The results were as follows: First, major factors that constrict leisure activities of the elderly were financial burden and health problems. Second, there were significant differences among three(financial constraint, health constraint, and financial and health constraint) groups. Financial constraint group was the highest in the level of leisure satisfaction but leisure time was the shortest. The major reason to do leisure activities of the financial constraint group was to keep relationships with families and friends. In terms of desired leisure activities, health constraint group wanted resting, financial constraint group wanted hobbies and entertainment, and the financial-and-health constraint group wanted social activities. Third, financial constraint group demonstrated higher levels of leisure activity satisfaction when they wanted to take care of pets or gardens; however, they showed lower levels of leisure activity satisfaction when they wanted to domestic trips for desired leisure activities. In case of health constraint group, they demonstrated lower levels of leisure activity satisfaction whether or not they wanted resting like watching TV or listening to the radio. And, the showed higher levels of leisure activity satisfaction when they wanted social activities such as participation in religion or social gathering organizations. For the financial-and-health constraint group, whereas they showed lower levels of leisure activity satisfaction when they wanted walking around or watching TV, and domestic trips for desired leisure activities, they demonstrated higher levels of leisure activity satisfaction when they wanted entertainment doing the game of go, or chess, and hobbies like hiking and social activities. Practice and policy level suggestions to offer leisure activities that meet the needs of the elderly were made based on the study results.
https://doi.org/10.31888/JKGS.2020.40.1.197 인용

Geomagnetic Paleosecular Variation in the Korean Peninsula during the First Six Centuries (기원후 600년간 한반도 지구 자기장 고영년변화)

Park, Jong kyu;Park, Yong-Hee
- The Journal of Engineering Geology
- /
- v.32 no.4
- /
- pp.611-625
- /
- 2022
One of the applications of geomagnetic paleo-secular variation (PSV) is the age dating of archeological remains (i.e., the archeomagnetic dating technique). This application requires the local model of PSV that reflects non-dipole fields with regional differences. Until now, the tentative Korean paleosecular variation (t-KPSV) calculated based on JPSV (SW Japanese PSV) has been applied as a reference curve for individual archeomagnetic directions in Korea. However, it is less reliable due to regional differences in the non-dipole magnetic field. Here, we present PSV curves for AD 1 to 600, corresponding to the Korean Three Kingdoms (including the Proto Three Kingdoms) Period, using the results of archeomagnetic studies in the Korean Peninsula and published research data. Then we compare our PSV with the global geomagnetic prediction model and t-KPSV. A total of 49 reliable archeomagnetic directional data from 16 regions were compiled for our PSV. In detail, each data showed statistical consistency (N > 6, 𝛼₉₅ < 7.8°, and k > 57.8) and had radiocarbon or archeological ages in the range of AD 1 to 600 years with less than ±200 years error range. The compiled PSV for the initial six centuries (KPSV0.6k) showed declination and inclination in the range of 341.7° to 20.1° and 43.5° to 60.3°, respectively. Compared to the t-KPSV, our curve revealed different variation patterns both in declination and inclination. On the other hand, KPSV0.6k and global geomagnetic prediction models (ARCH3K.1, CALS3K.4, and SED3K.1) revealed consistent variation trends during the first six centennials. In particular, the ARCH3K.1 showed the best fitting with our KPSV0.6k. These results indicate that contribution of the non-dipole field to Korea and Japan is quite different, despite their geographical proximity. Moreover, the compilation of archeomagnetic data from the Korea territory is essential to build a reliable PSV curve for an age dating tool. Lastly, we double-check the reliability of our KPSV0.6k by showing a good fitting of newly acquired age-controlled archeomagnetic data on our curve.
https://doi.org/10.9720/kseg.2022.4.611 인용 PDF KSCI HTML

Analysis and Forecast of Venture Capital Investment on Generative AI Startups: Focusing on the U.S. and South Korea (생성 AI 스타트업에 대한 벤처투자 분석과 예측: 미국과 한국을 중심으로)

Lee, Seungah;Jung, Taehyun
- Asia-Pacific Journal of Business Venturing and Entrepreneurship
- /
- v.18 no.4
- /
- pp.21-35
- /
- 2023
Expectations surrounding generative AI technology and its profound ramifications are sweeping across various industrial domains. Given the anticipated pivotal role of the startup ecosystem in the utilization and advancement of generative AI technology, it is imperative to cultivate a deeper comprehension of the present state and distinctive attributes characterizing venture capital (VC) investments within this domain. The current investigation delves into South Korea's landscape of VC investment deals and prognosticates the projected VC investments by juxtaposing these against the United States, the frontrunner in the generative AI industry and its associated ecosystem. For analytical purposes, a compilation of 286 investment deals originating from 117 U.S. generative AI startups spanning the period from 2008 to 2023, as well as 144 investment deals from 42 South Korean generative AI startups covering the years 2011 to 2023, was amassed to construct new datasets. The outcomes of this endeavor reveal an upward trajectory in the count of VC investment deals within both the U.S. and South Korea during recent years. Predominantly, these deals have been concentrated within the early-stage investment realm. Noteworthy disparities between the two nations have also come to light. Specifically, in the U.S., in contrast to South Korea, the quantum of recent VC deals has escalated, marking an augmentation ranging from 285% to 488% in the corresponding developmental stage. While the interval between disparate investment stages demonstrated a slight elongation in South Korea relative to the U.S., this discrepancy did not achieve statistical significance. Furthermore, the proportion of VC investments channeled into generative AI enterprises, relative to the aggregate number of deals, exhibited a higher quotient in South Korea compared to the U.S. Upon a comprehensive sectoral breakdown of generative AI, it was discerned that within the U.S., 59.2% of total deals were concentrated in the text and model sectors, whereas in South Korea, 61.9% of deals centered around the video, image, and chat sectors. Through forecasting, the anticipated VC investments in South Korea from 2023 to 2029 were derived via four distinct models, culminating in an estimated average requirement of 3.4 trillion Korean won (ranging from at least 2.408 trillion won to a maximum of 5.919 trillion won). This research bears pragmatic significance as it methodically dissects VC investments within the generative AI domain across both the U.S. and South Korea, culminating in the presentation of an estimated VC investment projection for the latter. Furthermore, its academic significance lies in laying the groundwork for prospective scholarly inquiries by dissecting the current landscape of generative AI VC investments, a sphere that has hitherto remained void of rigorous academic investigation supported by empirical data. Additionally, the study introduces two innovative methodologies for the prediction of VC investment sums. Upon broader integration, application, and refinement of these methodologies within diverse academic explorations, they stand poised to enhance the prognosticative capacity pertaining to VC investment costs.
PDF

An Empirical Study on the Influencing Factors for Big Data Intented Adoption: Focusing on the Strategic Value Recognition and TOE Framework (빅데이터 도입의도에 미치는 영향요인에 관한 연구: 전략적 가치인식과 TOE(Technology Organizational Environment) Framework을 중심으로)

Ka, Hoi-Kwang;Kim, Jin-soo
- Asia pacific journal of information systems
- /
- v.24 no.4
- /
- pp.443-472
- /
- 2014
To survive in the global competitive environment, enterprise should be able to solve various problems and find the optimal solution effectively. The big-data is being perceived as a tool for solving enterprise problems effectively and improve competitiveness with its' various problem solving and advanced predictive capabilities. Due to its remarkable performance, the implementation of big data systems has been increased through many enterprises around the world. Currently the big-data is called the 'crude oil' of the 21st century and is expected to provide competitive superiority. The reason why the big data is in the limelight is because while the conventional IT technology has been falling behind much in its possibility level, the big data has gone beyond the technological possibility and has the advantage of being utilized to create new values such as business optimization and new business creation through analysis of big data. Since the big data has been introduced too hastily without considering the strategic value deduction and achievement obtained through the big data, however, there are difficulties in the strategic value deduction and data utilization that can be gained through big data. According to the survey result of 1,800 IT professionals from 18 countries world wide, the percentage of the corporation where the big data is being utilized well was only 28%, and many of them responded that they are having difficulties in strategic value deduction and operation through big data. The strategic value should be deducted and environment phases like corporate internal and external related regulations and systems should be considered in order to introduce big data, but these factors were not well being reflected. The cause of the failure turned out to be that the big data was introduced by way of the IT trend and surrounding environment, but it was introduced hastily in the situation where the introduction condition was not well arranged. The strategic value which can be obtained through big data should be clearly comprehended and systematic environment analysis is very important about applicability in order to introduce successful big data, but since the corporations are considering only partial achievements and technological phases that can be obtained through big data, the successful introduction is not being made. Previous study shows that most of big data researches are focused on big data concept, cases, and practical suggestions without empirical study. The purpose of this study is provide the theoretically and practically useful implementation framework and strategies of big data systems with conducting comprehensive literature review, finding influencing factors for successful big data systems implementation, and analysing empirical models. To do this, the elements which can affect the introduction intention of big data were deducted by reviewing the information system's successful factors, strategic value perception factors, considering factors for the information system introduction environment and big data related literature in order to comprehend the effect factors when the corporations introduce big data and structured questionnaire was developed. After that, the questionnaire and the statistical analysis were performed with the people in charge of the big data inside the corporations as objects. According to the statistical analysis, it was shown that the strategic value perception factor and the inside-industry environmental factors affected positively the introduction intention of big data. The theoretical, practical and political implications deducted from the study result is as follows. The frist theoretical implication is that this study has proposed theoretically effect factors which affect the introduction intention of big data by reviewing the strategic value perception and environmental factors and big data related precedent studies and proposed the variables and measurement items which were analyzed empirically and verified. This study has meaning in that it has measured the influence of each variable on the introduction intention by verifying the relationship between the independent variables and the dependent variables through structural equation model. Second, this study has defined the independent variable(strategic value perception, environment), dependent variable(introduction intention) and regulatory variable(type of business and corporate size) about big data introduction intention and has arranged theoretical base in studying big data related field empirically afterwards by developing measurement items which has obtained credibility and validity. Third, by verifying the strategic value perception factors and the significance about environmental factors proposed in the conventional precedent studies, this study will be able to give aid to the afterwards empirical study about effect factors on big data introduction. The operational implications are as follows. First, this study has arranged the empirical study base about big data field by investigating the cause and effect relationship about the influence of the strategic value perception factor and environmental factor on the introduction intention and proposing the measurement items which has obtained the justice, credibility and validity etc. Second, this study has proposed the study result that the strategic value perception factor affects positively the big data introduction intention and it has meaning in that the importance of the strategic value perception has been presented. Third, the study has proposed that the corporation which introduces big data should consider the big data introduction through precise analysis about industry's internal environment. Fourth, this study has proposed the point that the size and type of business of the corresponding corporation should be considered in introducing the big data by presenting the difference of the effect factors of big data introduction depending on the size and type of business of the corporation. The political implications are as follows. First, variety of utilization of big data is needed. The strategic value that big data has can be accessed in various ways in the product, service field, productivity field, decision making field etc and can be utilized in all the business fields based on that, but the parts that main domestic corporations are considering are limited to some parts of the products and service fields. Accordingly, in introducing big data, reviewing the phase about utilization in detail and design the big data system in a form which can maximize the utilization rate will be necessary. Second, the study is proposing the burden of the cost of the system introduction, difficulty in utilization in the system and lack of credibility in the supply corporations etc in the big data introduction phase by corporations. Since the world IT corporations are predominating the big data market, the big data introduction of domestic corporations can not but to be dependent on the foreign corporations. When considering that fact, that our country does not have global IT corporations even though it is world powerful IT country, the big data can be thought to be the chance to rear world level corporations. Accordingly, the government shall need to rear star corporations through active political support. Third, the corporations' internal and external professional manpower for the big data introduction and operation lacks. Big data is a system where how valuable data can be deducted utilizing data is more important than the system construction itself. For this, talent who are equipped with academic knowledge and experience in various fields like IT, statistics, strategy and management etc and manpower training should be implemented through systematic education for these talents. This study has arranged theoretical base for empirical studies about big data related fields by comprehending the main variables which affect the big data introduction intention and verifying them and is expected to be able to propose useful guidelines for the corporations and policy developers who are considering big data implementationby analyzing empirically that theoretical base.
https://doi.org/10.14329/apjis.2014.24.4.443 인용 PDF

FAMILY DYNAMICS OF INCEST PERCEIVED BY ADOLESECENTS (청소년이 지각한 근친상간의 가족역동)

Kim, Hun-Soo;Shin, Hwa-Sik
- Journal of the Korean Academy of Child and Adolescent Psychiatry
- /
- v.6 no.1
- /
- pp.56-64
- /
- 1995
Family is a primary unit of the major socialization processing for children. Parents among the family members are one of the most important figures from whom the child and adolescent acquire a wide variety of behavior patterns, attitudes, values and norms. An organization of family members product family structural functioning. Abnormal family structure is one of the most important reference models in the learning of antisocial patterns of behavior. Therefore incest and child sexual abuse including spouse abuse, elderly abuse, and neglect occurs in the abnormal family structural setting. In particular, incest, a specific form of sexual abuse, was once thought to be a phenomenon of great rarity, but our clinical experiences, especially over the past decade, have made us aware that incest and child sexual abuse is not rare case and on the increasing trend. Therefore, the aim of this study was to determine the family problem and dynamics of incest family, and character pattern of post-incest adolescent victim in Korea. A total of 1,838 adolescents from middle and high school(1,237) and juvenile correctional institute(601) were studied, sampled from Korean student population and adolescent delinquent population confined in juvenile correctional institutes, using proportional stratified random sampling method. The subjects' ages ranged from 12 to 21 years. Data were collected through questionnaire survey. Data analysis was done by IBM PC of Behavior Science Center at the Korea university, using SAS program. Statistical methods employed were Chi-square, principal component analysis and t-test etc. The results of this study were as follows ; 1) Of 1,071 subjects, 40(3.7%) reported incest experiences(sibling incest : 1.6% ; another type of incest : 2.1%) in their family setting. 2) The character pattern of post-incest adolescent victim was more socially maladjusted, immature, impulsive, rigid, anxious and dependent than non-incest adolescent. Also they showed some problem in academic performance and their assertiveness. 3) The other family members of incest family revealed more psychological and behavioral problem such as depression, alcoholism, psychotic disorder and criminal act than the non-incest family, even though there is no evidence of the context between them. 4) The family dynamics of incest family tended to be dysfunctional trend, as compared with non-incest family. It showed that the psychological instability of family member, parental rejection toward their children, coldness and indifference among family member and marital discordance between the parents had significant correlation with incest.
PDF

A Study on the Improvement of Recommendation Accuracy by Using Category Association Rule Mining (카테고리 연관 규칙 마이닝을 활용한 추천 정확도 향상 기법)

Lee, Dongwon
- Journal of Intelligence and Information Systems
- /
- v.26 no.2
- /
- pp.27-42
- /
- 2020
Traditional companies with offline stores were unable to secure large display space due to the problems of cost. This limitation inevitably allowed limited kinds of products to be displayed on the shelves, which resulted in consumers being deprived of the opportunity to experience various items. Taking advantage of the virtual space called the Internet, online shopping goes beyond the limits of limitations in physical space of offline shopping and is now able to display numerous products on web pages that can satisfy consumers with a variety of needs. Paradoxically, however, this can also cause consumers to experience the difficulty of comparing and evaluating too many alternatives in their purchase decision-making process. As an effort to address this side effect, various kinds of consumer's purchase decision support systems have been studied, such as keyword-based item search service and recommender systems. These systems can reduce search time for items, prevent consumer from leaving while browsing, and contribute to the seller's increased sales. Among those systems, recommender systems based on association rule mining techniques can effectively detect interrelated products from transaction data such as orders. The association between products obtained by statistical analysis provides clues to predicting how interested consumers will be in another product. However, since its algorithm is based on the number of transactions, products not sold enough so far in the early days of launch may not be included in the list of recommendations even though they are highly likely to be sold. Such missing items may not have sufficient opportunities to be exposed to consumers to record sufficient sales, and then fall into a vicious cycle of a vicious cycle of declining sales and omission in the recommendation list. This situation is an inevitable outcome in situations in which recommendations are made based on past transaction histories, rather than on determining potential future sales possibilities. This study started with the idea that reflecting the means by which this potential possibility can be identified indirectly would help to select highly recommended products. In the light of the fact that the attributes of a product affect the consumer's purchasing decisions, this study was conducted to reflect them in the recommender systems. In other words, consumers who visit a product page have shown interest in the attributes of the product and would be also interested in other products with the same attributes. On such assumption, based on these attributes, the recommender system can select recommended products that can show a higher acceptance rate. Given that a category is one of the main attributes of a product, it can be a good indicator of not only direct associations between two items but also potential associations that have yet to be revealed. Based on this idea, the study devised a recommender system that reflects not only associations between products but also categories. Through regression analysis, two kinds of associations were combined to form a model that could predict the hit rate of recommendation. To evaluate the performance of the proposed model, another regression model was also developed based only on associations between products. Comparative experiments were designed to be similar to the environment in which products are actually recommended in online shopping malls. First, the association rules for all possible combinations of antecedent and consequent items were generated from the order data. Then, hit rates for each of the associated rules were predicted from the support and confidence that are calculated by each of the models. The comparative experiments using order data collected from an online shopping mall show that the recommendation accuracy can be improved by further reflecting not only the association between products but also categories in the recommendation of related products. The proposed model showed a 2 to 3 percent improvement in hit rates compared to the existing model. From a practical point of view, it is expected to have a positive effect on improving consumers' purchasing satisfaction and increasing sellers' sales.
https://doi.org/10.13088/jiis.2020.26.2.027 인용 PDF KSCI

Self-optimizing feature selection algorithm for enhancing campaign effectiveness (캠페인 효과 제고를 위한 자기 최적화 변수 선택 알고리즘)

Seo, Jeoung-soo;Ahn, Hyunchul
- Journal of Intelligence and Information Systems
- /
- v.26 no.4
- /
- pp.173-198
- /
- 2020
For a long time, many studies have been conducted on predicting the success of campaigns for customers in academia, and prediction models applying various techniques are still being studied. Recently, as campaign channels have been expanded in various ways due to the rapid revitalization of online, various types of campaigns are being carried out by companies at a level that cannot be compared to the past. However, customers tend to perceive it as spam as the fatigue of campaigns due to duplicate exposure increases. Also, from a corporate standpoint, there is a problem that the effectiveness of the campaign itself is decreasing, such as increasing the cost of investing in the campaign, which leads to the low actual campaign success rate. Accordingly, various studies are ongoing to improve the effectiveness of the campaign in practice. This campaign system has the ultimate purpose to increase the success rate of various campaigns by collecting and analyzing various data related to customers and using them for campaigns. In particular, recent attempts to make various predictions related to the response of campaigns using machine learning have been made. It is very important to select appropriate features due to the various features of campaign data. If all of the input data are used in the process of classifying a large amount of data, it takes a lot of learning time as the classification class expands, so the minimum input data set must be extracted and used from the entire data. In addition, when a trained model is generated by using too many features, prediction accuracy may be degraded due to overfitting or correlation between features. Therefore, in order to improve accuracy, a feature selection technique that removes features close to noise should be applied, and feature selection is a necessary process in order to analyze a high-dimensional data set. Among the greedy algorithms, SFS (Sequential Forward Selection), SBS (Sequential Backward Selection), SFFS (Sequential Floating Forward Selection), etc. are widely used as traditional feature selection techniques. It is also true that if there are many risks and many features, there is a limitation in that the performance for classification prediction is poor and it takes a lot of learning time. Therefore, in this study, we propose an improved feature selection algorithm to enhance the effectiveness of the existing campaign. The purpose of this study is to improve the existing SFFS sequential method in the process of searching for feature subsets that are the basis for improving machine learning model performance using statistical characteristics of the data to be processed in the campaign system. Through this, features that have a lot of influence on performance are first derived, features that have a negative effect are removed, and then the sequential method is applied to increase the efficiency for search performance and to apply an improved algorithm to enable generalized prediction. Through this, it was confirmed that the proposed model showed better search and prediction performance than the traditional greed algorithm. Compared with the original data set, greed algorithm, genetic algorithm (GA), and recursive feature elimination (RFE), the campaign success prediction was higher. In addition, when performing campaign success prediction, the improved feature selection algorithm was found to be helpful in analyzing and interpreting the prediction results by providing the importance of the derived features. This is important features such as age, customer rating, and sales, which were previously known statistically. Unlike the previous campaign planners, features such as the combined product name, average 3-month data consumption rate, and the last 3-month wireless data usage were unexpectedly selected as important features for the campaign response, which they rarely used to select campaign targets. It was confirmed that base attributes can also be very important features depending on the type of campaign. Through this, it is possible to analyze and understand the important characteristics of each campaign type.
https://doi.org/10.13088/jiis.2020.26.4.173 인용 PDF KSCI

Search Result 3,009, Processing Time 0.037 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)