Search | Korea Science

Development of New Variables Affecting Movie Success and Prediction of Weekly Box Office Using Them Based on Machine Learning (영화 흥행에 영향을 미치는 새로운 변수 개발과 이를 이용한 머신러닝 기반의 주간 박스오피스 예측)

Song, Junga;Choi, Keunho;Kim, Gunwoo
- Journal of Intelligence and Information Systems
- /
- v.24 no.4
- /
- pp.67-83
- /
- 2018
The Korean film industry with significant increase every year exceeded the number of cumulative audiences of 200 million people in 2013 finally. However, starting from 2015 the Korean film industry entered a period of low growth and experienced a negative growth after all in 2016. To overcome such difficulty, stakeholders like production company, distribution company, multiplex have attempted to maximize the market returns using strategies of predicting change of market and of responding to such market change immediately. Since a film is classified as one of experiential products, it is not easy to predict a box office record and the initial number of audiences before the film is released. And also, the number of audiences fluctuates with a variety of factors after the film is released. So, the production company and distribution company try to be guaranteed the number of screens at the opining time of a newly released by multiplex chains. However, the multiplex chains tend to open the screening schedule during only a week and then determine the number of screening of the forthcoming week based on the box office record and the evaluation of audiences. Many previous researches have conducted to deal with the prediction of box office records of films. In the early stage, the researches attempted to identify factors affecting the box office record. And nowadays, many studies have tried to apply various analytic techniques to the factors identified previously in order to improve the accuracy of prediction and to explain the effect of each factor instead of identifying new factors affecting the box office record. However, most of previous researches have limitations in that they used the total number of audiences from the opening to the end as a target variable, and this makes it difficult to predict and respond to the demand of market which changes dynamically. Therefore, the purpose of this study is to predict the weekly number of audiences of a newly released film so that the stakeholder can flexibly and elastically respond to the change of the number of audiences in the film. To that end, we considered the factors used in the previous studies affecting box office and developed new factors not used in previous studies such as the order of opening of movies, dynamics of sales. Along with the comprehensive factors, we used the machine learning method such as Random Forest, Multi Layer Perception, Support Vector Machine, and Naive Bays, to predict the number of cumulative visitors from the first week after a film release to the third week. At the point of the first and the second week, we predicted the cumulative number of visitors of the forthcoming week for a released film. And at the point of the third week, we predict the total number of visitors of the film. In addition, we predicted the total number of cumulative visitors also at the point of the both first week and second week using the same factors. As a result, we found the accuracy of predicting the number of visitors at the forthcoming week was higher than that of predicting the total number of them in all of three weeks, and also the accuracy of the Random Forest was the highest among the machine learning methods we used. This study has implications in that this study 1) considered various factors comprehensively which affect the box office record and merely addressed by other previous researches such as the weekly rating of audiences after release, the weekly rank of the film after release, and the weekly sales share after release, and 2) tried to predict and respond to the demand of market which changes dynamically by suggesting models which predicts the weekly number of audiences of newly released films so that the stakeholders can flexibly and elastically respond to the change of the number of audiences in the film.
https://doi.org/10.13088/jiis.2018.24.4.067 인용 PDF KSCI HTML

Study on the effect of small and medium-sized businesses being selected as suitable business types, on the franchise industry (중소기업적합업종선정이 프랜차이즈산업에 미치는 영향에 관한 연구)

Kang, Chang-Dong;Shin, Geon-Chel;Jang, Jae Nam
- Journal of Distribution Research
- /
- v.17 no.5
- /
- pp.1-23
- /
- 2012
The conflict between major corporations and small and medium-sized businesses is being aggravated, the trickle down effect is not working properly, and, as the controversy surrounding the effectiveness of the business limiting system continues to swirl, the plan proposed to protect the business domain of small and medium-sized businesses, resolve polarization between these businesses and large corporations, and protect small family run stores is the suitable business type designation system for small and medium-sized businesses. The current status of carrying out this system of selecting suitable business types among small and medium-sized businesses involves receiving applications for 234 items among the suitable business types and items from small and medium-sized businesses in manufacturing, and then selecting the items of the consultative group by analyzing and investigating the actual conditions. Suitable business type designation in the service industry will involve designation with priority on business types that are experiencing social conflict. Three major classifications of the service industry, related to the livelihood of small and medium-sized businesses, will be first designated, and subsequently this will be expanded sequentially. However, there is the concern that when designated as a suitable business type or item, this will hinder the growth motive for small to medium-sized businesses, and designation all cause decrease in consumer welfare. Also it is highly likely that it will operate as a prior regulation, cause side-effects by limiting competition systematically, and also be in violation against the main regulations of the FTA system. Moreover, it is pointed out that the system does not sufficiently reflect reverse discrimination factor against large corporations. Because conflict between small to medium sized businesses and large corporations results from the expansion of corporations to the service industry, which is unrelated to their key industry, it is necessary to introduce an advanced contract method like a master franchise or local franchise system and to develop local small to medium sized businesses through a franchise system to protect these businesses and dealers. However, this method may have an effect that contributes to stronger competitiveness of small to medium sized franchise businesses by advancing their competitiveness and operational methods a step further, but also has many negative aspects. First, as revealed by the Ministry of Knowledge Economy, the franchise industry is contributing to the strengthening of competitiveness through the economy of scale by organizing existing individual proprietors and increasing the success rate of new businesses. It is also revealed to be a response measure by the government to stabilize the economy of ordinary people and is emphasized as a 'useful way' to revitalize the service industry and improve the competitiveness of individual proprietors, and has been involved in contributions to creating jobs and expanding the domestic market by providing various services to consumers. From this viewpoint, franchises fit the purpose of the suitable business type system and is not something that is against it. Second, designation as a suitable business type may decrease investment for overseas expansion, R&D, and food safety, as well negatively affect the expansion of overseas corporations that have entered the domestic market, due to the contraction and low morale of large domestic franchise corporations that have competitiveness internationally. Also because domestic franchise businesses are hard pressed to secure competitiveness with multinational overseas franchise corporations that are operating in Korea, the system may cause difficulty for domestic franchise businesses in securing international competitiveness and also may result in reverse discrimination against these overseas franchise corporations. Third, the designation of suitable business type and item can limit the opportunity of selection for consumers who have up to now used those products and can cause a negative effect that reduces consumer welfare. Also, because there is the possibility that the range of consumer selection may be reduced when a few small to medium size businesses monopolize the market, by causing reverse discrimination between these businesses, the role of determining the utility of products must be left ot the consumer not the government. Lastly, it is desirable that this is carried out with the supplementation of deficient parts in the future, because fair trade is already secured with the enforcement of the franchise trade law and the best trade standard of the Fair Trade Commission. Overlapping regulations by the suitable business type designation is an excessive restriction in the franchise industry. Now, it is necessary to establish in the domestic franchise industry an environment where a global franchise corporation, which spreads Korean culture around the world, is capable of growing, and the active support by the government is needed. Therefore, systems that do not consider the process or background of the growth of franchise businesses and harm these businesses for the sole reason of them being large corporations must be removed. The inhibition of growth to franchise enterprises may decrease the sales of franchise stores, in some cases even bankrupt them, as well as cause other problems. Therefore the suitable business type system should not hinder large corporations, and as both small dealers and small to medium size businesses both aim at improving competitiveness and combined growth, large corporations, small dealers and small to medium sized businesses, based on their mutual cooperation, should not include franchise corporations that continue business relations with them in this system.
PDF

A Study on the Critical Success Factors of Social Commerce through the Analysis of the Perception Gap between the Service Providers and the Users: Focused on Ticket Monster in Korea (서비스제공자와 사용자의 인식차이 분석을 통한 소셜커머스 핵심성공요인에 대한 연구: 한국의 티켓몬스터 중심으로)

Kim, Il Jung;Lee, Dae Chul;Lim, Gyoo Gun
- Asia pacific journal of information systems
- /
- v.24 no.2
- /
- pp.211-232
- /
- 2014
Recently, there is a growing interest toward social commerce using SNS(Social Networking Service), and the size of its market is also expanding due to popularization of smart phones, tablet PCs and other smart devices. Accordingly, various studies have been attempted but it is shown that most of the previous studies have been conducted from perspectives of the users. The purpose of this study is to derive user-centered CSF(Critical Success Factor) of social commerce from the previous studies and analyze the CSF perception gap between social commerce service providers and users. The CSF perception gap between two groups shows that there is a difference between ideal images the service providers hope for and the actual image the service users have on social commerce companies. This study provides effective improvement directions for social commerce companies by presenting current business problems and its solution plans. For this, This study selected Korea's representative social commerce business Ticket Monster, which is dominant in sales and staff size together with its excellent funding power through M&A by stock exchange with the US social commerce business Living Social with Amazon.com as a shareholder in August, 2011, as a target group of social commerce service provider. we have gathered questionnaires from both service providers and the users from October 22, 2012 until October 31, 2012 to conduct an empirical analysis. We surveyed 160 service providers of Ticket Monster We also surveyed 160 social commerce users who have experienced in using Ticket Monster service. Out of 320 surveys, 20 questionaries which were unfit or undependable were discarded. Consequently the remaining 300(service provider 150, user 150)were used for this empirical study. The statistics were analyzed using SPSS 12.0. Implications of the empirical analysis result of this study are as follows: First of all, There are order differences in the importance of social commerce CSF between two groups. While service providers regard Price Economic as the most important CSF influencing purchasing intention, the users regard 'Trust' as the most important CSF influencing purchasing intention. This means that the service providers have to utilize the unique strong point of social commerce which make the customers be trusted rathe than just focusing on selling product at a discounted price. It means that service Providers need to enhance effective communication skills by using SNS and play a vital role as a trusted adviser who provides curation services and explains the value of products through information filtering. Also, they need to pay attention to preventing consumer damages from deceptive and false advertising. service providers have to create the detailed reward system in case of a consumer damages caused by above problems. It can make strong ties with customers. Second, both service providers and users tend to consider that social commerce CSF influencing purchasing intention are Price Economic, Utility, Trust, and Word of Mouth Effect. Accordingly, it can be learned that users are expecting the benefit from the aspect of prices and economy when using social commerce, and service providers should be able to suggest the individualized discount benefit through diverse methods using social network service. Looking into it from the aspect of usefulness, service providers are required to get users to be cognizant of time-saving, efficiency, and convenience when they are using social commerce. Therefore, it is necessary to increase the usefulness of social commerce through the introduction of a new management strategy, such as intensification of search engine of the Website, facilitation in payment through shopping basket, and package distribution. Trust, as mentioned before, is the most important variable in consumers' mind, so it should definitely be managed for sustainable management. If the trust in social commerce should fall due to consumers' damage case due to false and puffery advertising forgeries, it could have a negative influence on the image of the social commerce industry in general. Instead of advertising with famous celebrities and using a bombastic amount of money on marketing expenses, the social commerce industry should be able to use the word of mouth effect between users by making use of the social network service, the major marketing method of initial social commerce. The word of mouth effect occurring from consumers' spontaneous self-marketer's duty performance can bring not only reduction effect in advertising cost to a service provider but it can also prepare the basis of discounted price suggestion to consumers; in this context, the word of mouth effect should be managed as the CSF of social commerce. Third, Trade safety was not derived as one of the CSF. Recently, with e-commerce like social commerce and Internet shopping increasing in a variety of methods, the importance of trade safety on the Internet also increases, but in this study result, trade safety wasn't evaluated as CSF of social commerce by both groups. This study judges that it's because both service provider groups and user group are perceiving that there is a reliable PG(Payment Gateway) which acts for e-payment of Internet transaction. Accordingly, it is understood that both two groups feel that social commerce can have a corporate identity by website and differentiation in products and services in sales, but don't feel a big difference by business in case of e-payment system. In other words, trade safety should be perceived as natural, basic universal service. Fourth, it's necessary that service providers should intensify the communication with users by making use of social network service which is the major marketing method of social commerce and should be able to use the word of mouth effect between users. The word of mouth effect occurring from consumers' spontaneous self- marketer's duty performance can bring not only reduction effect in advertising cost to a service provider but it can also prepare the basis of discounted price suggestion to consumers. in this context, it is judged that the word of mouth effect should be managed as CSF of social commerce. In this paper, the characteristics of social commerce are limited as five independent variables, however, if an additional study is proceeded with more various independent variables, more in-depth study results will be derived. In addition, this research targets social commerce service providers and the users, however, in the consideration of the fact that social commerce is a two-sided market, drawing CSF through an analysis of perception gap between social commerce service providers and its advertisement clients would be worth to be dealt with in a follow-up study.
https://doi.org/10.14329/apjis.2014.24.2.211 인용 PDF

How Enduring Product Involvement and Perceived Risk Affect Consumers' Online Merchant Selection Process: The 'Required Trust Level' Perspective (지속적 관여도 및 인지된 위험이 소비자의 온라인 상인선택 프로세스에 미치는 영향에 관한 연구: 요구신뢰 수준 개념을 중심으로)

Hong, Il-Yoo B.;Lee, Jung-Min;Cho, Hwi-Hyung
- Asia pacific journal of information systems
- /
- v.22 no.1
- /
- pp.29-52
- /
- 2012
Consumers differ in the way they make a purchase. An audio mania would willingly make a bold, yet serious, decision to buy a top-of-the-line home theater system, while he is not interested in replacing his two-decade-old shabby car. On the contrary, an automobile enthusiast wouldn't mind spending forty thousand dollars to buy a new Jaguar convertible, yet cares little about his junky component system. It is product involvement that helps us explain such differences among individuals in the purchase style. Product involvement refers to the extent to which a product is perceived to be important to a consumer (Zaichkowsky, 2001). Product involvement is an important factor that strongly influences consumer's purchase decision-making process, and thus has been of prime interest to consumer behavior researchers. Furthermore, researchers found that involvement is closely related to perceived risk (Dholakia, 2001). While abundant research exists addressing how product involvement relates to overall perceived risk, little attention has been paid to the relationship between involvement and different types of perceived risk in an electronic commerce setting. Given that perceived risk can be a substantial barrier to the online purchase (Jarvenpaa, 2000), research addressing such an issue will offer useful implications on what specific types of perceived risk an online firm should focus on mitigating if it is to increase sales to a fullest potential. Meanwhile, past research has focused on such consumer responses as information search and dissemination as a consequence of involvement, neglecting other behavioral responses like online merchant selection. For one example, will a consumer seriously considering the purchase of a pricey Guzzi bag perceive a great degree of risk associated with online buying and therefore choose to buy it from a digital storefront rather than from an online marketplace to mitigate risk? Will a consumer require greater trust on the part of the online merchant when the perceived risk of online buying is rather high? We intend to find answers to these research questions through an empirical study. This paper explores the impact of enduring product involvement and perceived risks on required trust level, and further on online merchant choice. For the purpose of the research, five types or components of perceived risk are taken into consideration, including financial, performance, delivery, psychological, and social risks. A research model has been built around the constructs under consideration, and 12 hypotheses have been developed based on the research model to examine the relationships between enduring involvement and five components of perceived risk, between five components of perceived risk and required trust level, between enduring involvement and required trust level, and finally between required trust level and preference toward an e-tailer. To attain our research objectives, we conducted an empirical analysis consisting of two phases of data collection: a pilot test and main survey. The pilot test was conducted using 25 college students to ensure that the questionnaire items are clear and straightforward. Then the main survey was conducted using 295 college students at a major university for nine days between December 13, 2010 and December 21, 2010. The measures employed to test the model included eight constructs: (1) enduring involvement, (2) financial risk, (3) performance risk, (4) delivery risk, (5) psychological risk, (6) social risk, (7) required trust level, (8) preference toward an e-tailer. The statistical package, SPSS 17.0, was used to test the internal consistency among the items within the individual measures. Based on the Cronbach's ${\alpha}$ coefficients of the individual measure, the reliability of all the variables is supported. Meanwhile, the Amos 18.0 package was employed to perform a confirmatory factor analysis designed to assess the unidimensionality of the measures. The goodness of fit for the measurement model was satisfied. Unidimensionality was tested using convergent, discriminant, and nomological validity. The statistical evidences proved that the three types of validity were all satisfied. Now the structured equation modeling technique was used to analyze the individual paths along the relationships among the research constructs. The results indicated that enduring involvement has significant positive relationships with all the five components of perceived risk, while only performance risk is significantly related to trust level required by consumers for purchase. It can be inferred from the findings that product performance problems are mostly likely to occur when a merchant behaves in an opportunistic manner. Positive relationships were also found between involvement and required trust level and between required trust level and online merchant choice. Enduring involvement is concerned with the pleasure a consumer derives from a product class and/or with the desire for knowledge for the product class, and thus is likely to motivate the consumer to look for ways of mitigating perceived risk by requiring a higher level of trust on the part of the online merchant. Likewise, a consumer requiring a high level of trust on the merchant will choose a digital storefront rather than an e-marketplace, since a digital storefront is believed to be trustworthier than an e-marketplace, as it fulfills orders by itself rather than acting as an intermediary. The findings of the present research provide both academic and practical implications. The first academic implication is that enduring product involvement is a strong motivator of consumer responses, especially the selection of a merchant, in the context of electronic shopping. Secondly, academicians are advised to pay attention to the finding that an individual component or type of perceived risk can be used as an important research construct, since it would allow one to pinpoint the specific types of risk that are influenced by antecedents or that influence consequents. Meanwhile, our research provides implications useful for online merchants (both online storefronts and e-marketplaces). Merchants may develop strategies to attract consumers by managing perceived performance risk involved in purchase decisions, since it was found to have significant positive relationship with the level of trust required by a consumer on the part of the merchant. One way to manage performance risk would be to thoroughly examine the product before shipping to ensure that it has no deficiencies or flaws. Secondly, digital storefronts are advised to focus on symbolic goods (e.g., cars, cell phones, fashion outfits, and handbags) in which consumers are relatively more involved than others, whereas e- marketplaces should put their emphasis on non-symbolic goods (e.g., drinks, books, MP3 players, and bike accessories).
PDF

Effects of firm strategies on customer acquisition of Software as a Service (SaaS) providers: A mediating and moderating role of SaaS technology maturity (SaaS 기업의 차별화 및 가격전략이 고객획득성과에 미치는 영향: SaaS 기술성숙도 수준의 매개효과 및 조절효과를 중심으로)

Chae, SeongWook;Park, Sungbum
- Journal of Intelligence and Information Systems
- /
- v.20 no.3
- /
- pp.151-171
- /
- 2014
Firms today have sought management effectiveness and efficiency utilizing information technologies (IT). Numerous firms are outsourcing specific information systems functions to cope with their short of information resources or IT experts, or to reduce their capital cost. Recently, Software-as-a-Service (SaaS) as a new type of information system has become one of the powerful outsourcing alternatives. SaaS is software deployed as a hosted and accessed over the internet. It is regarded as the idea of on-demand, pay-per-use, and utility computing and is now being applied to support the core competencies of clients in areas ranging from the individual productivity area to the vertical industry and e-commerce area. In this study, therefore, we seek to quantify the value that SaaS has on business performance by examining the relationships among firm strategies, SaaS technology maturity, and business performance of SaaS providers. We begin by drawing from prior literature on SaaS, technology maturity and firm strategy. SaaS technology maturity is classified into three different phases such as application service providing (ASP), Web-native application, and Web-service application. Firm strategies are manipulated by the low-cost strategy and differentiation strategy. Finally, we considered customer acquisition as a business performance. In this sense, specific objectives of this study are as follows. First, we examine the relationships between customer acquisition performance and both low-cost strategy and differentiation strategy of SaaS providers. Secondly, we investigate the mediating and moderating effects of SaaS technology maturity on those relationships. For this purpose, study collects data from the SaaS providers, and their line of applications registered in the database in CNK (Commerce net Korea) in Korea using a questionnaire method by the professional research institution. The unit of analysis in this study is the SBUs (strategic business unit) in the software provider. A total of 199 SBUs is used for analyzing and testing our hypotheses. With regards to the measurement of firm strategy, we take three measurement items for differentiation strategy such as the application uniqueness (referring an application aims to differentiate within just one or a small number of target industry), supply channel diversification (regarding whether SaaS vendor had diversified supply chain) as well as the number of specialized expertise and take two items for low cost strategy like subscription fee and initial set-up fee. We employ a hierarchical regression analysis technique for testing moderation effects of SaaS technology maturity and follow the Baron and Kenny's procedure for determining if firm strategies affect customer acquisition through technology maturity. Empirical results revealed that, firstly, when differentiation strategy is applied to attain business performance like customer acquisition, the effects of the strategy is moderated by the technology maturity level of SaaS providers. In other words, securing higher level of SaaS technology maturity is essential for higher business performance. For instance, given that firms implement application uniqueness or a distribution channel diversification as a differentiation strategy, they can acquire more customers when their level of SaaS technology maturity is higher rather than lower. Secondly, results indicate that pursuing differentiation strategy or low cost strategy effectively works for SaaS providers' obtaining customer, which means that continuously differentiating their service from others or making their service fee (subscription fee or initial set-up fee) lower are helpful for their business success in terms of acquiring their customers. Lastly, results show that the level of SaaS technology maturity mediates the relationships between low cost strategy and customer acquisition. That is, based on our research design, customers usually perceive the real value of the low subscription fee or initial set-up fee only through the SaaS service provide by vender and, in turn, this will affect their decision making whether subscribe or not.
https://doi.org/10.13088/jiis.2014.20.3.151 인용 PDF KSCI

A Study on the Regional Characteristics of Broadband Internet Termination by Coupling Type using Spatial Information based Clustering (공간정보기반 클러스터링을 이용한 초고속인터넷 결합유형별 해지의 지역별 특성연구)

Park, Janghyuk;Park, Sangun;Kim, Wooju
- Journal of Intelligence and Information Systems
- /
- v.23 no.3
- /
- pp.45-67
- /
- 2017
According to the Internet Usage Research performed in 2016, the number of internet users and the internet usage have been increasing. Smartphone, compared to the computer, is taking a more dominant role as an internet access device. As the number of smart devices have been increasing, some views that the demand on high-speed internet will decrease; however, Despite the increase in smart devices, the high-speed Internet market is expected to slightly increase for a while due to the speedup of Giga Internet and the growth of the IoT market. As the broadband Internet market saturates, telecom operators are over-competing to win new customers, but if they know the cause of customer exit, it is expected to reduce marketing costs by more effective marketing. In this study, we analyzed the relationship between the cancellation rates of telecommunication products and the factors affecting them by combining the data of 3 cities, Anyang, Gunpo, and Uiwang owned by a telecommunication company with the regional data from KOSIS(Korean Statistical Information Service). Especially, we focused on the assumption that the neighboring areas affect the distribution of the cancellation rates by coupling type, so we conducted spatial cluster analysis on the 3 types of cancellation rates of each region using the spatial analysis tool, SatScan, and analyzed the various relationships between the cancellation rates and the regional data. In the analysis phase, we first summarized the characteristics of the clusters derived by combining spatial information and the cancellation data. Next, based on the results of the cluster analysis, Variance analysis, Correlation analysis, and regression analysis were used to analyze the relationship between the cancellation rates data and regional data. Based on the results of analysis, we proposed appropriate marketing methods according to the region. Unlike previous studies on regional characteristics analysis, In this study has academic differentiation in that it performs clustering based on spatial information so that the regions with similar cancellation types on adjacent regions. In addition, there have been few studies considering the regional characteristics in the previous study on the determinants of subscription to high-speed Internet services, In this study, we tried to analyze the relationship between the clusters and the regional characteristics data, assuming that there are different factors depending on the region. In this study, we tried to get more efficient marketing method considering the characteristics of each region in the new subscription and customer management in high-speed internet. As a result of analysis of variance, it was confirmed that there were significant differences in regional characteristics among the clusters, Correlation analysis shows that there is a stronger correlation the clusters than all region. and Regression analysis was used to analyze the relationship between the cancellation rate and the regional characteristics. As a result, we found that there is a difference in the cancellation rate depending on the regional characteristics, and it is possible to target differentiated marketing each region. As the biggest limitation of this study and it was difficult to obtain enough data to carry out the analyze. In particular, it is difficult to find the variables that represent the regional characteristics in the Dong unit. In other words, most of the data was disclosed to the city rather than the Dong unit, so it was limited to analyze it in detail. The data such as income, card usage information and telecommunications company policies or characteristics that could affect its cause are not available at that time. The most urgent part for a more sophisticated analysis is to obtain the Dong unit data for the regional characteristics. Direction of the next studies be target marketing based on the results. It is also meaningful to analyze the effect of marketing by comparing and analyzing the difference of results before and after target marketing. It is also effective to use clusters based on new subscription data as well as cancellation data.
https://doi.org/10.13088/jiis.2017.23.3.045 인용 PDF KSCI

Efficient Topic Modeling by Mapping Global and Local Topics (전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안)

Choi, Hochang;Kim, Namgyu
- Journal of Intelligence and Information Systems
- /
- v.23 no.3
- /
- pp.69-94
- /
- 2017
Recently, increase of demand for big data analysis has been driving the vigorous development of related technologies and tools. In addition, development of IT and increased penetration rate of smart devices are producing a large amount of data. According to this phenomenon, data analysis technology is rapidly becoming popular. Also, attempts to acquire insights through data analysis have been continuously increasing. It means that the big data analysis will be more important in various industries for the foreseeable future. Big data analysis is generally performed by a small number of experts and delivered to each demander of analysis. However, increase of interest about big data analysis arouses activation of computer programming education and development of many programs for data analysis. Accordingly, the entry barriers of big data analysis are gradually lowering and data analysis technology being spread out. As the result, big data analysis is expected to be performed by demanders of analysis themselves. Along with this, interest about various unstructured data is continually increasing. Especially, a lot of attention is focused on using text data. Emergence of new platforms and techniques using the web bring about mass production of text data and active attempt to analyze text data. Furthermore, result of text analysis has been utilized in various fields. Text mining is a concept that embraces various theories and techniques for text analysis. Many text mining techniques are utilized in this field for various research purposes, topic modeling is one of the most widely used and studied. Topic modeling is a technique that extracts the major issues from a lot of documents, identifies the documents that correspond to each issue and provides identified documents as a cluster. It is evaluated as a very useful technique in that reflect the semantic elements of the document. Traditional topic modeling is based on the distribution of key terms across the entire document. Thus, it is essential to analyze the entire document at once to identify topic of each document. This condition causes a long time in analysis process when topic modeling is applied to a lot of documents. In addition, it has a scalability problem that is an exponential increase in the processing time with the increase of analysis objects. This problem is particularly noticeable when the documents are distributed across multiple systems or regions. To overcome these problems, divide and conquer approach can be applied to topic modeling. It means dividing a large number of documents into sub-units and deriving topics through repetition of topic modeling to each unit. This method can be used for topic modeling on a large number of documents with limited system resources, and can improve processing speed of topic modeling. It also can significantly reduce analysis time and cost through ability to analyze documents in each location or place without combining analysis object documents. However, despite many advantages, this method has two major problems. First, the relationship between local topics derived from each unit and global topics derived from entire document is unclear. It means that in each document, local topics can be identified, but global topics cannot be identified. Second, a method for measuring the accuracy of the proposed methodology should be established. That is to say, assuming that global topic is ideal answer, the difference in a local topic on a global topic needs to be measured. By those difficulties, the study in this method is not performed sufficiently, compare with other studies dealing with topic modeling. In this paper, we propose a topic modeling approach to solve the above two problems. First of all, we divide the entire document cluster(Global set) into sub-clusters(Local set), and generate the reduced entire document cluster(RGS, Reduced global set) that consist of delegated documents extracted from each local set. We try to solve the first problem by mapping RGS topics and local topics. Along with this, we verify the accuracy of the proposed methodology by detecting documents, whether to be discerned as the same topic at result of global and local set. Using 24,000 news articles, we conduct experiments to evaluate practical applicability of the proposed methodology. In addition, through additional experiment, we confirmed that the proposed methodology can provide similar results to the entire topic modeling. We also proposed a reasonable method for comparing the result of both methods.
https://doi.org/10.13088/jiis.2017.23.3.069 인용 PDF KSCI

Rough Set Analysis for Stock Market Timing (러프집합분석을 이용한 매매시점 결정)

Huh, Jin-Nyung;Kim, Kyoung-Jae;Han, In-Goo
- Journal of Intelligence and Information Systems
- /
- v.16 no.3
- /
- pp.77-97
- /
- 2010
Market timing is an investment strategy which is used for obtaining excessive return from financial market. In general, detection of market timing means determining when to buy and sell to get excess return from trading. In many market timing systems, trading rules have been used as an engine to generate signals for trade. On the other hand, some researchers proposed the rough set analysis as a proper tool for market timing because it does not generate a signal for trade when the pattern of the market is uncertain by using the control function. The data for the rough set analysis should be discretized of numeric value because the rough set only accepts categorical data for analysis. Discretization searches for proper "cuts" for numeric data that determine intervals. All values that lie within each interval are transformed into same value. In general, there are four methods for data discretization in rough set analysis including equal frequency scaling, expert's knowledge-based discretization, minimum entropy scaling, and na$\ddot{i}$ve and Boolean reasoning-based discretization. Equal frequency scaling fixes a number of intervals and examines the histogram of each variable, then determines cuts so that approximately the same number of samples fall into each of the intervals. Expert's knowledge-based discretization determines cuts according to knowledge of domain experts through literature review or interview with experts. Minimum entropy scaling implements the algorithm based on recursively partitioning the value set of each variable so that a local measure of entropy is optimized. Na$\ddot{i}$ve and Booleanreasoning-based discretization searches categorical values by using Na$\ddot{i}$ve scaling the data, then finds the optimized dicretization thresholds through Boolean reasoning. Although the rough set analysis is promising for market timing, there is little research on the impact of the various data discretization methods on performance from trading using the rough set analysis. In this study, we compare stock market timing models using rough set analysis with various data discretization methods. The research data used in this study are the KOSPI 200 from May 1996 to October 1998. KOSPI 200 is the underlying index of the KOSPI 200 futures which is the first derivative instrument in the Korean stock market. The KOSPI 200 is a market value weighted index which consists of 200 stocks selected by criteria on liquidity and their status in corresponding industry including manufacturing, construction, communication, electricity and gas, distribution and services, and financing. The total number of samples is 660 trading days. In addition, this study uses popular technical indicators as independent variables. The experimental results show that the most profitable method for the training sample is the na$\ddot{i}$ve and Boolean reasoning but the expert's knowledge-based discretization is the most profitable method for the validation sample. In addition, the expert's knowledge-based discretization produced robust performance for both of training and validation sample. We also compared rough set analysis and decision tree. This study experimented C4.5 for the comparison purpose. The results show that rough set analysis with expert's knowledge-based discretization produced more profitable rules than C4.5.
PDF KSCI

NFC-based Smartwork Service Model Design (NFC 기반의 스마트워크 서비스 모델 설계)

Park, Arum;Kang, Min Su;Jun, Jungho;Lee, Kyoung Jun
- Journal of Intelligence and Information Systems
- /
- v.19 no.2
- /
- pp.157-175
- /
- 2013
Since Korean government announced 'Smartwork promotion strategy' in 2010, Korean firms and government organizations have started to adopt smartwork. However, the smartwork has been implemented only in a few of large enterprises and government organizations rather than SMEs (small and medium enterprises). In USA, both Yahoo! and Best Buy have stopped their flexible work because of its reported low productivity and job loafing problems. In addition, according to the literature on smartwork, we could draw obstacles of smartwork adoption and categorize them into the three types: institutional, organizational, and technological. The first category of smartwork adoption obstacles, institutional, include the difficulties of smartwork performance evaluation metrics, the lack of readiness of organizational processes, limitation of smartwork types and models, lack of employee participation in smartwork adoption procedure, high cost of building smartwork system, and insufficiency of government support. The second category, organizational, includes limitation of the organization hierarchy, wrong perception of employees and employers, a difficulty in close collaboration, low productivity with remote coworkers, insufficient understanding on remote working, and lack of training about smartwork. The third category, technological, obstacles include security concern of mobile work, lack of specialized solution, and lack of adoption and operation know-how. To overcome the current problems of smartwork in reality and the reported obstacles in literature, we suggest a novel smartwork service model based on NFC(Near Field Communication). This paper suggests NFC-based Smartwork Service Model composed of NFC-based Smartworker networking service and NFC-based Smartwork space management service. NFC-based smartworker networking service is comprised of NFC-based communication/SNS service and NFC-based recruiting/job seeking service. NFC-based communication/SNS Service Model supplements the key shortcomings that existing smartwork service model has. By connecting to existing legacy system of a company through NFC tags and systems, the low productivity and the difficulty of collaboration and attendance management can be overcome since managers can get work processing information, work time information and work space information of employees and employees can do real-time communication with coworkers and get location information of coworkers. Shortly, this service model has features such as affordable system cost, provision of location-based information, and possibility of knowledge accumulation. NFC-based recruiting/job-seeking service provides new value by linking NFC tag service and sharing economy sites. This service model has features such as easiness of service attachment and removal, efficient space-based work provision, easy search of location-based recruiting/job-seeking information, and system flexibility. This service model combines advantages of sharing economy sites with the advantages of NFC. By cooperation with sharing economy sites, the model can provide recruiters with human resource who finds not only long-term works but also short-term works. Additionally, SMEs (Small Medium-sized Enterprises) can easily find job seeker by attaching NFC tags to any spaces at which human resource with qualification may be located. In short, this service model helps efficient human resource distribution by providing location of job hunters and job applicants. NFC-based smartwork space management service can promote smartwork by linking NFC tags attached to the work space and existing smartwork system. This service has features such as low cost, provision of indoor and outdoor location information, and customized service. In particular, this model can help small company adopt smartwork system because it is light-weight system and cost-effective compared to existing smartwork system. This paper proposes the scenarios of the service models, the roles and incentives of the participants, and the comparative analysis. The superiority of NFC-based smartwork service model is shown by comparing and analyzing the new service models and the existing service models. The service model can expand scope of enterprises and organizations that adopt smartwork and expand the scope of employees that take advantages of smartwork.
https://doi.org/10.13088/jiis.2013.19.2.157 인용 PDF KSCI

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
- Journal of Intelligence and Information Systems
- /
- v.27 no.1
- /
- pp.83-102
- /
- 2021
The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.
https://doi.org/10.13088/jiis.2021.27.1.083 인용 PDF KSCI

Search Result 6,426, Processing Time 0.035 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)