• Title/Summary/Keyword: Big Business

Search Result 1,346, Processing Time 0.031 seconds

Analysis of News Agenda Using Text mining and Semantic Network Analysis: Focused on COVID-19 Emotions (텍스트 마이닝과 의미 네트워크 분석을 활용한 뉴스 의제 분석: 코로나 19 관련 감정을 중심으로)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.47-64
    • /
    • 2021
  • The global spread of COVID-19 around the world has not only affected many parts of our daily life but also has a huge impact on many areas, including the economy and society. As the number of confirmed cases and deaths increases, medical staff and the public are said to be experiencing psychological problems such as anxiety, depression, and stress. The collective tragedy that accompanies the epidemic raises fear and anxiety, which is known to cause enormous disruptions to the behavior and psychological well-being of many. Long-term negative emotions can reduce people's immunity and destroy their physical balance, so it is essential to understand the psychological state of COVID-19. This study suggests a method of monitoring medial news reflecting current days which requires striving not only for physical but also for psychological quarantine in the prolonged COVID-19 situation. Moreover, it is presented how an easier method of analyzing social media networks applies to those cases. The aim of this study is to assist health policymakers in fast and complex decision-making processes. News plays a major role in setting the policy agenda. Among various major media, news headlines are considered important in the field of communication science as a summary of the core content that the media wants to convey to the audiences who read it. News data used in this study was easily collected using "Bigkinds" that is created by integrating big data technology. With the collected news data, keywords were classified through text mining, and the relationship between words was visualized through semantic network analysis between keywords. Using the KrKwic program, a Korean semantic network analysis tool, text mining was performed and the frequency of words was calculated to easily identify keywords. The frequency of words appearing in keywords of articles related to COVID-19 emotions was checked and visualized in word cloud 'China', 'anxiety', 'situation', 'mind', 'social', and 'health' appeared high in relation to the emotions of COVID-19. In addition, UCINET, a specialized social network analysis program, was used to analyze connection centrality and cluster analysis, and a method of visualizing a graph using Net Draw was performed. As a result of analyzing the connection centrality between each data, it was found that the most central keywords in the keyword-centric network were 'psychology', 'COVID-19', 'blue', and 'anxiety'. The network of frequency of co-occurrence among the keywords appearing in the headlines of the news was visualized as a graph. The thickness of the line on the graph is proportional to the frequency of co-occurrence, and if the frequency of two words appearing at the same time is high, it is indicated by a thick line. It can be seen that the 'COVID-blue' pair is displayed in the boldest, and the 'COVID-emotion' and 'COVID-anxiety' pairs are displayed with a relatively thick line. 'Blue' related to COVID-19 is a word that means depression, and it was confirmed that COVID-19 and depression are keywords that should be of interest now. The research methodology used in this study has the convenience of being able to quickly measure social phenomena and changes while reducing costs. In this study, by analyzing news headlines, we were able to identify people's feelings and perceptions on issues related to COVID-19 depression, and identify the main agendas to be analyzed by deriving important keywords. By presenting and visualizing the subject and important keywords related to the COVID-19 emotion at a time, medical policy managers will be able to be provided a variety of perspectives when identifying and researching the regarding phenomenon. It is expected that it can help to use it as basic data for support, treatment and service development for psychological quarantine issues related to COVID-19.

A Study on Industry-specific Sustainability Strategy: Analyzing ESG Reports and News Articles (산업별 지속가능경영 전략 고찰: ESG 보고서와 뉴스 기사를 중심으로)

  • WonHee Kim;YoungOk Kwon
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.287-316
    • /
    • 2023
  • As global energy crisis and the COVID-19 pandemic have emerged as social issues, there is a growing demand for companies to move away from profit-centric business models and embrace sustainable management that balances environmental, social, and governance (ESG) factors. ESG activities of companies vary across industries, and industry-specific weights are applied in ESG evaluations. Therefore, it is important to develop strategic management approaches that reflect the characteristics of each industry and the importance of each ESG factor. Additionally, with the stance of strengthened focus on ESG disclosures, specific guidelines are needed to identify and report on sustainable management activities of domestic companies. To understand corporate sustainability strategies, analyzing ESG reports and news articles by industry can help identify strategic characteristics in specific industries. However, each company has its own unique strategies and report structures, making it difficult to grasp detailed trends or action items. In our study, we analyzed ESG reports (2019-2021) and news articles (2019-2022) of six companies in the 'Finance,' 'Manufacturing,' and 'IT' sectors to examine the sustainability strategies of leading domestic ESG companies. Text mining techniques such as keyword frequency analysis and topic modeling were applied to identify industry-specific, ESG element-specific management strategies and issues. The analysis revealed that in the 'Finance' sector, customer-centric management strategies and efforts to promote an inclusive culture within and outside the company were prominent. Strategies addressing climate change, such as carbon neutrality and expanding green finance, were also emphasized. In the 'Manufacturing' sector, the focus was on creating sustainable communities through occupational health and safety issues, sustainable supply chain management, low-carbon technology development, and eco-friendly investments to achieve carbon neutrality. In the 'IT' sector, there was a tendency to focus on technological innovation and digital responsibility to enhance social value through technology. Furthermore, the key issues identified in the ESG factors were as follows: under the 'Environmental' element, issues such as greenhouse gas and carbon emission management, industry-specific eco-friendly activities, and green partnerships were identified. Under the 'Social' element, key issues included social contribution activities through stakeholder engagement, supporting the growth and coexistence of members and partner companies, and enhancing customer value through stable service provision. Under the 'Governance' element, key issues were identified as strengthening board independence through the appointment of outside directors, risk management and communication for sustainable growth, and establishing transparent governance structures. The exploration of the relationship between ESG disclosures in reports and ESG issues in news articles revealed that the sustainability strategies disclosed in reports were aligned with the issues related to ESG disclosed in news articles. However, there was a tendency to strengthen ESG activities for prevention and improvement after negative media coverage that could have a negative impact on corporate image. Additionally, environmental issues were mentioned more frequently in news articles compared to ESG reports, with environmental-related keywords being emphasized in the 'Finance' sector in the reports. Thus, ESG reports and news articles shared some similarities in content due to the sharing of information sources. However, the impact of media coverage influenced the emphasis on specific sustainability strategies, and the extent of mentioning environmental issues varied across documents. Based on our study, the following contributions were derived. From a practical perspective, companies need to consider their characteristics and establish sustainability strategies that align with their capabilities and situations. From an academic perspective, unlike previous studies on ESG strategies, we present a subdivided methodology through analysis considering the industry-specific characteristics of companies.

The Mediating Effect of Experiential Value on Customers' Perceived Value of Digital Content: China's Anti-virus Program Market (경험개치대소비자대전자내용적인지개치적중개영향(经验价值对消费者对电子内容的认知价值的中介影响): 중국살독연건시장(中国杀毒软件市场))

  • Jia, Weiwei;Kim, Sae-Bum
    • Journal of Global Scholars of Marketing Science
    • /
    • v.20 no.2
    • /
    • pp.219-230
    • /
    • 2010
  • Digital content makes big changes to our daily lives while bringing opportunities and challenges for companies. Creative firms integrate pictures, texts, videos, audios, and data by digitalization to develop new products or services and create digital experiences to promote their brands. Most articles on digital content contribute to the basic concept or development of marketing it in literature. Actually, compared with traditional value chains for common products or services, the digital content industry seems to have more potential value. Because quite a bit of digital content is free to the consumer, price is not necessarily perceived as an indicator of the quality or value of information (Rowley 2008). It becomes evident that a current theme in digital content is the issue of "value," and research on customers' perceived value of digital content is a necessity. This article argues that experiential value has an advantage in customers' evaluations of digital content. Two different but related contributions to the understanding of "value" of digital content are made here. First, based on the comparison of digital content with products and services, the article proposes two key characteristics that make experiential strategy available for digital content: intangibility and near-zero reproduction cost. On top of that, based on the discussion of the gap between company's idealized value and customer's perceived value, this article emphasizes that digital content prices and pricing of digital content is different from products and services. As a result of intangibility, prices may not reflect customer value. Moreover, the cost of digital content in the development stage may be very high while reproduction costs shrink dramatically. Moreover, because of the value gap mentioned before, the pricing polices vary for different digital contents. For example, flat price policy is generally used for movies and music (Magiera 2001; Netherby 2002), while for continuous demand, digital content such as online games and anti-virus programs involves a more complicated matter of utility and competitive price levels. Digital content companies have to explore various kinds of strategies to overcome this gap. Rethinking marketing solutions such as advertisements, images, and word-of-mouth and their effect on customers' perceived value becomes essential. China's digital content industry is becoming more and more globalized and drawing special attention from different countries and regions that have respective competitive advantages. The 2008-2009 Annual Report on the Development of China's Digital Content Industry (CCIDConsulting 2009) indicates that, with the driven power of domestic demand and governmental policy support, the country's digital content industry maintained a fast growth of some 30 percent in 2008, obviously indicating the initial stage of industry expansion. In China, anti-virus programs and other software programs which need to be updated use a quarter-based pricing policy. Customers can download a trial version for free and use it for six months or a year. If they want to use it longer, continuous payment is needed. They examine the excellence of the digital content during this trial period and decide whether to pay for continued usage. For China’s music and movie industries, as a result of initial development, experiential strategy has not been much applied, even though firms in other countries find the trial experience and explore important strategies(such as customers listening to music for several seconds for free before downloading it). For the above reasons, anti-virus program may be a representative for digital content industry in China and an exploratory study of the advantage of experiential value in customer's perceived value of digital content is done in the anti-virus market of China. In order to enhance the reliability of the survey data, this study focused on people who were experienced users of anti-virus programs. The empirical results revealed that experiential value has a positive effect on customers' perceived value of digital content. In other words, because digital content is intangible and the reproduction costs are nearly zero, customers' evaluations are based heavily on their experience. Moreover, image and word-of-mouth do not have a positive effect on perceived value, only on experiential value. That is to say, a digital content value chain is different from that of a general product or service. Experiential value has a notable advantage and mediates the effect of image and word-of-mouth on perceived value. The results of this study help provide an understanding of why free digital content downloads exist in developing countries. Customers can perceive the value of digital content only by using and experiencing it. This is also why such governments support the development of digital content. Other developing countries whose digital content business is also in the beginning stage can make use of the suggestions here. Moreover, based on the advantage of experiential strategy, companies should make more of an effort to invest in customers' experience. As a result of the characteristics and value gap of digital content, customers perceive more value in the intangible digital content only by experiencing what they really want. Moreover, because of the near-zero reproduction costs, companies can perhaps use experiential strategy to enhance customer understanding of digital content.

A Study on the Influence of IT Education Service Quality on Educational Satisfaction, Work Application Intention, and Recommendation Intention: Focusing on the Moderating Effects of Learner Position and Participation Motivation (IT교육 서비스품질이 교육만족도, 현업적용의도 및 추천의도에 미치는 영향에 관한 연구: 학습자 직위 및 참여동기의 조절효과를 중심으로)

  • Kang, Ryeo-Eun;Yang, Sung-Byung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.169-196
    • /
    • 2017
  • The fourth industrial revolution represents a revolutionary change in the business environment and its ecosystem, which is a fusion of Information Technology (IT) and other industries. In line with these recent changes, the Ministry of Employment and Labor of South Korea announced 'the Fourth Industrial Revolution Leader Training Program,' which includes five key support areas such as (1) smart manufacturing, (2) Internet of Things (IoT), (3) big data including Artificial Intelligence (AI), (4) information security, and (5) bio innovation. Based on this program, we can get a glimpse of the South Korean government's efforts and willingness to emit leading human resource with advanced IT knowledge in various fusion technology-related and newly emerging industries. On the other hand, in order to nurture excellent IT manpower in preparation for the fourth industrial revolution, the role of educational institutions capable of providing high quality IT education services is most of importance. However, these days, most IT educational institutions have had difficulties in providing customized IT education services that meet the needs of consumers (i.e., learners), without breaking away from the traditional framework of providing supplier-oriented education services. From previous studies, it has been found that the provision of customized education services centered on learners leads to high satisfaction of learners, and that higher satisfaction increases not only task performance and the possibility of business application but also learners' recommendation intention. However, since research has not yet been conducted in a comprehensive way that consider both antecedent and consequent factors of the learner's satisfaction, more empirical research on this is highly desirable. With the advent of the fourth industrial revolution, a rising interest in various convergence technologies utilizing information technology (IT) has brought with the growing realization of the important role played by IT-related education services. However, research on the role of IT education service quality in the context of IT education is relatively scarce in spite of the fact that research on general education service quality and satisfaction has been actively conducted in various contexts. In this study, therefore, the five dimensions of IT education service quality (i.e., tangibles, reliability, responsiveness, assurance, and empathy) are derived from the context of IT education, based on the SERVPERF model and related previous studies. In addition, the effects of these detailed IT education service quality factors on learners' educational satisfaction and their work application/recommendation intentions are examined. Furthermore, the moderating roles of learner position (i.e., practitioner group vs. manager group) and participation motivation (i.e., voluntary participation vs. involuntary participation) in relationships between IT education service quality factors and learners' educational satisfaction, work application intention, and recommendation intention are also investigated. In an analysis using the structural equation model (SEM) technique based on a questionnaire given to 203 participants of IT education programs in an 'M' IT educational institution in Seoul, South Korea, tangibles, reliability, and assurance were found to have a significant effect on educational satisfaction. This educational satisfaction was found to have a significant effect on both work application intention and recommendation intention. Moreover, it was discovered that learner position and participation motivation have a partial moderating impact on the relationship between IT education service quality factors and educational satisfaction. This study holds academic implications in that it is one of the first studies to apply the SERVPERF model (rather than the SERVQUAL model, which has been widely adopted by prior studies) is to demonstrate the influence of IT education service quality on learners' educational satisfaction, work application intention, and recommendation intention in an IT education environment. The results of this study are expected to provide practical guidance for IT education service providers who wish to enhance learners' educational satisfaction and service management efficiency.

A Study on Commodity Asset Investment Model Based on Machine Learning Technique (기계학습을 활용한 상품자산 투자모델에 관한 연구)

  • Song, Jin Ho;Choi, Heung Sik;Kim, Sun Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.127-146
    • /
    • 2017
  • Services using artificial intelligence have begun to emerge in daily life. Artificial intelligence is applied to products in consumer electronics and communications such as artificial intelligence refrigerators and speakers. In the financial sector, using Kensho's artificial intelligence technology, the process of the stock trading system in Goldman Sachs was improved. For example, two stock traders could handle the work of 600 stock traders and the analytical work for 15 people for 4weeks could be processed in 5 minutes. Especially, big data analysis through machine learning among artificial intelligence fields is actively applied throughout the financial industry. The stock market analysis and investment modeling through machine learning theory are also actively studied. The limits of linearity problem existing in financial time series studies are overcome by using machine learning theory such as artificial intelligence prediction model. The study of quantitative financial data based on the past stock market-related numerical data is widely performed using artificial intelligence to forecast future movements of stock price or indices. Various other studies have been conducted to predict the future direction of the market or the stock price of companies by learning based on a large amount of text data such as various news and comments related to the stock market. Investing on commodity asset, one of alternative assets, is usually used for enhancing the stability and safety of traditional stock and bond asset portfolio. There are relatively few researches on the investment model about commodity asset than mainstream assets like equity and bond. Recently machine learning techniques are widely applied on financial world, especially on stock and bond investment model and it makes better trading model on this field and makes the change on the whole financial area. In this study we made investment model using Support Vector Machine among the machine learning models. There are some researches on commodity asset focusing on the price prediction of the specific commodity but it is hard to find the researches about investment model of commodity as asset allocation using machine learning model. We propose a method of forecasting four major commodity indices, portfolio made of commodity futures, and individual commodity futures, using SVM model. The four major commodity indices are Goldman Sachs Commodity Index(GSCI), Dow Jones UBS Commodity Index(DJUI), Thomson Reuters/Core Commodity CRB Index(TRCI), and Rogers International Commodity Index(RI). We selected each two individual futures among three sectors as energy, agriculture, and metals that are actively traded on CME market and have enough liquidity. They are Crude Oil, Natural Gas, Corn, Wheat, Gold and Silver Futures. We made the equally weighted portfolio with six commodity futures for comparing with other commodity indices. We set the 19 macroeconomic indicators including stock market indices, exports & imports trade data, labor market data, and composite leading indicators as the input data of the model because commodity asset is very closely related with the macroeconomic activities. They are 14 US economic indicators, two Chinese economic indicators and two Korean economic indicators. Data period is from January 1990 to May 2017. We set the former 195 monthly data as training data and the latter 125 monthly data as test data. In this study, we verified that the performance of the equally weighted commodity futures portfolio rebalanced by the SVM model is better than that of other commodity indices. The prediction accuracy of the model for the commodity indices does not exceed 50% regardless of the SVM kernel function. On the other hand, the prediction accuracy of equally weighted commodity futures portfolio is 53%. The prediction accuracy of the individual commodity futures model is better than that of commodity indices model especially in agriculture and metal sectors. The individual commodity futures portfolio excluding the energy sector has outperformed the three sectors covered by individual commodity futures portfolio. In order to verify the validity of the model, it is judged that the analysis results should be similar despite variations in data period. So we also examined the odd numbered year data as training data and the even numbered year data as test data and we confirmed that the analysis results are similar. As a result, when we allocate commodity assets to traditional portfolio composed of stock, bond, and cash, we can get more effective investment performance not by investing commodity indices but by investing commodity futures. Especially we can get better performance by rebalanced commodity futures portfolio designed by SVM model.

Construction of Consumer Confidence index based on Sentiment analysis using News articles (뉴스기사를 이용한 소비자의 경기심리지수 생성)

  • Song, Minchae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.1-27
    • /
    • 2017
  • It is known that the economic sentiment index and macroeconomic indicators are closely related because economic agent's judgment and forecast of the business conditions affect economic fluctuations. For this reason, consumer sentiment or confidence provides steady fodder for business and is treated as an important piece of economic information. In Korea, private consumption accounts and consumer sentiment index highly relevant for both, which is a very important economic indicator for evaluating and forecasting the domestic economic situation. However, despite offering relevant insights into private consumption and GDP, the traditional approach to measuring the consumer confidence based on the survey has several limits. One possible weakness is that it takes considerable time to research, collect, and aggregate the data. If certain urgent issues arise, timely information will not be announced until the end of each month. In addition, the survey only contains information derived from questionnaire items, which means it can be difficult to catch up to the direct effects of newly arising issues. The survey also faces potential declines in response rates and erroneous responses. Therefore, it is necessary to find a way to complement it. For this purpose, we construct and assess an index designed to measure consumer economic sentiment index using sentiment analysis. Unlike the survey-based measures, our index relies on textual analysis to extract sentiment from economic and financial news articles. In particular, text data such as news articles and SNS are timely and cover a wide range of issues; because such sources can quickly capture the economic impact of specific economic issues, they have great potential as economic indicators. There exist two main approaches to the automatic extraction of sentiment from a text, we apply the lexicon-based approach, using sentiment lexicon dictionaries of words annotated with the semantic orientations. In creating the sentiment lexicon dictionaries, we enter the semantic orientation of individual words manually, though we do not attempt a full linguistic analysis (one that involves analysis of word senses or argument structure); this is the limitation of our research and further work in that direction remains possible. In this study, we generate a time series index of economic sentiment in the news. The construction of the index consists of three broad steps: (1) Collecting a large corpus of economic news articles on the web, (2) Applying lexicon-based methods for sentiment analysis of each article to score the article in terms of sentiment orientation (positive, negative and neutral), and (3) Constructing an economic sentiment index of consumers by aggregating monthly time series for each sentiment word. In line with existing scholarly assessments of the relationship between the consumer confidence index and macroeconomic indicators, any new index should be assessed for its usefulness. We examine the new index's usefulness by comparing other economic indicators to the CSI. To check the usefulness of the newly index based on sentiment analysis, trend and cross - correlation analysis are carried out to analyze the relations and lagged structure. Finally, we analyze the forecasting power using the one step ahead of out of sample prediction. As a result, the news sentiment index correlates strongly with related contemporaneous key indicators in almost all experiments. We also find that news sentiment shocks predict future economic activity in most cases. In almost all experiments, the news sentiment index strongly correlates with related contemporaneous key indicators. Furthermore, in most cases, news sentiment shocks predict future economic activity; in head-to-head comparisons, the news sentiment measures outperform survey-based sentiment index as CSI. Policy makers want to understand consumer or public opinions about existing or proposed policies. Such opinions enable relevant government decision-makers to respond quickly to monitor various web media, SNS, or news articles. Textual data, such as news articles and social networks (Twitter, Facebook and blogs) are generated at high-speeds and cover a wide range of issues; because such sources can quickly capture the economic impact of specific economic issues, they have great potential as economic indicators. Although research using unstructured data in economic analysis is in its early stages, but the utilization of data is expected to greatly increase once its usefulness is confirmed.

Development of New Variables Affecting Movie Success and Prediction of Weekly Box Office Using Them Based on Machine Learning (영화 흥행에 영향을 미치는 새로운 변수 개발과 이를 이용한 머신러닝 기반의 주간 박스오피스 예측)

  • Song, Junga;Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.67-83
    • /
    • 2018
  • The Korean film industry with significant increase every year exceeded the number of cumulative audiences of 200 million people in 2013 finally. However, starting from 2015 the Korean film industry entered a period of low growth and experienced a negative growth after all in 2016. To overcome such difficulty, stakeholders like production company, distribution company, multiplex have attempted to maximize the market returns using strategies of predicting change of market and of responding to such market change immediately. Since a film is classified as one of experiential products, it is not easy to predict a box office record and the initial number of audiences before the film is released. And also, the number of audiences fluctuates with a variety of factors after the film is released. So, the production company and distribution company try to be guaranteed the number of screens at the opining time of a newly released by multiplex chains. However, the multiplex chains tend to open the screening schedule during only a week and then determine the number of screening of the forthcoming week based on the box office record and the evaluation of audiences. Many previous researches have conducted to deal with the prediction of box office records of films. In the early stage, the researches attempted to identify factors affecting the box office record. And nowadays, many studies have tried to apply various analytic techniques to the factors identified previously in order to improve the accuracy of prediction and to explain the effect of each factor instead of identifying new factors affecting the box office record. However, most of previous researches have limitations in that they used the total number of audiences from the opening to the end as a target variable, and this makes it difficult to predict and respond to the demand of market which changes dynamically. Therefore, the purpose of this study is to predict the weekly number of audiences of a newly released film so that the stakeholder can flexibly and elastically respond to the change of the number of audiences in the film. To that end, we considered the factors used in the previous studies affecting box office and developed new factors not used in previous studies such as the order of opening of movies, dynamics of sales. Along with the comprehensive factors, we used the machine learning method such as Random Forest, Multi Layer Perception, Support Vector Machine, and Naive Bays, to predict the number of cumulative visitors from the first week after a film release to the third week. At the point of the first and the second week, we predicted the cumulative number of visitors of the forthcoming week for a released film. And at the point of the third week, we predict the total number of visitors of the film. In addition, we predicted the total number of cumulative visitors also at the point of the both first week and second week using the same factors. As a result, we found the accuracy of predicting the number of visitors at the forthcoming week was higher than that of predicting the total number of them in all of three weeks, and also the accuracy of the Random Forest was the highest among the machine learning methods we used. This study has implications in that this study 1) considered various factors comprehensively which affect the box office record and merely addressed by other previous researches such as the weekly rating of audiences after release, the weekly rank of the film after release, and the weekly sales share after release, and 2) tried to predict and respond to the demand of market which changes dynamically by suggesting models which predicts the weekly number of audiences of newly released films so that the stakeholders can flexibly and elastically respond to the change of the number of audiences in the film.

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

  • Park, Jiae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.143-163
    • /
    • 2016
  • The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

Factors Affecting International Transfer Pricing of Multinational Enterprises in Korea (외국인투자기업의 국제이전가격 결정에 영향을 미치는 환경 및 기업요인)

  • Jun, Tae-Young;Byun, Yong-Hwan
    • Korean small business review
    • /
    • v.31 no.2
    • /
    • pp.85-102
    • /
    • 2009
  • With the continued globalization of world markets, transfer pricing has become one of the dominant sources of controversy in international taxation. Transfer pricing is the process by which a multinational corporation calculates a price for goods and services that are transferred to affiliated entities. Consider a Korean electronic enterprise that buys supplies from its own subsidiary located in China. How much the Korean parent company pays its subsidiary will determine how much profit the Chinese unit reports in local taxes. If the parent company pays above normal market prices, it may appear to have a poor profit, even if the group as a whole shows a respectable profit margin. In this way, transfer prices impact the taxable income reported in each country in which the multinational enterprise operates. It's importance lies in that around 60% of international trade involves transactions between two related parts of multinationals, according to the OECD. Multinational enterprises (hereafter MEs) exert much effort into utilizing organizational advantages to make global investments. MEs wish to minimize their tax burden. So MEs spend a fortune on economists and accountants to justify transfer prices that suit their tax needs. On the contrary, local governments are not prepared to cope with MEs' powerful financial instruments. Tax authorities in each country wish to ensure that the tax base of any ME is divided fairly. Thus, both tax authorities and MEs have a vested interest in the way in which a transfer price is determined, and this is why MEs' international transfer prices are at the center of disputes concerned with taxation. Transfer pricing issues and practices are sometimes difficult to control for regulators because the tax administration does not have enough staffs with the knowledge and resources necessary to understand them. The authors examine transfer pricing practices to provide relevant resources useful in designing tax incentives and regulation schemes for policy makers. This study focuses on identifying the relevant business and environmental factors that could influence the international transfer pricing of MEs. In this perspective, we empirically investigate how the management perception of related variables influences their choice of international transfer pricing methods. We believe that this research is particularly useful in the design of tax policy. Because it can concentrate on a few selected factors in consideration of the limited budget of the tax administration with assistance of this research. Data is composed of questionnaire responses from foreign firms in Korea with investment balances exceeding one million dollars in the end of 2004. We mailed questionnaires to 861 managers in charge of the accounting departments of each company, resulting in 121 valid responses. Seventy six percent of the sample firms are classified as small and medium sized enterprises with assets below 100 billion Korean won. Reviewing transfer pricing methods, cost-based transfer pricing is most popular showing that 60 firms have adopted it. The market-based method is used by 31 firms, and 13 firms have reported the resale-pricing method. Regarding the nationalities of foreign investors, the Japanese and the Americans constitute most of the sample. Logistic regressions have been performed for statistical analysis. The dependent variable is binary in that whether the method of international transfer pricing is a market-based method or a cost-based method. This type of binary classification is founded on the belief that the market-based method is evaluated as the relatively objective way of pricing compared with the cost-based methods. Cost-based pricing is assumed to give mangers flexibility in transfer pricing decisions. Therefore, local regulatory agencies are thought to prefer market-based pricing over cost-based pricing. Independent variables are composed of eight factors such as corporate tax rate, tariffs, relations with local tax authorities, tax audit, equity ratios of local investors, volume of internal trade, sales volume, and product life cycle. The first four variables are included in the model because taxation lies in the center of transfer pricing disputes. So identifying the impact of these variables in Korean business environments is much needed. Equity ratio is included to represent the interest of local partners. Volume of internal trade was sometimes employed in previous research to check the pricing behavior of managers, so we have followed these footsteps in this paper. Product life cycle is used as a surrogate of competition in local markets. Control variables are firm size and nationality of foreign investors. Firm size is controlled using dummy variables in that whether or not the specific firm is small and medium sized. This is because some researchers report that big firms show different behaviors compared with small and medium sized firms in transfer pricing. The other control variable is also expressed in dummy variable showing if the entrepreneur is the American or not. That's because some prior studies conclude that the American management style is different in that they limit branch manger's freedom of decision. Reviewing the statistical results, we have found that managers prefer the cost-based method over the market-based method as the importance of corporate taxes and tariffs increase. This result means that managers need flexibility to lessen the tax burden when they feel taxes are important. They also prefer the cost-based method as the product life cycle matures, which means that they support subsidiaries in local market competition using cost-based transfer pricing. On the contrary, as the relationship with local tax authorities becomes more important, managers prefer the market-based method. That is because market-based pricing is a better way to maintain good relations with the tax officials. Other variables like tax audit, volume of internal transactions, sales volume, and local equity ratio have shown only insignificant influence. Additionally, we have replaced two tax variables(corporate taxes and tariffs) with the data showing top marginal tax rate and mean tariff rates of each country, and have performed another regression to find if we could get different results compared with the former one. As a consequence, we have found something different on the part of mean tariffs, that shows only an insignificant influence on the dependent variable. We guess that each company in the sample pays tariffs with a specific rate applied only for one's own company, which could be located far from mean tariff rates. Therefore we have concluded we need a more detailed data that shows the tariffs of each company if we want to check the role of this variable. Considering that the present paper has heavily relied on questionnaires, an effort to build a reliable data base is needed for enhancing the research reliability.

The Pattern Analysis of Financial Distress for Non-audited Firms using Data Mining (데이터마이닝 기법을 활용한 비외감기업의 부실화 유형 분석)

  • Lee, Su Hyun;Park, Jung Min;Lee, Hyoung Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.111-131
    • /
    • 2015
  • There are only a handful number of research conducted on pattern analysis of corporate distress as compared with research for bankruptcy prediction. The few that exists mainly focus on audited firms because financial data collection is easier for these firms. But in reality, corporate financial distress is a far more common and critical phenomenon for non-audited firms which are mainly comprised of small and medium sized firms. The purpose of this paper is to classify non-audited firms under distress according to their financial ratio using data mining; Self-Organizing Map (SOM). SOM is a type of artificial neural network that is trained using unsupervised learning to produce a lower dimensional discretized representation of the input space of the training samples, called a map. SOM is different from other artificial neural networks as it applies competitive learning as opposed to error-correction learning such as backpropagation with gradient descent, and in the sense that it uses a neighborhood function to preserve the topological properties of the input space. It is one of the popular and successful clustering algorithm. In this study, we classify types of financial distress firms, specially, non-audited firms. In the empirical test, we collect 10 financial ratios of 100 non-audited firms under distress in 2004 for the previous two years (2002 and 2003). Using these financial ratios and the SOM algorithm, five distinct patterns were distinguished. In pattern 1, financial distress was very serious in almost all financial ratios. 12% of the firms are included in these patterns. In pattern 2, financial distress was weak in almost financial ratios. 14% of the firms are included in pattern 2. In pattern 3, growth ratio was the worst among all patterns. It is speculated that the firms of this pattern may be under distress due to severe competition in their industries. Approximately 30% of the firms fell into this group. In pattern 4, the growth ratio was higher than any other pattern but the cash ratio and profitability ratio were not at the level of the growth ratio. It is concluded that the firms of this pattern were under distress in pursuit of expanding their business. About 25% of the firms were in this pattern. Last, pattern 5 encompassed very solvent firms. Perhaps firms of this pattern were distressed due to a bad short-term strategic decision or due to problems with the enterpriser of the firms. Approximately 18% of the firms were under this pattern. This study has the academic and empirical contribution. In the perspectives of the academic contribution, non-audited companies that tend to be easily bankrupt and have the unstructured or easily manipulated financial data are classified by the data mining technology (Self-Organizing Map) rather than big sized audited firms that have the well prepared and reliable financial data. In the perspectives of the empirical one, even though the financial data of the non-audited firms are conducted to analyze, it is useful for find out the first order symptom of financial distress, which makes us to forecast the prediction of bankruptcy of the firms and to manage the early warning and alert signal. These are the academic and empirical contribution of this study. The limitation of this research is to analyze only 100 corporates due to the difficulty of collecting the financial data of the non-audited firms, which make us to be hard to proceed to the analysis by the category or size difference. Also, non-financial qualitative data is crucial for the analysis of bankruptcy. Thus, the non-financial qualitative factor is taken into account for the next study. This study sheds some light on the non-audited small and medium sized firms' distress prediction in the future.