• Title/Summary/Keyword: Social Media Performance

Search Result 250, Processing Time 0.026 seconds

Reliability Analysis of VOC Data for Opinion Mining (오피니언 마이닝을 위한 VOC 데이타의 신뢰성 분석)

  • Kim, Dongwon;Yu, Song Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.217-245
    • /
    • 2016
  • The purpose of this study is to verify how 7 sentiment domains extracted through sentiment analysis from social media have an influence on business performance. It consists of three phases. In phase I, we constructed the sentiment lexicon after crawling 45,447 pieces of VOC (Voice of the Customer) on 26 auto companies from the car community and extracting the POS information and built a seven-sensitive domains. In phase II, in order to retain the reliability of experimental data, we examined auto-correlation analysis and PCA. In phase III, we investigated how 7 domains impact on the market share of three major (GM, FCA, and VOLKSWAGEN) auto companies by using linear regression analysis. The findings from the auto-correlation analysis proved auto-correlation and the sequence of the sentiments, and the results from PCA reported the 7 sentiments connected with positivity, negativity and neutrality. As a result of linear regression analysis on model 1, we indentified that the sentimental factors have a significant influence on the actual market share. In particular, not only posotive and negative sentiment domains, but neutral sentiment had significantly impacted on auto market share. As we apply the availability of data to the market, and take advantage of auto-correlation of the market-related information and the sentiment, the findings will be a huge contribution to other researches on sentiment analysis as well as actual business performances in various ways.

Politics of "Imagined Ethnicity" in World Music (월드뮤직에서 "상상된 민족"의 정치학)

  • Kim, Hee-sun
    • (The) Research of the performance art and culture
    • /
    • no.22
    • /
    • pp.223-252
    • /
    • 2011
  • If we remember that modern world history has built systems of meaning through the concepts "difference," "different," and "other-ness" and has constructed new identity based on opposing hierarchy, music anthropology which tried to build "difference" between the west and the non-west was thoroughly west -centered, in the sense that it has perceived the heterogeneous symbolic systems among nations, as well as the barrier between the two cultures. On the other hand, world music, which has emerged as the most attractive field in culture industry and concert-art-market by crossing over global capitals, markets, and barriers, can be considered the most post-modernist and glocal. However, it is interesting to note that world music, which has been described as post-modern and glocal, has "difference" and "different" in its basis, just like the precepts for modern music anthropology (Meintjes 1990; Guilbault 1993; Taylor 1997; Frith 2000; Feld 1988). Furthermore, one can understand that the "different" and "difference," generally termed as being "non-western," are fundamentally based on ethnic or national imagination. In this sense it is interesting and important to examine such ethnic imagination in the "non-western ethnic musics" in music anthropology and in world music. Notwithstanding the attention paid and research made by music anthropologists, they have failed to elevate the "non-western ethnic musics" to become universally communicative, and these ethnic musics were reborn as "global" and "world music," through the process of "acculturation," "derivation," and "hybridization," with the west as major site for production and consumption. Meanwhile, the audience for world music, which did not exist before the birth of world music as a term, was now born as world music emerged. They are global populace who consume the musical "difference" and "imagined ethnicity," who through their consumption are constructing new social meanings including ethnicity, race, nation, and class identity. This study, by examining current discourse, performance, and process for the world music through media and field studies and scholarly debates, attempts to understand the production and consumption of "imagined ethnicity." This will also shed light on how "ethnicity" is created and consumed, and how this is involved in the process of world music.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (비정형 텍스트 분석을 활용한 이슈의 동적 변이과정 고찰)

  • Lim, Myungsu;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.1-18
    • /
    • 2016
  • Owing to the extensive use of Web media and the development of the IT industry, a large amount of data has been generated, shared, and stored. Nowadays, various types of unstructured data such as image, sound, video, and text are distributed through Web media. Therefore, many attempts have been made in recent years to discover new value through an analysis of these unstructured data. Among these types of unstructured data, text is recognized as the most representative method for users to express and share their opinions on the Web. In this sense, demand for obtaining new insights through text analysis is steadily increasing. Accordingly, text mining is increasingly being used for different purposes in various fields. In particular, issue tracking is being widely studied not only in the academic world but also in industries because it can be used to extract various issues from text such as news, (SocialNetworkServices) to analyze the trends of these issues. Conventionally, issue tracking is used to identify major issues sustained over a long period of time through topic modeling and to analyze the detailed distribution of documents involved in each issue. However, because conventional issue tracking assumes that the content composing each issue does not change throughout the entire tracking period, it cannot represent the dynamic mutation process of detailed issues that can be created, merged, divided, and deleted between these periods. Moreover, because only keywords that appear consistently throughout the entire period can be derived as issue keywords, concrete issue keywords such as "nuclear test" and "separated families" may be concealed by more general issue keywords such as "North Korea" in an analysis over a long period of time. This implies that many meaningful but short-lived issues cannot be discovered by conventional issue tracking. Note that detailed keywords are preferable to general keywords because the former can be clues for providing actionable strategies. To overcome these limitations, we performed an independent analysis on the documents of each detailed period. We generated an issue flow diagram based on the similarity of each issue between two consecutive periods. The issue transition pattern among categories was analyzed by using the category information of each document. In this study, we then applied the proposed methodology to a real case of 53,739 news articles. We derived an issue flow diagram from the articles. We then proposed the following useful application scenarios for the issue flow diagram presented in the experiment section. First, we can identify an issue that actively appears during a certain period and promptly disappears in the next period. Second, the preceding and following issues of a particular issue can be easily discovered from the issue flow diagram. This implies that our methodology can be used to discover the association between inter-period issues. Finally, an interesting pattern of one-way and two-way transitions was discovered by analyzing the transition patterns of issues through category analysis. Thus, we discovered that a pair of mutually similar categories induces two-way transitions. In contrast, one-way transitions can be recognized as an indicator that issues in a certain category tend to be influenced by other issues in another category. For practical application of the proposed methodology, high-quality word and stop word dictionaries need to be constructed. In addition, not only the number of documents but also additional meta-information such as the read counts, written time, and comments of documents should be analyzed. A rigorous performance evaluation or validation of the proposed methodology should be performed in future works.

A Study on Antecedents of Ethical Leadership of Power Retailers, : Focusing on the Relationship between Discount Stores and Their Suppliers (대형 유통업체 윤리적 리더십의 선행변수에 관한 연구 : 할인점과 공급업체 간 관계를 중심으로)

  • Kim, Sang-Deok
    • Journal of Distribution Research
    • /
    • v.17 no.3
    • /
    • pp.59-92
    • /
    • 2012
  • With accumulated research evidence, there is little doubt that leadership behavior is related to a wide variety of positive individual and organizational outcomes. Indeed, leadership behavior has been empirically linked to increased employee satisfaction, organizational commitment, extra effort, turnover intention, organizational citizenship behavior, and overall employee performance. Although leadership behavior has been linked to a number of positive organizational outcomes, research regarding the antecedents of such behavior is limited. Especially there is little research dealing with the antecedents of inter-organizational leadership behavior. This study interests in inter-organizational ethical leadership among marketing channel members. In both the mass media and the academic association, there has been a surge in interest in the ethical and unethical behavior of leaders. Although the corporate scandals in recent years may explain much of the mass media and popular focus, academics' interest has been limited by evidence that ethical leadership behavior is associated with both positive and negative inter-organizational processes and performances. This study tried to contribute to this body of knowledge by examining antecedents of ethical leadership. Ethical leadership is defined "the demonstration of normatively appropriate conduct through personal actions and interpersonal relationships, and the promotion of such conduct to followers through two-way communication, reinforcement, and decision-making." Ethical leaders not only inform individuals of the behefits of ethical behavior and the cost of inappropriate behavior, such leaders also set clear standards and use rewards and fair and balanced punishment to hold followers accountable for their ethical conduct. Despite the assume importance and prominence of ethical leadership among organizations, there are still many questions relating to its antecedents and consequences. One is whether the likelihood of an leading organization being perceived as an ethical leader among other following organizations in marketing channels can be predicted using its characteristics and inter-organizational relationship maintenance skills. Identifying trait and skill antecedents will aid in the development of strategies for selecting and developing ethical leaders and determining the best means to reinforce ethical behaviors. The purpose of this study is to investigate the effects of three categorized variables on ethical leadership of channel leader. To be concrete, this study develops a model of the antecedents of three conceptually distinct forms of channel leader characteristics, such as organizational traits, inter-organizational relationship maintenance strategies, and supplier management strategies, and tests the hypothesized differential effects on ethical leadership of marketing channel leaders. The reason why this study deals with discount store channel is that there is very strong inter-dependence between a discount store and its suppliers. Their strong inter-dependence makes their relationship as the relationship between a leader and suppliers and creates an atmosphere that leadership occur without difficulty. The research model is as follows. For the purpose of empirical testing, 295 respondents of suppliers of discount store channel in Korea were surveyed. The procedures included scale reliability, and discriminant and convergent validity were used to validate measures. Also, the reliability measurements traditionally employed, such as the Cronbach's alpha, were used. All the reliabilities were greater than .70. This study conducted confirmatory factor analyses to assess the validity of our measurements. All items loaded significantly on their respective constructs(with the lowest t-value being 15.2), providing support for convergent validity. We then examined composite reliability and average variance extracted(AVE). The composite reliability of each construct was greater than .70. The AVE of each construct was greater than .50. This study tested research model using Partial Least Square(PLS). The estimation of the structural equation model revealed an acceptable fit of the model to the data($r^2$=.851). Thus, This study concluded that the model fit was considered acceptable. The results of PLS are as follows. The results indicated that conscientiousness, openness, conflict management, social networks, training, fair reward had positive effects on ethical leadership of channel leaders. On the other hand, emotional insecure had negative effect and agreeableness, assurance, and inter-organizational communication had no significant effect on supply chain leadership.

  • PDF

A Folksonomy Ranking Framework: A Semantic Graph-based Approach (폭소노미 사이트를 위한 랭킹 프레임워크 설계: 시맨틱 그래프기반 접근)

  • Park, Hyun-Jung;Rho, Sang-Kyu
    • Asia pacific journal of information systems
    • /
    • v.21 no.2
    • /
    • pp.89-116
    • /
    • 2011
  • In collaborative tagging systems such as Delicious.com and Flickr.com, users assign keywords or tags to their uploaded resources, such as bookmarks and pictures, for their future use or sharing purposes. The collection of resources and tags generated by a user is called a personomy, and the collection of all personomies constitutes the folksonomy. The most significant need of the folksonomy users Is to efficiently find useful resources or experts on specific topics. An excellent ranking algorithm would assign higher ranking to more useful resources or experts. What resources are considered useful In a folksonomic system? Does a standard superior to frequency or freshness exist? The resource recommended by more users with mere expertise should be worthy of attention. This ranking paradigm can be implemented through a graph-based ranking algorithm. Two well-known representatives of such a paradigm are Page Rank by Google and HITS(Hypertext Induced Topic Selection) by Kleinberg. Both Page Rank and HITS assign a higher evaluation score to pages linked to more higher-scored pages. HITS differs from PageRank in that it utilizes two kinds of scores: authority and hub scores. The ranking objects of these pages are limited to Web pages, whereas the ranking objects of a folksonomic system are somewhat heterogeneous(i.e., users, resources, and tags). Therefore, uniform application of the voting notion of PageRank and HITS based on the links to a folksonomy would be unreasonable, In a folksonomic system, each link corresponding to a property can have an opposite direction, depending on whether the property is an active or a passive voice. The current research stems from the Idea that a graph-based ranking algorithm could be applied to the folksonomic system using the concept of mutual Interactions between entitles, rather than the voting notion of PageRank or HITS. The concept of mutual interactions, proposed for ranking the Semantic Web resources, enables the calculation of importance scores of various resources unaffected by link directions. The weights of a property representing the mutual interaction between classes are assigned depending on the relative significance of the property to the resource importance of each class. This class-oriented approach is based on the fact that, in the Semantic Web, there are many heterogeneous classes; thus, applying a different appraisal standard for each class is more reasonable. This is similar to the evaluation method of humans, where different items are assigned specific weights, which are then summed up to determine the weighted average. We can check for missing properties more easily with this approach than with other predicate-oriented approaches. A user of a tagging system usually assigns more than one tags to the same resource, and there can be more than one tags with the same subjectivity and objectivity. In the case that many users assign similar tags to the same resource, grading the users differently depending on the assignment order becomes necessary. This idea comes from the studies in psychology wherein expertise involves the ability to select the most relevant information for achieving a goal. An expert should be someone who not only has a large collection of documents annotated with a particular tag, but also tends to add documents of high quality to his/her collections. Such documents are identified by the number, as well as the expertise, of users who have the same documents in their collections. In other words, there is a relationship of mutual reinforcement between the expertise of a user and the quality of a document. In addition, there is a need to rank entities related more closely to a certain entity. Considering the property of social media that ensures the popularity of a topic is temporary, recent data should have more weight than old data. We propose a comprehensive folksonomy ranking framework in which all these considerations are dealt with and that can be easily customized to each folksonomy site for ranking purposes. To examine the validity of our ranking algorithm and show the mechanism of adjusting property, time, and expertise weights, we first use a dataset designed for analyzing the effect of each ranking factor independently. We then show the ranking results of a real folksonomy site, with the ranking factors combined. Because the ground truth of a given dataset is not known when it comes to ranking, we inject simulated data whose ranking results can be predicted into the real dataset and compare the ranking results of our algorithm with that of a previous HITS-based algorithm. Our semantic ranking algorithm based on the concept of mutual interaction seems to be preferable to the HITS-based algorithm as a flexible folksonomy ranking framework. Some concrete points of difference are as follows. First, with the time concept applied to the property weights, our algorithm shows superior performance in lowering the scores of older data and raising the scores of newer data. Second, applying the time concept to the expertise weights, as well as to the property weights, our algorithm controls the conflicting influence of expertise weights and enhances overall consistency of time-valued ranking. The expertise weights of the previous study can act as an obstacle to the time-valued ranking because the number of followers increases as time goes on. Third, many new properties and classes can be included in our framework. The previous HITS-based algorithm, based on the voting notion, loses ground in the situation where the domain consists of more than two classes, or where other important properties, such as "sent through twitter" or "registered as a friend," are added to the domain. Forth, there is a big difference in the calculation time and memory use between the two kinds of algorithms. While the matrix multiplication of two matrices, has to be executed twice for the previous HITS-based algorithm, this is unnecessary with our algorithm. In our ranking framework, various folksonomy ranking policies can be expressed with the ranking factors combined and our approach can work, even if the folksonomy site is not implemented with Semantic Web languages. Above all, the time weight proposed in this paper will be applicable to various domains, including social media, where time value is considered important.

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.

An Analysis of the Internal Marketing Impact on the Market Capitalization Fluctuation Rate based on the Online Company Reviews from Jobplanet (직원을 위한 내부마케팅이 기업의 시가 총액 변동률에 미치는 영향 분석: 잡플래닛 기업 리뷰를 중심으로)

  • Kichul Choi;Sang-Yong Tom Lee
    • Information Systems Review
    • /
    • v.20 no.2
    • /
    • pp.39-62
    • /
    • 2018
  • Thanks to the growth of computing power and the recent development of data analytics, researchers have started to work on the data produced by users through the Internet or social media. This study is in line with these recent research trends and attempts to adopt data analytical techniques. We focus on the impact of "internal marketing" factors on firm performance, which is typically studied through survey methodologies. We looked into the job review platform Jobplanet (www.jobplanet.co.kr), which is a website where employees and former employees anonymously review companies and their management. With web crawling processes, we collected over 40K data points and performed morphological analysis to classify employees' reviews for internal marketing data. We then implemented econometric analysis to see the relationship between internal marketing and market capitalization. Contrary to the findings of extant survey studies, internal marketing is positively related to a firm's market capitalization only within a limited area. In most of the areas, the relationships are negative. Particularly, female-friendly environment and human resource development (HRD) are the areas exhibiting positive relations with market capitalization in the manufacturing industry. In the service industry, most of the areas, such as employ welfare and work-life balance, are negatively related with market capitalization. When firm size is small (or the history is short), female-friendly environment positively affect firm performance. On the contrary, when firm size is big (or the history is long), most of the internal marketing factors are either negative or insignificant. We explain the theoretical contributions and managerial implications with these results.

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.123-138
    • /
    • 2017
  • Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.

A Methodology for Automatic Multi-Categorization of Single-Categorized Documents (단일 카테고리 문서의 다중 카테고리 자동확장 방법론)

  • Hong, Jin-Sung;Kim, Namgyu;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.77-92
    • /
    • 2014
  • Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we propose a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. First, we attempt to find the relationship between documents and topics by using the result of topic analysis for single-categorized documents. Second, we construct a correspondence table between topics and categories by investigating the relationship between them. Finally, we calculate the matching scores for each document to multiple categories. The results imply that a document can be classified into a certain category if and only if the matching score is higher than the predefined threshold. For example, we can classify a certain document into three categories that have larger matching scores than the predefined threshold. The main contribution of our study is that our methodology can improve the applicability of traditional multi-category classifiers by generating multi-categorized documents from single-categorized documents. Additionally, we propose a module for verifying the accuracy of the proposed methodology. For performance evaluation, we performed intensive experiments with news articles. News articles are clearly categorized based on the theme, whereas the use of vulgar language and slang is smaller than other usual text document. We collected news articles from July 2012 to June 2013. The articles exhibit large variations in terms of the number of types of categories. This is because readers have different levels of interest in each category. Additionally, the result is also attributed to the differences in the frequency of the events in each category. In order to minimize the distortion of the result from the number of articles in different categories, we extracted 3,000 articles equally from each of the eight categories. Therefore, the total number of articles used in our experiments was 24,000. The eight categories were "IT Science," "Economy," "Society," "Life and Culture," "World," "Sports," "Entertainment," and "Politics." By using the news articles that we collected, we calculated the document/category correspondence scores by utilizing topic/category and document/topics correspondence scores. The document/category correspondence score can be said to indicate the degree of correspondence of each document to a certain category. As a result, we could present two additional categories for each of the 23,089 documents. Precision, recall, and F-score were revealed to be 0.605, 0.629, and 0.617 respectively when only the top 1 predicted category was evaluated, whereas they were revealed to be 0.838, 0.290, and 0.431 when the top 1 - 3 predicted categories were considered. It was very interesting to find a large variation between the scores of the eight categories on precision, recall, and F-score.

A Study on Interpreting People's Enjoyment under Cherry Blossom in Modern Times (벚꽃을 통해 본 근대 행락문화의 해석)

  • Kim, Hai Gyoung
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.29 no.4
    • /
    • pp.124-136
    • /
    • 2011
  • In landscape architecture, plants play an important role in realizing the intention of the architect and user- behavior as well as an ecology and appearance of the space for them. However, it is true that many researches have focused on ecological characteristics of plants, their cultivation environment and symbolic meanings in traditional terms, while relatively few for the analysis of the aspects of each period through plants. For this, cherry trees that we often see around are selected and their introduction, propagation, development and symbolism from the view of chronicle are studied and the results are followings; Firstly, three-year seedlings of 1,500 pieces of cherry tree from Osaka and Tokyo were planted for the first time in Oieseongdae, Namsan Park, Seoul. Since then, they had been widely planted at traditional sites, modern parks, newly-constructed roads for street trees, and for this, the Japanese Government-General of Chosun had actively supported by its direct cultivation and selling of cherry trees. The spread of cherry trees planted raised the question of whether or not Prunus yedoensis is originated from Jeju Island. Secondly, such massive and artificial planting of them had become attractions over the time and mass media at that time also had actively promoted it. And such trend made the day and night picnic under the cherry blossoms one of the most representative cultures of enjoying spring in Seoul. Thirdly, although general people enjoyed cherry blossoms, but they had dual view and attitude for cherry trees, which were well expressed in their use of them: for example, cherry blossoms, aeng and sakura were used altogether for same meaning, but night aeng or night picnic under cherry blossoms were especially used instead of yojakura when mentioning just pleasure, which meant some saw night enjoying cherry blossoms a low culture. Fourth, symbolic space of Chosun had been transformed into the space for enjoyment and consumption. Anyone who paid entrance fee could enjoy performance of revugirl, cinema and entertainment along with enjoying cherry blossoms. The still-existing strict differentiation of enjoyment culture by social status, class and ethnicity was dismantled from that trend and brought about a kind of disorder. From this, we could find that cherry blossoms had made a great contribution to the change of traditional enjoyment culture over the Japanese colonial period and become a popular spring enjoyment.