• Title/Summary/Keyword: clusters

Search Result 5,073, Processing Time 0.026 seconds

A Proposal of a Keyword Extraction System for Detecting Social Issues (사회문제 해결형 기술수요 발굴을 위한 키워드 추출 시스템 제안)

  • Jeong, Dami;Kim, Jaeseok;Kim, Gi-Nam;Heo, Jong-Uk;On, Byung-Won;Kang, Mijung
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.1-23
    • /
    • 2013
  • To discover significant social issues such as unemployment, economy crisis, social welfare etc. that are urgent issues to be solved in a modern society, in the existing approach, researchers usually collect opinions from professional experts and scholars through either online or offline surveys. However, such a method does not seem to be effective from time to time. As usual, due to the problem of expense, a large number of survey replies are seldom gathered. In some cases, it is also hard to find out professional persons dealing with specific social issues. Thus, the sample set is often small and may have some bias. Furthermore, regarding a social issue, several experts may make totally different conclusions because each expert has his subjective point of view and different background. In this case, it is considerably hard to figure out what current social issues are and which social issues are really important. To surmount the shortcomings of the current approach, in this paper, we develop a prototype system that semi-automatically detects social issue keywords representing social issues and problems from about 1.3 million news articles issued by about 10 major domestic presses in Korea from June 2009 until July 2012. Our proposed system consists of (1) collecting and extracting texts from the collected news articles, (2) identifying only news articles related to social issues, (3) analyzing the lexical items of Korean sentences, (4) finding a set of topics regarding social keywords over time based on probabilistic topic modeling, (5) matching relevant paragraphs to a given topic, and (6) visualizing social keywords for easy understanding. In particular, we propose a novel matching algorithm relying on generative models. The goal of our proposed matching algorithm is to best match paragraphs to each topic. Technically, using a topic model such as Latent Dirichlet Allocation (LDA), we can obtain a set of topics, each of which has relevant terms and their probability values. In our problem, given a set of text documents (e.g., news articles), LDA shows a set of topic clusters, and then each topic cluster is labeled by human annotators, where each topic label stands for a social keyword. For example, suppose there is a topic (e.g., Topic1 = {(unemployment, 0.4), (layoff, 0.3), (business, 0.3)}) and then a human annotator labels "Unemployment Problem" on Topic1. In this example, it is non-trivial to understand what happened to the unemployment problem in our society. In other words, taking a look at only social keywords, we have no idea of the detailed events occurring in our society. To tackle this matter, we develop the matching algorithm that computes the probability value of a paragraph given a topic, relying on (i) topic terms and (ii) their probability values. For instance, given a set of text documents, we segment each text document to paragraphs. In the meantime, using LDA, we can extract a set of topics from the text documents. Based on our matching process, each paragraph is assigned to a topic, indicating that the paragraph best matches the topic. Finally, each topic has several best matched paragraphs. Furthermore, assuming there are a topic (e.g., Unemployment Problem) and the best matched paragraph (e.g., Up to 300 workers lost their jobs in XXX company at Seoul). In this case, we can grasp the detailed information of the social keyword such as "300 workers", "unemployment", "XXX company", and "Seoul". In addition, our system visualizes social keywords over time. Therefore, through our matching process and keyword visualization, most researchers will be able to detect social issues easily and quickly. Through this prototype system, we have detected various social issues appearing in our society and also showed effectiveness of our proposed methods according to our experimental results. Note that you can also use our proof-of-concept system in http://dslab.snu.ac.kr/demo.html.

Clustering Method based on Genre Interest for Cold-Start Problem in Movie Recommendation (영화 추천 시스템의 초기 사용자 문제를 위한 장르 선호 기반의 클러스터링 기법)

  • You, Tithrottanak;Rosli, Ahmad Nurzid;Ha, Inay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.57-77
    • /
    • 2013
  • Social media has become one of the most popular media in web and mobile application. In 2011, social networks and blogs are still the top destination of online users, according to a study from Nielsen Company. In their studies, nearly 4 in 5active users visit social network and blog. Social Networks and Blogs sites rule Americans' Internet time, accounting to 23 percent of time spent online. Facebook is the main social network that the U.S internet users spend time more than the other social network services such as Yahoo, Google, AOL Media Network, Twitter, Linked In and so on. In recent trend, most of the companies promote their products in the Facebook by creating the "Facebook Page" that refers to specific product. The "Like" option allows user to subscribed and received updates their interested on from the page. The film makers which produce a lot of films around the world also take part to market and promote their films by exploiting the advantages of using the "Facebook Page". In addition, a great number of streaming service providers allows users to subscribe their service to watch and enjoy movies and TV program. They can instantly watch movies and TV program over the internet to PCs, Macs and TVs. Netflix alone as the world's leading subscription service have more than 30 million streaming members in the United States, Latin America, the United Kingdom and the Nordics. As the matter of facts, a million of movies and TV program with different of genres are offered to the subscriber. In contrast, users need spend a lot time to find the right movies which are related to their interest genre. Recent years there are many researchers who have been propose a method to improve prediction the rating or preference that would give the most related items such as books, music or movies to the garget user or the group of users that have the same interest in the particular items. One of the most popular methods to build recommendation system is traditional Collaborative Filtering (CF). The method compute the similarity of the target user and other users, which then are cluster in the same interest on items according which items that users have been rated. The method then predicts other items from the same group of users to recommend to a group of users. Moreover, There are many items that need to study for suggesting to users such as books, music, movies, news, videos and so on. However, in this paper we only focus on movie as item to recommend to users. In addition, there are many challenges for CF task. Firstly, the "sparsity problem"; it occurs when user information preference is not enough. The recommendation accuracies result is lower compared to the neighbor who composed with a large amount of ratings. The second problem is "cold-start problem"; it occurs whenever new users or items are added into the system, which each has norating or a few rating. For instance, no personalized predictions can be made for a new user without any ratings on the record. In this research we propose a clustering method according to the users' genre interest extracted from social network service (SNS) and user's movies rating information system to solve the "cold-start problem." Our proposed method will clusters the target user together with the other users by combining the user genre interest and the rating information. It is important to realize a huge amount of interesting and useful user's information from Facebook Graph, we can extract information from the "Facebook Page" which "Like" by them. Moreover, we use the Internet Movie Database(IMDb) as the main dataset. The IMDbis online databases that consist of a large amount of information related to movies, TV programs and including actors. This dataset not only used to provide movie information in our Movie Rating Systems, but also as resources to provide movie genre information which extracted from the "Facebook Page". Formerly, the user must login with their Facebook account to login to the Movie Rating System, at the same time our system will collect the genre interest from the "Facebook Page". We conduct many experiments with other methods to see how our method performs and we also compare to the other methods. First, we compared our proposed method in the case of the normal recommendation to see how our system improves the recommendation result. Then we experiment method in case of cold-start problem. Our experiment show that our method is outperform than the other methods. In these two cases of our experimentation, we see that our proposed method produces better result in case both cases.

A Study on the Effect of Technological Innovation Capability and Technology Commercialization Capability on Business Performance in SMEs of Korea (우리나라 중소기업의 기술혁신능력과 기술사업화능력이 경영성과에 미치는 영향연구)

  • Lee, Dongsuk;Chung, Lakchae
    • Korean small business review
    • /
    • v.32 no.1
    • /
    • pp.65-87
    • /
    • 2010
  • With the advent of knowledge-based society, the revitalization of technological innovation type SMEs, termed "inno-biz" hereafter, has been globally recognized as a government policymakers' primary concern in strengthening national competitiveness, and much effort is being put into establishing polices of boosting the start-ups and innovation capability of SMEs. Especially, in that the inno-biz enables national economy to get vitalized by widening world markets with its superior technology, and thus, taking the initiative of extremely competitive world markets, its growth and development has greater significance. In the case of Korea, the government has been maintaining the policies since the late 1990s of stimulating the growth of SMEs as well as building various infrastructures to foster the start-ups of the SMEs such as venture businesses with high technology. In addition, since the enactment of "Innovation Promotion Law for SMEs" in 2001, the government has been accelerating the policies of prioritizing the growth and development of inno-biz. So, for the sound growth and development of Korean inno-biz, this paper intends to offer effective management strategies for SMEs and suggest proper policies for the government, by researching into the effect of technological innovation capability and technology commercialization capability as the primary business resources on business performance in Korean SMEs in the light of market information orientation. The research is carried out on Korean companies characterized as inno-biz. On the basis of OSLO manual and prior studies, the research categorizes their status. R&D capability, technology accumulation capability and technological innovation system are categorized into technological innovation capability; product development capability, manufacturing capability and marketing capability into technology commercialization capability; and increase in product competitiveness and merits for new technology and/or product development into business performance. Then the effect of each component on business performance is substantially analyzed. In addition, the mediation effect of technological innovation and technology commercialization capability on business performance is observed by the use of the market information orientation as a parameter. The following hypotheses are proposed. H1 : Technology innovation capability will positively influence business performance. H1-1 : R&D capability will positively influence product competitiveness. H1-2 : R&D capability will positively influence merits for new technology and/or product development into business performance. H1-3 : Technology accumulation capability will positively influence product competitiveness. H1-4 : Technology accumulation capability will positively influence merits for new technology and/or product development into business performance. H1-5 : Technological innovation system will positively influence product competitiveness. H1-6 : Technological innovation system will positively influence merits for new technology and/or product development into business performance. H2 : Technology commercializing capability will positively influence business performance. H2-1 : Product development capability will positively influence product competitiveness. H2-2 : Product development capability will positively influence merits for new technology and/or product development into business performance. H2-3 : Manufacturing capability will positively influence product competitiveness. H2-4 : Manufacturing capability will positively influence merits for new technology and/or product development into business performance. H2-5 : Marketing capability will positively influence product competitiveness. H2-6 : Marketing capability will positively influence merits for new technology and/or product development into business performance. H3 : Technology innovation capability will positively influence market information orientation. H3-1 : R&D capability will positively influence information generation. H3-2 : R&D capability will positively influence information diffusion. H3-3 : R&D capability will positively influence information response. H3-4 : Technology accumulation capability will positively influence information generation. H3-5 : Technology accumulation capability will positively influence information diffusion. H3-6 : Technology accumulation capability will positively influence information response. H3-7 : Technological innovation system will positively influence information generation. H3-8 : Technological innovation system will positively influence information diffusion. H3-9 : Technological innovation system will positively influence information response. H4 : Technology commercialization capability will positively influence market information orientation. H4-1 : Product development capability will positively influence information generation. H4-2 : Product development capability will positively influence information diffusion. H4-3 : Product development capability will positively influence information response. H4-4 : Manufacturing capability will positively influence information generation. H4-5 : Manufacturing capability will positively influence information diffusion. H4-6 : Manufacturing capability will positively influence information response. H4-7 : Marketing capability will positively influence information generation. H4-8 : Marketing capability will positively influence information diffusion. H4-9 : Marketing capability will positively influence information response. H5 : Market information orientation will positively influence business performance. H5-1 : Information generation will positively influence product competitiveness. H5-2 : Information generation will positively influence merits for new technology and/or product development into business performance. H5-3 : Information diffusion will positively influence product competitiveness. H5-4 : Information diffusion will positively influence merits for new technology and/or product development into business performance. H5-5 : Information response will positively influence product competitiveness. H5-6 : Information response will positively influence merits for new technology and/or product development into business performance. H6 : Market information orientation will mediate the relationship between technology innovation capability and business performance. H7 : Market information orientation will mediate the relationship between technology commercializing capability and business performance. The followings are the research results : First, as for the effect of technological innovation on business performance, the technology accumulation capability and technological innovating system have a positive effect on increase in product competitiveness and merits for new technology and/or product development, while R&D capability has little effect on business performance. Second, as for the effect of technology commercialization capability on business performance, the effect of manufacturing capability is relatively greater than that of merits for new technology and/or product development. Third, the mediation effect of market information orientation is identified to exist partially in information generation, information diffusion and information response. Judging from these results, the following analysis can be made : On Increase in product competitiveness, directly related to successful technology commercialization of technology, management capability including technological innovation system, manufacturing capability and marketing capability has a relatively strong effect. On merits for new technology and/or product development, on the other hand, capability in technological aspect including R&D capability, technology accumulation capability and product development capability has relatively strong effect. Besides, in the cast of market information orientation, the level of information diffusion within an organization plays and important role in new technology and/or product development. Also, for commercial success like increase in product competitiveness, the level of information response is primarily required. Accordingly, the following policies are suggested : First, as the effect of technological innovation capability and technology commercialization capability on business performance differs among SMEs; in order for SMEs to secure competitiveness, the government has to establish microscopic policies for SMEs which meet their needs and characteristics. Especially, the SMEs lacking in capital and labor are required to map out management strategies of focusing their resources primarily on their strengths. And the government needs to set up policies for SMEs, not from its macro-scaled standpoint, but from the selective and concentrative one that meets the needs and characteristics of respective SMEs. Second, systematic infrastructures are urgently required which lead technological success to commercial success. Namely, as technological merits at respective SME levels do not always guarantee commercial success, the government should make and effort to build systematic infrastructures including encouragement of M&A or technology trade, systematic support for protecting intellectual property, furtherance of business incubating and industrial clusters for strengthening academic-industrial network, and revitalization of technology financing, in order to make successful commercialization from technological success. Finally, the effort to innovate technology, R&D, for example, is essential to future national competitiveness, but its result is often prolonged. So the government needs continuous concern and funding for basic science, in order to maximize technological innovation capability. Indeed the government needs to examine continuously whether technological innovation capability or technological success leads satisfactorily to commercial success in market economic system. It is because, when the transition fails, it should be left to the government.