• 제목/요약/키워드: News

Search Result 10,880, Processing Time 0.08 seconds

Twitter Issue Tracking System by Topic Modeling Techniques (토픽 모델링을 이용한 트위터 이슈 트래킹 시스템)

  • Bae, Jung-Hwan;Han, Nam-Gi;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.109-122
    • /
    • 2014
  • People are nowadays creating a tremendous amount of data on Social Network Service (SNS). In particular, the incorporation of SNS into mobile devices has resulted in massive amounts of data generation, thereby greatly influencing society. This is an unmatched phenomenon in history, and now we live in the Age of Big Data. SNS Data is defined as a condition of Big Data where the amount of data (volume), data input and output speeds (velocity), and the variety of data types (variety) are satisfied. If someone intends to discover the trend of an issue in SNS Big Data, this information can be used as a new important source for the creation of new values because this information covers the whole of society. In this study, a Twitter Issue Tracking System (TITS) is designed and established to meet the needs of analyzing SNS Big Data. TITS extracts issues from Twitter texts and visualizes them on the web. The proposed system provides the following four functions: (1) Provide the topic keyword set that corresponds to daily ranking; (2) Visualize the daily time series graph of a topic for the duration of a month; (3) Provide the importance of a topic through a treemap based on the score system and frequency; (4) Visualize the daily time-series graph of keywords by searching the keyword; The present study analyzes the Big Data generated by SNS in real time. SNS Big Data analysis requires various natural language processing techniques, including the removal of stop words, and noun extraction for processing various unrefined forms of unstructured data. In addition, such analysis requires the latest big data technology to process rapidly a large amount of real-time data, such as the Hadoop distributed system or NoSQL, which is an alternative to relational database. We built TITS based on Hadoop to optimize the processing of big data because Hadoop is designed to scale up from single node computing to thousands of machines. Furthermore, we use MongoDB, which is classified as a NoSQL database. In addition, MongoDB is an open source platform, document-oriented database that provides high performance, high availability, and automatic scaling. Unlike existing relational database, there are no schema or tables with MongoDB, and its most important goal is that of data accessibility and data processing performance. In the Age of Big Data, the visualization of Big Data is more attractive to the Big Data community because it helps analysts to examine such data easily and clearly. Therefore, TITS uses the d3.js library as a visualization tool. This library is designed for the purpose of creating Data Driven Documents that bind document object model (DOM) and any data; the interaction between data is easy and useful for managing real-time data stream with smooth animation. In addition, TITS uses a bootstrap made of pre-configured plug-in style sheets and JavaScript libraries to build a web system. The TITS Graphical User Interface (GUI) is designed using these libraries, and it is capable of detecting issues on Twitter in an easy and intuitive manner. The proposed work demonstrates the superiority of our issue detection techniques by matching detected issues with corresponding online news articles. The contributions of the present study are threefold. First, we suggest an alternative approach to real-time big data analysis, which has become an extremely important issue. Second, we apply a topic modeling technique that is used in various research areas, including Library and Information Science (LIS). Based on this, we can confirm the utility of storytelling and time series analysis. Third, we develop a web-based system, and make the system available for the real-time discovery of topics. The present study conducted experiments with nearly 150 million tweets in Korea during March 2013.

A Study on Industries's Leading at the Stock Market in Korea - Gradual Diffusion of Information and Cross-Asset Return Predictability- (산업의 주식시장 선행성에 관한 실증분석 - 자산간 수익률 예측 가능성 -)

  • Kim Jong-Kwon
    • Proceedings of the Safety Management and Science Conference
    • /
    • 2004.11a
    • /
    • pp.355-380
    • /
    • 2004
  • I test the hypothesis that the gradual diffusion of information across asset markets leads to cross-asset return predictability in Korea. Using thirty-six industry portfolios and the broad market index as our test assets, I establish several key results. First, a number of industries such as semiconductor, electronics, metal, and petroleum lead the stock market by up to one month. In contrast, the market, which is widely followed, only leads a few industries. Importantly, an industry's ability to lead the market is correlated with its propensity to forecast various indicators of economic activity such as industrial production growth. Consistent with our hypothesis, these findings indicate that the market reacts with a delay to information in industry returns about its fundamentals because information diffuses only gradually across asset markets. Traditional theories of asset pricing assume that investors have unlimited information-processing capacity. However, this assumption does not hold for many traders, even the most sophisticated ones. Many economists recognize that investors are better characterized as being only boundedly rational(see Shiller(2000), Sims(2201)). Even from casual observation, few traders can pay attention to all sources of information much less understand their impact on the prices of assets that they trade. Indeed, a large literature in psychology documents the extent to which even attention is a precious cognitive resource(see, eg., Kahneman(1973), Nisbett and Ross(1980), Fiske and Taylor(1991)). A number of papers have explored the implications of limited information- processing capacity for asset prices. I will review this literature in Section II. For instance, Merton(1987) develops a static model of multiple stocks in which investors only have information about a limited number of stocks and only trade those that they have information about. Related models of limited market participation include brennan(1975) and Allen and Gale(1994). As a result, stocks that are less recognized by investors have a smaller investor base(neglected stocks) and trade at a greater discount because of limited risk sharing. More recently, Hong and Stein(1999) develop a dynamic model of a single asset in which information gradually diffuses across the investment public and investors are unable to perform the rational expectations trick of extracting information from prices. Hong and Stein(1999). My hypothesis is that the gradual diffusion of information across asset markets leads to cross-asset return predictability. This hypothesis relies on two key assumptions. The first is that valuable information that originates in one asset reaches investors in other markets only with a lag, i.e. news travels slowly across markets. The second assumption is that because of limited information-processing capacity, many (though not necessarily all) investors may not pay attention or be able to extract the information from the asset prices of markets that they do not participate in. These two assumptions taken together leads to cross-asset return predictability. My hypothesis would appear to be a very plausible one for a few reasons. To begin with, as pointed out by Merton(1987) and the subsequent literature on segmented markets and limited market participation, few investors trade all assets. Put another way, limited participation is a pervasive feature of financial markets. Indeed, even among equity money managers, there is specialization along industries such as sector or market timing funds. Some reasons for this limited market participation include tax, regulatory or liquidity constraints. More plausibly, investors have to specialize because they have their hands full trying to understand the markets that they do participate in

  • PDF

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.

Efficient Topic Modeling by Mapping Global and Local Topics (전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안)

  • Choi, Hochang;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.69-94
    • /
    • 2017
  • Recently, increase of demand for big data analysis has been driving the vigorous development of related technologies and tools. In addition, development of IT and increased penetration rate of smart devices are producing a large amount of data. According to this phenomenon, data analysis technology is rapidly becoming popular. Also, attempts to acquire insights through data analysis have been continuously increasing. It means that the big data analysis will be more important in various industries for the foreseeable future. Big data analysis is generally performed by a small number of experts and delivered to each demander of analysis. However, increase of interest about big data analysis arouses activation of computer programming education and development of many programs for data analysis. Accordingly, the entry barriers of big data analysis are gradually lowering and data analysis technology being spread out. As the result, big data analysis is expected to be performed by demanders of analysis themselves. Along with this, interest about various unstructured data is continually increasing. Especially, a lot of attention is focused on using text data. Emergence of new platforms and techniques using the web bring about mass production of text data and active attempt to analyze text data. Furthermore, result of text analysis has been utilized in various fields. Text mining is a concept that embraces various theories and techniques for text analysis. Many text mining techniques are utilized in this field for various research purposes, topic modeling is one of the most widely used and studied. Topic modeling is a technique that extracts the major issues from a lot of documents, identifies the documents that correspond to each issue and provides identified documents as a cluster. It is evaluated as a very useful technique in that reflect the semantic elements of the document. Traditional topic modeling is based on the distribution of key terms across the entire document. Thus, it is essential to analyze the entire document at once to identify topic of each document. This condition causes a long time in analysis process when topic modeling is applied to a lot of documents. In addition, it has a scalability problem that is an exponential increase in the processing time with the increase of analysis objects. This problem is particularly noticeable when the documents are distributed across multiple systems or regions. To overcome these problems, divide and conquer approach can be applied to topic modeling. It means dividing a large number of documents into sub-units and deriving topics through repetition of topic modeling to each unit. This method can be used for topic modeling on a large number of documents with limited system resources, and can improve processing speed of topic modeling. It also can significantly reduce analysis time and cost through ability to analyze documents in each location or place without combining analysis object documents. However, despite many advantages, this method has two major problems. First, the relationship between local topics derived from each unit and global topics derived from entire document is unclear. It means that in each document, local topics can be identified, but global topics cannot be identified. Second, a method for measuring the accuracy of the proposed methodology should be established. That is to say, assuming that global topic is ideal answer, the difference in a local topic on a global topic needs to be measured. By those difficulties, the study in this method is not performed sufficiently, compare with other studies dealing with topic modeling. In this paper, we propose a topic modeling approach to solve the above two problems. First of all, we divide the entire document cluster(Global set) into sub-clusters(Local set), and generate the reduced entire document cluster(RGS, Reduced global set) that consist of delegated documents extracted from each local set. We try to solve the first problem by mapping RGS topics and local topics. Along with this, we verify the accuracy of the proposed methodology by detecting documents, whether to be discerned as the same topic at result of global and local set. Using 24,000 news articles, we conduct experiments to evaluate practical applicability of the proposed methodology. In addition, through additional experiment, we confirmed that the proposed methodology can provide similar results to the entire topic modeling. We also proposed a reasonable method for comparing the result of both methods.

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.

Analysis of Metadata Standards of Record Management for Metadata Interoperability From the viewpoint of the Task model and 5W1H (메타데이터 상호운용성을 위한 기록관리 메타데이터 표준 분석 5W1H와 태스크 모델의 관점에서)

  • Baek, Jae-Eun;Sugimoto, Shigeo
    • The Korean Journal of Archival Studies
    • /
    • no.32
    • /
    • pp.127-176
    • /
    • 2012
  • Metadata is well recognized as one of the foundational factors in archiving and long-term preservation of digital resources. There are several metadata standards for records management, archives and preservation, e.g. ISAD(G), EAD, AGRkMs, PREMIS, and OAIS. Consideration is important in selecting appropriate metadata standards in order to design metadata schema that meet the requirements of a particular archival system. Interoperability of metadata with other systems should be considered in schema design. In our previous research, we have presented a feature analysis of metadata standards by identifying the primary resource lifecycle stages where each standard is applied. We have clarified that any single metadata standard cannot cover the whole records lifecycle for archiving and preservation. Through this feature analysis, we analyzed the features of metadata in the whole records lifecycle, and we clarified the relationships between the metadata standards and the stages of the lifecycle. In the previous study, more detailed analysis was left for future study. This paper proposes to analyze the metadata schemas from the viewpoint of tasks performed in the lifecycle. Metadata schemas are primarily defined to describe properties of a resource in accordance with the purposes of description, e.g. finding aids, records management, preservation and so forth. In other words, the metadata standards are resource- and purpose-centric, and the resource lifecycle is not explicitly reflected in the standards. There are no systematic methods for mapping between different metadata standards in accordance with the lifecycle. This paper proposes a method for mapping between metadata standards based on the tasks contained in the resource lifecycle. We first propose a Task Model to clarify tasks applied to resources in each stage of the lifecycle. This model is created as a task-centric model to identify features of metadata standards and to create mappings among elements of those standards. It is important to categorize the elements in order to limit the semantic scope of mapping among elements and decrease the number of combinations of elements for mapping. This paper proposes to use 5W1H (Who, What, Why, When, Where, How) model to categorize the elements. 5W1H categories are generally used for describing events, e.g. news articles. As performing a task on a resource causes an event and metadata elements are used in the event, we consider that the 5W1H categories are adequate to categorize the elements. By using these categories, we determine the features of every element of metadata standards which are AGLS, AGRkMS, PREMIS, EAD, OAIS and an attribute set extracted from DPC decision flow. Then, we perform the element mapping between the standards, and find the relationships between the standards. In this study, we defined a set of terms for each of 5W1H categories, which typically appear in the definition of an element, and used those terms to categorize the elements. For example, if the definition of an element includes the terms such as person and organization that mean a subject which contribute to create, modify a resource the element is categorized into the Who category. A single element can be categorized into one or more 5W1H categories. Thus, we categorized every element of the metadata standards using the 5W1H model, and then, we carried out mapping among the elements in each category. We conclude that the Task Model provides a new viewpoint for metadata schemas and is useful to help us understand the features of metadata standards for records management and archives. The 5W1H model, which is defined based on the Task Model, provides us a core set of categories to semantically classify metadata elements from the viewpoint of an event caused by a task.

Structural features and Diffusion Patterns of Gartner Hype Cycle for Artificial Intelligence using Social Network analysis (인공지능 기술에 관한 가트너 하이프사이클의 네트워크 집단구조 특성 및 확산패턴에 관한 연구)

  • Shin, Sunah;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.107-129
    • /
    • 2022
  • It is important to preempt new technology because the technology competition is getting much tougher. Stakeholders conduct exploration activities continuously for new technology preoccupancy at the right time. Gartner's Hype Cycle has significant implications for stakeholders. The Hype Cycle is a expectation graph for new technologies which is combining the technology life cycle (S-curve) with the Hype Level. Stakeholders such as R&D investor, CTO(Chef of Technology Officer) and technical personnel are very interested in Gartner's Hype Cycle for new technologies. Because high expectation for new technologies can bring opportunities to maintain investment by securing the legitimacy of R&D investment. However, contrary to the high interest of the industry, the preceding researches faced with limitations aspect of empirical method and source data(news, academic papers, search traffic, patent etc.). In this study, we focused on two research questions. The first research question was 'Is there a difference in the characteristics of the network structure at each stage of the hype cycle?'. To confirm the first research question, the structural characteristics of each stage were confirmed through the component cohesion size. The second research question is 'Is there a pattern of diffusion at each stage of the hype cycle?'. This research question was to be solved through centralization index and network density. The centralization index is a concept of variance, and a higher centralization index means that a small number of nodes are centered in the network. Concentration of a small number of nodes means a star network structure. In the network structure, the star network structure is a centralized structure and shows better diffusion performance than a decentralized network (circle structure). Because the nodes which are the center of information transfer can judge useful information and deliver it to other nodes the fastest. So we confirmed the out-degree centralization index and in-degree centralization index for each stage. For this purpose, we confirmed the structural features of the community and the expectation diffusion patterns using Social Network Serice(SNS) data in 'Gartner Hype Cycle for Artificial Intelligence, 2021'. Twitter data for 30 technologies (excluding four technologies) listed in 'Gartner Hype Cycle for Artificial Intelligence, 2021' were analyzed. Analysis was performed using R program (4.1.1 ver) and Cyram Netminer. From October 31, 2021 to November 9, 2021, 6,766 tweets were searched through the Twitter API, and converting the relationship user's tweet(Source) and user's retweets (Target). As a result, 4,124 edgelists were analyzed. As a reult of the study, we confirmed the structural features and diffusion patterns through analyze the component cohesion size and degree centralization and density. Through this study, we confirmed that the groups of each stage increased number of components as time passed and the density decreased. Also 'Innovation Trigger' which is a group interested in new technologies as a early adopter in the innovation diffusion theory had high out-degree centralization index and the others had higher in-degree centralization index than out-degree. It can be inferred that 'Innovation Trigger' group has the biggest influence, and the diffusion will gradually slow down from the subsequent groups. In this study, network analysis was conducted using social network service data unlike methods of the precedent researches. This is significant in that it provided an idea to expand the method of analysis when analyzing Gartner's hype cycle in the future. In addition, the fact that the innovation diffusion theory was applied to the Gartner's hype cycle's stage in artificial intelligence can be evaluated positively because the Gartner hype cycle has been repeatedly discussed as a theoretical weakness. Also it is expected that this study will provide a new perspective on decision-making on technology investment to stakeholdes.

Study on the effect of small and medium-sized businesses being selected as suitable business types, on the franchise industry (중소기업적합업종선정이 프랜차이즈산업에 미치는 영향에 관한 연구)

  • Kang, Chang-Dong;Shin, Geon-Chel;Jang, Jae Nam
    • Journal of Distribution Research
    • /
    • v.17 no.5
    • /
    • pp.1-23
    • /
    • 2012
  • The conflict between major corporations and small and medium-sized businesses is being aggravated, the trickle down effect is not working properly, and, as the controversy surrounding the effectiveness of the business limiting system continues to swirl, the plan proposed to protect the business domain of small and medium-sized businesses, resolve polarization between these businesses and large corporations, and protect small family run stores is the suitable business type designation system for small and medium-sized businesses. The current status of carrying out this system of selecting suitable business types among small and medium-sized businesses involves receiving applications for 234 items among the suitable business types and items from small and medium-sized businesses in manufacturing, and then selecting the items of the consultative group by analyzing and investigating the actual conditions. Suitable business type designation in the service industry will involve designation with priority on business types that are experiencing social conflict. Three major classifications of the service industry, related to the livelihood of small and medium-sized businesses, will be first designated, and subsequently this will be expanded sequentially. However, there is the concern that when designated as a suitable business type or item, this will hinder the growth motive for small to medium-sized businesses, and designation all cause decrease in consumer welfare. Also it is highly likely that it will operate as a prior regulation, cause side-effects by limiting competition systematically, and also be in violation against the main regulations of the FTA system. Moreover, it is pointed out that the system does not sufficiently reflect reverse discrimination factor against large corporations. Because conflict between small to medium sized businesses and large corporations results from the expansion of corporations to the service industry, which is unrelated to their key industry, it is necessary to introduce an advanced contract method like a master franchise or local franchise system and to develop local small to medium sized businesses through a franchise system to protect these businesses and dealers. However, this method may have an effect that contributes to stronger competitiveness of small to medium sized franchise businesses by advancing their competitiveness and operational methods a step further, but also has many negative aspects. First, as revealed by the Ministry of Knowledge Economy, the franchise industry is contributing to the strengthening of competitiveness through the economy of scale by organizing existing individual proprietors and increasing the success rate of new businesses. It is also revealed to be a response measure by the government to stabilize the economy of ordinary people and is emphasized as a 'useful way' to revitalize the service industry and improve the competitiveness of individual proprietors, and has been involved in contributions to creating jobs and expanding the domestic market by providing various services to consumers. From this viewpoint, franchises fit the purpose of the suitable business type system and is not something that is against it. Second, designation as a suitable business type may decrease investment for overseas expansion, R&D, and food safety, as well negatively affect the expansion of overseas corporations that have entered the domestic market, due to the contraction and low morale of large domestic franchise corporations that have competitiveness internationally. Also because domestic franchise businesses are hard pressed to secure competitiveness with multinational overseas franchise corporations that are operating in Korea, the system may cause difficulty for domestic franchise businesses in securing international competitiveness and also may result in reverse discrimination against these overseas franchise corporations. Third, the designation of suitable business type and item can limit the opportunity of selection for consumers who have up to now used those products and can cause a negative effect that reduces consumer welfare. Also, because there is the possibility that the range of consumer selection may be reduced when a few small to medium size businesses monopolize the market, by causing reverse discrimination between these businesses, the role of determining the utility of products must be left ot the consumer not the government. Lastly, it is desirable that this is carried out with the supplementation of deficient parts in the future, because fair trade is already secured with the enforcement of the franchise trade law and the best trade standard of the Fair Trade Commission. Overlapping regulations by the suitable business type designation is an excessive restriction in the franchise industry. Now, it is necessary to establish in the domestic franchise industry an environment where a global franchise corporation, which spreads Korean culture around the world, is capable of growing, and the active support by the government is needed. Therefore, systems that do not consider the process or background of the growth of franchise businesses and harm these businesses for the sole reason of them being large corporations must be removed. The inhibition of growth to franchise enterprises may decrease the sales of franchise stores, in some cases even bankrupt them, as well as cause other problems. Therefore the suitable business type system should not hinder large corporations, and as both small dealers and small to medium size businesses both aim at improving competitiveness and combined growth, large corporations, small dealers and small to medium sized businesses, based on their mutual cooperation, should not include franchise corporations that continue business relations with them in this system.

  • PDF

Clustering Method based on Genre Interest for Cold-Start Problem in Movie Recommendation (영화 추천 시스템의 초기 사용자 문제를 위한 장르 선호 기반의 클러스터링 기법)

  • You, Tithrottanak;Rosli, Ahmad Nurzid;Ha, Inay;Jo, Geun-Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.57-77
    • /
    • 2013
  • Social media has become one of the most popular media in web and mobile application. In 2011, social networks and blogs are still the top destination of online users, according to a study from Nielsen Company. In their studies, nearly 4 in 5active users visit social network and blog. Social Networks and Blogs sites rule Americans' Internet time, accounting to 23 percent of time spent online. Facebook is the main social network that the U.S internet users spend time more than the other social network services such as Yahoo, Google, AOL Media Network, Twitter, Linked In and so on. In recent trend, most of the companies promote their products in the Facebook by creating the "Facebook Page" that refers to specific product. The "Like" option allows user to subscribed and received updates their interested on from the page. The film makers which produce a lot of films around the world also take part to market and promote their films by exploiting the advantages of using the "Facebook Page". In addition, a great number of streaming service providers allows users to subscribe their service to watch and enjoy movies and TV program. They can instantly watch movies and TV program over the internet to PCs, Macs and TVs. Netflix alone as the world's leading subscription service have more than 30 million streaming members in the United States, Latin America, the United Kingdom and the Nordics. As the matter of facts, a million of movies and TV program with different of genres are offered to the subscriber. In contrast, users need spend a lot time to find the right movies which are related to their interest genre. Recent years there are many researchers who have been propose a method to improve prediction the rating or preference that would give the most related items such as books, music or movies to the garget user or the group of users that have the same interest in the particular items. One of the most popular methods to build recommendation system is traditional Collaborative Filtering (CF). The method compute the similarity of the target user and other users, which then are cluster in the same interest on items according which items that users have been rated. The method then predicts other items from the same group of users to recommend to a group of users. Moreover, There are many items that need to study for suggesting to users such as books, music, movies, news, videos and so on. However, in this paper we only focus on movie as item to recommend to users. In addition, there are many challenges for CF task. Firstly, the "sparsity problem"; it occurs when user information preference is not enough. The recommendation accuracies result is lower compared to the neighbor who composed with a large amount of ratings. The second problem is "cold-start problem"; it occurs whenever new users or items are added into the system, which each has norating or a few rating. For instance, no personalized predictions can be made for a new user without any ratings on the record. In this research we propose a clustering method according to the users' genre interest extracted from social network service (SNS) and user's movies rating information system to solve the "cold-start problem." Our proposed method will clusters the target user together with the other users by combining the user genre interest and the rating information. It is important to realize a huge amount of interesting and useful user's information from Facebook Graph, we can extract information from the "Facebook Page" which "Like" by them. Moreover, we use the Internet Movie Database(IMDb) as the main dataset. The IMDbis online databases that consist of a large amount of information related to movies, TV programs and including actors. This dataset not only used to provide movie information in our Movie Rating Systems, but also as resources to provide movie genre information which extracted from the "Facebook Page". Formerly, the user must login with their Facebook account to login to the Movie Rating System, at the same time our system will collect the genre interest from the "Facebook Page". We conduct many experiments with other methods to see how our method performs and we also compare to the other methods. First, we compared our proposed method in the case of the normal recommendation to see how our system improves the recommendation result. Then we experiment method in case of cold-start problem. Our experiment show that our method is outperform than the other methods. In these two cases of our experimentation, we see that our proposed method produces better result in case both cases.

A study on the Greeting's Types of Ganchal in Joseon Dynasty (간찰(簡札)의 안부인사(安否人事)에 대한 유형(類型) 연구(硏究))

  • Jeon, Byeong-yong
    • (The)Study of the Eastern Classic
    • /
    • no.57
    • /
    • pp.467-505
    • /
    • 2014
  • I am working on a series of Korean linguistic studies targeting Ganchal(old typed letters in Korea) for many years and this study is for the typology of the [Safety Expression] as the part. For this purpose, [Safety Expression] were divided into a formal types and semantic types, targeting the Chinese Ganchal and Hangul Ganchal of modern Korean Language time(16th century-19th century). Formal types can be divided based on whether Normal position or not, whether Omission or not, whether the Sending letter or not, whether the relationship of the high and the low or not. Normal position form and completion were made the first type which reveal well the typicality of the [Safety Expression]. Original position while [Own Safety] omitted as the second type, while Original position while [Opposite Safety] omitted as the third type, Original position while [Safety Expression] omitted as the fourth type. Inversion type were made as the fifth type which is the most severe solecism in [Safety Expression]. The first type is refers to Original position type that [Opposite Safety] precede the [Own Safety] and the completion type that is full of semantic element. This type can be referred to most typical and normative in that it equipped all components of [Safety Expression]. A second type is that [Safety Expression] is composed of only the [Opposite Safety]. This type is inferior to the first type in terms of set pattern, it is never outdone when it comes to the appearance frequency. Because asking [Opposite Safety] faithfully, omitting [Own Safety] dose not greatly deviate politeness and easy to write Ganchal, it is utilized. The third type is the Original position type showing the configuration of the [Opposite Safety]+Own Safety], but [Opposite Safety] is omitted. The fourth type is a Original position type showing configuration of the [Opposite Safety+Own Safety], but [Safety Expression] is omitted. This type is divided into A ; [Safety Expression] is entirely omitted and B ; such as 'saving trouble', the conventional expression, replace [Safety Expression]. The fifth type is inversion type that shown to structure of the [Own Safety+Opposite Safety], unlike the Original position type. This type is the most severe solecism type and real example is very rare. It is because let leading [Own Safety] and ask later [Opposite Safety] for face save is offend against common decency. In addition, it can be divided into the direct type that [Opposite Safety] and [Own Safety] is directly connected and indirect type that separate into the [story]. The semantic types of [Safety Expression] can be classified based on whether Sending letter or not, fast or slow, whether intimate or not, and isolation or not. For Sending letter, [Safety Expression] consists [Opposite Safety(Climate+Inquiry after health+Mental state)+Own safety(status+Inquiry after health+Mental state)]. At [Opposite safety], [Climate] could be subdivided as [Season] information and [Climate(weather)] information. Also, [Mental state] is divided as receiver's [Family Safety Mental state] and [Individual Safety Mental state]. In [Own Safety], [Status] is divided as receiver's traditional situation; [Recent condition] and receiver's ongoing situation; [Present condition]. [Inquiry after health] is also subdivided as receiver's [Family Safety] and [Individual Safety], [Safety] is as [Family Safety] and [Individual Safety]. Likewise, [Inquiry after health] or [Safety] is usually used as pairs, in dimension of [Family] and [Individual]. This phenomenon seems to have occurred from a big family system, which is defined as taking care of one's parents or grand parents. As for the Written Reply, [Safety Expression] consists [Opposite Safety (Reception+Inquiry after health+Mental state)+Own safety(status+Inquiry after health+Mental state)], and only in [Opposite safety], a difference in semantic structure happens with Sending letter. In [Opposite Safety], [Reception] is divided as [Letter] which is Ganchal that is directly received and [Message], which is news that is received indirectly from people. [Safety] is as [Family Safety] and [Individual Safety], [Mental state] also as [Family Safety Mental state] and [Individual Safety Mental state].