• Title/Summary/Keyword: internet based information system

Search Result 4,698, Processing Time 0.063 seconds

Multi-Dimensional Analysis Method of Product Reviews for Market Insight (마켓 인사이트를 위한 상품 리뷰의 다차원 분석 방안)

  • Park, Jeong Hyun;Lee, Seo Ho;Lim, Gyu Jin;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.57-78
    • /
    • 2020
  • With the development of the Internet, consumers have had an opportunity to check product information easily through E-Commerce. Product reviews used in the process of purchasing goods are based on user experience, allowing consumers to engage as producers of information as well as refer to information. This can be a way to increase the efficiency of purchasing decisions from the perspective of consumers, and from the seller's point of view, it can help develop products and strengthen their competitiveness. However, it takes a lot of time and effort to understand the overall assessment and assessment dimensions of the products that I think are important in reading the vast amount of product reviews offered by E-Commerce for the products consumers want to compare. This is because product reviews are unstructured information and it is difficult to read sentiment of reviews and assessment dimension immediately. For example, consumers who want to purchase a laptop would like to check the assessment of comparative products at each dimension, such as performance, weight, delivery, speed, and design. Therefore, in this paper, we would like to propose a method to automatically generate multi-dimensional product assessment scores in product reviews that we would like to compare. The methods presented in this study consist largely of two phases. One is the pre-preparation phase and the second is the individual product scoring phase. In the pre-preparation phase, a dimensioned classification model and a sentiment analysis model are created based on a review of the large category product group review. By combining word embedding and association analysis, the dimensioned classification model complements the limitation that word embedding methods for finding relevance between dimensions and words in existing studies see only the distance of words in sentences. Sentiment analysis models generate CNN models by organizing learning data tagged with positives and negatives on a phrase unit for accurate polarity detection. Through this, the individual product scoring phase applies the models pre-prepared for the phrase unit review. Multi-dimensional assessment scores can be obtained by aggregating them by assessment dimension according to the proportion of reviews organized like this, which are grouped among those that are judged to describe a specific dimension for each phrase. In the experiment of this paper, approximately 260,000 reviews of the large category product group are collected to form a dimensioned classification model and a sentiment analysis model. In addition, reviews of the laptops of S and L companies selling at E-Commerce are collected and used as experimental data, respectively. The dimensioned classification model classified individual product reviews broken down into phrases into six assessment dimensions and combined the existing word embedding method with an association analysis indicating frequency between words and dimensions. As a result of combining word embedding and association analysis, the accuracy of the model increased by 13.7%. The sentiment analysis models could be seen to closely analyze the assessment when they were taught in a phrase unit rather than in sentences. As a result, it was confirmed that the accuracy was 29.4% higher than the sentence-based model. Through this study, both sellers and consumers can expect efficient decision making in purchasing and product development, given that they can make multi-dimensional comparisons of products. In addition, text reviews, which are unstructured data, were transformed into objective values such as frequency and morpheme, and they were analysed together using word embedding and association analysis to improve the objectivity aspects of more precise multi-dimensional analysis and research. This will be an attractive analysis model in terms of not only enabling more effective service deployment during the evolving E-Commerce market and fierce competition, but also satisfying both customers.

An Exploratory Study for Identifying Key Factors in Online Games Development Strategy Utilizing Web Community (온라인게임 개발전략에 관한 탐색적 연구 : 게임 커뮤니티 활용을 중심으로)

  • Jung, Jai-Jin;Chang, Chung-Moo;Kim, Tae-Ung
    • The KIPS Transactions:PartD
    • /
    • v.11D no.4
    • /
    • pp.991-1002
    • /
    • 2004
  • Online game business has emerged as the most lucrative entertainment industry, with over 20 million platers. The popularity of online games can be attributed to the presence of numerous PC Bangs around the country, which have pushed online games into the mainstream culture while broadband internet services facilitated online game play. The age distribution of online game players is expanding and a variety of new games are under development to target certain age groups. While the online game market continues to expand, with many new online game publishers entering the market, relatively little is known about which factors are strategically important for successful development of online games. A conceptual framework is proposed, and a structural equation modeling, for Identifying the factors affecting the market success of online games, is developed. The concept of online game community, idea generation, systematic development strategy, flexible development process, utilizing demo-version, outsourcing, etc, are ail introduced into the model, as the independent variables affecting the success level of online games directly and indirectly. Based on data collected from questionnaire survey, the validity of the model has been tested and interesting conclusions have been developed concerning the relationships between these variables. Statistical results show that utilizing online game community and system atic development strategy is the key for successful online game development. Other interesting results concerning game development strategy are also provided. It is hoped that this result might provide the useful guidelines for developing the successful online game contents. With a better understanding of key success factors, online game developers should be able to make adjustments in their development and marketing plans, providing them with a sustainable advantage over their competition.

Color-related Query Processing for Intelligent E-Commerce Search (지능형 검색엔진을 위한 색상 질의 처리 방안)

  • Hong, Jung A;Koo, Kyo Jung;Cha, Ji Won;Seo, Ah Jeong;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.109-125
    • /
    • 2019
  • As interest on intelligent search engines increases, various studies have been conducted to extract and utilize the features related to products intelligencely. In particular, when users search for goods in e-commerce search engines, the 'color' of a product is an important feature that describes the product. Therefore, it is necessary to deal with the synonyms of color terms in order to produce accurate results to user's color-related queries. Previous studies have suggested dictionary-based approach to process synonyms for color features. However, the dictionary-based approach has a limitation that it cannot handle unregistered color-related terms in user queries. In order to overcome the limitation of the conventional methods, this research proposes a model which extracts RGB values from an internet search engine in real time, and outputs similar color names based on designated color information. At first, a color term dictionary was constructed which includes color names and R, G, B values of each color from Korean color standard digital palette program and the Wikipedia color list for the basic color search. The dictionary has been made more robust by adding 138 color names converted from English color names to foreign words in Korean, and with corresponding RGB values. Therefore, the fininal color dictionary includes a total of 671 color names and corresponding RGB values. The method proposed in this research starts by searching for a specific color which a user searched for. Then, the presence of the searched color in the built-in color dictionary is checked. If there exists the color in the dictionary, the RGB values of the color in the dictioanry are used as reference values of the retrieved color. If the searched color does not exist in the dictionary, the top-5 Google image search results of the searched color are crawled and average RGB values are extracted in certain middle area of each image. To extract the RGB values in images, a variety of different ways was attempted since there are limits to simply obtain the average of the RGB values of the center area of images. As a result, clustering RGB values in image's certain area and making average value of the cluster with the highest density as the reference values showed the best performance. Based on the reference RGB values of the searched color, the RGB values of all the colors in the color dictionary constructed aforetime are compared. Then a color list is created with colors within the range of ${\pm}50$ for each R value, G value, and B value. Finally, using the Euclidean distance between the above results and the reference RGB values of the searched color, the color with the highest similarity from up to five colors becomes the final outcome. In order to evaluate the usefulness of the proposed method, we performed an experiment. In the experiment, 300 color names and corresponding color RGB values by the questionnaires were obtained. They are used to compare the RGB values obtained from four different methods including the proposed method. The average euclidean distance of CIE-Lab using our method was about 13.85, which showed a relatively low distance compared to 3088 for the case using synonym dictionary only and 30.38 for the case using the dictionary with Korean synonym website WordNet. The case which didn't use clustering method of the proposed method showed 13.88 of average euclidean distance, which implies the DBSCAN clustering of the proposed method can reduce the Euclidean distance. This research suggests a new color synonym processing method based on RGB values that combines the dictionary method with the real time synonym processing method for new color names. This method enables to get rid of the limit of the dictionary-based approach which is a conventional synonym processing method. This research can contribute to improve the intelligence of e-commerce search systems especially on the color searching feature.

SKU recommender system for retail stores that carry identical brands using collaborative filtering and hybrid filtering (협업 필터링 및 하이브리드 필터링을 이용한 동종 브랜드 판매 매장간(間) 취급 SKU 추천 시스템)

  • Joe, Denis Yongmin;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.77-110
    • /
    • 2017
  • Recently, the diversification and individualization of consumption patterns through the web and mobile devices based on the Internet have been rapid. As this happens, the efficient operation of the offline store, which is a traditional distribution channel, has become more important. In order to raise both the sales and profits of stores, stores need to supply and sell the most attractive products to consumers in a timely manner. However, there is a lack of research on which SKUs, out of many products, can increase sales probability and reduce inventory costs. In particular, if a company sells products through multiple in-store stores across multiple locations, it would be helpful to increase sales and profitability of stores if SKUs appealing to customers are recommended. In this study, the recommender system (recommender system such as collaborative filtering and hybrid filtering), which has been used for personalization recommendation, is suggested by SKU recommendation method of a store unit of a distribution company that handles a homogeneous brand through a plurality of sales stores by country and region. We calculated the similarity of each store by using the purchase data of each store's handling items, filtering the collaboration according to the sales history of each store by each SKU, and finally recommending the individual SKU to the store. In addition, the store is classified into four clusters through PCA (Principal Component Analysis) and cluster analysis (Clustering) using the store profile data. The recommendation system is implemented by the hybrid filtering method that applies the collaborative filtering in each cluster and measured the performance of both methods based on actual sales data. Most of the existing recommendation systems have been studied by recommending items such as movies and music to the users. In practice, industrial applications have also become popular. In the meantime, there has been little research on recommending SKUs for each store by applying these recommendation systems, which have been mainly dealt with in the field of personalization services, to the store units of distributors handling similar brands. If the recommendation method of the existing recommendation methodology was 'the individual field', this study expanded the scope of the store beyond the individual domain through a plurality of sales stores by country and region and dealt with the store unit of the distribution company handling the same brand SKU while suggesting a recommendation method. In addition, if the existing recommendation system is limited to online, it is recommended to apply the data mining technique to develop an algorithm suitable for expanding to the store area rather than expanding the utilization range offline and analyzing based on the existing individual. The significance of the results of this study is that the personalization recommendation algorithm is applied to a plurality of sales outlets handling the same brand. A meaningful result is derived and a concrete methodology that can be constructed and used as a system for actual companies is proposed. It is also meaningful that this is the first attempt to expand the research area of the academic field related to the existing recommendation system, which was focused on the personalization domain, to a sales store of a company handling the same brand. From 05 to 03 in 2014, the number of stores' sales volume of the top 100 SKUs are limited to 52 SKUs by collaborative filtering and the hybrid filtering method SKU recommended. We compared the performance of the two recommendation methods by totaling the sales results. The reason for comparing the two recommendation methods is that the recommendation method of this study is defined as the reference model in which offline collaborative filtering is applied to demonstrate higher performance than the existing recommendation method. The results of this model are compared with the Hybrid filtering method, which is a model that reflects the characteristics of the offline store view. The proposed method showed a higher performance than the existing recommendation method. The proposed method was proved by using actual sales data of large Korean apparel companies. In this study, we propose a method to extend the recommendation system of the individual level to the group level and to efficiently approach it. In addition to the theoretical framework, which is of great value.

Effects of firm strategies on customer acquisition of Software as a Service (SaaS) providers: A mediating and moderating role of SaaS technology maturity (SaaS 기업의 차별화 및 가격전략이 고객획득성과에 미치는 영향: SaaS 기술성숙도 수준의 매개효과 및 조절효과를 중심으로)

  • Chae, SeongWook;Park, Sungbum
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.151-171
    • /
    • 2014
  • Firms today have sought management effectiveness and efficiency utilizing information technologies (IT). Numerous firms are outsourcing specific information systems functions to cope with their short of information resources or IT experts, or to reduce their capital cost. Recently, Software-as-a-Service (SaaS) as a new type of information system has become one of the powerful outsourcing alternatives. SaaS is software deployed as a hosted and accessed over the internet. It is regarded as the idea of on-demand, pay-per-use, and utility computing and is now being applied to support the core competencies of clients in areas ranging from the individual productivity area to the vertical industry and e-commerce area. In this study, therefore, we seek to quantify the value that SaaS has on business performance by examining the relationships among firm strategies, SaaS technology maturity, and business performance of SaaS providers. We begin by drawing from prior literature on SaaS, technology maturity and firm strategy. SaaS technology maturity is classified into three different phases such as application service providing (ASP), Web-native application, and Web-service application. Firm strategies are manipulated by the low-cost strategy and differentiation strategy. Finally, we considered customer acquisition as a business performance. In this sense, specific objectives of this study are as follows. First, we examine the relationships between customer acquisition performance and both low-cost strategy and differentiation strategy of SaaS providers. Secondly, we investigate the mediating and moderating effects of SaaS technology maturity on those relationships. For this purpose, study collects data from the SaaS providers, and their line of applications registered in the database in CNK (Commerce net Korea) in Korea using a questionnaire method by the professional research institution. The unit of analysis in this study is the SBUs (strategic business unit) in the software provider. A total of 199 SBUs is used for analyzing and testing our hypotheses. With regards to the measurement of firm strategy, we take three measurement items for differentiation strategy such as the application uniqueness (referring an application aims to differentiate within just one or a small number of target industry), supply channel diversification (regarding whether SaaS vendor had diversified supply chain) as well as the number of specialized expertise and take two items for low cost strategy like subscription fee and initial set-up fee. We employ a hierarchical regression analysis technique for testing moderation effects of SaaS technology maturity and follow the Baron and Kenny's procedure for determining if firm strategies affect customer acquisition through technology maturity. Empirical results revealed that, firstly, when differentiation strategy is applied to attain business performance like customer acquisition, the effects of the strategy is moderated by the technology maturity level of SaaS providers. In other words, securing higher level of SaaS technology maturity is essential for higher business performance. For instance, given that firms implement application uniqueness or a distribution channel diversification as a differentiation strategy, they can acquire more customers when their level of SaaS technology maturity is higher rather than lower. Secondly, results indicate that pursuing differentiation strategy or low cost strategy effectively works for SaaS providers' obtaining customer, which means that continuously differentiating their service from others or making their service fee (subscription fee or initial set-up fee) lower are helpful for their business success in terms of acquiring their customers. Lastly, results show that the level of SaaS technology maturity mediates the relationships between low cost strategy and customer acquisition. That is, based on our research design, customers usually perceive the real value of the low subscription fee or initial set-up fee only through the SaaS service provide by vender and, in turn, this will affect their decision making whether subscribe or not.

NUI/NUX of the Virtual Monitor Concept using the Concentration Indicator and the User's Physical Features (사용자의 신체적 특징과 뇌파 집중 지수를 이용한 가상 모니터 개념의 NUI/NUX)

  • Jeon, Chang-hyun;Ahn, So-young;Shin, Dong-il;Shin, Dong-kyoo
    • Journal of Internet Computing and Services
    • /
    • v.16 no.6
    • /
    • pp.11-21
    • /
    • 2015
  • As growing interest in Human-Computer Interaction(HCI), research on HCI has been actively conducted. Also with that, research on Natural User Interface/Natural User eXperience(NUI/NUX) that uses user's gesture and voice has been actively conducted. In case of NUI/NUX, it needs recognition algorithm such as gesture recognition or voice recognition. However these recognition algorithms have weakness because their implementation is complex and a lot of time are needed in training because they have to go through steps including preprocessing, normalization, feature extraction. Recently, Kinect is launched by Microsoft as NUI/NUX development tool which attracts people's attention, and studies using Kinect has been conducted. The authors of this paper implemented hand-mouse interface with outstanding intuitiveness using the physical features of a user in a previous study. However, there are weaknesses such as unnatural movement of mouse and low accuracy of mouse functions. In this study, we designed and implemented a hand mouse interface which introduce a new concept called 'Virtual monitor' extracting user's physical features through Kinect in real-time. Virtual monitor means virtual space that can be controlled by hand mouse. It is possible that the coordinate on virtual monitor is accurately mapped onto the coordinate on real monitor. Hand-mouse interface based on virtual monitor concept maintains outstanding intuitiveness that is strength of the previous study and enhance accuracy of mouse functions. Further, we increased accuracy of the interface by recognizing user's unnecessary actions using his concentration indicator from his encephalogram(EEG) data. In order to evaluate intuitiveness and accuracy of the interface, we experimented it for 50 people from 10s to 50s. As the result of intuitiveness experiment, 84% of subjects learned how to use it within 1 minute. Also, as the result of accuracy experiment, accuracy of mouse functions (drag(80.4%), click(80%), double-click(76.7%)) is shown. The intuitiveness and accuracy of the proposed hand-mouse interface is checked through experiment, this is expected to be a good example of the interface for controlling the system by hand in the future.

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis (다계층 이원 네트워크를 활용한 사용자 관점의 이슈 클러스터링)

  • Kim, Jieun;Kim, Namgyu;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.2
    • /
    • pp.93-107
    • /
    • 2014
  • In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.

The Building Plan of Online ADR Model related to the International Commercial Transaction Dispute Resolution (국제상거래 분쟁해결을 위한 온라인 ADR 모델 구축방안)

  • Kim Sun-Kwang;Kim Jong-Rack;Hong Sung-Kyu
    • Journal of Arbitration Studies
    • /
    • v.15 no.2
    • /
    • pp.3-35
    • /
    • 2005
  • The meaning of Online ADR lies in the prompt and economical resolution of disputes by applying the information/communication element (Internet) to existing ADR. However, if the promptness and economical efficiency are overemphasized, the fairness and appropriateness of dispute resolution may be compromised and consequently Online ADR will be belittled and criticized as second-class trials. In addition, as communication is mostly made using texts in Online ADR it is difficult to investigate cases and to create atmosphere and induce dynamic feelings, which are possible in the process of dispute resolution through face-to-face contact. Despite such difficulties, Online ADR is expanding its area not only in online but also in offline due to its advantages such as promptness, low expenses and improved resolution methods, and is expected to develop rapidly as the electronic government decided to adopt it in the future. Accordingly, the following points must be focused on for the continuous First, in the legal and institutional aspects for the development of Online ADR, it is necessary to establish a framework law on ADR. A framework law on ADR comprehending existing mediation and arbitration should be established and it must include contents of Online ADR, which utilizes electronic communication means. However, it is too early to establish a separate law for Online ADR because Online ADR must develop based on the theoretical system of ADR. Second, although Online ADR is expanding rapidly, it may take time to be settled as a tool of dispute resolution. As discussed earlier, additionally, if the amount of money in dispute is large or the dispute is complicated, Online ADR may have a negative effect on the resolution of the dispute. Thus, it is necessary to apply Online ADR to trifle cases or domestic cases in the early stage, accumulating experiences and correcting errors. Moreover, in order to settle numerous disputes effectively, Online ADR cases should be analyzed systematically and cases should be classified by type so that similar disputes may be settled automatically. What is more, these requirements should reflected in developing Online ADR system. Third, the application of Online ADR is being expanded to consumer disputes, domain name disputes, commercial disputes, legal disputes, etc., millions of cases are settled through Online ADR, and 115 Online ADR sites are in operation throughout the world. Thus Online ADR requires not temporary but continuous attention, and mediators and arbitrators participating in Online ADR should be more intensively educated on negotiation and information technologies. In particular, government-led research projects should be promoted to establish Online ADR model and these projects should be supported by comprehensive researches on mediation, arbitration and Online ADR. Fourth, what is most important in the continuous development and expansion of Online ADR is to secure confidence in Online ADR and advertise Online ADR to users. For this, incentives and rewards should be given to specialists such as lawyers when they participate in Online ADR as mediators and arbitrators in order to improve their expertise. What is more, from the early stage, the government and public institutions should have initiative in promoting Online ADR so that parties involved in disputes recognize the substantial contribution of Online ADR to dispute resolution. Lastly, dispute resolution through Online ADR is performed by organizations such as Korea Institute for Electronic Commerce and Korea Consumer Protection Board and partially by Korean Commercial Arbitration Board. Online ADR is expected to expand its area to commercial disputes in offline in the future. In response to this, Korean Commercial Arbitration Board, which is an organization for commercial dispute resolution, needs to be restructured.

  • PDF

Earthquake Monitoring : Future Strategy (지진관측 : 미래 발전 전략)

  • Chi, Heon-Cheol;Park, Jung-Ho;Kim, Geun-Young;Shin, Jin-Soo;Shin, In-Cheul;Lim, In-Seub;Jeong, Byung-Sun;Sheen, Dong-Hoon
    • Geophysics and Geophysical Exploration
    • /
    • v.13 no.3
    • /
    • pp.268-276
    • /
    • 2010
  • Earthquake Hazard Mitigation Law was activated into force on March 2009. By the law, the obligation to monitor the effect of earthquake on the facilities was extended to many organizations such as gas company and local governments. Based on the estimation of National Emergency Management Agency (NEMA), the number of free-surface acceleration stations would be expanded to more than 400. The advent of internet protocol and the more simplified operation have allowed the quick and easy installation of seismic stations. In addition, the dynamic range of seismic instruments has been continuously improved enough to evaluate damage intensity and to alert alarm directly for earthquake hazard mitigation. For direct visualization of damage intensity and area, Real Time Intensity COlor Mapping (RTICOM) is explained in detail. RTICOM would be used to retrieve the essential information for damage evaluation, Peak Ground Acceleration (PGA). Destructive earthquake damage is usually due to surface waves which just follow S wave. The peak amplitude of surface wave would be pre-estimated from the amplitude and frequency content of first arrival P wave. Earthquake Early Warning (EEW) system is conventionally defined to estimate local magnitude from P wave. The status of EEW is reviewed and the application of EEW to Odesan earthquake is exampled with ShakeMap in order to make clear its appearance. In the sense of rapidity, the earthquake announcement of Korea Meteorological Agency (KMA) might be dramatically improved by the adaption of EEW. In order to realize hazard mitigation, EEW should be applied to the local crucial facilities such as nuclear power plants and fragile semi-conduct plant. The distributed EEW is introduced with the application example of Uljin earthquake. Not only Nation-wide but also locally distributed EEW applications, all relevant information is needed to be shared in real time. The plan of extension of Korea Integrated Seismic System (KISS) is briefly explained in order to future cooperation of data sharing and utilization.

A Methodology for Automatic Multi-Categorization of Single-Categorized Documents (단일 카테고리 문서의 다중 카테고리 자동확장 방법론)

  • Hong, Jin-Sung;Kim, Namgyu;Lee, Sangwon
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.77-92
    • /
    • 2014
  • Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we propose a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. First, we attempt to find the relationship between documents and topics by using the result of topic analysis for single-categorized documents. Second, we construct a correspondence table between topics and categories by investigating the relationship between them. Finally, we calculate the matching scores for each document to multiple categories. The results imply that a document can be classified into a certain category if and only if the matching score is higher than the predefined threshold. For example, we can classify a certain document into three categories that have larger matching scores than the predefined threshold. The main contribution of our study is that our methodology can improve the applicability of traditional multi-category classifiers by generating multi-categorized documents from single-categorized documents. Additionally, we propose a module for verifying the accuracy of the proposed methodology. For performance evaluation, we performed intensive experiments with news articles. News articles are clearly categorized based on the theme, whereas the use of vulgar language and slang is smaller than other usual text document. We collected news articles from July 2012 to June 2013. The articles exhibit large variations in terms of the number of types of categories. This is because readers have different levels of interest in each category. Additionally, the result is also attributed to the differences in the frequency of the events in each category. In order to minimize the distortion of the result from the number of articles in different categories, we extracted 3,000 articles equally from each of the eight categories. Therefore, the total number of articles used in our experiments was 24,000. The eight categories were "IT Science," "Economy," "Society," "Life and Culture," "World," "Sports," "Entertainment," and "Politics." By using the news articles that we collected, we calculated the document/category correspondence scores by utilizing topic/category and document/topics correspondence scores. The document/category correspondence score can be said to indicate the degree of correspondence of each document to a certain category. As a result, we could present two additional categories for each of the 23,089 documents. Precision, recall, and F-score were revealed to be 0.605, 0.629, and 0.617 respectively when only the top 1 predicted category was evaluated, whereas they were revealed to be 0.838, 0.290, and 0.431 when the top 1 - 3 predicted categories were considered. It was very interesting to find a large variation between the scores of the eight categories on precision, recall, and F-score.