• Title/Summary/Keyword: TextMining

Search Result 1,563, Processing Time 0.026 seconds

Information types and characteristics within the Wireless Emergency Alert in COVID-19: Focusing on Wireless Emergency Alerts in Seoul (코로나 19 하에서 재난문자 내의 정보유형 및 특성: 서울특별시 재난문자를 중심으로)

  • Yoon, Sungwook;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.45-68
    • /
    • 2022
  • The central and local governments of the Republic of Korea provided information necessary for disaster response through wireless emergency alerts (WEAs) in order to overcome the pandemic situation in which COVID-19 rapidly spreads. Among all channels for delivering disaster information, wireless emergency alert is the most efficient, and since it adopts the CBS(Cell Broadcast Service) method that broadcasts directly to the mobile phone, it has the advantage of being able to easily access disaster information through the mobile phone without the effort of searching. In this study, the characteristics of wireless emergency alerts sent to Seoul during the past year and one month (January 2020 to January 2021) were derived through various text mining methodologies, and various types of information contained in wireless emergency alerts were analyzed. In addition, it was confirmed through the population mobility by age in the districts of Seoul that what kind of influence it had on the movement behavior of people. After going through the process of classifying key words and information included in each character, text analysis was performed so that individual sent characters can be used as an analysis unit by applying a document cluster analysis technique based on the included words. The number of WEAs sent to the Seoul has grown dramatically since the spread of Covid-19. In January 2020, only 10 WEAs were sent to the Seoul, but the number of the WEAs increased 5 times in March, and 7.7 times over the previous months. Since the basic, regional local government were authorized to send wireless emergency alerts independently, the sending behavior of related to wireless emergency alerts are different for each local government. Although most of the basic local governments increased the transmission of WEAs as the number of confirmed cases of Covid-19 increases, the trend of the increase in WEAs according to the increase in the number of confirmed cases of Covid-19 was different by region. By using structured econometric model, the effect of disaster information included in wireless emergency alerts on population mobility was measured by dividing it into baseline effect and accumulating effect. Six types of disaster information, including date, order, online URL, symptom, location, normative guidance, were identified in WEAs and analyzed through econometric modelling. It was confirmed that the types of information that significantly change population mobility by age are different. Population mobility of people in their 60s and 70s decreased when wireless emergency alerts included information related to date and order. As date and order information is appeared in WEAs when they intend to give information about Covid-19 confirmed cases, these results show that the population mobility of higher ages decreased as they reacted to the messages reporting of confirmed cases of Covid-19. Online information (URL) decreased the population mobility of in their 20s, and information related to symptoms reduced the population mobility of people in their 30s. On the other hand, it was confirmed that normative words that including the meaning of encouraging compliance with quarantine policies did not cause significant changes in the population mobility of all ages. This means that only meaningful information which is useful for disaster response should be included in the wireless emergency alerts. Repeated sending of wireless emergency alerts reduces the magnitude of the impact of disaster information on population mobility. It proves indirectly that under the prolonged pandemic, people started to feel tired of getting repetitive WEAs with similar content and started to react less. In order to effectively use WEAs for quarantine and overcoming disaster situations, it is necessary to reduce the fatigue of the people who receive WEA by sending them only in necessary situations, and to raise awareness of WEAs.

A Literature Review and Classification of Recommender Systems on Academic Journals (추천시스템관련 학술논문 분석 및 분류)

  • Park, Deuk-Hee;Kim, Hyea-Kyeong;Choi, Il-Young;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.139-152
    • /
    • 2011
  • Recommender systems have become an important research field since the emergence of the first paper on collaborative filtering in the mid-1990s. In general, recommender systems are defined as the supporting systems which help users to find information, products, or services (such as books, movies, music, digital products, web sites, and TV programs) by aggregating and analyzing suggestions from other users, which mean reviews from various authorities, and user attributes. However, as academic researches on recommender systems have increased significantly over the last ten years, more researches are required to be applicable in the real world situation. Because research field on recommender systems is still wide and less mature than other research fields. Accordingly, the existing articles on recommender systems need to be reviewed toward the next generation of recommender systems. However, it would be not easy to confine the recommender system researches to specific disciplines, considering the nature of the recommender system researches. So, we reviewed all articles on recommender systems from 37 journals which were published from 2001 to 2010. The 37 journals are selected from top 125 journals of the MIS Journal Rankings. Also, the literature search was based on the descriptors "Recommender system", "Recommendation system", "Personalization system", "Collaborative filtering" and "Contents filtering". The full text of each article was reviewed to eliminate the article that was not actually related to recommender systems. Many of articles were excluded because the articles such as Conference papers, master's and doctoral dissertations, textbook, unpublished working papers, non-English publication papers and news were unfit for our research. We classified articles by year of publication, journals, recommendation fields, and data mining techniques. The recommendation fields and data mining techniques of 187 articles are reviewed and classified into eight recommendation fields (book, document, image, movie, music, shopping, TV program, and others) and eight data mining techniques (association rule, clustering, decision tree, k-nearest neighbor, link analysis, neural network, regression, and other heuristic methods). The results represented in this paper have several significant implications. First, based on previous publication rates, the interest in the recommender system related research will grow significantly in the future. Second, 49 articles are related to movie recommendation whereas image and TV program recommendation are identified in only 6 articles. This result has been caused by the easy use of MovieLens data set. So, it is necessary to prepare data set of other fields. Third, recently social network analysis has been used in the various applications. However studies on recommender systems using social network analysis are deficient. Henceforth, we expect that new recommendation approaches using social network analysis will be developed in the recommender systems. So, it will be an interesting and further research area to evaluate the recommendation system researches using social method analysis. This result provides trend of recommender system researches by examining the published literature, and provides practitioners and researchers with insight and future direction on recommender systems. We hope that this research helps anyone who is interested in recommender systems research to gain insight for future research.

A New Approach to Automatic Keyword Generation Using Inverse Vector Space Model (키워드 자동 생성에 대한 새로운 접근법: 역 벡터공간모델을 이용한 키워드 할당 방법)

  • Cho, Won-Chin;Rho, Sang-Kyu;Yun, Ji-Young Agnes;Park, Jin-Soo
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.103-122
    • /
    • 2011
  • Recently, numerous documents have been made available electronically. Internet search engines and digital libraries commonly return query results containing hundreds or even thousands of documents. In this situation, it is virtually impossible for users to examine complete documents to determine whether they might be useful for them. For this reason, some on-line documents are accompanied by a list of keywords specified by the authors in an effort to guide the users by facilitating the filtering process. In this way, a set of keywords is often considered a condensed version of the whole document and therefore plays an important role for document retrieval, Web page retrieval, document clustering, summarization, text mining, and so on. Since many academic journals ask the authors to provide a list of five or six keywords on the first page of an article, keywords are most familiar in the context of journal articles. However, many other types of documents could not benefit from the use of keywords, including Web pages, email messages, news reports, magazine articles, and business papers. Although the potential benefit is large, the implementation itself is the obstacle; manually assigning keywords to all documents is a daunting task, or even impractical in that it is extremely tedious and time-consuming requiring a certain level of domain knowledge. Therefore, it is highly desirable to automate the keyword generation process. There are mainly two approaches to achieving this aim: keyword assignment approach and keyword extraction approach. Both approaches use machine learning methods and require, for training purposes, a set of documents with keywords already attached. In the former approach, there is a given set of vocabulary, and the aim is to match them to the texts. In other words, the keywords assignment approach seeks to select the words from a controlled vocabulary that best describes a document. Although this approach is domain dependent and is not easy to transfer and expand, it can generate implicit keywords that do not appear in a document. On the other hand, in the latter approach, the aim is to extract keywords with respect to their relevance in the text without prior vocabulary. In this approach, automatic keyword generation is treated as a classification task, and keywords are commonly extracted based on supervised learning techniques. Thus, keyword extraction algorithms classify candidate keywords in a document into positive or negative examples. Several systems such as Extractor and Kea were developed using keyword extraction approach. Most indicative words in a document are selected as keywords for that document and as a result, keywords extraction is limited to terms that appear in the document. Therefore, keywords extraction cannot generate implicit keywords that are not included in a document. According to the experiment results of Turney, about 64% to 90% of keywords assigned by the authors can be found in the full text of an article. Inversely, it also means that 10% to 36% of the keywords assigned by the authors do not appear in the article, which cannot be generated through keyword extraction algorithms. Our preliminary experiment result also shows that 37% of keywords assigned by the authors are not included in the full text. This is the reason why we have decided to adopt the keyword assignment approach. In this paper, we propose a new approach for automatic keyword assignment namely IVSM(Inverse Vector Space Model). The model is based on a vector space model. which is a conventional information retrieval model that represents documents and queries by vectors in a multidimensional space. IVSM generates an appropriate keyword set for a specific document by measuring the distance between the document and the keyword sets. The keyword assignment process of IVSM is as follows: (1) calculating the vector length of each keyword set based on each keyword weight; (2) preprocessing and parsing a target document that does not have keywords; (3) calculating the vector length of the target document based on the term frequency; (4) measuring the cosine similarity between each keyword set and the target document; and (5) generating keywords that have high similarity scores. Two keyword generation systems were implemented applying IVSM: IVSM system for Web-based community service and stand-alone IVSM system. Firstly, the IVSM system is implemented in a community service for sharing knowledge and opinions on current trends such as fashion, movies, social problems, and health information. The stand-alone IVSM system is dedicated to generating keywords for academic papers, and, indeed, it has been tested through a number of academic papers including those published by the Korean Association of Shipping and Logistics, the Korea Research Academy of Distribution Information, the Korea Logistics Society, the Korea Logistics Research Association, and the Korea Port Economic Association. We measured the performance of IVSM by the number of matches between the IVSM-generated keywords and the author-assigned keywords. According to our experiment, the precisions of IVSM applied to Web-based community service and academic journals were 0.75 and 0.71, respectively. The performance of both systems is much better than that of baseline systems that generate keywords based on simple probability. Also, IVSM shows comparable performance to Extractor that is a representative system of keyword extraction approach developed by Turney. As electronic documents increase, we expect that IVSM proposed in this paper can be applied to many electronic documents in Web-based community and digital library.

A Method of Analyzing Sentiment Polarity of Multilingual Social Media: A Case of Korean-Chinese Languages (다국어 소셜미디어에 대한 감성분석 방법 개발: 한국어-중국어를 중심으로)

  • Cui, Meina;Jin, Yoonsun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.91-111
    • /
    • 2016
  • It is crucial for the social media based marketing practices to perform sentiment analyze the unstructured data written by the potential consumers of their products and services. In particular, when it comes to the companies which are interested in global business, the companies must collect and analyze the data from the social media of multinational settings (e.g. Youtube, Instagram, etc.). In this case, since the texts are multilingual, they usually translate the sentences into a certain target language before conducting sentiment analysis. However, due to the lack of cultural differences and highly qualified data dictionary, translated sentences suffer from misunderstanding the true meaning. These result in decreasing the quality of sentiment analysis. Hence, this study aims to propose a method to perform a multilingual sentiment analysis, focusing on Korean-Chinese cases, while avoiding language translations. To show the feasibility of the idea proposed in this paper, we compare the performance of the proposed method with those of the legacy methods which adopt language translators. The results suggest that our method outperforms in terms of RMSE, and can be applied by the global business institutions.

A Study on the Analysis of Centrality and Brokerage Measures of Journal Citation Network - Focusing on KCI Journals - (학술지 인용 네트워크의 중심성과 중개성 분석에 관한 연구 - KCI 등재 학술지를 중심으로 -)

  • Lee, Soo-Sang
    • Journal of Korean Library and Information Science Society
    • /
    • v.50 no.4
    • /
    • pp.77-100
    • /
    • 2019
  • This study aims to analyze and compare centrality and brokerage measures of journal citation network focusing on textmining research. The analytic sample was 193 academic articles collected from 136 KCI journals published in 2018. The journal citation network was constructed based on citation relations. The characteristics, centralities, and brokerages of network was analyzed. The journal citation network consisted 136 nodes and 413 links with directed and weight. According to the five types of centrality(out-degree, in-degree, out-closeness, in-closeness, betweenness), journals of social sciences, engineering, and interdisciplinary research showed higher centrality. Social sciences, engineering and interdisciplinary research journals also showed higher brokerages as a result of brokerage analysis which identify five types of brokerage roles(coordinator, gatekeeper, representative, consultant, liaison). The centralities and brokerages of journals are positively correlated. This study suggested how to construct journal citation network from the articles focusing on certain topics. This was meaningful study in terms of conducting brokerage analysis and comparing it with centrality in the journal citation network.

Content Analysis of Food and Nutrition unit in Middle School Textbooks of Home Economics - Focus on the National Curriculums from 1st to 2009 revised (중학교 가정(기술·가정)교과 식생활 영역의 핵심 교육내용 분석 - 제1차 교육과정부터 2009개정 교육과정의 교과서 내용을 중심으로 -)

  • Jang, Yoon-Mi;Kim, Yoo Kyeong
    • Journal of Korean Home Economics Education Association
    • /
    • v.30 no.4
    • /
    • pp.93-112
    • /
    • 2018
  • We analysed the textbooks of Home Economics in middle school from 1st to 2009 curriculums to investigate the contents and the portion of Food and Nutrition section. The key words were generated by word cloud technique using text-mining, and the portion of Food and Nutrition section was presented as a ratio of the pages. The core key words of Food and Nutrition section through the curriculums were 'raw food'·'food'·'diet'. In 1st and 2nd curriculums, the main key words were related to food materials, condiments and nutrients such as 'vitamin'·'protein'. The words such as 'nutrition'·'eating'·'requirement' were newly appeared in 3rd, 'portion' in 6th, and 'diet'·'adolescence' in 7th curriculum. The mean ratio of Food and Nutrition section in Home Economics was 24.3%. While the portion was as high as 31.8% in 7th it was strikingly reduced to 15.2% in 2009th. curriculum. Besides, Food and Nutrition section was composed of 10 units of middle level category during the 2nd and 3rd curriculums, and was reduced to 2 small units with none of middle level category in 2009th curriculum. Although the contents of Food and Nutrition section has been developed and adapted to the needs of the society through the curriculums, the portion of Food and Nutrition section in Home Economics has been reduced especially in 2009th curriculum, which could raise concerns on the health of individuals and communities.

Consumers Perceptions on Sodium Saccharin in Social Media (소셜미디어 분석을 통한 삭카린나트륨 소비자 인식 조사)

  • Lee, Sooyeon;Lee, Wonsung;Moon, Il-Chul;Kwon, Hoonjeong
    • Journal of Food Hygiene and Safety
    • /
    • v.30 no.4
    • /
    • pp.329-342
    • /
    • 2015
  • The purpose of this study was to investigate consumers' perceptions of sodium saccharin in social media. Data was collected from Naver blogs and Naver web communities (Korean representative portal web-site), and media reports including comment sections on a Yonhap news website (Korean largest news agency). The results from Naver blogs and Naver web communities showed that it was primarily mentioned 'sodium saccharin-no added' products, properties of sodium saccharin, and methods of reducing sodium saccharin in food. When media reported the expansion of food categories permitted to use sodium saccharin, search volume for sodium saccharin has increased in both PC and mobile search engines. Also, it was mainly commented about distrust of government, criticism of food product price, and distrust of food companies below the news on the news site. The label of sodium saccharin-no added products in market emphasized "no added-sodium saccharin". These results suggest that consumers are interested in sodium saccharin and especially when media reported the expansion of food categories permitted to use it. Consumers were able to search various information on sodium saccharin except safety or acceptable daily intake through social media. Therefore media or competent authority should report item on sodium saccharin with information including safety or acceptable daily intake based on scientific background and reference or experts' interview for consumers to get reliable information.

Pattern Analysis for Civil Complaints of Local Governments Using a Text Mining (텍스트마이닝에 의한 지자체 민원청구 패턴 분석)

  • Won, Tae Hong;Yoo, Hwan Hee
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.34 no.3
    • /
    • pp.319-327
    • /
    • 2016
  • Korea faces a wide range of problems in areas such as safety, environment, and traffic due to the rapid economic development and urbanization process. Despite the local governments’ efforts to deal with electronic civil complaints and solve urban problems, civil complaints have been on the increase year by year. In this study, we collected civil complaint data over the last six years from a small and medium-sized city, Jinju-si. In order to conduct a spatial distribution pattern analysis, we indicated the location data on the area through Geocoding after classifying the reasons for civil complaints and then extracted the location data of the civil complaint occurrence spots in order to analyze the correlation between electronic civil complaints and land use. Results demonstrated that electronic civil complaints in Jinju-si were clustered in residential, central commercial, and residential-industrial mixed-use areas—areas where land development had been completed within the city center. After analyzing the civil complaints according to the land use, results revealed that complaints about illegal parking were the highest. Regarding the analysis results of facility distribution within a 50m radius from the civil complaint areas, civil complaints occurred a lot in detached housing areas located within the commercial and residential-industrial mixed-use areas. In the case of residential areas(old downtown), civil complaints were condensed in the areas with many ordinary restaurants. This research explored civil complaints in terms of the urban space and can be expected to be effectively utilized in finding solutions to the civil complaints

Analysis of Research Trends in SIAM Journal on Applied Mathematics Using Topic Modeling (토픽모델링을 활용한 SIAM Journal on Applied Mathematics의 연구 동향 분석)

  • Kim, Sung-Yeun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.7
    • /
    • pp.607-615
    • /
    • 2020
  • The purpose of this study was to analyze the research status and trends related to the industrial mathematics based on text mining techniques with a sample of 4910 papers collected in the SIAM Journal on Applied Mathematics from 1970 to 2019. The R program was used to collect titles, abstracts, and key words from the papers and to analyze topic modeling techniques based on LDA algorithm. As a result of the coherence score on the collected papers, 20 topics were determined optimally using the Gibbs sampling methods. The main results were as follows. First, studies on industrial mathematics were conducted in a variety of mathematics fields, including computational mathematics, geometry, mathematical modeling, topology, discrete mathematics, probability and statistics, with a focus on analysis and algebra. Second, 5 hot topics (mathematical biology, nonlinear partial differential equation, discrete mathematics, statistics, topology) and 1 cold topic (probability theory) were found based on time series regression analysis. Third, among the fields that were not reflected in the 2015 revised mathematics curriculum, numeral system, matrix, vector in space, and complex numbers were extracted as the contents to be covered in the high school mathematical curriculum. Finally, this study suggested strategies to activate industrial mathematics in Korea, described the study limitations, and proposed directions for future research.

Strategic Behavioral Characteristics of Co-opetition in the Display Industry (디스플레이 산업에서의 협력-경쟁(co-opetition) 전략적 행동 특성)

  • Jung, Hyo-jung;Cho, Yong-rae
    • Journal of Korea Technology Innovation Society
    • /
    • v.20 no.3
    • /
    • pp.576-606
    • /
    • 2017
  • It is more salient in the high-tech industry to cooperate even among competitors in order to promptly respond to the changes in product architecture. In this sense, 'co-opetition,' which is the combination word between 'cooperation' and 'competition,' is the new business term in the strategic management and represents the two concepts "simultaneously co-exist." From this view, this study set up the research purposes as follows: 1) investigating the corporate managerial and technological behavioral characteristics in the co-opetition of the global display industry. 2) verifying the emerging factors during the co-opetition behavior hereafter. 3) suggesting the strategic direction focusing on the co-opetition behavioral characteristics. To this end, this study used co-word network analysis to understand the structure in context level of the co-opetition. In order to understand topics on each network, we clustered the keywords by community detection algorithm based on modularity and labeled the cluster name. The results show that there were increasing patterns of competition rather than cooperation. Especially, the litigations for mutual control against Korean firms much more severely occurred and increased as time passed by. Investigating these network structure in technological evolution perspective, there were already active cooperation and competition among firms in the early 2000s surrounding the issues of OLED-related technology developments. From the middle of the 2000s, firm behaviors have focused on the acceleration of the existing technologies and the development of futuristic display. In other words, there has been competition to take leadership of the innovation in the level of final products such as the TV and smartphone by applying the display panel products. This study will provide not only better understanding on the context of the display industry, but also the analytical framework for the direction of the predictable innovation through analyzing the managerial and technological factors. Also, the methods can support CTOs and practitioners in the technology planning who should consider those factors in the process of decision making related to the strategic technology management and product development.