• Title/Summary/Keyword: TEXT-MINING

Search Result 1,545, Processing Time 0.027 seconds

A View from the Bottom: Project-Oriented Risk Mining Approach for Overseas Construction Projects

  • Lee, JeeHee;Son, JeongWook;Yi, June-Seong
    • International conference on construction engineering and project management
    • /
    • 2015.10a
    • /
    • pp.97-100
    • /
    • 2015
  • Analysis of construction tender documents in overseas projects is a very important issue from a risk management point of view. Unfortunately, majority of construction firms are biased by winning contracts without in-depth analysis of tender documents. As a result, many contractors have incurred loss in overseas projects. Although a lot of risk analysis techniques have been introduced, most of them focus project's external unexpected risks such as country conditions and owner's financial standing. However, because those external risks are difficult to control and take preemptive action, we need to concentrate on project inherent risks. Based on this premise, this paper proposes a project-oriented risk mining approach which could detect and extract project risk factors automatically before they are materialized and assess them. This study presents a methodology regarding how to extract potential risks which exist in owner's project requirements and project tender documents using state of the art data analysis method such as text mining, data mining, and information visualization. The project-oriented risk mining approach is expected to effectively reflect project characteristics to the project risk management and could provide construction firms with valuable business intelligence.

  • PDF

Quantitative Text Mining for Social Science: Analysis of Immigrant in the Articles (사회과학을 위한 양적 텍스트 마이닝: 이주, 이민 키워드 논문 및 언론기사 분석)

  • Yi, Soo-Jeong;Choi, Doo-Young
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.5
    • /
    • pp.118-127
    • /
    • 2020
  • The paper introduces trends and methodological challenges of quantitative Korean text analysis by using the case studies of academic and news media articles on "migration" and "immigration" within the periods of 2017-2019. The quantitative text analysis based on natural language processing technology (NLP) and this became an essential tool for social science. It is a part of data science that converts documents into structured data and performs hypothesis discovery and verification as the data and visualize data. Furthermore, we examed the commonly applied social scientific statistical models of quantitative text analysis by using Natural Language Processing (NLP) with R programming and Quanteda.

Prediction of Physical Examination Demand Using Text Mining (텍스트 마이닝을 이용한 건강검진 수요 예측)

  • Park, Kyungbo;Kim, Mi Ryang
    • Journal of Information Technology Services
    • /
    • v.21 no.5
    • /
    • pp.95-106
    • /
    • 2022
  • Recently, physical examinations have become an important strategy to reduce costs for individuals and society. Pre-physical counseling is important for an effective physical examination. However, incomplete counseling is being conducted because the demand for physical examinations is not predicted. Therefore, in this study, the demand for physical examination was predicted using text mining and stepwise regression. As a result of the analysis, the most recent text data showed a high explanatory power of the demand for physical examination. Also, large amounts of data have high explanatory power. In addition, it was found that the high frequency of the text "health food" reduces the number of health examination customers. And the higher the frequency of the text of the word "food", the lower the number of physical examination customers. However, when the word "wild ginseng" was exposed a lot on Twitter, the number of physical examination customers visiting hospitals increased. In other words, customers consume efficiently by comparing the health examination price with the price of consumer goods. The proposed research framework can help predict demand in other industries.

A Pilot Study on Applying Text Mining Tools to Analyzing Steel Industry Trends : A Case Study of the Steel Industry for the Company "P" (철강산업 트렌드 분석을 위한 텍스트 마이닝 도입 연구 : P사(社) 사례를 중심으로)

  • Min, Ki Young;Kim, Hoon Tae;Ji, Yong Gu
    • The Journal of Society for e-Business Studies
    • /
    • v.19 no.3
    • /
    • pp.51-64
    • /
    • 2014
  • It becomes more and more important for business survival to have the ability to predict the future with uncertainties increasing faster and faster. To predict the future, text mining tools are one of the main candidate other than traditional quantitative analyses, but those efforts are still at their infancy. This paper is to introduce one of those efforts using the case of company "P" in the steel industry. Even with only four month pilot studies, we found strong possibilities, if not testified robustly, to predict future industrial trends using text mining tools. For these text mining case studies, we categorized steel industry trend keywords into ten components (10 categories) to study ten different subjects for each category. Once found any meaningful changes in a trend, we had investigated in more detail what and how some trend happened so. To be more roust, firstly we need to define more cleary the purpose of text mining analyses. Then we need to categorize industry trend key words in a more systematic way using systems thinking models. With these improvements, we are quite sure that applying text mining tools to analyzing industry trends will contribute to predicting the future industry trends as well as to identifying the unseen trends otherwise.

Application Development for Text Mining: KoALA (텍스트 마이닝 통합 애플리케이션 개발: KoALA)

  • Byeong-Jin Jeon;Yoon-Jin Choi;Hee-Woong Kim
    • Information Systems Review
    • /
    • v.21 no.2
    • /
    • pp.117-137
    • /
    • 2019
  • In the Big Data era, data science has become popular with the production of numerous data in various domains, and the power of data has become a competitive power. There is a growing interest in unstructured data, which accounts for more than 80% of the world's data. Along with the everyday use of social media, most of the unstructured data is in the form of text data and plays an important role in various areas such as marketing, finance, and distribution. However, text mining using social media is difficult to access and difficult to use compared to data mining using numerical data. Thus, this study aims to develop Korean Natural Language Application (KoALA) as an integrated application for easy and handy social media text mining without relying on programming language or high-level hardware or solution. KoALA is a specialized application for social media text mining. It is an integrated application that can analyze both Korean and English. KoALA handles the entire process from data collection to preprocessing, analysis and visualization. This paper describes the process of designing, implementing, and applying KoALA applications using the design science methodology. Lastly, we will discuss practical use of KoALA through a block-chain business case. Through this paper, we hope to popularize social media text mining and utilize it for practical and academic use in various domains.

Analysis of key words published with the Korea Society of Emergency Medical Services journal using text mining (텍스트마이닝을 이용한 한국응급구조학회지 중심단어 분석)

  • Kwon, Chan-Yang;Yang, Hyun-Mo
    • The Korean Journal of Emergency Medical Services
    • /
    • v.24 no.1
    • /
    • pp.85-92
    • /
    • 2020
  • Purpose: The purpose of this study was to analyze the English abstract key words found within the Korea Society of Emergency Medical Services journal using text mining techniques to determine the adherence of these terms with Medical Subject Headings (MeSH) and identify key word trends. Methods: We analyzed 212 papers that were published from 2012 to 2019. R software, web scraping, and frequency analysis of key words were conducted using R's basic and text mining packages. Additionally, the Word Clouds package was used for visualization. Results: The average number of key words used per study was 3.9. Word cloud visualization revealed that CPR was most prominent in the first half and emergency medical technician was most frequently used during the second half. There were a total of 542 (64.9%) words that exactly matched the MeSH listed words. A total of 293 (35%) key words did not match MeSH listed words. Conclusion: Researchers should obey submission rules. Further, journals should update their respective submission rules. MeSH key words that are frequently cited should be suggested for use.

Methodology for Applying Text Mining Techniques to Analyzing Online Customer Reviews for Market Segmentation (온라인 고객리뷰 분석을 통한 시장세분화에 텍스트마이닝 기술을 적용하기 위한 방법론)

  • Kim, Keun-Hyung;Oh, Sung-Ryoel
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.8
    • /
    • pp.272-284
    • /
    • 2009
  • In this paper, we proposed the methodology for analyzing online customer reviews by using text mining technologies. We introduced marketing segmentation into the methodology because it would be efficient and effective to analyze the online customers by grouping them into similar online customers that might include similar opinions and experiences of the customers. That is, the methodology uses categorization and information extraction functions among text mining technologies, matched up with the concept of market segmentation. In particular, the methodology also uses cross-tabulations analysis function which is a kind of traditional statistics analysis functions to derive rigorous results of the analysis. In order to confirm the validity of the methodology, we actually analyzed online customer reviews related with tourism by using the methodology.

Evaluation of Vulnerability on Rural Emergency Relief Service using Text Mining (Text Mining 기법을 활용한 농촌마을 긴급구호서비스 접근 취약성 평가)

  • Woo, Jaehyeong;Park, Jinseon;Yoon, Seongsoo
    • Journal of Korean Society of Rural Planning
    • /
    • v.24 no.1
    • /
    • pp.67-74
    • /
    • 2018
  • The rural areas are large residential space with fewer people than urban areas. That is why they are vulnerable to social services such as health care and security. This research analyzed the vulnerability of emergency relief service in rural village through text mining and the weighting value have been calculated. Based on the calculated statistics data, the police facilities are the most important, While the fire fighting and hospital facilities are important as well. In addition, the distance from the emergency relief service facility to the rural village was confirmed by using Open API. By combining these results, The vulnerable areas of the rural villages and the emergency relief service facilities were calculated and classified into 5 levels. For rural areas, the 1st class will have 33 places, following by 1,179 in 2nd class, 199 in 3rd class, 17 in 4th class and 8 in 5th class. Hence in order to further supplement the vulnerable areas to emergency relief service in villages, geographical relocation and policy approach of emergency relief service facilities are necessary.

Competitive intelligence in Korean Ramen Market using Text Mining and Sentiment Analysis

  • Kim, Yoosin;Jeong, Seung Ryul
    • Journal of Internet Computing and Services
    • /
    • v.19 no.1
    • /
    • pp.155-166
    • /
    • 2018
  • These days, online media, such as blogospheres, online communities, and social networking sites, provides the uncountable user-generated content (UGC) to discover market intelligence and business insight with. The business has been interested in consumers, and constantly requires the approach to identify consumers' opinions and competitive advantage in the competing market. Analyzing consumers' opinion about oneself and rivals can help decision makers to gain in-depth and fine-grained understanding on the human and social behavioral dynamics underlying the competition. In order to accomplish the comparison study for rival products and companies, we attempted to do competitive analysis using text mining with online UGC for two popular and competing ramens, a market leader and a market follower, in the Korean instant noodle market. Furthermore, to overcome the lack of the Korean sentiment lexicon, we developed the domain specific sentiment dictionary of Korean texts. We gathered 19,386 pieces of blogs and forum messages, developed the Korean sentiment dictionary, and defined the taxonomy for categorization. In the context of our study, we employed sentiment analysis to present consumers' opinion and statistical analysis to demonstrate the differences between the competitors. Our results show that the sentiment portrayed by the text mining clearly differentiate the two rival noodles and convincingly confirm that one is a market leader and the other is a follower. In this regard, we expect this comparison can help business decision makers to understand rich in-depth competitive intelligence hidden in the social media.

Hangeul Stem Extraction Algorithm for Text Mining Based on Natural Language Processing (자연어 처리 기반 텍스트 마이닝을 위한 한글 어간 추출 알고리즘)

  • Choi, Ki-won;Choi, Seong-hun;Jo, Sang-hyeon;Kim, Hee-cheol
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.718-721
    • /
    • 2017
  • Natural language processing, which is the basis of text mining, differs depending on the type of language. Especially, Hangeul, which has relatively high freedom of expression compared to other languages, has various forms of words depending on the use of ending. The part that does not change in these various forms of words is called the stem. For effective text mining, it is essential to extract words and unify various types of words. Therefore, this paper proposes an extraction algorithm for Hangul word for effective text mining of Hangul document.

  • PDF