• Title/Summary/Keyword: Text Mining Method

Search Result 447, Processing Time 0.025 seconds

Using Text Mining Techniques for Intrusion Detection Problem in Computer Network (텍스트 마이닝 기법을 이용한 컴퓨터 네트워크의 침입 탐지)

  • Oh Seung-Joon;Won Min-Kwon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.27-32
    • /
    • 2005
  • Recently there has been much interest in applying data mining to computer network intrusion detection. A new approach, based on the k-Nearest Neighbour(kNN) classifier, is used to classify Program behaviour as normal or intrusive. Each system call is treated as a word and the collection of system calls over each program execution as a document. These documents are then classified using kNN classifier, a Popular method in text mining. A simple example illustrates the proposed procedure.

  • PDF

A Convergent Study on the Narration of Novel through Text-mining (소설 내러티브의 변화: 텍스트마이닝 기반 장르별 내러티브 분석)

  • Park, Jungsik;Park, Mi Sun
    • English & American cultural studies
    • /
    • v.17 no.1
    • /
    • pp.81-106
    • /
    • 2017
  • Using recently emerging quantitative methods, this article provides a comparative study of the diachronic changes in the narrations of novel, history, and science from the early 18th-century to the 20th-century. To trace the narrative changes in different genres, this article discusses how text-mining methodology can be introduced in literary studies. We compared the traces of narrative in three genres—novel, history, and science—as a pilot study, with the three major grammatical elements of narrative: pronoun, subordinating conjunction, and action verbs in past tense. The results of data-mining show that the use of pronoun and action verb has increased in the genre of novel toward the $20^{th}$ century, while history and science has developed less story-like writing styles.

A Content Analysis for Website Usefulness Evaluation: Utilizing Text Mining Technique

  • Kwon, Do Young;Jeong, Seung Ryul
    • Journal of Internet Computing and Services
    • /
    • v.16 no.4
    • /
    • pp.71-81
    • /
    • 2015
  • With the increasing influence of online media, company websites have become important communication channels between companies and customers. Companies use their websites as a marketing tool for a variety of purposes, including enhancing their image and selling products or services. Many researchers have examined the criteria, methods, and tools for website evaluation, but most have focused on usability. Prior content analyses have focused not on text content but on website components, an approach likely to produce subjective evaluations. This study attempts to objectively evaluate company websites by utilizing text mining. We analyze the usefulness of company websites by presenting visualized outputs from a business perspective, allowing practitioners to easily understand the results of the website evaluation and use them in decision making. To demonstrate our method empirically, we selected a company with a number of affiliates in Korea and analyzed the text content of their websites to assess their usefulness using natural language processing and graphics packages in R. Practitioners can easily employ our objective evaluation method, and researchers can use it to gain a new perspective on website evaluation.

A View from the Bottom: Project-Oriented Risk Mining Approach for Overseas Construction Projects

  • Lee, JeeHee;Son, JeongWook;Yi, June-Seong
    • International conference on construction engineering and project management
    • /
    • 2015.10a
    • /
    • pp.97-100
    • /
    • 2015
  • Analysis of construction tender documents in overseas projects is a very important issue from a risk management point of view. Unfortunately, majority of construction firms are biased by winning contracts without in-depth analysis of tender documents. As a result, many contractors have incurred loss in overseas projects. Although a lot of risk analysis techniques have been introduced, most of them focus project's external unexpected risks such as country conditions and owner's financial standing. However, because those external risks are difficult to control and take preemptive action, we need to concentrate on project inherent risks. Based on this premise, this paper proposes a project-oriented risk mining approach which could detect and extract project risk factors automatically before they are materialized and assess them. This study presents a methodology regarding how to extract potential risks which exist in owner's project requirements and project tender documents using state of the art data analysis method such as text mining, data mining, and information visualization. The project-oriented risk mining approach is expected to effectively reflect project characteristics to the project risk management and could provide construction firms with valuable business intelligence.

  • PDF

An Investigation on the Periodical Transition of News related to North Korea using Text Mining (텍스트마이닝을 활용한 북한 관련 뉴스의 기간별 변화과정 고찰)

  • Park, Chul-Soo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.63-88
    • /
    • 2019
  • The goal of this paper is to investigate changes in North Korea's domestic and foreign policies through automated text analysis over North Korea represented in South Korean mass media. Based on that data, we then analyze the status of text mining research, using a text mining technique to find the topics, methods, and trends of text mining research. We also investigate the characteristics and method of analysis of the text mining techniques, confirmed by analysis of the data. In this study, R program was used to apply the text mining technique. R program is free software for statistical computing and graphics. Also, Text mining methods allow to highlight the most frequently used keywords in a paragraph of texts. One can create a word cloud, also referred as text cloud or tag cloud. This study proposes a procedure to find meaningful tendencies based on a combination of word cloud, and co-occurrence networks. This study aims to more objectively explore the images of North Korea represented in South Korean newspapers by quantitatively reviewing the patterns of language use related to North Korea from 2016. 11. 1 to 2019. 5. 23 newspaper big data. In this study, we divided into three periods considering recent inter - Korean relations. Before January 1, 2018, it was set as a Before Phase of Peace Building. From January 1, 2018 to February 24, 2019, we have set up a Peace Building Phase. The New Year's message of Kim Jong-un and the Olympics of Pyeong Chang formed an atmosphere of peace on the Korean peninsula. After the Hanoi Pease summit, the third period was the silence of the relationship between North Korea and the United States. Therefore, it was called Depression Phase of Peace Building. This study analyzes news articles related to North Korea of the Korea Press Foundation database(www.bigkinds.or.kr) through text mining, to investigate characteristics of the Kim Jong-un regime's South Korea policy and unification discourse. The main results of this study show that trends in the North Korean national policy agenda can be discovered based on clustering and visualization algorithms. In particular, it examines the changes in the international circumstances, domestic conflicts, the living conditions of North Korea, the South's Aid project for the North, the conflicts of the two Koreas, North Korean nuclear issue, and the North Korean refugee problem through the co-occurrence word analysis. It also offers an analysis of South Korean mentality toward North Korea in terms of the semantic prosody. In the Before Phase of Peace Building, the results of the analysis showed the order of 'Missiles', 'North Korea Nuclear', 'Diplomacy', 'Unification', and ' South-North Korean'. The results of Peace Building Phase are extracted the order of 'Panmunjom', 'Unification', 'North Korea Nuclear', 'Diplomacy', and 'Military'. The results of Depression Phase of Peace Building derived the order of 'North Korea Nuclear', 'North and South Korea', 'Missile', 'State Department', and 'International'. There are 16 words adopted in all three periods. The order is as follows: 'missile', 'North Korea Nuclear', 'Diplomacy', 'Unification', 'North and South Korea', 'Military', 'Kaesong Industrial Complex', 'Defense', 'Sanctions', 'Denuclearization', 'Peace', 'Exchange and Cooperation', and 'South Korea'. We expect that the results of this study will contribute to analyze the trends of news content of North Korea associated with North Korea's provocations. And future research on North Korean trends will be conducted based on the results of this study. We will continue to study the model development for North Korea risk measurement that can anticipate and respond to North Korea's behavior in advance. We expect that the text mining analysis method and the scientific data analysis technique will be applied to North Korea and unification research field. Through these academic studies, I hope to see a lot of studies that make important contributions to the nation.

In-depth Analysis of Soccer Game via Webcast and Text Mining (웹 캐스트와 텍스트 마이닝을 이용한 축구 경기의 심층 분석)

  • Jung, Ho-Seok;Lee, Jong-Uk;Yu, Jae-Hak;Lee, Han-Sung;Park, Dai-Hee
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.10
    • /
    • pp.59-68
    • /
    • 2011
  • As the role of soccer game analyst who analyzes soccer games and creates soccer wining strategies is emphasized, it is required to have high-level analysis beyond the procedural ones such as main event detection in the context of IT based broadcasting soccer game research community. In this paper, we propose a novel approach to generate the high-level in-depth analysis results via real-time text based soccer Webcast and text mining. Proposed method creates a metadata such as attribute, action and event, build index, and then generate available knowledges via text mining techniques such as association rule mining, event growth index, and pathfinder network analysis using Webcast and domain knowledges. We carried out a feasibility experiment on the proposed technique with the Webcast text about Spain team's 2010 World Cup games.

Analysis of Prevention Methods by Type of Construction Disaster Using Text Mining Techniques (텍스트마이닝을 활용한 건설현장 재해 유형별 예방 대책 분석)

  • Gyu Pil Jo;Myungdo Lee;Yoon-seok Shin;Baek-Joong Kim
    • Journal of the Society of Disaster Information
    • /
    • v.20 no.1
    • /
    • pp.13-19
    • /
    • 2024
  • Purpose: This study provides prevention methods by type of construction disaster using text mining techniques. Method: Based on the database that analyzed the cases of critical disasters in the domestic construction sector, preventive measures and causes are analyzed by text mining techniques, and the contents of the analysis are visually shown. Result: This visual data represents the measures for preventing critical disasters of each process according to the importance. Conclusion: It is believed that the results will be helpful in identifying factors to be considered in preparing preventive measures for serious accidents in construction.

A Multilevel Project-Oriented Risk-Mining Framework for Overseas Construction Projects

  • Son, JeongWook;Lee, JeeHee;Yi, June-Seong
    • International conference on construction engineering and project management
    • /
    • 2015.10a
    • /
    • pp.39-40
    • /
    • 2015
  • As international construction market increases, the importance of risk management in international construction project is emphasized. Unfortunately, current risk management practice does not sufficiently deal with project risks. Although a lot of risk analysis techniques have been introduced, most of them focus on project's external unexpected risks such as country conditions and owner's financial standing. However, because those external risks are difficult to manage and take preemptive action, we need to concentrate on project inherent risks. Based on this premise, this paper proposes a project-oriented risk mining approach which could detect and extract project risk factors automatically before they are materialized. This study presents a methodology regarding how to extract potential risks which exist in owner's project requirements and project tender documents using state of the art data analysis method such as text mining. The project-oriented risk mining approach is expected to effectively reflect project characteristics to the project risk management and could provide construction firms with valuable business intelligence.

  • PDF

Finding Meaningful Pattern of Key Words in IIE Transactions Using Text Mining (텍스트마이닝을 활용한 산업공학 학술지의 논문 주제어간 연관관계 연구)

  • Cho, Su-Gon;Kim, Seoung-Bum
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.38 no.1
    • /
    • pp.67-73
    • /
    • 2012
  • Identification of meaningful patterns and trends in large volumes of text data is an important task in various research areas. In the present study we crawled the keywords from the abstracts in IIE Transactions, one of the representative journals in the field of Industrial Engineering from 1969 to 2011. We applied low-dimensional embedding method, clustering analysis, association rule, and social network analysis to find meaningful associative patterns of key words frequently appeared in the paper.

Text Mining and Sentiment Analysis for Predicting Box Office Success

  • Kim, Yoosin;Kang, Mingon;Jeong, Seung Ryul
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.8
    • /
    • pp.4090-4102
    • /
    • 2018
  • After emerging online communications, text mining and sentiment analysis has been frequently applied into analyzing electronic word-of-mouth. This study aims to develop a domain-specific lexicon of sentiment analysis to predict box office success in Korea film market and validate the feasibility of the lexicon. Natural language processing, a machine learning algorithm, and a lexicon-based sentiment classification method are employed. To create a movie domain sentiment lexicon, 233,631 reviews of 147 movies with popularity ratings is collected by a XML crawling package in R program. We accomplished 81.69% accuracy in sentiment classification by the Korean sentiment dictionary including 706 negative words and 617 positive words. The result showed a stronger positive relationship with box office success and consumers' sentiment as well as a significant positive effect in the linear regression for the predicting model. In addition, it reveals emotion in the user-generated content can be a more accurate clue to predict business success.