• Title/Summary/Keyword: 출현 패턴 마이닝

Search Result 18, Processing Time 0.025 seconds

Analysis of Research Trends in Elder Abuse Using Text Mining : Academic Papers from 2004 to 2021. (텍스트 마이닝 분석을 통한 노인학대 관련 연구 동향 분석 : 2004년~2021년까지 발행된 국내 학술논문을 중심으로)

  • Youn, Ki-Hyok
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.4
    • /
    • pp.25-40
    • /
    • 2022
  • This study aimed to understand the increasing number of elder abuses in South Korea, where entry into the super-aged society is imminent, by implementing text mining analysis. Korean Academic journals were obtained from 2004, the establishment year of the senior care agency, to 2021. We performed natural language processing of the titles, keywords, and abstracts and divided them into three segments of periods to identify latent meanings in the data. The results illustrated that the first section included 81 papers, the second 64, and the third 104 respectively, averaging 13.8 annually, which increased its numbers from 2014 until the decrease below the annual average in 2020. Word frequency demonstrated that the common keywords of the entire segments were 'elder abuse,' 'elders,' 'influences,' 'factors,' 'recognition,' 'family,' 'society,' 'prevention plans,' 'experiences,' 'abused elders,' 'abuse prevention,' 'depression,' etc., in consecutive order. TF-IDF indicated that 'influences,' 'recognition,' 'society,' 'prevention plans,' 'abuse prevention,' 'experiences,' 'depression,' etc., were the common keywords of all divisions. Network text analysis displayed that the commonly represented keywords were 'elder abuse,' 'elders,' 'influences,' 'factors,' 'characteristics,' 'recognition,' 'family,' 'prevention plans,' 'society,' 'abuse prevention,' and 'experiences' in the entire sections. concor analysis presented that the first segment consisted of 5 groups, the second 7, and the third 6. We suggest future directions for elder abuse research based on the results.

Knowledge Trend Analysis of Uncertainty in Biomedical Scientific Literature (생의학 학술 문헌의 불확실성 기반 지식 동향 분석에 관한 연구)

  • Heo, Go Eun;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.2
    • /
    • pp.175-199
    • /
    • 2019
  • Uncertainty means incomplete stages of knowledge of propositions due to the lack of consensus of information and existing knowledge. As the amount of academic literature increases exponentially over time, new knowledge is discovered as research develops. Although the flow of time may be an important factor to identify patterns of uncertainty in scientific knowledge, existing studies have only identified the nature of uncertainty based on the frequency in a particular discipline, and they did not take into consideration of the flow of time. Therefore, in this study, we identify and analyze the uncertainty words that indicate uncertainty in the scientific literature and investigate the stream of knowledge. We examine the pattern of biomedical knowledge such as representative entity pairs, predicate types, and entities over time. We also perform the significance testing using linear regression analysis. Seven pairs out of 17 entity pairs show the significant decrease pattern statistically and all 10 representative predicates decrease significantly over time. We analyze the relative importance of representative entities by year and identify entities that display a significant rising and falling pattern.

The Stream of Uncertainty in Scientific Knowledge using Topic Modeling (토픽 모델링 기반 과학적 지식의 불확실성의 흐름에 관한 연구)

  • Heo, Go Eun
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.1
    • /
    • pp.191-213
    • /
    • 2019
  • The process of obtaining scientific knowledge is conducted through research. Researchers deal with the uncertainty of science and establish certainty of scientific knowledge. In other words, in order to obtain scientific knowledge, uncertainty is an essential step that must be performed. The existing studies were predominantly performed through a hedging study of linguistic approaches and constructed corpus with uncertainty word manually in computational linguistics. They have only been able to identify characteristics of uncertainty in a particular research field based on the simple frequency. Therefore, in this study, we examine pattern of scientific knowledge based on uncertainty word according to the passage of time in biomedical literature where biomedical claims in sentences play an important role. For this purpose, biomedical propositions are analyzed based on semantic predications provided by UMLS and DMR topic modeling which is useful method to identify patterns in disciplines is applied to understand the trend of entity based topic with uncertainty. As time goes by, the development of research has been confirmed that uncertainty in scientific knowledge is moving toward a decreasing pattern.

An Efficient Web Search Method Based on a Style-based Keyword Extraction and a Keyword Mining Profile (스타일 기반 키워드 추출 및 키워드 마이닝 프로파일 기반 웹 검색 방법)

  • Joo, Kil-Hong;Lee, Jun-Hwl;Lee, Won-Suk
    • The KIPS Transactions:PartD
    • /
    • v.11D no.5
    • /
    • pp.1049-1062
    • /
    • 2004
  • With the popularization of a World Wide Web (WWW), the quantity of web information has been increased. Therefore, an efficient searching system is needed to offer the exact result of diverse Information to user. Due to this reason, it is important to extract and analysis of user requirements in the distributed information environment. The conventional searching method used the only keyword for the web searching. However, the searching method proposed in this paper adds the context information of keyword for the effective searching. In addition, this searching method extracts keywords by the new keyword extraction method proposed in this paper and it executes the web searching based on a keyword mining profile generated by the extracted keywords. Unlike the conventional searching method which searched for information by a representative word, this searching method proposed in this paper is much more efficient and exact. This is because this searching method proposed in this paper is searched by the example based query included content information as well as a representative word. Moreover, this searching method makes a domain keyword list in order to perform search quietly. The domain keyword is a representative word of a special domain. The performance of the proposed algorithm is analyzed by a series of experiments to identify its various characteristic.

Emerging Patterns Mining for Classifying Non-Safe Electrical Sections in Power Distribution System (전력배전 시스템에서의 취약 선로 분류를 위한 출현 패턴 마이닝)

  • Khalid E.K. Saeed;Minghao Piao;Heon Gyu Lee;Jin-Ho Shin;Keun Ho Ryu
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.325-327
    • /
    • 2008
  • In electrical industry, classification methodology has been an important issue for analyzing power consumption patterns. It has many applications including decisions on energy purchasing, load switching as well as helping in infrastructure development. Our aim in this work is to classify the electrical section and find potentially non-safe electrical sections. For this purpose, we use Emerging Patterns based classification. The classification method uses the aggregate score of emerging patterns to build classifier. The proposed methodology was applied to a set of electrical section data of the Korea power. The test data and relational electricity information and knowledge are supported by Korea Electric Power Research Institute (KEPRI).

Sensor Selection Strategies for Activity Recognition in a Smart Environment (스마트 환경에서 행위 인식을 위한 센서 선정 기법)

  • Gu, Sungdo;Sohn, Kyung-Ah
    • Journal of KIISE
    • /
    • v.42 no.8
    • /
    • pp.1031-1038
    • /
    • 2015
  • The recent emergence of smart phones, wearable devices, and even the IoT concept made it possible for various objects to interact one another anytime and anywhere. Among many of such smart services, a smart home service typically requires a large number of sensors to recognize the residents' activities. For this reason, the ideas on activity recognition using the data obtained from those sensors are actively discussed and studied these days. Furthermore, plenty of sensors are installed in order to recognize activities and analyze their patterns via data mining techniques. However, if many of these sensors should be installed for IoT smart home service, it raises the issue of cost and energy consumption. In this paper, we proposed a new method for reducing the number of sensors for activity recognition in a smart environment, which utilizes the principal component analysis and clustering techniques, and also show the effect of improvement in terms of the activity recognition by the proposed method.

Automatic Extraction of Opinion Words from Korean Product Reviews Using the k-Structure (k-Structure를 이용한 한국어 상품평 단어 자동 추출 방법)

  • Kang, Han-Hoon;Yoo, Seong-Joon;Han, Dong-Il
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.6
    • /
    • pp.470-479
    • /
    • 2010
  • In relation to the extraction of opinion words, it may be difficult to directly apply most of the methods suggested in existing English studies to the Korean language. Additionally, the manual method suggested by studies in Korea poses a problem with the extraction of opinion words in that it takes a long time. In addition, English thesaurus-based extraction of Korean opinion words leaves a challenge to reconsider the deterioration of precision attributed to the one to one mismatching between Korean and English words. Studies based on Korean phrase analyzers may potentially fail due to the fact that they select opinion words with a low level of frequency. Therefore, this study will suggest the k-Structure (k=5 or 8) method, which may possibly improve the precision while mutually complementing existing studies in Korea, in automatically extracting opinion words from a simple sentence in a given Korean product review. A simple sentence is defined to be composed of at least 3 words, i.e., a sentence including an opinion word in ${\pm}2$ distance from the attribute name (e.g., the 'battery' of a camera) of a evaluated product (e.g., a 'camera'). In the performance experiment, the precision of those opinion words for 8 previously given attribute names were automatically extracted and estimated for 1,868 product reviews collected from major domestic shopping malls, by using k-Structure. The results showed that k=5 led to a recall of 79.0% and a precision of 87.0%; while k=8 led to a recall of 92.35% and a precision of 89.3%. Also, a test was conducted using PMI-IR (Pointwise Mutual Information - Information Retrieval) out of those methods suggested in English studies, which resulted in a recall of 55% and a precision of 57%.

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (비정형 텍스트 분석을 활용한 이슈의 동적 변이과정 고찰)

  • Lim, Myungsu;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.1-18
    • /
    • 2016
  • Owing to the extensive use of Web media and the development of the IT industry, a large amount of data has been generated, shared, and stored. Nowadays, various types of unstructured data such as image, sound, video, and text are distributed through Web media. Therefore, many attempts have been made in recent years to discover new value through an analysis of these unstructured data. Among these types of unstructured data, text is recognized as the most representative method for users to express and share their opinions on the Web. In this sense, demand for obtaining new insights through text analysis is steadily increasing. Accordingly, text mining is increasingly being used for different purposes in various fields. In particular, issue tracking is being widely studied not only in the academic world but also in industries because it can be used to extract various issues from text such as news, (SocialNetworkServices) to analyze the trends of these issues. Conventionally, issue tracking is used to identify major issues sustained over a long period of time through topic modeling and to analyze the detailed distribution of documents involved in each issue. However, because conventional issue tracking assumes that the content composing each issue does not change throughout the entire tracking period, it cannot represent the dynamic mutation process of detailed issues that can be created, merged, divided, and deleted between these periods. Moreover, because only keywords that appear consistently throughout the entire period can be derived as issue keywords, concrete issue keywords such as "nuclear test" and "separated families" may be concealed by more general issue keywords such as "North Korea" in an analysis over a long period of time. This implies that many meaningful but short-lived issues cannot be discovered by conventional issue tracking. Note that detailed keywords are preferable to general keywords because the former can be clues for providing actionable strategies. To overcome these limitations, we performed an independent analysis on the documents of each detailed period. We generated an issue flow diagram based on the similarity of each issue between two consecutive periods. The issue transition pattern among categories was analyzed by using the category information of each document. In this study, we then applied the proposed methodology to a real case of 53,739 news articles. We derived an issue flow diagram from the articles. We then proposed the following useful application scenarios for the issue flow diagram presented in the experiment section. First, we can identify an issue that actively appears during a certain period and promptly disappears in the next period. Second, the preceding and following issues of a particular issue can be easily discovered from the issue flow diagram. This implies that our methodology can be used to discover the association between inter-period issues. Finally, an interesting pattern of one-way and two-way transitions was discovered by analyzing the transition patterns of issues through category analysis. Thus, we discovered that a pair of mutually similar categories induces two-way transitions. In contrast, one-way transitions can be recognized as an indicator that issues in a certain category tend to be influenced by other issues in another category. For practical application of the proposed methodology, high-quality word and stop word dictionaries need to be constructed. In addition, not only the number of documents but also additional meta-information such as the read counts, written time, and comments of documents should be analyzed. A rigorous performance evaluation or validation of the proposed methodology should be performed in future works.