• Title/Summary/Keyword: 키워드 추출 방법

Search Result 355, Processing Time 0.027 seconds

The Main Path Analysis of Korean Studies Using Text Mining: Based on SCOPUS Literature Containing 'Korea' as a Keyword (텍스트 마이닝을 활용한 한국학 주경로(Main Path) 분석: '한국'을 키워드로 포함하는 SCOPUS 문헌을 대상으로)

  • Kim, Hea-Jin
    • Journal of the Korean Society for information Management
    • /
    • v.37 no.3
    • /
    • pp.253-274
    • /
    • 2020
  • In this study, text mining and main path analysis (MPA) were applied to understand the origins and development paths of research areas that make up the mainstream of Korean studies. To this end, a quantitative analysis was attempted based on digital texts rather than the traditional humanities research methodology, and the main paths of Korean studies were extracted by collecting documents related to Korean studies including citation information using a citation database, and establishing a direct citation network. As a result of the main path analysis, two main path clusters (Korean ancient agricultural culture (history, culture, archeology) and Korean acquisition of English (linguistics)) were found in the key-route search for the Humanities field of Korean studies. In the field of Korean Studies Humanities and Social Sciences, four main path clusters were discovered: (1) Korea regional/spatial development, (2) Korean economic development (Economic aid/Soft power), (3) Korean industry (Political economics), and (4) population of Korea (Sex selection) & North Korean economy (Poverty, South-South cooperation).

A study on the efficient extraction method of SNS data related to crime risk factor (범죄발생 위험요소와 연관된 SNS 데이터의 효율적 추출 방법에 관한 연구)

  • Lee, Jong-Hoon;Song, Ki-Sung;Kang, Jin-A;Hwang, Jung-Rae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.1
    • /
    • pp.255-263
    • /
    • 2015
  • In this paper, we suggest a plan to take advantage of the SNS data to proactively identify the information on crime risk factor and to prevent crime. Recently, SNS(Social Network Service) data have been used to build a proactive prevention system in a variety of fields. However, when users are collecting SNS data with simple keyword, the result is contain a large amount of unrelated data. It may possibly accuracy decreases and lead to confusion in the data analysis. So we present a method that can be efficiently extracted by improving the search accuracy through text mining analysis of SNS data.

Analysis of research trends on mobile health intervention for Korean patients with chronic disease using text mining (텍스트마이닝을 이용한 국내 만성질환자 대상 모바일 헬스 중재연구 동향 분석)

  • Son, Youn-Jung;Lee, Soo-Kyoung
    • Journal of Digital Convergence
    • /
    • v.17 no.4
    • /
    • pp.211-217
    • /
    • 2019
  • As the widespread use of mobile health intervention among Korean patients with chronic disease, it is needed to identify research trends in mobile health intervention on chronic care using text mining technique. This secondary data analysis was conducted to investigate characteristics and main research topics in intervention studies from 2005 to 2018 with a total of 20 peer reviewed articles. Microsoft Excel and Text Analyzer were used for data analysis. Mobile health interventions were mainly applied to hypertension, diabetes, stroke, and coronary artery disease. The most common type of intervention was to develop mobile application. Lately, 'feasibility', 'mobile health', and 'outcome measure' were frequently presented. Future larger studies are needed to identify the relationships among key terms and the effectiveness of mobile health intervention using social network analysis.

Implementation of Anti-Porn Spam System based on Hyperlink Analysis Technique's of the Web Robot Agent (웹 로봇 에이전트의 하이퍼링크 분석기법을 이용한 음란메일 차단 시스템의 구현)

  • Lee, Seung-Man;Jung, Hui-Sok;Han, Sang;Song, Woo-Seok;Lee, Do-Han;Hong, Ji-Young;Ban, Eui-Hwan;Yang, Joon-Young
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.06c
    • /
    • pp.332-335
    • /
    • 2007
  • 이메일은 누구나 쉽게 정보를 교환할 수 있는 편리함 때문에 인터넷에서 가장 중요한 수단으로 사용되고 있다. 그러나 순수한 의사소통의 수단이 아닌 스팸메일의 범람은 성인뿐만 아니라, 어린이 청소년에게도 무차별적으로 전송됨으로써 심각한 부작용을 낳고 있다. 본 논문은 점차 지능화 되는 신 유형의 음란 스팸메일로부터 청소년을 보호하기 위하여 새로운 방법의 음란메일 차단시스템을 제안하고자 한다. 기존의 스팸메일 차단시스템은 사용자가 직접 음란한 메일이라고 판단되는 메일에 대해 일일이 키워드를 설정하거나, 메일 내용 중에 텍스트만을 추출하여 패턴 매칭방법으로 분류하는 것이 대부분이었지만, 본 논문은 기존 방법의 문제점을 해결하기 위하여 이미지 내 Skin-Color분포의 Human Detection 알고리즘과 웹 로봇 에이전트의 하이퍼링크 분석기법을 사용하였다. 성능 측정결과, 형태소 분석과 Human Detection 알고리즘을 병합하여 적용한 경우 성능 측정에서 90% 정도의 F-measure를 보였지만, 추가적으로 웹 로봇 에이전트의 하이퍼링크 분석기법을 병합하여 적용한 경우 97% 이상의 F-measure를 보이며, 신뢰성이 높은 음란스팸메일 차단 시스템을 구현할 수 있다는 것을 증명하였다.

  • PDF

Analysis of Journal of Dental Hygiene Science Research Trends Using Keyword Network Analysis (키워드 네트워크 분석을 활용한 치위생과학회지 연구동향 분석)

  • Kang, Yong-Ju;Yoon, Sun-Joo;Moon, Kyung-Hui
    • Journal of dental hygiene science
    • /
    • v.18 no.6
    • /
    • pp.380-388
    • /
    • 2018
  • This research team extracted keywords from 953 papers published in the Journal of Dental Hygiene Science from 2001 to 2018 for keyword and centrality analyses using the Keyword Network Analysis method. Data were analyzed using Excel 2016 and NetMiner Version 4.4.1. By conducting a deeper analysis between keywords by overall keyword and time frame, we arrived at the following conclusions. For the 17 years considered for this study, the most frequently used words in a dental science paper were "Health," "Oral," "Hygiene," and "Hygienist." The words that form the center by connecting major words in the Journal of Dental Hygiene through the upper-degree centrality words were "Health," "Dental," "Oral," "Hygiene," and "Hygienist." The upper betweenness centrality words were "Dental," "Health," "Oral," "Hygiene," and "Student." Analysis results of the degree centrality words per period revealed "Health" (0.227), "Dental" (0.136), and "Hygiene" (0.136) for period 1; "Health" (0.242), "Dental" (0.177), and "Hygiene" (0.113) for period 2; "Health" (0.200), "Dental" (0.176), and "Oral" (0.082) for period 3; and "Dental" (0.235), "Health" (0.206), and "Oral" (0.147) for period 4. Analysis results of the betweenness centrality words per period revealed "Oral" (0.281) and "Health" (0.199) for period 1; "Dental" (0.205) and "Health" (0.169) for period 2, with the weight then dispersing to "Hygiene" (0.112), "Hygienist" (0.054), and "Oral" (0.053); "Health" (0.258) and "Dental" (0.246) for period 3; and "Oral" (0.364), "Health" (0.353), and "Dental" (0.333) for period 4. Based on the above results, we hope that further studies will be conducted in the future with diverse study subjects.

A Study on Automatic Classification of Newspaper Articles Based on Unsupervised Learning by Departments (비지도학습 기반의 행정부서별 신문기사 자동분류 연구)

  • Kim, Hyun-Jong;Ryu, Seung-Eui;Lee, Chul-Ho;Nam, Kwang Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.9
    • /
    • pp.345-351
    • /
    • 2020
  • Administrative agencies today are paying keen attention to big data analysis to improve their policy responsiveness. Of all the big data, news articles can be used to understand public opinion regarding policy and policy issues. The amount of news output has increased rapidly because of the emergence of new online media outlets, which calls for the use of automated bots or automatic document classification tools. There are, however, limits to the automatic collection of news articles related to specific agencies or departments based on the existing news article categories and keyword search queries. Thus, this paper proposes a method to process articles using classification glossaries that take into account each agency's different work features. To this end, classification glossaries were developed by extracting the work features of different departments using Word2Vec and topic modeling techniques from news articles related to different agencies. As a result, the automatic classification of newspaper articles for each department yielded approximately 71% accuracy. This study is meaningful in making academic and practical contributions because it presents a method of extracting the work features for each department, and it is an unsupervised learning-based automatic classification method for automatically classifying news articles relevant to each agency.

VOC Summarization and Classification based on Sentence Understanding (구문 의미 이해 기반의 VOC 요약 및 분류)

  • Kim, Moonjong;Lee, Jaean;Han, Kyouyeol;Ahn, Youngmin
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.1
    • /
    • pp.50-55
    • /
    • 2016
  • To attain an understanding of customers' opinions or demands regarding a companies' products or service, it is important to consider VOC (Voice of Customer) data; however, it is difficult to understand contexts from VOC because segmented and duplicate sentences and a variety of dialog contexts. In this article, POS (part of speech) and morphemes were selected as language resources due to their semantic importance regarding documents, and based on these, we defined an LSP (Lexico-Semantic-Pattern) to understand the structure and semantics of the sentences and extracted summary by key sentences; furthermore the LSP was introduced to connect the segmented sentences and remove any contextual repetition. We also defined the LSP by categories and classified the documents based on those categories that comprise the main sentences matched by LSP. In the experiment, we classified the VOC-data documents for the creation of a summarization before comparing the result with the previous methodologies.

A Collecting Model of Public Opinion on Social Disaster in Twitter: A Case Study in 'Humidifier Disinfectant' (사회적 재난에 대한 트위터 여론 수렴 모델: '가습기 살균제' 사건을 중심으로)

  • Park, JunHyeong;Ryu, Pum-Mo;Oh, Hyo-Jung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.4
    • /
    • pp.177-184
    • /
    • 2017
  • The abstract should concisely state what was done, how it was done, principal results, and their significance. It should be less than 300 words for all forms of publication. Recently social disasters have been occurring frequently in the increasing complicated social structure, and the scale of damage has also become larger. Accordingly, there is a need for a way to prevent further damage by rapidly responding to social disasters. Twitter is attracting attention as a countermeasure against disasters because of immediacy and expandability. Especially, collecting public opinion on Twitter can be used as a useful tool to prevent disasters by quickly responding. This study proposes a collecting method of Twitter public opinion through keyword analysis, issue topic tweet detection, and time trend analysis. Furthermore we also show the feasibility by selecting the case of humidifier disinfectant which is a social issue recently.

Semantic-based Scene Retrieval Using Ontologies for Video Server (비디오 서버에서 온톨로지를 이용한 의미기반 장면 검색)

  • Jung, Min-Young;Park, Sung-Han
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.5
    • /
    • pp.32-37
    • /
    • 2008
  • To ensure access to rapidly growing video collection, video indexing is becoming more and more important. In this paper, video ontology system for retrieving a video data based on a scene unit is proposed. The proposed system creates a semantic scene as a basic unit of video retrieval, and limits a domain of retrieval through a subject of that scene. The content of semantic scene is defined using the relationship between object and event included in the key frame of shots. The semantic gap between the low level feature and the high level feature is solved through the scene ontology to ensure the semantic-based retrieval.

An Analysis of Civil Complaints about Traffic Policing Using the LDA Model (토픽모델링을 활용한 교통경찰 민원 분석)

  • Lee, Sangyub
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.20 no.4
    • /
    • pp.57-70
    • /
    • 2021
  • This study aims to investigate the security demand about the traffic policing by analyzing civil complaints. Latent Dirichlet Allocation(LDA) was applied to extract key topics for 2,062 civil complaints data related to traffic policing from e-People. And additional analysis was made of reports of violations, which accounted for a high proportion. In this process, the consistency and convergence of keywords and representative documents were considered together. As a result of the analysis, complaints related to traffic police could be classified into 41 topics, including traffic safety facilities, passing through intersections(signals), provisional impoundment of vehicle plate, and personal mobility. It is necessary to strengthen crackdowns on violations at intersections and violations of motorcycles and take preemptive measures for the installation and operation of unmanned traffic control equipments, crosswalks, and traffic lights. In addition, it is necessary to publicize the recently amended laws a implemented policies, e-fine, procedure after crackdown.