• 제목/요약/키워드: Keyword clustering

검색결과 85건 처리시간 0.107초

A Study on the Deduction of Social Issues Applying Word Embedding: With an Empasis on News Articles related to the Disables (단어 임베딩(Word Embedding) 기법을 적용한 키워드 중심의 사회적 이슈 도출 연구: 장애인 관련 뉴스 기사를 중심으로)

  • Choi, Garam;Choi, Sung-Pil
    • Journal of the Korean Society for information Management
    • /
    • 제35권1호
    • /
    • pp.231-250
    • /
    • 2018
  • In this paper, we propose a new methodology for extracting and formalizing subjective topics at a specific time using a set of keywords extracted automatically from online news articles. To do this, we first extracted a set of keywords by applying TF-IDF methods selected by a series of comparative experiments on various statistical weighting schemes that can measure the importance of individual words in a large set of texts. In order to effectively calculate the semantic relation between extracted keywords, a set of word embedding vectors was constructed by using about 1,000,000 news articles collected separately. Individual keywords extracted were quantified in the form of numerical vectors and clustered by K-means algorithm. As a result of qualitative in-depth analysis of each keyword cluster finally obtained, we witnessed that most of the clusters were evaluated as appropriate topics with sufficient semantic concentration for us to easily assign labels to them.

Design and Implementation of Real-Time Research Trend Analysis System Using Author Keyword of Articles (논문의 저자 키워드를 이용한 실시간 연구동향 분석시스템 설계 및 구현)

  • Kim, Young-Chan;Jin, Byoung-Sam;Bae, Young-Chul
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • 제13권1호
    • /
    • pp.141-146
    • /
    • 2018
  • The authors' author keywords are the most important elements that characterize the contents of the paper, By analyzing this in real time and providing it to users, It is possible to grasp research trends. Unstructured data of a journal created in a paper is constructed as a database, make use of this to make index data structure that can search in real time. In the index data structure, a thesis containing a specific keyword is searched, By extracting and clustering the author keywords, By presenting to the user a word cloud that can be displayed by size according to the weight, designed a method to visualize research trends. We also present the results of the research trend analysis of the keywords "virus" and "iris recognition" in the implemented system.

Experimental Studies on the Skin Barrier Improvement and Anti-inflammatory Activity based on a Bibliometric Network Map

  • Eunsoo Sohn;Sung Hyeok Kim;Chang Woo Ha;Sohee Jang;Jung Hun Choi;Hyo Yeon Son;Cheol-Joo Chae;Hyun Jung Koo;Eun-Hwa Sohn
    • Proceedings of the Plant Resources Society of Korea Conference
    • /
    • 한국자원식물학회 2023년도 임시총회 및 춘계학술대회
    • /
    • pp.40-40
    • /
    • 2023
  • Atopic dermatitis is a chronic inflammatory skin diseases caused by skin barrier dysfunction. Allium victoralis var. Platyphyllum (AVP) is a perennial plant used as vegetable and herbal medicine. The purpose of this study was to suggest that AVP is a new cosmetic material by examining the effects of AVP on the skin barrier and inflammatory response. A bibliometric network analysis was performed through keyword co-occurrence analysis by extracting author keyword from 69 articles retrieved from SCOPUS. We noted the anti-inflammatory activity shown by the results of clustering and mapping from network visualization analysis using VOSviewer software tool. HPLC-UV analysis showed that AVP contains 0.12 ± 0.02 mg/g of chlorogenic acid and 0.10 ± 0.01 mg/g of gallic acid. AVP at 100 ㎍/mL was shown to increase the mRNA levels of filaggrin and involucrin related to skin barrier function by 1.50-fold and 1.43-fold, respectively. In the scratch assay, AVP at concentrations of 100 ㎍/mL and 200 ㎍/mL significantly increased the cell migration rate and narrowed the scratch area. In addition, AVP suppressed the increase of inflammation-related factors COX-2 and NO and decreased the release of β-hexosaminidase. This study suggests that AVP can be developed as a functional cosmetic material for atopy management through skin barrier protection effects, anti-inflammatory and anti-itch effects.

  • PDF

Query Expansion based on Word Sense Community (유사 단어 커뮤니티 기반의 질의 확장)

  • Kwak, Chang-Uk;Yoon, Hee-Geun;Park, Seong-Bae
    • Journal of KIISE
    • /
    • 제41권12호
    • /
    • pp.1058-1065
    • /
    • 2014
  • In order to assist user's who are in the process of executing a search, a query expansion method suggests keywords that are related to an input query. Recently, several studies have suggested keywords that are identified by finding domains using a clustering method over the documents that are retrieved. However, the clustering method is not relevant when presenting various domains because the number of clusters should be fixed. This paper proposes a method that suggests keywords by finding various domains related to the input queries by using a community detection algorithm. The proposed method extracts words from the top-30 documents of those that are retrieved and builds communities according to the word graph. Then, keywords representing each community are derived, and the represented keywords are used for the query expansion method. In order to evaluate the proposed method, we compared our results to those of two baseline searches performed by the Google search engine and keyword recommendation using TF-IDF in the search results. The results of the evaluation indicate that the proposed method outperforms the baseline with respect to diversity.

Decomposition of a Text Block into Words Using Projection Profiles, Gaps and Special Symbols (투영 프로파일, GaP 및 특수 기호를 이용한 텍스트 영역의 어절 단위 분할)

  • Jeong Chang Bu;Kim Soo Hyung
    • Journal of KIISE:Software and Applications
    • /
    • 제31권9호
    • /
    • pp.1121-1130
    • /
    • 2004
  • This paper proposes a method for line and word segmentation for machine-printed text blocks. To separate a text region into the unit of lines, it analyses the horizontal projection profile and performs a recursive projection profile cut method. In the word segmentation, between-word gaps are identified by a hierarchical clustering method after finding gaps in the text line by using a connected component analysis. In addition, a special symbol detection technique is applied to find two types of special symbols tying between words using their morphologic features. An experiment with 84 text regions from English and Korean documents shows that the proposed method achieves 99.92% accuracy of word segmentation, while a commercial OCR software named Armi 6.0 Pro$^{TM}$ has 97.58% accuracy.y.

A Study on the Macro Analysis of Knowledge Structure of the Domestic Korean Studies for Identifying the Research Fields (국내 한국학 분야의 연구 영역 식별을 위한 거시적 지식구조 분석 연구)

  • Song, Min-Sun;Ko, Young Man
    • Journal of the Korean Society for information Management
    • /
    • 제32권3호
    • /
    • pp.221-236
    • /
    • 2015
  • The purpose of this study is to analyze the research fields constituting the knowledge structure of the Korean Studies by applying hierarchical clustering method to domestic journal papers in Korean Studies. We analyzed 3,800 papers containing Korean author keyword that were listed in 14 kinds of Korean Studies journals published in 2004-2013, which have average impact factor more than 0.5 in 2011-2013. The results of the analysis show that the central research fields are the subjects related to political & social problems based on Confucian ideas focusing on Neo-Confucianism (Seonglihak) and Realist School of Confucianism (Silhak), to the political situation associated with territorial division of the Korean peninsula, and to the history from the period of japanese colonialism to modern and contemporary. It has been also found that the temporal backgrounds of researches in domestic Korean Studies were related to the modern times and the Joseon Dynasty periods, rather than the time of the ancient and contemporary.

Design of WWW IR System Based on Keyword Clustering Architecture (색인어 말뭉치 처리를 기반으로 한 웹 정보검색 시스템의 설계)

  • 송점동;이정현;최준혁
    • The Journal of Information Technology
    • /
    • 제1권1호
    • /
    • pp.13-26
    • /
    • 1998
  • In general Information retrieval systems, improper keywords are often extracted and different search results are offered comparing to user's aim bacause the systems use only term frequency informations for selecting keywords and don't consider their meanings. It represents that improving precision is limited without considering semantics of keywords because recall ratio and precision have inverse proportion relation. In this paper, a system which is able to improve precision without decreasing recall ratio is designed and implemented, as client user module is introduced which can send feedbacks to server with user's intention. For this purpose, keywords are selected using relative term frequency and inverse document frequency and co-occurrence words are extracted from original documents. Then, the keywords are clustered by their semantics using calculated mutual informations. In this paper, the system can reject inappropriate documents using segmented semantic informations according to feedbacks from client user module. Consequently precision of the system is improved without decreasing recall ratio.

  • PDF

Event Detection System Using Twitter Data (트위터를 이용한 이벤트 감지 시스템)

  • Park, Tae Soo;Jeong, Ok-Ran
    • Journal of Internet Computing and Services
    • /
    • 제17권6호
    • /
    • pp.153-158
    • /
    • 2016
  • As the number of social network users increases, the information on event such as social issues and disasters receiving attention in each region is promptly posted by the bucket through social media site in real time, and its social ripple effect becomes huge. This study proposes a detection method of events that draw attention from users in specific region at specific time by using twitter data with regional information. In order to collect Twitter data, we use Twitter Streaming API. After collecting data, We implemented event detection system by analyze the frequency of a keyword which contained in a twit in a particular time and clustering the keywords that describes same event by exploiting keywords' co-occurrence graph. Finally, we evaluates the validity of our method through experiments.

Realtime Word Filtering System against Variations of Censored Words in Korean (변형된 한글 금칙어에 대한 실시간 필터링 시스템)

  • Kim, ChanWoo;Sung, Mee Young
    • Journal of Korea Multimedia Society
    • /
    • 제22권6호
    • /
    • pp.695-705
    • /
    • 2019
  • The level of psychological damage caused by verbal abuse among cyberbully victims is very serious. It is going to introduce a system that determines the level of sanctions against chatting in real time using the automatic prohibited words filtering based on artificial neural network. In this paper, we propose a keyword filtering method that detects the modified prohibited words and determines whether the corresponding chat should be sanctioned in real time, and a real-time chatting screening system using it. The accuracy of filtering through machine learning was improved by processing data in advance through coding techniques that express consonants and vowels of similar pronunciation at close distances. After comparing and analyzing Mahalanobis-based clustering algorithms and artificial neural network-based algorithms, algorithms that utilize artificial neural networks showed high performance. If it is applied to Internet chatting, comments or online games, it is expected that it will be able to filter more effectively than the existing filtering method and that this will ease communication inconvenience due to existing indiscriminate filtering methods.

Analysis of Research Trends in Homomorphic Encryption Using Bibliometric Analysis (서지통계학적 분석을 이용한 동형 암호의 연구경향 분석)

  • Akihiko Yamada;Eunsang Lee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • 제33권4호
    • /
    • pp.601-608
    • /
    • 2023
  • Homomorphic encryption is a promising technology that has been extensively researched in recent years. It allows computations to be performed on encrypted data, without the need to decrypt it. In this paper, we perform bibliometric analysis to objectively and quantitatively analyze the research trends of homomorphic encryption technology using 6,047 homomorphic encryption papers from the Scopus database. Specifically, we analyze the number of papers by year, keyword co-occurrence, topic clustering, changes in related keywords over time, and country of homomorphic encryption research institutions. Our analysis results provide strategic directions for research and application of homomorphic encryption and can be a great help for subsequent research and industrial applications.