• Title/Summary/Keyword: Extract Keywords

Search Result 127, Processing Time 0.021 seconds

Patent Technology Trends of Oral Health: Application of Text Mining

  • Hee-Kyeong Bak;Yong-Hwan Kim;Han-Na Kim
    • Journal of dental hygiene science
    • /
    • v.24 no.1
    • /
    • pp.9-21
    • /
    • 2024
  • Background: The purpose of this study was to utilize text network analysis and topic modeling to identify interconnected relationships among keywords present in patent information related to oral health, and subsequently extract latent topics and visualize them. By examining key keywords and specific subjects, this study sought to comprehend the technological trends in oral health-related innovations. Furthermore, it aims to serve as foundational material, suggesting directions for technological advancement in dentistry and dental hygiene. Methods: The data utilized in this study consisted of information registered over a 20-year period until July 31st, 2023, obtained from the patent information retrieval service, KIPRIS. A total of 6,865 patent titles related to keywords, such as "dentistry," "teeth," and "oral health," were collected through the searches. The research tools included a custom-designed program coded specifically for the research objectives based on Python 3.10. This program was used for keyword frequency analysis, semantic network analysis, and implementation of Latent Dirichlet Allocation for topic modeling. Results: Upon analyzing the centrality of connections among the top 50 frequently occurring words, "method," "tooth," and "manufacturing" displayed the highest centrality, while "active ingredient" had the lowest. Regarding topic modeling outcomes, the "implant" topic constituted the largest share at 22.0%, while topics concerning "devices and materials for oral health" and "toothbrushes and oral care" exhibited the lowest proportions at 5.5% each. Conclusion: Technologies concerning methods and implants are continually being researched in patents related to oral health, while there is comparatively less technological development in devices and materials for oral health. This study is expected to be a valuable resource for uncovering potential themes from a large volume of patent titles and suggesting research directions.

Method of Related Document Recommendation with Similarity and Weight of Keyword (키워드의 유사도와 가중치를 적용한 연관 문서 추천 방법)

  • Lim, Myung Jin;Kim, Jae Hyun;Shin, Ju Hyun
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.11
    • /
    • pp.1313-1323
    • /
    • 2019
  • With the development of the Internet and the increase of smart phones, various services considering user convenience are increasing, so that users can check news in real time anytime and anywhere. However, online news is categorized by media and category, and it provides only a few related search terms, making it difficult to find related news related to keywords. In order to solve this problem, we propose a method to recommend related documents more accurately by applying Doc2Vec similarity to the specific keywords of news articles and weighting the title and contents of news articles. We collect news articles from Naver politics category by web crawling in Java environment, preprocess them, extract topics using LDA modeling, and find similarities using Doc2Vec. To supplement Doc2Vec, we apply TF-IDF to obtain TC(Title Contents) weights for the title and contents of news articles. Then we combine Doc2Vec similarity and TC weight to generate TC weight-similarity and evaluate the similarity between words using PMI technique to confirm the keyword association.

A Study on Automatic Extraction of Core Sentences from Document using Word Cooccurrence Graph (단어의 공기 관계 그래프를 이용한 문서의 핵심 문장 추출에 관한 연구)

  • Ryu, Je;Han, Kwang-Rok;Sohn, Seok-Won;Rim, Kee-Wook
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.11
    • /
    • pp.3427-3437
    • /
    • 2000
  • In this paper,we propose an method of core sciences extractionusing word cooccrrence graph in order to summarize a document. For automatic extraction of core sentenees, we construct a mean cluster from word cooccurrence graph, and find insistence which corresponds a porposed of author. And then we extract keywords by using relationship between mean cluster and isistence. Finally, core senrences are sclected based on keywords and insitances. The esults are evaluated by comparing with manual extraction, and show that the extraction performance is improved about 10%.

  • PDF

Automatic Single Document Text Summarization Using Key Concepts in Documents

  • Sarkar, Kamal
    • Journal of Information Processing Systems
    • /
    • v.9 no.4
    • /
    • pp.602-620
    • /
    • 2013
  • Many previous research studies on extractive text summarization consider a subset of words in a document as keywords and use a sentence ranking function that ranks sentences based on their similarities with the list of extracted keywords. But the use of key concepts in automatic text summarization task has received less attention in literature on summarization. The proposed work uses key concepts identified from a document for creating a summary of the document. We view single-word or multi-word keyphrases of a document as the important concepts that a document elaborates on. Our work is based on the hypothesis that an extract is an elaboration of the important concepts to some permissible extent and it is controlled by the given summary length restriction. In other words, our method of text summarization chooses a subset of sentences from a document that maximizes the important concepts in the final summary. To allow diverse information in the summary, for each important concept, we select one sentence that is the best possible elaboration of the concept. Accordingly, the most important concept will contribute first to the summary, then to the second best concept, and so on. To prove the effectiveness of our proposed summarization method, we have compared it to some state-of-the art summarization systems and the results show that the proposed method outperforms the existing systems to which it is compared.

Necessary Factors for Space for Community Behavior of Elderly People Living Alone in Public Rental Housing - Using by Laddering Survey Method and KJ Rule - (임대주택단지 독거노인의 커뮤니티행위를 위한 공간의 필요요소 - Laddering조사방법과 KJ법을 통한 분류 -)

  • Kim, Dong-Sook;Kwon, Oh-Jung;Lee, Ok-Kyung
    • Korean Institute of Interior Design Journal
    • /
    • v.25 no.2
    • /
    • pp.112-122
    • /
    • 2016
  • The study aims to furnish the data for the study of invigorating communities of various ages including old people living alone. Furthermore, it seems that the laddering method and kj method can be useful for the study of various attitude survey. This study attempted to extract necessary factors for space for community behavior of elderly people living alone using laddering survey method and KJ rule on a sample of elderly people who live alone in three different rental housings. The keywords that were drawn from the perception survey through laddering were divided into small, middle and large classification using KJ rule. Mostly identical keywords were extracted in the three complexes, implying that there is no difference of perception among the three complexes in this survey. According to the large classification where KJ rule was applied, there were six necessary factors for space for community activity of elderly people living alone in rental housing including spatial consideration, communication, resting space, nature-friendliness, health promotion and Secure identity, which were the same for all of the three complexes.

Conflict Analysis in Construction Project with Unstructured Data: A Case Study of Jeju Naval Base Project in South Korea

  • Baek, Seungwon;Han, Seung Heon;Lee, Changjun;Jang, Woosik;Ock, Jong Ho
    • International conference on construction engineering and project management
    • /
    • 2017.10a
    • /
    • pp.291-296
    • /
    • 2017
  • Infrastructure development as national project suffers from social conflict which is one of main risk to be managed. Social conflicts have a negative impact on not only the social integration but also the national economy as they require enormous social costs to be solved. Against this backdrop, this study analyzes social conflict using articles published by online news media based on web-crawling and natural language processing (NLP) techniques. As an illustrative case, the Jeju Naval Base (JNB) project which is one of representative conflict case in South Korea is analyzed. Total of 21,788 articles and representative keywords are identified annually. Additionally, comparative analysis is conducted between the extracted keywords and actual events occurred during the project. The authors explain actual events in the JNB project based on the extracted words by the year. This study contributes to analyze social conflict and to extract meaningful information from unstructured data.

  • PDF

Perception and Trend Differences between Korea, China, and the US on Vegan Fashion -Using Big Data Analytics- (빅데이터를 이용한 비건 패션 쟁점의 분석 -한국, 중국, 미국을 중심으로-)

  • Jiwoon Jeong;Sojung Yun
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.47 no.5
    • /
    • pp.804-821
    • /
    • 2023
  • This study examines current trends and perceptions of veganism and vegan fashion in Korea, China, and the United States. Using big data tools Textom and Ucinet, we conducted cluster analysis between keywords. Further, frequency analysis using keyword extraction and CONCOR analysis obtained the following results. First, the nations' perceptions of veganism and vegan fashion differ significantly. Korea and the United States generally share a similar understanding of vegan fashion. Second, the industrial structures, such as products and businesses, impacted how Korea perceived veganism. Third, owing to its ongoing sociopolitical tensions, the United States views veganism as an ethical consumption method that ties into activism. In contrast, China views veganism as a healthy diet rather than a lifestyle and associates it with Buddhist vegetarianism. This perception is because of their religious history and culinary culture. Fundamentally, this study is meaningful for using big data to extract keywords related to vegan fashion in Korea, China, and the United States. This study deepens our understanding of vegan fashion by comparing perceptions across nations.

GPT-enabled SNS Sentence writing support system Based on Image Object and Meta Information (이미지 객체 및 메타정보 기반 GPT 활용 SNS 문장 작성 보조 시스템)

  • Dong-Hee Lee;Mikyeong Moon;Bong-Jun, Choi
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.3
    • /
    • pp.160-165
    • /
    • 2023
  • In this study, we propose an SNS sentence writing assistance system that utilizes YOLO and GPT to assist users in writing texts with images, such as SNS. We utilize the YOLO model to extract objects from images inserted during writing, and also extract meta-information such as GPS information and creation time information, and use them as prompt values for GPT. To use the YOLO model, we trained it on form image data, and the mAP score of the model is about 0.25 on average. GPT was trained on 1,000 blog text data with the topic of 'restaurant reviews', and the model trained in this study was used to generate sentences with two types of keywords extracted from the images. A survey was conducted to evaluate the practicality of the generated sentences, and a closed-ended survey was conducted to clearly analyze the survey results. There were three evaluation items for the questionnaire by providing the inserted image and keyword sentences. The results showed that the keywords in the images generated meaningful sentences. Through this study, we found that the accuracy of image-based sentence generation depends on the relationship between image keywords and GPT learning contents.

Improvement of a Product Recommendation Model using Customers' Search Patterns and Product Details

  • Lee, Yunju;Lee, Jaejun;Ahn, Hyunchul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.265-274
    • /
    • 2021
  • In this paper, we propose a novel recommendation model based on Doc2vec using search keywords and product details. Until now, a lot of prior studies on recommender systems have proposed collaborative filtering (CF) as the main algorithm for recommendation, which uses only structured input data such as customers' purchase history or ratings. However, the use of unstructured data like online customer review in CF may lead to better recommendation. Under this background, we propose to use search keyword data and product detail information, which are seldom used in previous studies, for product recommendation. The proposed model makes recommendation by using CF which simultaneously considers ratings, search keywords and detailed information of the products purchased by customers. To extract quantitative patterns from these unstructured data, Doc2vec is applied. As a result of the experiment, the proposed model was found to outperform the conventional recommendation model. In addition, it was confirmed that search keywords and product details had a significant effect on recommendation. This study has academic significance in that it tries to apply the customers' online behavior information to the recommendation system and that it mitigates the cold start problem, which is one of the critical limitations of CF.

Analysis of Keywords in national river occupancy permits by region using text mining and network theory (텍스트 마이닝과 네트워크 이론을 활용한 권역별 국가하천 점용허가 키워드 분석)

  • Seong Yun Jeong
    • Smart Media Journal
    • /
    • v.12 no.11
    • /
    • pp.185-197
    • /
    • 2023
  • This study was conducted using text mining and network theory to extract useful information for application for occupancy and performance of permit tasks contained in the permit contents from the permit register, which is used only for the simple purpose of recording occupancy permit information. Based on text mining, we analyzed and compared the frequency of vocabulary occurrence and topic modeling in five regions, including Seoul, Gyeonggi, Gyeongsang, Jeolla, Chungcheong, and Gangwon, as well as normalization processes such as stopword removal and morpheme analysis. By applying four types of centrality algorithms, including stage, proximity, mediation, and eigenvector, which are widely used in network theory, we looked at keywords that are in a central position or act as an intermediary in the network. Through a comprehensive analysis of vocabulary appearance frequency, topic modeling, and network centrality, it was found that the 'installation' keyword was the most influential in all regions. This is believed to be the result of the Ministry of Environment's permit management office issuing many permits for constructing facilities or installing structures. In addition, it was found that keywords related to road facilities, flood control facilities, underground facilities, power/communication facilities, sports/park facilities, etc. were at a central position or played a role as an intermediary in topic modeling and networks. Most of the keywords appeared to have a Zipf's law statistical distribution with low frequency of occurrence and low distribution ratio.