• Title/Summary/Keyword: Text Mining Method

Search Result 451, Processing Time 0.027 seconds

Research on R&D Planning Through NLP Analysis of Patent Information: Focusing on Display Technology (특허정보의 NLP 분석을 통한 R&D 계획수립 방안 연구: 디스플레이 기술 분석을 중심으로)

  • Kim, Jung-Heui;Kim, Young-Min
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.25 no.5
    • /
    • pp.817-826
    • /
    • 2022
  • Patent information describes the history of technological progress in the relevant field, so it can be usefully used to identify trends in technological development and change and to establish R&D development strategies. This study proposes a method to identify the needs and problems of technology development at the planning stage of the R&D process and to analyze core technologies through patent analysis using Natural Language Processing(NLP) technology. As a big data source, collected patent documents registered in Google Patents for foldable technology, the latest technology in the display industry, and then extracted keywords using NLP analyzer. By classifying the extracted keywords into needs and problems for technology development, developed technology and materials, identified the needs of the market and customers and analyzed the technologies being researched and developed. Unlike previous studies that performed patent analysis, this methodology is different in that it can quickly and conveniently analyze the latest technology trends from big data called patents even if you do not have specialized knowledge and skills in the text mining. This study contributes to the digitalization of the R&D process based on data analysis.

Changes in the Cultural Trend of Use by Type of Green Infrastructure Before and After COVID-19 Using Blog Text Mining in Seoul

  • Chae, Jinhae;Cho, MinJoon
    • Journal of People, Plants, and Environment
    • /
    • v.24 no.4
    • /
    • pp.415-427
    • /
    • 2021
  • Background and objective: This study examined the changes in the cultural trend of use for green infrastructure in Seoul due to COVID-19 pandemic. Methods: The subjects of this study are 8 sites of green infrastructure selected by type: Forested green infrastructure, Watershed green infrastructure, Park green infrastructure, Walkway green infrastructure. The data used for analysis was blog posts for a total of four years from August 1, 2016 to July 31, 2020. The analysis method was conducted keyword frequency analysis, topic modeling, and related keyword analysis. Results: The results of this study are as follows. First, the number of posts on green infrastructure has increased since COVID-19, especially forested green infrastructure and watershed green infrastructure with abundant naturalness and high openness. Second, the cultural trend keywords before and after COVID-19 changed from large-scale to small-scale, community-based to individual-based activities, and nondaily to daily culture. Third, after COVID-19, topics and keywords related to coronavirus showed that the cultural trends were reflected on appreciation, activities, and dailiness based on natural resources. In sum, the interest in green infrastructure in Seoul has increased after COVID-19. Also, the change of green infrastructure represents the increased demand for experience that reflects the need and expectation for nature. Conclusion: The new trend of green Infrastructure in the pandemic era should be considered in the the individual relaxations & activities.

Text mining-based Data Preprocessing and Accident Type Analysis for Construction Accident Analysis (건설사고 분석을 위한 텍스트 마이닝 기반 데이터 전처리 및 사고유형 분석)

  • Yoon, Young Geun;Lee, Jae Yun;Oh, Tae Keun
    • Journal of the Korean Society of Safety
    • /
    • v.37 no.2
    • /
    • pp.18-27
    • /
    • 2022
  • Construction accidents are difficult to prevent because several different types of activities occur simultaneously. The current method of accident analysis only indicates the number of occurrences for one or two variables and accidents have not reduced as a result of safety measures that focus solely on individual variables. Even if accident data is analyzed to establish appropriate safety measures, it is difficult to derive significant results due to a large number of data variables, elements, and qualitative records. In this study, in order to simplify the analysis and approach this complex problem logically, data preprocessing techniques, such as latent class cluster analysis (LCCA) and predictor importance were used to discover the most influential variables. Finally, the correlation was analyzed using an alluvial flow diagram consisting of seven variables and fourteen elements based on accident data. The alluvial diagram analysis using reduced variables and elements enabled the identification of accident trends into four categories. The findings of this study demonstrate that complex and diverse construction accident data can yield relevant analysis results, assisting in the prevention of accidents.

Rating Individual Food Items of Restaurant Menu based on Online Customer Reviews using Text Mining Technique (신뢰성있는 온라인 고객 리뷰 텍스트 마이닝 기반 식당 개별 음식 아이템 평가)

  • Syed, Muzamil Hussain;Chung, Sun-Tae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.389-392
    • /
    • 2020
  • The growth in social media, blogs and restaurant listing directories have led to increasing customer reviews about restaurants, their quality of food items and services available on the internet. These user reviews offer a massive amount of valuable information that can be used for various decision-making purposes. Currently, most food recommendation sites provide recommendation scores about restaurants rather than food items of the restaurant and the provided recommendation scores may be biased since they are calculated only from user reviews listed only in their sites. Usually, people wants a reliable recommendation about foods, not restaurant. In this paper, we present a reliable Korean food items rating method; we first extract food items by applying NER technique to restaurant reviews collected from many Korean restaurant recommendation web sites, blogs and web data. Then, we apply lexicon-based sentiment analysis on collected user reviews and predict people's opinions as sentiment polarity scores (+1 for positive; -1 for negative; 0 for neutral). Finally, by taking average of all calculated polarity scores about a food item, we obtain a rating to individual menu items of the restaurant. The proposed food item rating is more reliable since it does not depend on reviews of only one site.

Topic Modeling of Korean Newspaper Articles on Aging via Latent Dirichlet Allocation

  • Lee, So Chung
    • Asian Journal for Public Opinion Research
    • /
    • v.10 no.1
    • /
    • pp.4-22
    • /
    • 2022
  • The purpose of this study is to explore the structure of social discourse on aging in Korea by analyzing newspaper articles on aging. The analysis is composed of three steps: first, data collection and preprocessing; second, identifying the latent topics; and third, observing yearly dynamics of topics. In total, 1,472 newspaper articles that included the word "aging" within the title were collected from 10 major newspapers between 2006 and 2019. The underlying topic structure was analyzed using Latent Dirichlet Allocation (LDA), a topic modeling method widely adopted by text mining academics and researchers. Seven latent topics were generated from the LDA model, defined as social issues, death, private insurance, economic growth, national debt, labor market innovation, and income security. The topic loadings demonstrated a clear increase in public interest on topics such as national debt and labor market innovation in recent years. This study concludes that media discourse on aging has shifted towards more productivity and efficiency related issues, requiring older people to be productive citizens. Such subjectivation connotes a decreased role of the government and society by shifting the responsibility to individuals not being able to adapt successfully as productive citizens within the labor market.

A Study on the Research Trends in Int'l Trade Using Topic modeling (토픽모델링을 활용한 무역분야 연구동향 분석)

  • Jee-Hoon Lee;Jung-Suk Kim
    • Korea Trade Review
    • /
    • v.45 no.3
    • /
    • pp.55-69
    • /
    • 2020
  • This study examines the research trends and knowledge structure of international trade studies using topic modeling method, which is one of the main methodologies of text mining. We collected and analyzed English abstracts of 1,868 papers of three Korean major journals in the area of international trade from 2003 to 2019. We used the Latent Dirichlet Allocation(LDA), an unsupervised machine learning algorithm to extract the latent topics from the large quantity of research abstracts. 20 topics are identified without any prior human judgement. The topics reveal topographical maps of research in international trade and are representative and meaningful in the sense that most of them correspond to previously established sub-topics in trade studies. Then we conducted a regression analysis on the document-topic distributions generated by LDA to identify hot and cold topics. We discovered 2 hot topics(internationalization capacity and performance of export companies, economic effect of trade) and 2 cold topics(exchange rate and current account, trade finance). Trade studies are characterized as a interdisciplinary study of three agendas(i.e. international economy, International Business, trade practice), and 20 topics identified can be grouped into these 3 agendas. From the estimated results of the study, we find that the Korean government's active pursuit of FTA and consequent necessity of capacity building in Korean export firms lie behind the popularity of topic selection by the Korean researchers in the area of int'l trade.

Perception and Trend Differences between Korea, China, and the US on Vegan Fashion -Using Big Data Analytics- (빅데이터를 이용한 비건 패션 쟁점의 분석 -한국, 중국, 미국을 중심으로-)

  • Jiwoon Jeong;Sojung Yun
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.47 no.5
    • /
    • pp.804-821
    • /
    • 2023
  • This study examines current trends and perceptions of veganism and vegan fashion in Korea, China, and the United States. Using big data tools Textom and Ucinet, we conducted cluster analysis between keywords. Further, frequency analysis using keyword extraction and CONCOR analysis obtained the following results. First, the nations' perceptions of veganism and vegan fashion differ significantly. Korea and the United States generally share a similar understanding of vegan fashion. Second, the industrial structures, such as products and businesses, impacted how Korea perceived veganism. Third, owing to its ongoing sociopolitical tensions, the United States views veganism as an ethical consumption method that ties into activism. In contrast, China views veganism as a healthy diet rather than a lifestyle and associates it with Buddhist vegetarianism. This perception is because of their religious history and culinary culture. Fundamentally, this study is meaningful for using big data to extract keywords related to vegan fashion in Korea, China, and the United States. This study deepens our understanding of vegan fashion by comparing perceptions across nations.

A Study on the Development of LDA Algorithm-Based Financial Technology Roadmap Using Patent Data

  • Koopo KWON;Kyounghak LEE
    • Korean Journal of Artificial Intelligence
    • /
    • v.12 no.3
    • /
    • pp.17-24
    • /
    • 2024
  • This study aims to derive a technology development roadmap in related fields by utilizing patent documents of financial technology. To this end, patent documents are extracted by dragging technical keywords from prior research and related reports on financial technology. By applying the TF-IDF (Term Frequency-Inverse Document Frequency) technique in the extracted patent document, which is a text mining technique, to the extracted patent documents, the Latent Dirichlet Allocation (LDA) algorithm was applied to identify the keywords and identify the topics of the core technologies of financial technology. Based on the proportion of topics by year, which is the result of LDA, promising technology fields and convergence fields were identified through trend analysis and similarity analysis between topics. A first-stage technology development roadmap for technology field development and a second-stage technology development roadmap for convergence were derived through network analysis about the technology data-based integrated management system of the high-dimensional payment system using RF and intelligent cards, as well as the security processing methodology for data information and network payment, which are identified financial technology fields. The proposed method can serve as a sufficient reason basis for developing financial technology R&D strategies and technology roadmaps.

A Time Series Analysis of Urban Park Behavior Using Big Data (빅데이터를 활용한 도시공원 이용행태 특성의 시계열 분석)

  • Woo, Kyung-Sook;Suh, Joo-Hwan
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.48 no.1
    • /
    • pp.35-45
    • /
    • 2020
  • This study focused on the park as a space to support the behavior of urban citizens in modern society. Modern city parks are not spaces that play a specific role but are used by many people, so their function and meaning may change depending on the user's behavior. In addition, current online data may determine the selection of parks to visit or the usage of parks. Therefore, this study analyzed the change of behavior in Yeouido Park, Yeouido Hangang Park, and Yangjae Citizen's Forest from 2000 to 2018 by utilizing a time series analysis. The analysis method used Big Data techniques such as text mining and social network analysis. The summary of the study is as follows. The usage behavior of Yeouido Park has changed over time to "Ride" (Dynamic Behavior) for the first period (I), "Take" (Information Communication Service Behavior) for the second period (II), "See" (Communicative Behavior) for the third period (III), and "Eat" (Energy Source Behavior) for the fourth period (IV). In the case of Yangjae Citizens' Forest, the usage behavior has changed over time to "Walk" (Dynamic Behavior) for the first, second, and third periods (I), (II), (III) and "Play" (Dynamic Behavior) for the fourth period (IV). Looking at the factors affecting behavior, Yeouido Park was had various factors related to sports, leisure, culture, art, and spare time compared to Yangjae Citizens' Forest. The differences in Yangjae Citizens' Forest that affected its main usage behavior were various elements of natural resources. Second, the behavior of the target areas was found to be focused on certain main behaviors over time and played a role in selecting or limiting future behaviors. These results indicate that the space and facilities of the target areas had not been utilized evenly, as various behaviors have not occurred, however, a certain main behavior has appeared in the target areas. This study has great significance in that it analyzes the usage of urban parks using Big Data techniques, and determined that urban parks are transformed into play spaces where consumption progressed beyond the role of rest and walking. The behavior occurring in modern urban parks is changing in quantity and content. Therefore, through various types of discussions based on the results of the behavior collected through Big Data, we can better understand how citizens are using city parks. This study found that the behavior associated with static behavior in both parks had a great impact on other behaviors.

Text Mining and Association Rules Analysis to a Self-Introduction Letter of Freshman at Korea National College of Agricultural and Fisheries (1) (한국농수산대학 신입생 자기소개서의 텍스트 마이닝과 연관규칙 분석 (1))

  • Joo, J.S.;Lee, S.Y.;Kim, J.S.;Shin, Y.K.;Park, N.B.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.22 no.1
    • /
    • pp.113-129
    • /
    • 2020
  • In this study we examined the topic analysis and correlation analysis by text mining to extract meaningful information or rules from the self introduction letter of freshman at Korea National College of Agriculture and Fisheries in 2020. The analysis items are described in items related to 'academic' and 'in-school activities' during high school. In the text mining results, the keywords of 'academic' items were 'study', 'thought', 'effort', 'problem', 'friend', and the key words of 'in-school activities' were 'activity', 'thought', 'friend', 'club', 'school' in order. As a result of the correlation analysis, the key words of 'thinking', 'studying', 'effort', and 'time' played a central role in the 'academic' item. And the key words of 'in-school activities' were 'thought', 'activity', 'school', 'time', and 'friend'. The results of frequency analysis and association analysis were visualized with word cloud and correlation graphs to make it easier to understand all the results. In the next study, TF-IDF(Term Frequency-Inverse Document Frequency) analysis using 'frequency of keywords' and 'reverse of document frequency' will be performed as a method of extracting key words from a large amount of documents.