• Title/Summary/Keyword: 워드 클라우드 분석

Search Result 119, Processing Time 0.031 seconds

A Study on Word Cloud Techniques for Analysis of Unstructured Text Data (비정형 텍스트 테이터 분석을 위한 워드클라우드 기법에 관한 연구)

  • Lee, Won-Jo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.6 no.4
    • /
    • pp.715-720
    • /
    • 2020
  • In Big data analysis, text data is mostly unstructured and large-capacity, so analysis was difficult because analysis techniques were not established. Therefore, this study was conducted for the possibility of commercialization through verification of usefulness and problems when applying the big data word cloud technique, one of the text data analysis techniques. In this paper, the limitations and problems of this technique are derived through visualization analysis of the "President UN Speech" using the R program word cloud technique. In addition, by proposing an improved model to solve this problem, an efficient method for practical application of the word cloud technique is proposed.

A Study on the Use of Stopword Corpus for Cleansing Unstructured Text Data (비정형 텍스트 데이터 정제를 위한 불용어 코퍼스의 활용에 관한 연구)

  • Lee, Won-Jo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.6
    • /
    • pp.891-897
    • /
    • 2022
  • In big data analysis, raw text data mostly exists in various unstructured data forms, so it becomes a structured data form that can be analyzed only after undergoing heuristic pre-processing and computer post-processing cleansing. Therefore, in this study, unnecessary elements are purified through pre-processing of the collected raw data in order to apply the wordcloud of R program, which is one of the text data analysis techniques, and stopwords are removed in the post-processing process. Then, a case study of wordcloud analysis was conducted, which calculates the frequency of occurrence of words and expresses words with high frequency as key issues. In this study, to improve the problems of the "nested stopword source code" method, which is the existing stopword processing method, using the word cloud technique of R, we propose the use of "general stopword corpus" and "user-defined stopword corpus" and conduct case analysis. The advantages and disadvantages of the proposed "unstructured data cleansing process model" are comparatively verified and presented, and the practical application of word cloud visualization analysis using the "proposed external corpus cleansing technique" is presented.

A Study on Data Cleansing Techniques for Word Cloud Analysis of Text Data (텍스트 데이터 워드클라우드 분석을 위한 데이터 정제기법에 관한 연구)

  • Lee, Won-Jo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.745-750
    • /
    • 2021
  • In Big data visualization analysis of unstructured text data, raw data is mostly large-capacity, and analysis techniques cannot be applied without cleansing it unstructured. Therefore, from the collected raw data, unnecessary data is removed through the first heuristic cleansing process and Stopwords are removed through the second machine cleansing process. Then, the frequency of the vocabulary is calculated, visualized using the word cloud technique, and key issues are extracted and informationalized, and the results are analyzed. In this study, we propose a new Stopword cleansing technique using an external Stopword set (DB) in Python word cloud, and derive the problems and effectiveness of this technique through practical case analysis. And, through this verification result, the utility of the practical application of word cloud analysis applying the proposed cleansing technique is presented.

Security of Password Vaults of Password Managers (패스워드 매니저의 패스워드 저장소 보안 취약점 분석)

  • Jeong, Hyera;So, Jaewoo
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.5
    • /
    • pp.1047-1057
    • /
    • 2018
  • As the number of services offered on the Internet exponentially increases, password managers are increasing popular applications that store several passwords in an encrypted database (or password vault). Browser-integrated password managers or locally-installed password managers store the password vault on the user's device. Although a web-based password manager stores the password vault on the cloud server, a user can store the master password used to sign in the cloud server on her device. An attacker that steals a user's encrypted vault stored in the victim's device can make an offline attack and, if successful, all the passwords in the vault will be exposed to the attacker. This paper investigates the vulnerability of the password vault stored in the device and develops attack programs to verify the vulnerability of the password vault.

Analysis of VR Game Trends using Text Mining and Word Cloud -Focusing on STEAM review data- (텍스트마이닝과 워드 클라우드를 활용한 VR 게임 트렌드 분석 -스팀(steam) 리뷰 데이터를 중심으로-)

  • Na, Ji Young
    • Journal of Korea Game Society
    • /
    • v.22 no.1
    • /
    • pp.87-98
    • /
    • 2022
  • With the development of fourth industrial revolution-related technology and increased demands for non-face-to-face services, VR games attract attention. This study collected VR game review data from an online game platform STEAM and analyzed chronical trends using text mining and word cloud analysis. According to the results, experience and perceived cost were major trends from 2016 to 2017, increased demands for FPS and rhythm games were from 2018 to 2019, and story and immersion were from 2020 to 2021. It aims to contribute to expanding the base of VR games by identifying the keywords VR users take interest in by period.

A Study on the Analysis of Consultation Needs of SMEs through Big-Data (빅데이터 분석을 활용한 중소기업의 상담요구 분석)

  • Lee, Bong-Cheol;You, Yen-Yoo
    • Journal of Digital Convergence
    • /
    • v.16 no.7
    • /
    • pp.27-34
    • /
    • 2018
  • This study was conducted to identify the contents of major consulting needs of SMEs using Big Data and to suggest the efficiency of operation. The subjects of the study were counseling cases posted on the website of the Business Support Center of the Ministry of SMEs and Startups. To do this, from 2009 to March 2018, we crawled about 7,000 cases of counseling cases, followed by word cloud analysis centering on effective keyword. The main results were as follows: First, the frequency of counseling cases in each field was found in the order of establishment, management strategy, human resources, financial order. Second, in word cloud analysis, the most frequent keyword related to counseling demand were small businesses, exports, methods, procedures, registration and authentication. In this study, we obtained research results that we can improve the efficiency of the policy in real time from a new point of view by conducting big data analysis on public policy.

Coocurrence Relation Analysis and Visualization in Tweet for Food Safety Domain (식품안전 관련 트위터 정보의 연관 관계 분석 및 시각화)

  • So, Hyun-Su;Kang, Seung-Shik;Oh, Se-Wook
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.305-306
    • /
    • 2016
  • 식품안전 사고가 발생했을 때 뉴스, 인터넷 기사를 통해 정보를 인지하기 전에 그 음식을 섭취하는 경우가 발생하는 문제점 최소화하기 위하여 실시간 트윗 분석으로 현재 발생한 식품안전 키워드와 어느 지역에서 발생했는지를 신속하게 파악하고, 키워드 연관관계 분석 프로그램을 활용하여 정확한 정보를 추출한다. 이와 더불어, SNS 등 다양한 정보 소스로부터 추출한 정보를 간단명료하게 파악하기 위해서 워드 클라우드 등 데이터 시각화 기법을 활용하여 시각화로 정보를 제공한다. 이 기법은 식품안전 뿐만 아니라 최근 발생한 콜레라 감염 발생과 같은 문제를 해결하기 위한 방법으로 활용될 수 있을 것이다.

  • PDF

Coocurrence Relation Analysis and Visualization in Tweet for Food Safety Domain (식품안전 관련 트위터 정보의 연관 관계 분석 및 시각화)

  • So, Hyun-Su;Kang, Seung-Shik;Oh, Se-Wook
    • Annual Conference on Human and Language Technology
    • /
    • 2016.10a
    • /
    • pp.305-306
    • /
    • 2016
  • 식품안전 사고가 발생했을 때 뉴스, 인터넷 기사를 통해 정보를 인지하기 전에 그 음식을 섭취하는 경우가 발생하는 문제점 최소화하기 위하여 실시간 트윗 분석으로 현재 발생한 식품안전 키워드와 어느 지역에서 발생했는지를 신속하게 파악하고, 키워드 연관관계 분석 프로그램을 활용하여 정확한 정보를 추출한다. 이와 더불어, SNS 등 다양한 정보 소스로부터 추출한 정보를 간단명료하게 파악하기 위해서 워드 클라우드 등 데이터 시각화 기법을 활용하여 시각화로 정보를 제공한다. 이 기법은 식품안전 뿐만 아니라 최근 발생한 콜레라 감염 발생과 같은 문제를 해결하기 위한 방법으로 활용될 수 있을 것이다.

  • PDF

An Efficient Dynamic Workload Balancing Strategy (빅데이터를 활용한 국내 샤오미에 관한 인식 연구)

  • Jae-Young Moon;Eun-Ji Lee
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.343-344
    • /
    • 2023
  • 본 논문에서는 최근 스마트업체이며 제조업체로 화두가 되고 있는 샤오미 키워드로 빅데이터 분석을 활용하여 분석하고자 한다. 샤오미는 2021년 스마트폰 제조업체 세계1위를 차지했고, 글로벌 100대 브랜드(2022)에는 처음으로 84위에 진입하여 급격하게 성장하고 있는 업체 중 하나이다. 특히 국내에서도 점차 점유율이 커지고 있는 상황에서 국내 소비자들의 인식과 향후 국내에서의 입지를 알아보고자 한다. 국내 포털과 SNS에 채널을 통한 '샤오미' 키워드에 관한 데이터를 통해 키워드 분석, 워드클라우드, 토픽모델링 등의 분석을 진행하여 최근 국내 샤오미에 관한 인식과 향후 방향성을 제시해보고자 한다.

  • PDF

A Study on Trend Analysis in Convergence Research Applying Word Cloud in Korea (워드 클라우드 기법을 이용한 국내 융복합 학술연구 트렌드 분석)

  • Kim, Joon-Hwan;Mun, Hyung-Jin;Lee, Hang
    • Journal of Digital Convergence
    • /
    • v.19 no.2
    • /
    • pp.33-38
    • /
    • 2021
  • The convergence trend is the core of the 4th industrial revolution, and due to such expectations and possibilities, various countermeasures are being sought in diverse fields. This study conducted a quantitative analysis to identify the trend of convergence research over the past 10 years. Specifically, major research keywords were extracted, word cloud techniques were applied, and visualized to identify trends in academic research on convergence. To this end, research papers from 2012 to 2020 published in journal of digital convergence were investigated. The analysis period was divided into two periods: the former 4 years(2012-2015) and the latter 4 years(2016-2019) to confirm the difference in research trends. In addition, the research papers of 2020 were analyzed in order to more clearly understand the changes in the research trend of the last year due to the COVID-19. The results of this study are significant in that they can be used as useful basic data for future research and to understand research trends as keywords in the field of convergence.