Search | Korea Science

A Study on Word Cloud Techniques for Analysis of Unstructured Text Data (비정형 텍스트 테이터 분석을 위한 워드클라우드 기법에 관한 연구)

Lee, Won-Jo
- The Journal of the Convergence on Culture Technology
- /
- v.6 no.4
- /
- pp.715-720
- /
- 2020
In Big data analysis, text data is mostly unstructured and large-capacity, so analysis was difficult because analysis techniques were not established. Therefore, this study was conducted for the possibility of commercialization through verification of usefulness and problems when applying the big data word cloud technique, one of the text data analysis techniques. In this paper, the limitations and problems of this technique are derived through visualization analysis of the "President UN Speech" using the R program word cloud technique. In addition, by proposing an improved model to solve this problem, an efficient method for practical application of the word cloud technique is proposed.
https://doi.org/10.17703/JCCT.2020.6.4.715 인용 PDF KSCI

A Study on Data Cleansing Techniques for Word Cloud Analysis of Text Data (텍스트 데이터 워드클라우드 분석을 위한 데이터 정제기법에 관한 연구)

Lee, Won-Jo
- The Journal of the Convergence on Culture Technology
- /
- v.7 no.4
- /
- pp.745-750
- /
- 2021
In Big data visualization analysis of unstructured text data, raw data is mostly large-capacity, and analysis techniques cannot be applied without cleansing it unstructured. Therefore, from the collected raw data, unnecessary data is removed through the first heuristic cleansing process and Stopwords are removed through the second machine cleansing process. Then, the frequency of the vocabulary is calculated, visualized using the word cloud technique, and key issues are extracted and informationalized, and the results are analyzed. In this study, we propose a new Stopword cleansing technique using an external Stopword set (DB) in Python word cloud, and derive the problems and effectiveness of this technique through practical case analysis. And, through this verification result, the utility of the practical application of word cloud analysis applying the proposed cleansing technique is presented.
https://doi.org/10.17703/JCCT.2021.7.4.745 인용 PDF KSCI

A Study on the Use of Stopword Corpus for Cleansing Unstructured Text Data (비정형 텍스트 데이터 정제를 위한 불용어 코퍼스의 활용에 관한 연구)

Lee, Won-Jo
- The Journal of the Convergence on Culture Technology
- /
- v.8 no.6
- /
- pp.891-897
- /
- 2022
In big data analysis, raw text data mostly exists in various unstructured data forms, so it becomes a structured data form that can be analyzed only after undergoing heuristic pre-processing and computer post-processing cleansing. Therefore, in this study, unnecessary elements are purified through pre-processing of the collected raw data in order to apply the wordcloud of R program, which is one of the text data analysis techniques, and stopwords are removed in the post-processing process. Then, a case study of wordcloud analysis was conducted, which calculates the frequency of occurrence of words and expresses words with high frequency as key issues. In this study, to improve the problems of the "nested stopword source code" method, which is the existing stopword processing method, using the word cloud technique of R, we propose the use of "general stopword corpus" and "user-defined stopword corpus" and conduct case analysis. The advantages and disadvantages of the proposed "unstructured data cleansing process model" are comparatively verified and presented, and the practical application of word cloud visualization analysis using the "proposed external corpus cleansing technique" is presented.
https://doi.org/10.17703/JCCT.2022.8.6.891 인용 PDF KSCI

A Study on the Analysis of Consultation Needs of SMEs through Big-Data (빅데이터 분석을 활용한 중소기업의 상담요구 분석)

Lee, Bong-Cheol;You, Yen-Yoo
- Journal of Digital Convergence
- /
- v.16 no.7
- /
- pp.27-34
- /
- 2018
This study was conducted to identify the contents of major consulting needs of SMEs using Big Data and to suggest the efficiency of operation. The subjects of the study were counseling cases posted on the website of the Business Support Center of the Ministry of SMEs and Startups. To do this, from 2009 to March 2018, we crawled about 7,000 cases of counseling cases, followed by word cloud analysis centering on effective keyword. The main results were as follows: First, the frequency of counseling cases in each field was found in the order of establishment, management strategy, human resources, financial order. Second, in word cloud analysis, the most frequent keyword related to counseling demand were small businesses, exports, methods, procedures, registration and authentication. In this study, we obtained research results that we can improve the efficiency of the policy in real time from a new point of view by conducting big data analysis on public policy.
https://doi.org/10.14400/JDC.2018.16.7.027 인용 PDF KSCI

An Efficient Dynamic Workload Balancing Strategy (빅데이터를 활용한 국내 샤오미에 관한 인식 연구)

Jae-Young Moon;Eun-Ji Lee
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2023.07a
- /
- pp.343-344
- /
- 2023
본 논문에서는 최근 스마트업체이며 제조업체로 화두가 되고 있는 샤오미 키워드로 빅데이터 분석을 활용하여 분석하고자 한다. 샤오미는 2021년 스마트폰 제조업체 세계1위를 차지했고, 글로벌 100대 브랜드(2022)에는 처음으로 84위에 진입하여 급격하게 성장하고 있는 업체 중 하나이다. 특히 국내에서도 점차 점유율이 커지고 있는 상황에서 국내 소비자들의 인식과 향후 국내에서의 입지를 알아보고자 한다. 국내 포털과 SNS에 채널을 통한 '샤오미' 키워드에 관한 데이터를 통해 키워드 분석, 워드클라우드, 토픽모델링 등의 분석을 진행하여 최근 국내 샤오미에 관한 인식과 향후 방향성을 제시해보고자 한다.
PDF

Security of Password Vaults of Password Managers (패스워드 매니저의 패스워드 저장소 보안 취약점 분석)

Jeong, Hyera;So, Jaewoo
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.28 no.5
- /
- pp.1047-1057
- /
- 2018
As the number of services offered on the Internet exponentially increases, password managers are increasing popular applications that store several passwords in an encrypted database (or password vault). Browser-integrated password managers or locally-installed password managers store the password vault on the user's device. Although a web-based password manager stores the password vault on the cloud server, a user can store the master password used to sign in the cloud server on her device. An attacker that steals a user's encrypted vault stored in the victim's device can make an offline attack and, if successful, all the passwords in the vault will be exposed to the attacker. This paper investigates the vulnerability of the password vault stored in the device and develops attack programs to verify the vulnerability of the password vault.
https://doi.org/10.13089/JKIISC.2018.28.5.1047 인용 PDF KSCI HTML

Analysis of VR Game Trends using Text Mining and Word Cloud -Focusing on STEAM review data- (텍스트마이닝과 워드 클라우드를 활용한 VR 게임 트렌드 분석 -스팀(steam) 리뷰 데이터를 중심으로-)

Na, Ji Young
- Journal of Korea Game Society
- /
- v.22 no.1
- /
- pp.87-98
- /
- 2022
With the development of fourth industrial revolution-related technology and increased demands for non-face-to-face services, VR games attract attention. This study collected VR game review data from an online game platform STEAM and analyzed chronical trends using text mining and word cloud analysis. According to the results, experience and perceived cost were major trends from 2016 to 2017, increased demands for FPS and rhythm games were from 2018 to 2019, and story and immersion were from 2020 to 2021. It aims to contribute to expanding the base of VR games by identifying the keywords VR users take interest in by period.
https://doi.org/10.7583/JKGS.2022.22.1.87 인용 PDF KSCI

A Study on the Analysis of Accident Types in Public and Private Construction Using Web Scraping and Text Mining (웹 스크래핑과 텍스트마이닝을 이용한 공공 및 민간공사의 사고유형 분석)

Yoon, Younggeun;Oh, Taekeun
- The Journal of the Convergence on Culture Technology
- /
- v.8 no.5
- /
- pp.729-734
- /
- 2022
Various studies using accident cases are being conducted to identify the causes of accidents in the construction industry, but studies on the differences between public and private construction are insignificant. In this study, web scraping and text mining technologies were applied to analyze the causes of accidents by order type. Through statistical analysis and word cloud analysis of more than 10,000 structured and unstructured data collected, it was confirmed that there was a difference in the types and causes of accidents in public and private construction. In addition, it can contribute to the establishment of safety management measures in the future by identifying the correlation between major accident causes.
https://doi.org/10.17703/JCCT.2022.8.5.729 인용 PDF KSCI

A Study on Unstructured text data Post-processing Methodology using Stopword Thesaurus (불용어 시소러스를 이용한 비정형 텍스트 데이터 후처리 방법론에 관한 연구)

Won-Jo Lee
- The Journal of the Convergence on Culture Technology
- /
- v.9 no.6
- /
- pp.935-940
- /
- 2023
Most text data collected through web scraping for artificial intelligence and big data analysis is generally large and unstructured, so a purification process is required for big data analysis. The process becomes structured data that can be analyzed through a heuristic pre-processing refining step and a post-processing machine refining step. Therefore, in this study, in the post-processing machine refining process, the Korean dictionary and the stopword dictionary are used to extract vocabularies for frequency analysis for word cloud analysis. In this process, "user-defined stopwords" are used to efficiently remove stopwords that were not removed. We propose a methodology for applying the "thesaurus" and examine the pros and cons of the proposed refining method through a case analysis using the "user-defined stop word thesaurus" technique proposed to complement the problems of the existing "stop word dictionary" method with R's word cloud technique. We present comparative verification and suggest the effectiveness of practical application of the proposed methodology.
https://doi.org/10.17703/JCCT.2023.9.6.935 인용 PDF

Coocurrence Relation Analysis and Visualization in Tweet for Food Safety Domain (식품안전 관련 트위터 정보의 연관 관계 분석 및 시각화)

So, Hyun-Su;Kang, Seung-Shik;Oh, Se-Wook
- 한국어정보학회:학술대회논문집
- /
- 2016.10a
- /
- pp.305-306
- /
- 2016
식품안전 사고가 발생했을 때 뉴스, 인터넷 기사를 통해 정보를 인지하기 전에 그 음식을 섭취하는 경우가 발생하는 문제점 최소화하기 위하여 실시간 트윗 분석으로 현재 발생한 식품안전 키워드와 어느 지역에서 발생했는지를 신속하게 파악하고, 키워드 연관관계 분석 프로그램을 활용하여 정확한 정보를 추출한다. 이와 더불어, SNS 등 다양한 정보 소스로부터 추출한 정보를 간단명료하게 파악하기 위해서 워드 클라우드 등 데이터 시각화 기법을 활용하여 시각화로 정보를 제공한다. 이 기법은 식품안전 뿐만 아니라 최근 발생한 콜레라 감염 발생과 같은 문제를 해결하기 위한 방법으로 활용될 수 있을 것이다.
PDF

Search Result 124, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)