• Title/Summary/Keyword: 텍스트 빈도 분석

Search Result 335, Processing Time 0.03 seconds

The Trend and Tasks of Meister High School Research: Network Text Analysis and Content Analysis (마이스터고 연구의 동향과 과제: 네트워크 텍스트 분석 및 내용분석)

  • Bae, Sang Hoon;Jang, Chang Seong;Lee, Tae Hee;Cho, Sung Bum
    • Journal of vocational education research
    • /
    • v.33 no.3
    • /
    • pp.83-104
    • /
    • 2014
  • The study examined the trends of research on Meister high schools in Korea. The study also investigated differences of research interests between the university faculty and graduate students who are the future researchers in this field. A total of 56 research articles were analyzed using the network text analysis method and the content analysis. The results showed that 56% of all studies was done to reveal the distinguishable characteristics of Meister students and teachers compared to their counterpart in vocational schools. 17.6% of studies were about school curriculum, while 14.0% of studies were on school organization and operation. Only 12.3% of studies were conducted to evaluate school performance. Quantitative studies outnumbered qualitative ones. Based on the results, this study suggested implications for policies and future research on meister high school.

Case Study on Public Document Classification System That Utilizes Text-Mining Technique in BigData Environment (빅데이터 환경에서 텍스트마이닝 기법을 활용한 공공문서 분류체계의 적용사례 연구)

  • Shim, Jang-sup;Lee, Kang-wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.1085-1089
    • /
    • 2015
  • Text-mining technique in the past had difficulty in realizing the analysis algorithm due to text complexity and degree of freedom that variables in the text have. Although the algorithm demanded lots of effort to get meaningful result, mechanical text analysis took more time than human text analysis. However, along with the development of hardware and analysis algorithm, big data technology has appeared. Thanks to big data technology, all the previously mentioned problems have been solved while analysis through text-mining is recognized to be valuable as well. However, applying text-mining to Korean text is still at the initial stage due to the linguistic domain characteristics that the Korean language has. If not only the data searching but also the analysis through text-mining is possible, saving the cost of human and material resources required for text analysis will lead efficient resource utilization in numerous public work fields. Thus, in this paper, we compare and evaluate the public document classification by handwork to public document classification where word frequency(TF-IDF) in a text-mining-based text and Cosine similarity between each document have been utilized in big data environment.

  • PDF

A Comparative Study between Ubiquitous City Comprehensive Plan and Ubiquitous City Plan - Focusing on U-Service Plan (유비쿼터스도시종합계획과 유비쿼터스도시계획 비교 연구 -U-서비스 계획을 중심으로-)

  • Yoo, Ji Song;Jeong, Da Woon;Yi, Mi Sook;Min, Kyung Ju
    • Spatial Information Research
    • /
    • v.23 no.2
    • /
    • pp.83-93
    • /
    • 2015
  • U-Services, which are offered from local governments based on their Ubiquitous City Plans, are only focused on facility and urban management services. Also Citizen oriented U-service is only planned. This study's purpose is to propose the implication for provide of the Citizen oriented U-service comparing with U-Service plan of 'Ubiquitous City Comprehensive Plan' and 'Ubiquitous City Plan' through a network text analysis and word frequency analysis. It was calculated a important keyword that was extracted the service plan contents of the 'Ubiquitous City Comprehensive Plan' and 'Ubiquitous City Plan' of the four local governments. The network text analysis and keyword frequency analysis was performed through derived keyword. Based on the analysis results, awareness of the citizens can be expected to increase about U-City by activating a excavation of Citizen oriented U-service in a variety of sector through additional services and policy of financial support in the next Ubiquitous City Comprehensive Plan.

Text Analysis of Software Test Report (소프트웨어 시험성적서에 대한 텍스트 분석)

  • Jung, Hye-Jung;Han, Gun-Hee
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.11
    • /
    • pp.25-31
    • /
    • 2020
  • This study is to study a method of applying weights for quality characteristics in software test evaluation. The weight application method analyzes the text of the test report and uses the ratio according to the frequency of the text as a weight for the quality characteristics of the software test score. The feasibility review of the results of this study was conducted by comparing the results of the questionnaire survey, which made the developers and users to evaluate the importance of software, and the results of the frequency analysis of text analysis. When measuring quality based on the eight quality characteristics presented in ISO/IEC 25023, the result of this study is the software quality measurement result considering software characteristics, whereas the result of this study is the software quality measurement result by applying the same weight when measuring quality.

Knowledge Structure Analysis on Defense Research Using Text Network Analysis (텍스트 네트워크분석을 활용한 국방분야 연구논문 지식구조 분석)

  • Lee, Yong-Kyu;Yoon, Soung-woong;Lee, Sang-Hoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2018.07a
    • /
    • pp.526-529
    • /
    • 2018
  • 본 연구에서는 텍스트 네트워크분석을 활용하여 국방분야 연구의 핵심 주제어와 연구주제를 분석하고 이를 통해 전체 지식구조를 파악하고자 하였다. 이를 위해 2010년부터 2017년까지의 국방대학교 학위과정 논문을 대상으로 국방분야 연구현황을 진단하고 지식구조를 구성하였다. 8년간 누적된 논문 710건의 초록을 분석하여 총 6,883개의 단어를 추출한 후, 단어의 논문 등장 빈도수와 단어간 링크수를 파레토 법칙에 따라 상위 20%의 기준으로 총 270개의 단어로 추출하였고, 컴포넌트 분석을 통해 최종 170개의 핵심 주제어를 도출하였다. 이 핵심 주제어를 통해 중심성 분석과 응집구조를 분석하여, 국방분야에 대한 총 6개의 지식구조 그룹을 도출하였다.

  • PDF

Text Mining Analysis Technique on ECDIS Accident Report (텍스트 마이닝 기법을 활용한 ECDIS 사고보고서 분석)

  • Lee, Jeong-Seok;Lee, Bo-Kyeong;Cho, Ik-Soon
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.25 no.4
    • /
    • pp.405-412
    • /
    • 2019
  • SOLAS requires that ECDIS be installed on ships of more than 500 gross tonnage engaged in international navigation until the first inspection arriving after July 1, 2018. Several accidents related to the use of ECDIS have occurred with its installation as a new major navigation instrument. The 12 incident reports issued by MAIB, BSU, BEAmer, DMAIB, and DSB were analyzed, and the cause of accident was determined to be related to the operation of the navigator and the ECDIS system. The text was analyzed using the R-program to quantitatively analyze words related to the cause of the accident. We used text mining techniques such as Wordcloud, Wordnetwork and Wordweight to represent the importance of words according to their frequency of derivation. Wordcloud uses the N-gram model as a way of expressing the frequency of used words in cloud form. As a result of the uni-gram analysis of the N-gram model, ECDIS words were obtained the most, and the bi-gram analysis results showed that the word "Safety Contour" was used most frequently. Based on the bi-gram analysis, the causative words are classified into the officer and the ECDIS system, and the related words are represented by Wordnetwork. Finally, the related words with the of icer and the ECDIS system were composed of word corpus, and Wordweight was applied to analyze the change in corpus frequency by year. As a result of analyzing the tendency of corpus variation with the trend line graph, more recently, the corpus of the officer has decreased, and conversely, the corpus of the ECDIS system is gradually increasing.

A Study on Monitoring Method of Citizen Opinion based on Big Data : Focused on Gyeonggi Lacal Currency (Gyeonggi Money) (빅데이터 기반 시민의견 모니터링 방안 연구 : "경기지역화폐"를 중심으로)

  • Ahn, Soon-Jae;Lee, Sae-Mi;Ryu, Seung-Ei
    • Journal of Digital Convergence
    • /
    • v.18 no.7
    • /
    • pp.93-99
    • /
    • 2020
  • Text mining is one of the big data analysis methods that extracts meaningful information from atypical large-scale text data. In this study, text mining was used to monitor citizens' opinions on the policies and systems being implemented. We collected 5,108 newspaper articles and 748 online cafe posts related to 'Gyeonggi Lacal Currency' and performed frequency analysis, TF-IDF analysis, association analysis, and word tree visualization analysis. As a result, many articles related to the purpose of introducing local currency, the benefits provided, and the method of use. However, the contents related to the actual use of local currency were written in the online cafe posts. In order to revitalize local currency, the news was involved in the promotion of local currency as an informant. Online cafe posts consisted of the opinions of citizens who are local currency users. SNS and text mining are expected to effectively activate various policies as well as local currency.

A Study on the Use of Stopword Corpus for Cleansing Unstructured Text Data (비정형 텍스트 데이터 정제를 위한 불용어 코퍼스의 활용에 관한 연구)

  • Lee, Won-Jo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.6
    • /
    • pp.891-897
    • /
    • 2022
  • In big data analysis, raw text data mostly exists in various unstructured data forms, so it becomes a structured data form that can be analyzed only after undergoing heuristic pre-processing and computer post-processing cleansing. Therefore, in this study, unnecessary elements are purified through pre-processing of the collected raw data in order to apply the wordcloud of R program, which is one of the text data analysis techniques, and stopwords are removed in the post-processing process. Then, a case study of wordcloud analysis was conducted, which calculates the frequency of occurrence of words and expresses words with high frequency as key issues. In this study, to improve the problems of the "nested stopword source code" method, which is the existing stopword processing method, using the word cloud technique of R, we propose the use of "general stopword corpus" and "user-defined stopword corpus" and conduct case analysis. The advantages and disadvantages of the proposed "unstructured data cleansing process model" are comparatively verified and presented, and the practical application of word cloud visualization analysis using the "proposed external corpus cleansing technique" is presented.

Exploration of Emotional Labor Research Trends in Korea through Keyword Network Analysis (주제어 네트워크 분석(network analysis)을 통한 국내 감정노동의 연구동향 탐색)

  • Lee, Namyeon;Kim, Joon-Hwan;Mun, Hyung-Jin
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.3
    • /
    • pp.68-74
    • /
    • 2019
  • The purpose of this study was to identify research trends of 892 domestic articles (2009-2018) related to emotional labor by using text-mining and network analysis. To this end, the keyword of these papers were collected and coded and eventually converted to 871 nodes and 2625 links for network text analysis. First, network text analysis revealed that the top four main keyword, according to co-occurrence frequency, were burnout, turnover intention, job stress, and job satisfaction in order and that the frequency and the top four core keyword by degree centrality were all relatively the high. Second, based on the top four core keyword of degree centrality the ego network analysis was conducted and the keyword for connection centroid of each network were presented.

Latent class model for mixed variables with applications to text data (혼합모드 잠재범주모형을 통한 텍스트 자료의 분석)

  • Shin, Hyun Soo;Seo, Byungtae
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.6
    • /
    • pp.837-849
    • /
    • 2019
  • Latent class models (LCM) are useful tools to draw hidden information from categorical data. This model can also be interpreted as a mixture model with multinomial component distributions. In some cases, however, an available dataset may contain both categorical and count or continuous data. For such cases, we can extend the LCM to a mixture model with both multinomial and other component distributions such as normal and Poisson distributions. In this paper, we consider a LCM for the data containing categorical and count data to analyze the Drug Review dataset which contains categorical responses and text review. From this data analysis, we show that we can obtain more specific hidden inforamtion than those from the LCM only with categorical responses.