• Title/Summary/Keyword: 텍스트 빈도 분석

Search Result 332, Processing Time 0.03 seconds

Exploring Teaching Method for Productive Knowledge of Scientific Concept Words through Science Textbook Quantitative Analysis (과학교과서 텍스트의 계량적 분석을 이용한 과학 개념어의 생산적 지식 교육 방안 탐색)

  • Yun, Eunjeong
    • Journal of The Korean Association For Science Education
    • /
    • v.40 no.1
    • /
    • pp.41-50
    • /
    • 2020
  • Looking at the understanding of scientific concepts from a linguistic perspective, it is very important for students to develop a deep and sophisticated understanding of words used in scientific concept as well as the ability to use them correctly. This study intends to provide the basis for productive knowledge education of scientific words by noting that the foundation of productive knowledge teaching on scientific words is not well established, and by exploring ways to teach the relationship among words that constitute scientific concept in a productive and effective manner. To this end, we extracted the relationship among the words that make up the scientific concept from the text of science textbook by using quantitative text analysis methods, second, qualitatively examined the meaning of the word relationship extracted as a result of each method, and third, we proposed a writing activity method to help improve the productive knowledge of scientific concept words. We analyzed the text of the "Force and motion" unit on first grade science textbook by using four methods of quantitative linguistic analysis: word cluster, co-occurrence, text network analysis, and word-embedding. As results, this study suggests four writing activities, completing sentence activity by using the result of word cluster analysis, filling the blanks activity by using the result of co-occurrence analysis, material-oriented writing activities by using the result of text network analysis, and finally we made a list of important words by using the result of word embedding.

Analysis of Information Education Related Theses Using R Program (R을 활용한 정보교육관련 논문 분석)

  • Park, SunJu
    • Journal of The Korean Association of Information Education
    • /
    • v.21 no.1
    • /
    • pp.57-66
    • /
    • 2017
  • Lately, academic interests in big data analysis and social network has been prominently raised. Various academic fields are involved in this social network based research trend, which is, social network has been actively used as the research topic in social science field as well as in natural science field. Accordingly, this paper focuses on the text analysis and the following social network analysis with the Master's and Doctor's dissertations. The result indicates that certain words had a high frequency throughout the entire period and some words had fluctuating frequencies in different period. In detail, the words with a high frequency had a higher betweenness centrality and each period seems to have a distinctive research flow. Therefore, it was found that the subjects of the Master's and Doctor's dissertations were changed sensitively to the development of IT technology and changes in information curriculum of elementary, middle and high school. It is predicted that researches related to smart, mobile, smartphone, SNS, application, storytelling, multicultural, and STEAM, which had an increased frequency in period 4, would be continuously conducted. Moreover, the topics of robots, programming, coding, algorithms, creativity, interaction, and privacy will also be studied steadily.

Exploring 'Tradition' Terminology Trends based on Keyword Analysis (1920~2017) (키워드 분석 기반 '전통' 용어의 트렌드 분석 (1920~2017))

  • Kim, Min-Jeong;Kim, Chul Joo
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.12
    • /
    • pp.421-431
    • /
    • 2018
  • The purpose of this study is to analyze the trends of 'traditional' terminology in Korea. We focus on an empirical investigation of how media reports are conveying 'tradition' terminology in our society by applying text mining and social network analysis techniques. The analysis covered 2,481,143 news articles related to 'tradition' terminology that appeared in the media since the 1920's. In this research, frequency analysis, association analysis and social network analysis were used on articles related to 'tradition' terminology from 1920 to 2017 by decade. By applying these data science techniques, we can grasp the meaning of social culture phenomenon related 'tradition' with objective and value-neutral position and understand the social symbolism which contains the tradition of the times.

Frequency and Social Network Analysis of the Bible Data using Big Data Analytics Tools R (R을 이용한 성경 데이터의 빈도와 소셜 네트워크 분석)

  • Ban, ChaeHoon;Ha, JongSoo
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.93-96
    • /
    • 2018
  • Big datatics technology that can store and analyze data and obtain new knowledge has been adjusted for importance in many fields of the society. Big data is emerging as an important problem in the field of information and communication technology, but the mind of continuous technology is rising. R, a tool that can analyze big data, is a language and environment that enables information analysis of statistical bases. In this thesis, we use this to analyze the Bible data. R is used to investigate the frequency of what text is distributed and analyze the Bible through analysis of social network.

  • PDF

Correspondence Analysis of Reports and Persuasives based on a Newspaper Corpus (접속 부사의 사용에 따른 설득문과 보도문의 대응 분석)

  • Kim, Hye-Young;Kang, Beom-Mo
    • Annual Conference on Human and Language Technology
    • /
    • 2013.10a
    • /
    • pp.175-180
    • /
    • 2013
  • 본 논문은 동아, 조선, 중앙, 한겨레 신문의 2000~2011년 신문 사설과 보도문에서서 나타나는 접속부사의 사용에 대한 분석이다. 구체적으로, 텍스트 구조를 드러내는 표지의 기능을 하는 접속부사에 대해 논의하고자 한다. 12년 동안 출현한 고빈도 접속부사 '그러나, 하지만, 그런데, 그리고, 따라서, 그래서, 그렇지만, 그러면, 그러므로, 하물며'를 대상으로 보도문에서의 빈도 변화와 신문 사설에서의 빈도 변화를 대응 분석과 군집 분석을 통해 객관적, 통계적, 통시적으로 분석하였다. 연구 결과, 나열의 구조에서 보도문은 '그리고'를 선호하고 신문 사설은 '하물며'를 선호하여 사용하며, 대조의 표지로서 보도문은 '하지만'을 신문 사설은 '그러나, 그렇지만'을 선호하여 사용하였다. 화제 전환을 나타낼 때 보도문은 '그러면'을 사용하는 반면 신문 사설은 '그런데'를 사용하고, 문제에 대한 결과를 제시할 때 '보도문'은 '그러므로, 그래서'를 신문 사설은 '따라서'를 더 많이 사용하는 경향이 나타났다.

  • PDF

Multi-Dimensional Keyword Search and Analysis of Hotel Review Data Using Multi-Dimensional Text Cubes (다차원 텍스트 큐브를 이용한 호텔 리뷰 데이터의 다차원 키워드 검색 및 분석)

  • Kim, Namsoo;Lee, Suan;Jo, Sunhwa;Kim, Jinho
    • Journal of Information Technology and Architecture
    • /
    • v.11 no.1
    • /
    • pp.63-73
    • /
    • 2014
  • As the advance of WWW, unstructured data including texts are taking users' interests more and more. These unstructured data created by WWW users represent users' subjective opinions thus we can get very useful information such as users' personal tastes or perspectives from them if we analyze appropriately. In this paper, we provide various analysis efficiently for unstructured text documents by taking advantage of OLAP (On-Line Analytical Processing) multidimensional cube technology. OLAP cubes have been widely used for the multidimensional analysis for structured data such as simple alphabetic and numberic data but they didn't have used for unstructured data consisting of long texts. In order to provide multidimensional analysis for unstructured text data, however, Text Cube model has been proposed precently. It incorporates term frequency and inverted index as measurements to search and analyze text databases which play key roles in information retrieval. The primary goal of this paper is to apply this text cube model to a real data set from in an Internet site sharing hotel information and to provide multidimensional analysis for users' reviews on hotels written in texts. To achieve this goal, we first build text cubes for the hotel review data. By using the text cubes, we design and implement the system which provides multidimensional keyword search features to search and to analyze review texts on various dimensions. This system will be able to help users to get valuable guest-subjective summary information easily. Furthermore, this paper evaluats the proposed systems through various experiments and it reveals the effectiveness of the system.

Analysis of the Research Trends by Environmental Spatial-Information Using Text-Mining Technology (텍스트 마이닝 기법을 활용한 환경공간정보 연구 동향 분석)

  • OH, Kwan-Young;LEE, Moung-Jin;PARK, Bo-Young;LEE, Jung-Ho;YOON, Jung-Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.20 no.1
    • /
    • pp.113-126
    • /
    • 2017
  • This study aimed to quantitatively analyze the trends in environmental research that utilize environmental geospatial information through text mining, one of the big data analysis technologies. The analysis was conducted on a total of 869 papers published in the Republic of Korea, which were collected from the National Digital Science Library (NDSL). On the basis of the classification scheme, the keywords extracted from the papers were recategorized into 10 environmental fields including "general environment", "climate", "air quality", and 20 environmental geospatial information fields including "satellite image", "numerical map", and "disaster". With the recategorized keywords, their frequency levels and time series changes in the collected papers were analyzed, as well as the association rules between keywords. First, the results of frequency analysis showed that "general environment"(40.85%) and "satellite image"(24.87%) had the highest frequency levels among environmental fields and environmental geospatial information fields, respectively. Second, the results of the time series analysis on environmental fields showed that the share of "climate" between 1996 and 2000 was high, but since 2001, that of "general environment" has increased. In terms of environmental geospatial information fields, the demand for "satellite image" was highest throughout the period analyzed, and its utilization share has also gradually increased. Third, a total of 80 correlation rules were generated for environmental fields and environmental geospatial information fields. Among environmental fields, "general environment" generated the highest number of correlation rules (17) with environmental geospatial information fields such as "satellite image" and "digital map".

The Fourth Industrial Revolution Core Technology Association Analysis Using Text Mining (텍스트 마이닝을 활용한 4차 산업혁명 핵심기술 연관분석)

  • Ryu, Jae-Han;You, Yen-Yoo
    • Journal of Digital Convergence
    • /
    • v.16 no.8
    • /
    • pp.129-136
    • /
    • 2018
  • This study analyzed technology application field and technology transfer type related to the 4th industrial revolution using frequency, visualization, and association analysis of text mining of Big Data. The analysis was conducted between the last three years (2015 - 2017) registered with the NTB of KIAT transfer technology database was utilized. As a result of analysis, First, First, transfer technologies called core technologies of the Fourth Industrial Revolution are a lot of about robots, 3D, autonomous driving, and wearables. Second, as the year go by, transfer technolgy registration such as IoT, Cloud, VR is increasing. Third, the results of the association analysis of technology transfer type are as follows. IoT and VR showed preference for technology trading and licensing, autonomous driving technology trading, wearable licensing, robots preferring technology cooperation, licensing, and technology trading.

Research Trends in Record Management Using Unstructured Text Data Analysis (비정형 텍스트 데이터 분석을 활용한 기록관리 분야 연구동향)

  • Deokyong Hong;Junseok Heo
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.23 no.4
    • /
    • pp.73-89
    • /
    • 2023
  • This study aims to analyze the frequency of keywords used in Korean abstracts, which are unstructured text data in the domestic record management research field, using text mining techniques to identify domestic record management research trends through distance analysis between keywords. To this end, 1,157 keywords of 77,578 journals were visualized by extracting 1,157 articles from 7 journal types (28 types) searched by major category (complex study) and middle category (literature informatics) from the institutional statistics (registered site, candidate site) of the Korean Citation Index (KCI). Analysis of t-Distributed Stochastic Neighbor Embedding (t-SNE) and Scattertext using Word2vec was performed. As a result of the analysis, first, it was confirmed that keywords such as "record management" (889 times), "analysis" (888 times), "archive" (742 times), "record" (562 times), and "utilization" (449 times) were treated as significant topics by researchers. Second, Word2vec analysis generated vector representations between keywords, and similarity distances were investigated and visualized using t-SNE and Scattertext. In the visualization results, the research area for record management was divided into two groups, with keywords such as "archiving," "national record management," "standardization," "official documents," and "record management systems" occurring frequently in the first group (past). On the other hand, keywords such as "community," "data," "record information service," "online," and "digital archives" in the second group (current) were garnering substantial focus.

Analysis of Seasonal Importance of Construction Hazards Using Text Mining (텍스트마이닝을 이용한 건설공사 위험요소의 계절별 중요도 분석)

  • Park, Kichang;Kim, Hyoungkwan
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.41 no.3
    • /
    • pp.305-316
    • /
    • 2021
  • Construction accidents occur due to a number of reasons-worker carelessness, non-adoption of safety equipment, and failure to comply with safety rules are some examples. Because much construction work is done outdoors, weather conditions can also be a factor in accidents. Past construction accident data are useful for accident prevention, but since construction accident data are often in a text format consisting of natural language, extracting construction hazards from construction accident data can take a lot of time and that entails extra cost. Therefore, in this study, we extracted construction hazards from 2,026 domestic construction accident reports using text mining and performed a seasonal analysis of construction hazards through frequency analysis and centrality analysis. Of the 254 construction hazards defined by Korea's Ministry of Land, Infrastructure, and Transport, we extracted 51 risk factors from the construction accident data. The results showed that a significant hazard was "Formwork" in spring and autumn, "Scaffold" in summer, and "Crane" in winter. The proposed method would enable construction safety managers to prepare better safety measures against outdoor construction accidents according to weather, season, and climate.