• Title/Summary/Keyword: 동시출현단어 분석

Search Result 112, Processing Time 0.025 seconds

Microplastics Intellectual Network Analysis based on Bigdata (빅데이터 기반한 미세플라스틱 지적네트워크 분석)

  • Kim, Younghee;Chang, Kwanjong
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.4
    • /
    • pp.239-259
    • /
    • 2022
  • Since 2019, research on microplastics has been actively conducted around the world, so analyzing the differences between domestic and foreign microplastics research can be a milestone in establishing the direction of domestic research. In this study, microplastic papers from KCI and WoS were extracted and the differences between domestic and foreign studies were analyzed using a network analysis methodology based on big data such as author keyword co-occurrence word analysis, thesis co-citation analysis, and author co-citation analysis. As a result of the analysis, the analysis of the research topic confirmed that studies that could affect the human body and the treatment of microplastics in daily life were additionally needed in Korea. In the analysis of the depth of thesis citation that examines the quality of research, it was found that Korea was still insufficient at 2.25 overseas and 1.39 in Korea. In the analysis of the composition of the joint research front, where various researchers participate and share information, 3 out of 22 clusters in Korea are Star type. In the case of overseas, all 19 clusters have a mesh structure, so it was confirmed that information flow and sharing were insufficient in specific research fields in Korea. These research results confirmed the need to expand the research topic of microplastics, improve the quality of research, and improve the research promotion system in which various researchers participate. In addition, if the automation program is developed based on topic modeling, it will be possible to build a system capable of real-time analysis.

An Investigation on Scientific Data for Data Journal and Data Paper (Scientific Data 학술지 분석을 통한 데이터 논문 현황에 관한 연구)

  • Chung, EunKyung
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.1
    • /
    • pp.117-135
    • /
    • 2019
  • Data journals and data papers have grown and considered an important scholarly practice in the paradigm of open science in the context of data sharing and data reuse. This study investigates a total of 713 data papers published in Scientific Data in terms of author, citation, and subject areas. The findings of the study show that the subject areas of core authors are found as the areas of Biotechnology and Physics. An average number of co-authors is 12 and the patterns of co-authorship are recognized as several closed sub-networks. In terms of citation status, the subject areas of cited publications are highly similar to the areas of data paper authors. However, the citation analysis indicates that there are considerable citations on the journals specialized on methodology. The network with authors' keywords identifies more detailed areas such as marine ecology, cancer, genome, database, and temperature. This result indicates that biology oriented-subjects are primary areas in the journal although Scientific Data is categorized in multidisciplinary science in Web of Science database.

Introducing Keyword Bibliographic Coupling Analysis (KBCA) for Identifying the Intellectual Structure (지적구조 규명을 위한 키워드서지결합분석 기법에 관한 연구)

  • Lee, Jae Yun;Chung, EunKyung
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.1
    • /
    • pp.309-330
    • /
    • 2022
  • Intellectual structure analysis, which quantitatively identifies the structure, characteristics, and sub-domains of fields, has rapidly increased in recent years. Analysis techniques traditionally used to conduct intellectual structure analysis research include bibliographic coupling analysis, co-citation analysis, co-occurrence analysis, and author bibliographic coupling analysis. This study proposes a novel intellectual structure analysis method, Keyword Bibliographic Coupling Analysis (KBCA). The Keyword Bibliographic Coupling Analysis (KBCA) is a variation of the author bibliographic coupling analysis, which targets keywords instead of authors. It calculates the number of references shared by two keywords to the degree of coupling between the two keywords. A set of 1,366 articles in the field of 'Open Data' searched in the Web of Science were collected using the proposed KBCA technique. A total of 63 keywords that appeared more than 7 times, extracted from 1,366 article sets, were selected as core keywords in the open data field. The intellectual structure presented by the KBCA technique with 63 key keywords identified the main areas of open government and open science and 10 sub-areas. On the other hand, the intellectual structure network of co-occurrence word analysis was found to be insufficient in the overall structure and detailed domain structure. This result can be considered because the KBCA sufficiently measures the relationship between keywords using the degree of bibliographic coupling.

Examining Suicide Tendency Social Media Texts by Deep Learning and Topic Modeling Techniques (딥러닝 및 토픽모델링 기법을 활용한 소셜 미디어의 자살 경향 문헌 판별 및 분석)

  • Ko, Young Soo;Lee, Ju Hee;Song, Min
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.32 no.3
    • /
    • pp.247-264
    • /
    • 2021
  • This study aims to create a deep learning-based classification model to classify suicide tendency by suicide corpus constructed for the present study. Also, to analyze suicide factors, the study classified suicide tendency corpus into detailed topics by using topic modeling, an analysis technique that automatically extracts topics. For this purpose, 2,011 documents of the suicide-related corpus collected from social media naver knowledge iN were directly annotated into suicide-tendency documents or non-suicide-tendency documents based on suicide prevention education manual issued by the Central Suicide Prevention Center, and we also conducted the deep learning model(LSTM, BERT, ELECTRA) performance evaluation based on the classification model, using annotated corpus data. In addition, one of the topic modeling techniques, LDA identified suicide factors by classifying thematic literature, and co-word analysis and visualization were conducted to analyze the factors in-depth.

A Study on Analysis of Research Trends and Intellectual Structure of Cataloging Field (목록 분야 연구동향 및 지적구조 분석)

  • Lee, Ji Won
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.4
    • /
    • pp.279-300
    • /
    • 2019
  • This study aims to analyze and to demonstrate the research trends and intellectual structure in the field of catalog in the 2000s and 2010s through co-word analysis. The field of catalog had firmly established its own research area and Many differences were found in research trends and intellectual structures in the 2000s and 2010s. First, the average number of articles decreased by 4.2 in the 2010s compared to the 2000s, but the number of author keywords was not significantly different. Only 22.2% of keywords appeared more than three times in both periods, and 77.8% of keywords appeared more than three times in one period. Second, in terms of intellectual structure, the 2000s, represented by three-level clusters, formed a more complex network than the 2010s, represented by two-level clusters. Third, as a result of examining the changes in the characteristics of each cluster, there were some research topics with few changes, but many research topics were more actively progressed or subdivided, and decreased. The results of this study are meaningful in that they can visually grasp the intellectual structure along with the trend of the age of catalogue, and can prepare for related education and research by predicting the future.

Exploration of Intellectual Structure of Artificial Intelligence Field Using Co-word Analysis (동시출현 단어 분석을 통한 지식 구조의 파악 : 인공지능 분야를 대상으로)

  • 이미경;정영미
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2003.08a
    • /
    • pp.245-251
    • /
    • 2003
  • 이 연구에서는 통제된 색인어를 이용하여 파악한 지식 구조와 통제되지 않은 키워드를 이용한 지식 구조를 비교하여 두 구조가 어떤 차이점을 보이는지를 살펴보았다. 또한 색인효과가 어떻게 나타나는지, 비통제어를 사용한 경우가 실제적으로 더 상세한 하위 영역을 표현하는지를 확인하고자 하였다. 실험 결과 통제된 색인어인 주제명표목을 사용한 영역지도와 비통제 색인어인 키워드를 사용한 영역지도 둘 다 인공지능 분야의 주요 분야들을 비슷하게 나타냈지만, 주제명표목을 사용한 경우에 색인효과가 일부 나타났다. 그리고 대체적으로 주제명표목에 기반한 영역지도보다는 키워드에 기반한 영역지도가 더 상세하게 나타났다.

  • PDF

Time Series Analysis of Intellectual Structure and Research Trend Changes in the Field of Library and Information Science: 2003 to 2017 (문헌정보학 분야의 지적구조 및 연구 동향 변화에 대한 시계열 분석: 2003년부터 2017년까지)

  • Choi, Hyung Wook;Choi, Ye-Jin;Nam, So-Yeon
    • Journal of the Korean Society for information Management
    • /
    • v.35 no.2
    • /
    • pp.89-114
    • /
    • 2018
  • Research on changes in research trends in academic disciplines is a method that enables observation of not only the detailed research subject and structure of the field but also the state of change in the flow of time. Therefore, in this study, in order to observe the changes of research trend in library and information science field in Korea, co-word analysis was conducted with Korean author keywords from three types of journals which were listed in the Korea Citation Index(KCI) and have top citation impact factor were selected. For the time series analysis, the 15-year research period was accumulated in 5-years units, and divided into 2003~2007, 2003~2012, and 2003~2017. The keywords which limited to the frequency of appearance 10 or more, respectively, were analyzed and visualized. As a result of the analysis, during the period from 2003 to 2007, the intellectual structure composed with 25 keywords and 8 areas was confirmed, and during the period from 2003 to 2012, the structure composed by 3 areas 17 sub-areas with 76 keywords was confirmed. Also, the intellectual structure during the period from 2003 to 2017 was crowded into 6 areas 32 consisting of a total of 132 keywords. As a result of comprehensive period analysis, in the field of library and information science in Korea, over the past 15 years, new keywords have been added for each period, and detailed topics have also been subdivided and gradually segmented and expanded.

A Content Analysis of Journal Articles Using the Language Network Analysis Methods (언어 네트워크 분석 방법을 활용한 학술논문의 내용분석)

  • Lee, Soo-Sang
    • Journal of the Korean Society for information Management
    • /
    • v.31 no.4
    • /
    • pp.49-68
    • /
    • 2014
  • The purpose of this study is to perform content analysis of research articles using the language network analysis method in Korea and catch the basic point of the language network analysis method. Six analytical categories are used for content analysis: types of language text, methods of keyword selection, methods of forming co-occurrence relation, methods of constructing network, network analytic tools and indexes. From the results of content analysis, this study found out various features as follows. The major types of language text are research articles and interview texts. The keywords were selected from words which are extracted from text content. To form co-occurrence relation between keywords, there use the co-occurrence count. The constructed networks are multiple-type networks rather than single-type ones. The network analytic tools such as NetMiner, UCINET/NetDraw, NodeXL, Pajek are used. The major analytic indexes are including density, centralities, sub-networks, etc. These features can be used to form the basis of the language network analysis method.

Building and Analyzing Panic Disorder Social Media Corpus for Automatic Deep Learning Classification Model (딥러닝 자동 분류 모델을 위한 공황장애 소셜미디어 코퍼스 구축 및 분석)

  • Lee, Soobin;Kim, Seongdeok;Lee, Juhee;Ko, Youngsoo;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.38 no.2
    • /
    • pp.153-172
    • /
    • 2021
  • This study is to create a deep learning based classification model to examine the characteristics of panic disorder and to classify the panic disorder tendency literature by the panic disorder corpus constructed for the present study. For this purpose, 5,884 documents of the panic disorder corpus collected from social media were directly annotated based on the mental disease diagnosis manual and were classified into panic disorder-prone and non-panic-disorder documents. Then, TF-IDF scores were calculated and word co-occurrence analysis was performed to analyze the lexical characteristics of the corpus. In addition, the co-occurrence between the symptom frequency measurement and the annotated symptom was calculated to analyze the characteristics of panic disorder symptoms and the relationship between symptoms. We also conducted the performance evaluation for a deep learning based classification model. Three pre-trained models, BERT multi-lingual, KoBERT, and KcBERT, were adopted for classification model, and KcBERT showed the best performance among them. This study demonstrated that it can help early diagnosis and treatment of people suffering from related symptoms by examining the characteristics of panic disorder and expand the field of mental illness research to social media.

Analyzing Self-Introduction Letter of Freshmen at Korea National College of Agricultural and Fisheries by Using Semantic Network Analysis : Based on TF-IDF Analysis (언어네트워크분석을 활용한 한국농수산대학 신입생 자기소개서 분석 - TF-IDF 분석을 기초로 -)

  • Joo, J.S.;Lee, S.Y.;Kim, J.S.;Kim, S.H.;Park, N.B.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.23 no.1
    • /
    • pp.89-104
    • /
    • 2021
  • Based on the TF-IDF weighted value that evaluates the importance of words that play a key role, the semantic network analysis(SNA) was conducted on the self-introduction letter of freshman at Korea National College of Agriculture and Fisheries(KNCAF) in 2020. The top three words calculated by TF-IDF weights were agriculture, mathematics, study (Q. 1), clubs, plants, friends (Q. 2), friends, clubs, opinions, (Q. 3), mushrooms, insects, and fathers (Q. 4). In the relationship between words, the words with high betweenness centrality are reason, high school, attending (Q. 1), garbage, high school, school (Q. 2), importance, misunderstanding, completion (Q.3), processing, feed, and farmhouse (Q. 4). The words with high degree centrality are high school, inquiry, grades (Q. 1), garbage, cleanup, class time (Q. 2), opinion, meetings, volunteer activities (Q.3), processing, space, and practice (Q. 4). The combination of words with high frequency of simultaneous appearances, that is, high correlation, appeared as 'certification - acquisition', 'problem - solution', 'science - life', and 'misunderstanding - concession'. In cluster analysis, the number of clusters obtained by the height of cluster dendrogram was 2(Q.1), 4(Q.2, 4) and 5(Q. 3). At this time, the cohesion in Cluster was high and the heterogeneity between Clusters was clearly shown.