• Title/Summary/Keyword: Co-Word Analysis

Search Result 198, Processing Time 0.024 seconds

Building and Analyzing Panic Disorder Social Media Corpus for Automatic Deep Learning Classification Model (딥러닝 자동 분류 모델을 위한 공황장애 소셜미디어 코퍼스 구축 및 분석)

  • Lee, Soobin;Kim, Seongdeok;Lee, Juhee;Ko, Youngsoo;Song, Min
    • Journal of the Korean Society for information Management
    • /
    • v.38 no.2
    • /
    • pp.153-172
    • /
    • 2021
  • This study is to create a deep learning based classification model to examine the characteristics of panic disorder and to classify the panic disorder tendency literature by the panic disorder corpus constructed for the present study. For this purpose, 5,884 documents of the panic disorder corpus collected from social media were directly annotated based on the mental disease diagnosis manual and were classified into panic disorder-prone and non-panic-disorder documents. Then, TF-IDF scores were calculated and word co-occurrence analysis was performed to analyze the lexical characteristics of the corpus. In addition, the co-occurrence between the symptom frequency measurement and the annotated symptom was calculated to analyze the characteristics of panic disorder symptoms and the relationship between symptoms. We also conducted the performance evaluation for a deep learning based classification model. Three pre-trained models, BERT multi-lingual, KoBERT, and KcBERT, were adopted for classification model, and KcBERT showed the best performance among them. This study demonstrated that it can help early diagnosis and treatment of people suffering from related symptoms by examining the characteristics of panic disorder and expand the field of mental illness research to social media.

An Investigation on Intellectual Structure of Social Sciences Research by Analysing the Publications of ICPSR Data Reuse (ICPSR 데이터 재이용 저작물 분석을 통한 사회과학 분야의 지적구조 분석)

  • Chung, EunKyung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.52 no.1
    • /
    • pp.341-357
    • /
    • 2018
  • Due to the paradigm of open science and advanced digital information technology, data sharing and re-use have been actively conducted and considered data-intensive in a wide variety of disciplines. This study aims to investigate the intellectual structure portrayed by the research products re-using the data sets from ICPSR. For the purpose of this study, a total of 570 research products published in 2017 from the ICPSR site were collected and analyzed in two folds. First, the authors and publications of those research products were analyzed in order to show the trends of research using ICPSR data. Authors tend to be affiliated with university or research institute in the United States. The subject areas of journals are recognized into Social Sciences, Health, and Psychology. In addition, a network with clustering analysis was conducted with using co-word occurrence from the titles of the research products. The results show that there are 12 clusters, mental health, tabocco effect, disorder in school, childhood, and adolescence, sexual risk, child injuries, physical activity, violent behavior, survey, family role, women, problem behavior, gender differences in research areas. The structure portrayed by ICPSR data re-uses demonstrates that substantial number of studies in Medicine have been conducted with a perspective of social sciences.

The Tresnds of Artiodactyla Researches in Korea, China and Japan using Text-mining and Co-occurrence Analysis of Words (텍스트마이닝과 동시출현단어분석을 이용한 한국, 중국, 일본의 우제목 연구 동향 분석)

  • Lee, Byeong-Ju;Kim, Baek-Jun;Lee, Jae Min;Eo, Soo Hyung
    • Korean Journal of Environment and Ecology
    • /
    • v.33 no.1
    • /
    • pp.9-15
    • /
    • 2019
  • Artiodactyla, which is an even-toed mammal, widely inhabits worldwide. In recent years, wild Artiodactyla species have attracted public attention due to the rapid increase of crop damage and road-kill caused by wild Artiodactyla such as water deer and wild boar and the decrease of some species such as long-tailed goral and musk deer. In spite of such public attention, however, there have been few studies on Artiodactyla in Korea, and no studies have focused on the trend analysis of Artiodactyla, making it difficult to understand actual problems. Many recent studies on trend used text-mining and co-occurrence analysis to increase objectivity in the classification of research subjects by extracting keywords appearing in literature and quantifying relevance between words. In this study, we analyzed texts from research articles of three countries (Korea, China, and Japan) through text-mining and co-occurrence analysis and compared the research subjects in each country. We extracted 199 words from 665 articles related to Artiodactyla of three countries through text-mining. Three word-clusters were formed as a result of co-occurrence analysis on extracted words. We determined that cluster1 was related to "habitat condition and ecology", cluster2 was related to "disease" and cluster3 was related to "conservation genetics and molecular ecology". The results of comparing the rates of occurrence of each word clusters in each country showed that they were relatively even in China and Japan whereas Korea had a prevailing rate (69%) of cluster2 related to "disease". In the regression analysis on the number of words per year in each cluster, the number of words in both China and Japan increased evenly by year in each cluster while the rate of increase of cluster2 was five times more than the other clusters in Korea. The results indicate that Korean researches on Artiodactyla tended to focus on diseases more than those in China and Japan, and few researchers considered other subjects including habitat characteristics, behavior and molecular ecology. In order to control the damage caused by Artiodactyla and to establish a reasonable policy for the protection of endangered species, it is necessary to accumulate basic ecological data by conducting researches on wild Artiodactyla more.

Identifying Top K Persuaders Using Singular Value Decomposition

  • Min, Yun-Hong;Chung, Ye-Rim
    • Journal of Distribution Science
    • /
    • v.14 no.9
    • /
    • pp.25-29
    • /
    • 2016
  • Purpose - Finding top K persuaders in consumer network is an important problem in marketing. Recently, a new method of computing persuasion scores, interpreted as fixed point or stable distribution for given persuasion probabilities, was proposed. Top K persuaders are chosen according to the computed scores. This research proposed a new definition of persuasion scores relaxing some conditions on the matrix of probabilities, and a method to identify top K persuaders based on the defined scores. Research design, data, and methodology - A new method of computing top K persuaders is computed by singular value decomposition (SVD) of the matrix which represents persuasion probabilities between entities. Results - By testing a randomly generated instance, it turns out that the proposed method is essentially different from the previous study sharing a similar idea. Conclusions - The proposed method is shown to be valid with respect to both theoretical analysis and empirical test. However, this method is limited to the category of persuasion scores relying on the matrix-form of persuasion probabilities. In addition, the strength of the method should be evaluated via additional experiments, e.g., using real instances, different benchmark methods, efficient numerical methods for SVD, and other decomposition methods such as NMF.

Archeological Consideration of DNA Typing (유전자 분석의 고고학적 고찰)

  • Lee, Kyu-Sik;Seo, Min-Seok;Chung, Yong-Jae
    • Korean Journal of Heritage: History & Science
    • /
    • v.35
    • /
    • pp.120-137
    • /
    • 2002
  • It has not been a long time since we recognize that a word 'DNA' is not unfamiliar with us. Development of biology give us so much of benefits of civilization and so we call the 21th century as 'biological period'. It has not been a long time that archeology made contact with biology. With biological development, DNA typing analysis has been accomplished extensively since 1990's. We know through mitochondrial DNA base sequencing analysis that the Neanderthal man is not the origin of the human race and ancient human race set out from Africa. Biological science technology, which is polymerase chain reaction(PCR) or electrophoresis etc., made these results possible. A contact between biology, especially genetics, and archeology is getting accomplished through these current. If genetics keep in contact with archeological foundation, we know not only about ancient populations in the Korean Peninsula, but also origin of human race. This field is so-called 'DNA Archeology'. This field is of help to person identification and children discrimination as like a forensic science. We make every effort for great possibilities from co-ownership of these two fields and these fields needs to convert a recognition, especially.

Analysis of News Agenda Using Text mining and Semantic Network Analysis: Focused on COVID-19 Emotions (텍스트 마이닝과 의미 네트워크 분석을 활용한 뉴스 의제 분석: 코로나 19 관련 감정을 중심으로)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.47-64
    • /
    • 2021
  • The global spread of COVID-19 around the world has not only affected many parts of our daily life but also has a huge impact on many areas, including the economy and society. As the number of confirmed cases and deaths increases, medical staff and the public are said to be experiencing psychological problems such as anxiety, depression, and stress. The collective tragedy that accompanies the epidemic raises fear and anxiety, which is known to cause enormous disruptions to the behavior and psychological well-being of many. Long-term negative emotions can reduce people's immunity and destroy their physical balance, so it is essential to understand the psychological state of COVID-19. This study suggests a method of monitoring medial news reflecting current days which requires striving not only for physical but also for psychological quarantine in the prolonged COVID-19 situation. Moreover, it is presented how an easier method of analyzing social media networks applies to those cases. The aim of this study is to assist health policymakers in fast and complex decision-making processes. News plays a major role in setting the policy agenda. Among various major media, news headlines are considered important in the field of communication science as a summary of the core content that the media wants to convey to the audiences who read it. News data used in this study was easily collected using "Bigkinds" that is created by integrating big data technology. With the collected news data, keywords were classified through text mining, and the relationship between words was visualized through semantic network analysis between keywords. Using the KrKwic program, a Korean semantic network analysis tool, text mining was performed and the frequency of words was calculated to easily identify keywords. The frequency of words appearing in keywords of articles related to COVID-19 emotions was checked and visualized in word cloud 'China', 'anxiety', 'situation', 'mind', 'social', and 'health' appeared high in relation to the emotions of COVID-19. In addition, UCINET, a specialized social network analysis program, was used to analyze connection centrality and cluster analysis, and a method of visualizing a graph using Net Draw was performed. As a result of analyzing the connection centrality between each data, it was found that the most central keywords in the keyword-centric network were 'psychology', 'COVID-19', 'blue', and 'anxiety'. The network of frequency of co-occurrence among the keywords appearing in the headlines of the news was visualized as a graph. The thickness of the line on the graph is proportional to the frequency of co-occurrence, and if the frequency of two words appearing at the same time is high, it is indicated by a thick line. It can be seen that the 'COVID-blue' pair is displayed in the boldest, and the 'COVID-emotion' and 'COVID-anxiety' pairs are displayed with a relatively thick line. 'Blue' related to COVID-19 is a word that means depression, and it was confirmed that COVID-19 and depression are keywords that should be of interest now. The research methodology used in this study has the convenience of being able to quickly measure social phenomena and changes while reducing costs. In this study, by analyzing news headlines, we were able to identify people's feelings and perceptions on issues related to COVID-19 depression, and identify the main agendas to be analyzed by deriving important keywords. By presenting and visualizing the subject and important keywords related to the COVID-19 emotion at a time, medical policy managers will be able to be provided a variety of perspectives when identifying and researching the regarding phenomenon. It is expected that it can help to use it as basic data for support, treatment and service development for psychological quarantine issues related to COVID-19.

Web Site Keyword Selection Method by Considering Semantic Similarity Based on Word2Vec (Word2Vec 기반의 의미적 유사도를 고려한 웹사이트 키워드 선택 기법)

  • Lee, Donghun;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.2
    • /
    • pp.83-96
    • /
    • 2018
  • Extracting keywords representing documents is very important because it can be used for automated services such as document search, classification, recommendation system as well as quickly transmitting document information. However, when extracting keywords based on the frequency of words appearing in a web site documents and graph algorithms based on the co-occurrence of words, the problem of containing various words that are not related to the topic potentially in the web page structure, There is a difficulty in extracting the semantic keyword due to the limit of the performance of the Korean tokenizer. In this paper, we propose a method to select candidate keywords based on semantic similarity, and solve the problem that semantic keyword can not be extracted and the accuracy of Korean tokenizer analysis is poor. Finally, we use the technique of extracting final semantic keywords through filtering process to remove inconsistent keywords. Experimental results through real web pages of small business show that the performance of the proposed method is improved by 34.52% over the statistical similarity based keyword selection technique. Therefore, it is confirmed that the performance of extracting keywords from documents is improved by considering semantic similarity between words and removing inconsistent keywords.

Analysis on Topics of Digital Preservation Researches and Courses (디지털 보존 관련 학술연구 및 교과 주제분석)

  • Jeong, Uiyeon;Choi, Sanghee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.53 no.3
    • /
    • pp.25-43
    • /
    • 2019
  • Recently there has been a growing interest in digital preservation and digital curation with rapid increase of digital resource. This study aims to investigate the research topics and the course topics related digital preservation and digital curation. The course information is collected from the curricular of library and information science departments and archival science departments in leading countries such as US, England, Ireland, Canada and New Zealand. Title keyword profiling and network analysis were adapted to discover core research and education areas. The key topics in the abstracts of research papers and the contents of the course were also illustrated by these methods. In the research analysis, archival system is the biggest area of researches related digital preservation and digital curation. Courser analysis shows digital curation education and process is the important area of education. As a result of content analysis, plan and strategy is a notable topic of research and record management process is a major topic of courses for digital preservation and digital curation. In addition, format of digital resource is an important topic for research and courses.

Knowledge Structure of Posttraumatic Growth Research: A Network Analysis (네트워크 분석을 통한 외상 후 성장 지식구조 연구)

  • Shin, JooYeon;Kwon, Sunyoung;Bae, Ka Ryeong
    • Journal of Industrial Convergence
    • /
    • v.20 no.10
    • /
    • pp.61-69
    • /
    • 2022
  • Posttraumatic growth literature has been rapidly expanding in multiple academic disciplines. Purpose of this study is to examine the knowledge structure of posttraumatic growth utilizing a network analysis. Papers published between 1996 and 2018 were searched on the Web of Science, focusing on terms related to posttraumatic growth. One thousand six-hundred and fifty-nine keywords were published 6,343 times in 1,780 papers; thus, a total of 322 keywords (5,195 appearances) were selected for the final analysis. The network analysis and network visualization tool used were NodeXL and PFnet, respectively. The keywords which appeared the most frequently were "Posttraumatic growth," followed by "Posttraumatic Stress Disease," "Cancer," and "Trauma." A total of 322 nodes have been reduced to 175 nodes and divided into a total of five groups. The five groups were "Posttraumatic Growth in Cancer, Chronic/Serious Illness, and Disability," "Posttraumatic Growth-related Psychological Variables and Psychotherapy," "Posttraumatic Growth in the Context of Death," "Cognitive Mechanisms of Posttraumatic Growth," and "Vicarious Posttraumatic Growth." This study provides a systematic overview on the knowledge structure of posttraumatic growth by quantitatively network analysis.

Analysis Study on Trends of Library Development Plan by Using Big Data Analysis (빅데이터 분석 기법을 활용한 도서관발전종합계획 동향 분석 연구)

  • Kim, Dongseok;Noh, Younghee
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.29 no.2
    • /
    • pp.85-108
    • /
    • 2018
  • This study aimed to analyze media reports of the Comprehensive Library Advancement Plan using big data analysis in order to determine trends and implications by period. To do so, related data from 2009 to 2017 were collected from major domestic web portal sites. Words in the collected data were refined through the text mining process and frequency, centrality, and structural equivalence analyses were performed. Results confirmed that, during the implementation of the first and the second phases of the Comprehensive Library Advancement Plan, the focus of the library policy changed from external growth to strengthening internal stability and advancement of library operation, and the media coverage were limited to specific policies such as expansion of library facilities. Findings from this study will serve as useful material for ascertaining the approach to perceive and understand the national library policy represented by the Comprehensive Library Advancement Plan.