• Title/Summary/Keyword: document analysis

Search Result 1,202, Processing Time 0.025 seconds

A Study of Development and Application of an Inland Water Body Training Dataset Using Sentinel-1 SAR Images in Korea (Sentinel-1 SAR 영상을 활용한 국내 내륙 수체 학습 데이터셋 구축 및 알고리즘 적용 연구)

  • Eu-Ru Lee;Hyung-Sup Jung
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1371-1388
    • /
    • 2023
  • Floods are becoming more severe and frequent due to global warming-induced climate change. Water disasters are rising in Korea due to severe rainfall and wet seasons. This makes preventive climate change measures and efficient water catastrophe responses crucial, and synthetic aperture radar satellite imagery can help. This research created 1,423 water body learning datasets for individual water body regions along the Han and Nakdong waterways to reflect domestic water body properties discovered by Sentinel-1 satellite radar imagery. We created a document with exact data annotation criteria for many situations. After the dataset was processed, U-Net, a deep learning model, analyzed water body detection results. The results from applying the learned model to water body locations not involved in the learning process were studied to validate soil water body monitoring on a national scale. The analysis showed that the created water body area detected water bodies accurately (F1-Score: 0.987, Intersection over Union [IoU]: 0.955). Other domestic water body regions not used for training and evaluation showed similar accuracy (F1-Score: 0.941, IoU: 0.89). Both outcomes showed that the computer accurately spotted water bodies in most areas, however tiny streams and gloomy areas had problems. This work should improve water resource change and disaster damage surveillance. Future studies will likely include more water body attribute datasets. Such databases could help manage and monitor water bodies nationwide and shed light on misclassified regions.

A Study on the Current Status and Utilization of Old Map in Library and Museums in Korea (국내 도서관·박물관 소장 고지도의 현황 및 활용에 관한 연구)

  • Gi Young Kim
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.35 no.1
    • /
    • pp.97-125
    • /
    • 2024
  • The purpose of this study is to increase access to information on old maps and to discuss efficient ways to utilize old maps, such as providing services and information using old maps. To this end, the information search system of domestic institutions that provide old map information was investigated, and methods of searching for old map data and accessing information were searched on the website. In addition, the current status of the collection of old maps in domestic libraries and museums was analyzed by referring to the homepage, book, research book, and publication of each institution. As a result of the analysis, about 2,200 old maps were housed in 76 institutions, including national, public, and university libraries and museums nationwide. Each institution in the collection of old maps was carrying out publication business, such as publication of English manuscripts, exhibitions and books, publication of research document edits such as lists and summaries. However, reading and using of original documents are limited due to the rare nature of old maps and the data characteristics of the only one. In order to effectively utilize old maps, first, it is necessary to improve access to old map information services and expand academic information services. Second, it is proposed to use old maps as data for archival construction that reflects the identity of the region. Third, it is necessary to cultivate professional manpower who selects and provides information based on knowledge of old map data and humanities literacy.

A Study on the Goal-Orientation of QI Performers in the Medical Centers (의료기관 QI 담당자의 목표추구몰입에 관한 연구)

  • Kim, Mi-Sook;Park, Jae-Sung
    • The Korean Journal of Health Service Management
    • /
    • v.2 no.1
    • /
    • pp.105-124
    • /
    • 2008
  • The purpose of this research is to provide the data base for the activation of Quality Improvement operation through investigating the status of Quality Improvement operation, and finding out factors influencing on the goal-orientation of QI performers in the medical centers of more than one hundred beds where are practicing Quality Improvement operation. In order to reach the purpose, document study was carried out grounded on the proceeding researches and formulated statistical data in relation with the status of Quality Improvement performers, and proof study was carried out through questionnaire survey. The subjects of the survey were the Quality Improvement performers working in seventy three medical centers in Pusan-Gyeongnam, Daegu-Gyeongbuk, and Ulsan. Among eighty three Quality Improvement performers, fifty, five were questionnaire surveyed, on the result of which Reliability Analysis, Factor Analysis, and Multiple Regression Analysis were made, using statistical program. The the results of the proof analysis on this research are as follows. First, in the factors influencing the devoting to goal pursuit of QI performers, organization-goal contribution(0.44) had significant positive effects, while organization conflict(-0.25) had significant negative effects. In other words, the higher the organization-goal contribution was, the higher the devoting to goal pursuit was, while the less the organization conflict was, the higher the devoting to goal pursuit was, which was statistically significant.(p<0.05). Second, in the aspect of goal performance types of QI performers, the process-centered type showed high level of the devoting to goal pursuit, which was statistically significant.(p<0.05). Third, in the aspect of QI performance degree, the higher the devoting to goal pursuit was, the higher the QI performance degree was, which was statistically significant.(p<0.05). In addition, the performers who perceived their workplaces organic structure showed much higher QI performance degree, which statistically significant.(p<0.05). Generalizing the results of this research, it is possible to offer a few suggestions as follows. First, as the competition among the medical centers is more severe recently owing to medical center evaluation system, medical centers are practicing various Quality Improvement operation in all of medical services such as clinical performance and management performance, to reach the purpose of both cost-cutting and medical quality improvement. Thus in order to practice Quality Improvement operation more efficiently in medical centers, it is essential to nuke use of problem-solving methods and statistical members. This as the willingness of chief executives and positive attitude and recognition of organization members. This requires the installation of divisions in charge and disposition of persons in charge, not to speak of persistent training of Quality Improvement. Second, the divisions in charge of QI carry out Quality Improvement operation at the medical center level, and take the role of generalizing and adjusting QI performances of various departments. Owing to this role, the division in charge of QI is considered indispensable organization in the QI operation of medical centers along with medical QI committee, while it contributes to the government's goal of reducing quality level gaps among medical centers. Therefore it is necessary for government and QI organizations to give institutional support and resources for the sake of QI operation of medical centers, besides to supply systematic trainning and informations to the divisions and persons in charge of QI. Third, it is certain that disposition of persons in charge should be determined in view of the scale and the scope of QI operation in medical centers.

  • PDF

Discovering Promising Convergence Technologies Using Network Analysis of Maturity and Dependency of Technology (기술 성숙도 및 의존도의 네트워크 분석을 통한 유망 융합 기술 발굴 방법론)

  • Choi, Hochang;Kwahk, Kee-Young;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.101-124
    • /
    • 2018
  • Recently, most of the technologies have been developed in various forms through the advancement of single technology or interaction with other technologies. Particularly, these technologies have the characteristic of the convergence caused by the interaction between two or more techniques. In addition, efforts in responding to technological changes by advance are continuously increasing through forecasting promising convergence technologies that will emerge in the near future. According to this phenomenon, many researchers are attempting to perform various analyses about forecasting promising convergence technologies. A convergence technology has characteristics of various technologies according to the principle of generation. Therefore, forecasting promising convergence technologies is much more difficult than forecasting general technologies with high growth potential. Nevertheless, some achievements have been confirmed in an attempt to forecasting promising technologies using big data analysis and social network analysis. Studies of convergence technology through data analysis are actively conducted with the theme of discovering new convergence technologies and analyzing their trends. According that, information about new convergence technologies is being provided more abundantly than in the past. However, existing methods in analyzing convergence technology have some limitations. Firstly, most studies deal with convergence technology analyze data through predefined technology classifications. The technologies appearing recently tend to have characteristics of convergence and thus consist of technologies from various fields. In other words, the new convergence technologies may not belong to the defined classification. Therefore, the existing method does not properly reflect the dynamic change of the convergence phenomenon. Secondly, in order to forecast the promising convergence technologies, most of the existing analysis method use the general purpose indicators in process. This method does not fully utilize the specificity of convergence phenomenon. The new convergence technology is highly dependent on the existing technology, which is the origin of that technology. Based on that, it can grow into the independent field or disappear rapidly, according to the change of the dependent technology. In the existing analysis, the potential growth of convergence technology is judged through the traditional indicators designed from the general purpose. However, these indicators do not reflect the principle of convergence. In other words, these indicators do not reflect the characteristics of convergence technology, which brings the meaning of new technologies emerge through two or more mature technologies and grown technologies affect the creation of another technology. Thirdly, previous studies do not provide objective methods for evaluating the accuracy of models in forecasting promising convergence technologies. In the studies of convergence technology, the subject of forecasting promising technologies was relatively insufficient due to the complexity of the field. Therefore, it is difficult to find a method to evaluate the accuracy of the model that forecasting promising convergence technologies. In order to activate the field of forecasting promising convergence technology, it is important to establish a method for objectively verifying and evaluating the accuracy of the model proposed by each study. To overcome these limitations, we propose a new method for analysis of convergence technologies. First of all, through topic modeling, we derive a new technology classification in terms of text content. It reflects the dynamic change of the actual technology market, not the existing fixed classification standard. In addition, we identify the influence relationships between technologies through the topic correspondence weights of each document, and structuralize them into a network. In addition, we devise a centrality indicator (PGC, potential growth centrality) to forecast the future growth of technology by utilizing the centrality information of each technology. It reflects the convergence characteristics of each technology, according to technology maturity and interdependence between technologies. Along with this, we propose a method to evaluate the accuracy of forecasting model by measuring the growth rate of promising technology. It is based on the variation of potential growth centrality by period. In this paper, we conduct experiments with 13,477 patent documents dealing with technical contents to evaluate the performance and practical applicability of the proposed method. As a result, it is confirmed that the forecast model based on a centrality indicator of the proposed method has a maximum forecast accuracy of about 2.88 times higher than the accuracy of the forecast model based on the currently used network indicators.

Consumer Responses to Retailer's Location-based Mobile Shopping Service : Focusing on PAD Emotional State Model and Information Relevance (유통업체의 위치기반 모바일 쇼핑서비스 제공에 대한 소비자 반응 : PAD 감정모델과 정보의 상황관련성을 중심으로)

  • Lee, Hyun-Hwa;Moon, Hee-Kang
    • Journal of Distribution Research
    • /
    • v.17 no.2
    • /
    • pp.63-92
    • /
    • 2012
  • This study investigated consumer intention to use a location-based mobile shopping service (LBMSS) that integrates cognitive and affective responses. Information relevancy was integrated into pleasure-arousal-dominance (PAD) emotional state model in the present study as a conceptual framework. The results of an online survey of 335 mobile phone users in the U.S. indicated the positive effects of arousal and information relevancy on pleasure. In addition, there was a significant relationship between pleasure and intention to use a LBMSS. However, the relationship between dominance and pleasure was not statistically significant. The results of the present study provides insight to retailers and marketers as to what factors they need to consider to implement location-based mobile shopping services to improve their business performance. Extended Abstract : Location aware technology has expanded the marketer's reach by reducing space and time between a consumer's receipt of advertising and purchase, offering real-time information and coupons to consumers in purchasing situations (Dickenger and Kleijnen, 2008; Malhotra and Malhotra, 2009). LBMSS increases the relevancy of SMS marketing by linking advertisements to a user's location (Bamba and Barnes, 2007; Malhotra and Malhotra, 2009). This study investigated consumer intention to use a location-based mobile shopping service (LBMSS) that integrates cognitive and affective response. The purpose of the study was to examine the relationship among information relevancy and affective variables and their effects on intention to use LBMSS. Thus, information relevancy was integrated into pleasure-arousal-dominance (PAD) model and generated the following hypotheses. Hypothesis 1. There will be a positive influence of arousal concerning LBMSS on pleasure in regard to LBMSS. Hypothesis 2. There will be a positive influence of dominance in LBMSS on pleasure in regard to LBMSS. Hypothesis 3. There will be a positive influence of information relevancy on pleasure in regard to LBMSS. Hypothesis 4. There will be a positive influence of pleasure about LBMSS on intention to use LBMSS. E-mail invitations were sent out to a randomly selected sample of three thousand consumers who are older than 18 years old and mobile phone owners, acquired from an independent marketing research company. An online survey technique was employed utilizing Dillman's (2000) online survey method and follow-ups. A total of 335 valid responses were used for the data analysis in the present study. Before the respondents answer any of the questions, they were told to read a document describing LBMSS. The document included definitions and examples of LBMSS provided by various service providers. After that, they were exposed to a scenario describing the participant as taking a saturday shopping trip to a mall and then receiving a short message from the mall. The short message included new product information and coupons for same day use at participating stores. They then completed a questionnaire containing various questions. To assess arousal, dominance, and pleasure, we adapted and modified scales used in the previous studies in the context of location-based mobile shopping service, each of the five items from Mehrabian and Russell (1974). A total of 15 items were measured on a seven-point bipolar scale. To measure information relevancy, four items were borrowed from Mason et al. (1995). Intention to use LBMSS was captured using two items developed by Blackwell, and Miniard (1995) and one items developed by the authors. Data analyses were conducted using SPSS 19.0 and LISREL 8.72. A total of usable 335 data were obtained after deleting the incomplete responses, which results in a response rate of 11.20%. A little over half of the respondents were male (53.9%) and approximately 60% of respondents were married (57.4%). The mean age of the sample was 29.44 years with a range from 19 to 60 years. In terms of the ethnicity there were European Americans (54.5%), Hispanic American (5.3%), African-American (3.6%), and Asian American (2.9%), respectively. The respondents were highly educated; close to 62.5% of participants in the study reported holding a college degree or its equivalent and 14.5% of the participants had graduate degree. The sample represents all income categories: less than $24,999 (10.8%), $25,000-$49,999 (28.34%), $50,000-$74,999 (13.8%), and $75,000 or more (10.23%). The respondents of the study indicated that they were employed in many occupations. Responses came from all 42 states in the U.S. To identify the dimensions of research constructs, Exploratory Factor Analysis (EFA) using a varimax rotation was conducted. As indicated in table 1, these dimensions: arousal, dominance, relevancy, pleasure, and intention to use, suggested by the EFA, explained 82.29% of the total variance with factor loadings ranged from .74 to .89. As a next step, CFA was conducted to validate the dimensions that were identified from the exploratory factor analysis and to further refine the scale. Table 1 exhibits the results of measurement model analysis and revealed a chi-square of 202.13 with degree-of-freedom of 89 (p =.002), GFI of .93, AGFI = .89, CFI of .99, NFI of .98, which indicates of the evidence of a good model fit to the data (Bagozzi and Yi, 1998; Hair et al., 1998). As table 1 shows, reliability was estimated with Cronbach's alpha and composite reliability (CR) for all multi-item scales. All the values met evidence of satisfactory reliability in multi-item measure for alpha (>.91) and CR (>.80). In addition, we tested the convergent validity of the measure using average variance extracted (AVE) by following recommendations from Fornell and Larcker (1981). The AVE values for the model constructs ranged from .74 through .85, which are higher than the threshold suggested by Fornell and Larcker (1981). To examine discriminant validity of the measure, we again followed the recommendations from Fornell and Larcker (1981). The shared variances between constructs were smaller than the AVE of the research constructs and confirm discriminant validity of the measure. The causal model testing was conducted using LISREL 8.72 with a maximum-likelihood estimation method. Table 2 shows the results of the hypotheses testing. The results for the conceptual model revealed good overall fit for the proposed model. Chi-square was 342.00 (df = 92, p =.000), NFI was .97, NNFI was .97, GFI was .89, AGFI was .83, and RMSEA was .08. All paths in the proposed model received significant statistical support except H2. The paths from arousal to pleasure (H1: ${\ss}$=.70; t = 11.44), from information relevancy to intention to use (H3 ${\ss}$ =.12; t = 2.36), from information relevancy to pleasure (H4 ${\ss}$ =.15; t = 2.86), and pleasure to intention to use (H5: ${\ss}$=.54; t = 9.05) were significant. However, the path from dominance to pleasure was not supported. This study investigated consumer intention to use a location-based mobile shopping service (LBMSS) that integrates cognitive and affective responses. Information relevancy was integrated into pleasure-arousal-dominance (PAD) emotional state model as a conceptual framework. The results of the present study support previous studies indicating that emotional responses as well as cognitive responses have a strong impact on accepting new technology. The findings of this study suggest potential marketing strategies to mobile service developers and retailers who are considering the implementation of LBMSS. It would be rewarding to develop location-based mobile services that integrate information relevancy and which cause positive emotional responses.

  • PDF

Development of Standard Process for Private Information Protection of Medical Imaging Issuance (개인정보 보호를 위한 의료영상 발급 표준 업무절차 개발연구)

  • Park, Bum-Jin;Yoo, Beong-Gyu;Lee, Jong-Seok;Jeong, Jae-Ho;Son, Gi-Gyeong;Kang, Hee-Doo
    • Journal of radiological science and technology
    • /
    • v.32 no.3
    • /
    • pp.335-341
    • /
    • 2009
  • Purpose : The medical imaging issuance is changed from conventional film method to Digital Compact Disk solution because of development on IT technology. However other medical record department's are undergoing identification check through and through whereas medical imaging department cannot afford to do that. So, we examine present applicant's recognition of private intelligence safeguard, and medical imaging issuance condition by CD & DVD medium toward various medical facility and then perform comparative analysis associated with domestic and foreign law & recommendation, lastly suggest standard for medical imaging issuance and process relate with internal environment. Materials and methods : First, we surveyed issuance process & required documents when situation of medical image issuance in the metropolitan medical facility by wire telephone between 2008.6.1$\sim$2008.7.1. in accordance with the medical law Article 21$\sim$clause 2, suggested standard through applicant's required documents occasionally - (1) in the event of oneself $\rightarrow$ verifying identification, (2) in the event of family $\rightarrow$ verifying applicant identification & family relations document (health insurance card, attested copy, and so on), (3) third person or representative $\rightarrow$ verifying applicant identification & letter of attorney & certificate of one's seal impression. Second, also checked required documents of applicant in accordance with upper standard when situation of medical image issuance in Kyung-hee university medical center during 3 month 2008.5.1$\sim$2008.7.31. Third, developed a work process by triangular position of issuance procedure for situation when verifying required documents & management of unpreparedness. Result : Look all over the our manufactured output in the hospital - satisfy the all conditions $\rightarrow$ 4 place(12%), possibly request everyone $\rightarrow$ 4 place(12%), and apply in the clinic section $\rightarrow$ 9 place(27%) that does not medical imaging issuance office, so we don't know about required documents condition. and look into whether meet or not the applicant's required documents on upper 3month survey - satisfy the all conditions $\rightarrow$ 629 case(49%), prepare a one part $\rightarrow$ 416 case(33%), insufficiency of all document $\rightarrow$ 226case(18%). On the authority of upper research result, we are establishing the service model mapping for objective reception when image export situation through triangular position of issuance procedure and reduce of friction with patient and promote the patient convenience. Conclusion : The PACS is classified under medical machinery that mean indicates about higher importance of medical information therefore medical information administrator's who already received professional education & mind, are performer about issuance process only and also have to provide under ID checking process exhaustively.

  • PDF

Directions of Implementing Documentation Strategies for Local Regions (지역 기록화를 위한 도큐멘테이션 전략의 적용)

  • Seol, Moon-Won
    • The Korean Journal of Archival Studies
    • /
    • no.26
    • /
    • pp.103-149
    • /
    • 2010
  • Documentation strategy has been experimented in various subject areas and local regions since late 1980's when it was proposed as archival appraisal and selection methods by archival communities in the United States. Though it was criticized to be too ideal, it needs to shed new light on the potentialities of the strategy for documenting local regions in digital environment. The purpose of this study is to analyse the implementation issues of documentation strategy and to suggest the directions for documenting local regions of Korea through the application of the strategy. The documentation strategy which was developed more than twenty years ago in mostly western countries gives us some implications for documenting local regions even in current digital environments. They are as follows; Firstly, documentation strategy can enhance the value of archivists as well as archives in local regions because archivist should be active shaper of history rather than passive receiver of archives according to the strategy. It can also be a solution for overcoming poor conditions of local archives management in Korea. Secondly, the strategy can encourage cooperation between collecting institutions including museums, libraries, archives, cultural centers, history institutions, etc. in each local region. In the networked environment the cooperation can be achieved more effectively than in traditional environment where the heavy workload of cooperative institutions is needed. Thirdly, the strategy can facilitate solidarity of various groups in local region. According to the analysis of the strategy projects, it is essential to collect their knowledge, passion, and enthusiasm of related groups to effectively implement the strategy. It can also provide a methodology for minor groups of society to document their memories. This study suggests the directions of documenting local regions in consideration of current archival infrastructure of Korean as follows; Firstly, very selective and intensive documentation should be pursued rather than comprehensive one for documenting local regions. Though it is a very political problem to decide what subject has priority for documentation, interests of local community members as well as professional groups should be considered in the decision-making process seriously. Secondly, it is effective to plan integrated representation of local history in the distributed custody of local archives. It would be desirable to implement archival gateway for integrated search and representation of local archives regardless of the location of archives. Thirdly, it is necessary to try digital documentation using Web 2.0 technologies. Documentation strategy as the methodology of selecting and acquiring archives can not avoid subjectivity and prejudices of appraiser completely. To mitigate the problems, open documentation system should be prepared for reflecting different interests of different groups. Fourth, it is desirable to apply a conspectus model used in cooperative collection management of libraries to document local regions digitally. Conspectus can show existing documentation strength and future documentation intensity for each participating institution. Using this, documentation level of each subject area can be set up cooperatively and effectively in the local regions.

Improvement of Certification Criteria based on Analysis of On-site Investigation of Good Agricultural Practices(GAP) for Ginseng (인삼 GAP 인증기준의 현장실천평가결과 분석에 따른 인증기준 개선방안)

  • Yoon, Deok-Hoon;Nam, Ki-Woong;Oh, Soh-Young;Kim, Ga-Bin
    • Journal of Food Hygiene and Safety
    • /
    • v.34 no.1
    • /
    • pp.40-51
    • /
    • 2019
  • Ginseng has a unique production system that is different from those used for other crops. It is subject to the Ginseng Industry Act., requires a long-term cultivation period of 4-6 years, involves complicated cultivation characteristics whereby ginseng is not produced in a single location, and many ginseng farmers engage in mixed-farming. Therefore, to bring the production of Ginseng in line with GAP standards, it is necessary to better understand the on-site practices of Ginseng farmers according to established control points, and to provide a proper action plan for improving efficiency. Among ginseng farmers in Korea who applied for GAP certification, 77.6% obtained it, which is lower than the 94.1% of farmers who obtained certification for other products. 13.7% of the applicants were judged to be unsuitable during document review due to their use of unregistered pesticides and soil heavy metals. Another 8.7% of applicants failed to obtain certification due to inadequate management results. This is a considerably higher rate of failure than the 5.3% incompatibility of document inspection and 0.6% incompatibility of on-site inspection, which suggests that it is relatively more difficult to obtain GAP certification for ginseng farming than for other crops. Ginseng farmers were given an average of 2.65 points out of 10 essential control points and a total 72 control points, which was slightly lower than the 2.81 points obtained for other crops. In particular, ginseng farmers were given an average of 1.96 points in the evaluation of compliance with the safe use standards for pesticides, which was much lower than the average of 2.95 points for other crops. Therefore, it is necessary to train ginseng farmers to comply with the safe use of pesticides. In the other essential control points, the ginseng farmers were rated at an average of 2.33 points, lower than the 2.58 points given for other crops. Several other areas of compliance in which the ginseng farmers also rated low in comparison to other crops were found. These inclued record keeping over 1 year, record of pesticide use, pesticide storages, posts harvest storage management, hand washing before and after work, hygiene related to work clothing, training of workers safety and hygiene, and written plan of hazard management. Also, among the total 72 control points, there are 12 control points (10 required, 2 recommended) that do not apply to ginseng. Therefore, it is considered inappropriate to conduct an effective evaluation of the ginseng production process based on the existing certification standards. In conclusion, differentiated certification standards are needed to expand GAP certification for ginseng farmers, and it is also necessary to develop programs that can be implemented in a more systematic and field-oriented manner to provide the farmers with proper GAP management education.

A study on the classification of research topics based on COVID-19 academic research using Topic modeling (토픽모델링을 활용한 COVID-19 학술 연구 기반 연구 주제 분류에 관한 연구)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.155-174
    • /
    • 2022
  • From January 2020 to October 2021, more than 500,000 academic studies related to COVID-19 (Coronavirus-2, a fatal respiratory syndrome) have been published. The rapid increase in the number of papers related to COVID-19 is putting time and technical constraints on healthcare professionals and policy makers to quickly find important research. Therefore, in this study, we propose a method of extracting useful information from text data of extensive literature using LDA and Word2vec algorithm. Papers related to keywords to be searched were extracted from papers related to COVID-19, and detailed topics were identified. The data used the CORD-19 data set on Kaggle, a free academic resource prepared by major research groups and the White House to respond to the COVID-19 pandemic, updated weekly. The research methods are divided into two main categories. First, 41,062 articles were collected through data filtering and pre-processing of the abstracts of 47,110 academic papers including full text. For this purpose, the number of publications related to COVID-19 by year was analyzed through exploratory data analysis using a Python program, and the top 10 journals under active research were identified. LDA and Word2vec algorithm were used to derive research topics related to COVID-19, and after analyzing related words, similarity was measured. Second, papers containing 'vaccine' and 'treatment' were extracted from among the topics derived from all papers, and a total of 4,555 papers related to 'vaccine' and 5,971 papers related to 'treatment' were extracted. did For each collected paper, detailed topics were analyzed using LDA and Word2vec algorithms, and a clustering method through PCA dimension reduction was applied to visualize groups of papers with similar themes using the t-SNE algorithm. A noteworthy point from the results of this study is that the topics that were not derived from the topics derived for all papers being researched in relation to COVID-19 (

    ) were the topic modeling results for each research topic (
    ) was found to be derived from For example, as a result of topic modeling for papers related to 'vaccine', a new topic titled Topic 05 'neutralizing antibodies' was extracted. A neutralizing antibody is an antibody that protects cells from infection when a virus enters the body, and is said to play an important role in the production of therapeutic agents and vaccine development. In addition, as a result of extracting topics from papers related to 'treatment', a new topic called Topic 05 'cytokine' was discovered. A cytokine storm is when the immune cells of our body do not defend against attacks, but attack normal cells. Hidden topics that could not be found for the entire thesis were classified according to keywords, and topic modeling was performed to find detailed topics. In this study, we proposed a method of extracting topics from a large amount of literature using the LDA algorithm and extracting similar words using the Skip-gram method that predicts the similar words as the central word among the Word2vec models. The combination of the LDA model and the Word2vec model tried to show better performance by identifying the relationship between the document and the LDA subject and the relationship between the Word2vec document. In addition, as a clustering method through PCA dimension reduction, a method for intuitively classifying documents by using the t-SNE technique to classify documents with similar themes and forming groups into a structured organization of documents was presented. In a situation where the efforts of many researchers to overcome COVID-19 cannot keep up with the rapid publication of academic papers related to COVID-19, it will reduce the precious time and effort of healthcare professionals and policy makers, and rapidly gain new insights. We hope to help you get It is also expected to be used as basic data for researchers to explore new research directions.

  • Analysis of media trends related to spent nuclear fuel treatment technology using text mining techniques (텍스트마이닝 기법을 활용한 사용후핵연료 건식처리기술 관련 언론 동향 분석)

    • Jeong, Ji-Song;Kim, Ho-Dong
      • Journal of Intelligence and Information Systems
      • /
      • v.27 no.2
      • /
      • pp.33-54
      • /
      • 2021
    • With the fourth industrial revolution and the arrival of the New Normal era due to Corona, the importance of Non-contact technologies such as artificial intelligence and big data research has been increasing. Convergent research is being conducted in earnest to keep up with these research trends, but not many studies have been conducted in the area of nuclear research using artificial intelligence and big data-related technologies such as natural language processing and text mining analysis. This study was conducted to confirm the applicability of data science analysis techniques to the field of nuclear research. Furthermore, the study of identifying trends in nuclear spent fuel recognition is critical in terms of being able to determine directions to nuclear industry policies and respond in advance to changes in industrial policies. For those reasons, this study conducted a media trend analysis of pyroprocessing, a spent nuclear fuel treatment technology. We objectively analyze changes in media perception of spent nuclear fuel dry treatment techniques by applying text mining analysis techniques. Text data specializing in Naver's web news articles, including the keywords "Pyroprocessing" and "Sodium Cooled Reactor," were collected through Python code to identify changes in perception over time. The analysis period was set from 2007 to 2020, when the first article was published, and detailed and multi-layered analysis of text data was carried out through analysis methods such as word cloud writing based on frequency analysis, TF-IDF and degree centrality calculation. Analysis of the frequency of the keyword showed that there was a change in media perception of spent nuclear fuel dry treatment technology in the mid-2010s, which was influenced by the Gyeongju earthquake in 2016 and the implementation of the new government's energy conversion policy in 2017. Therefore, trend analysis was conducted based on the corresponding time period, and word frequency analysis, TF-IDF, degree centrality values, and semantic network graphs were derived. Studies show that before the 2010s, media perception of spent nuclear fuel dry treatment technology was diplomatic and positive. However, over time, the frequency of keywords such as "safety", "reexamination", "disposal", and "disassembly" has increased, indicating that the sustainability of spent nuclear fuel dry treatment technology is being seriously considered. It was confirmed that social awareness also changed as spent nuclear fuel dry treatment technology, which was recognized as a political and diplomatic technology, became ambiguous due to changes in domestic policy. This means that domestic policy changes such as nuclear power policy have a greater impact on media perceptions than issues of "spent nuclear fuel processing technology" itself. This seems to be because nuclear policy is a socially more discussed and public-friendly topic than spent nuclear fuel. Therefore, in order to improve social awareness of spent nuclear fuel processing technology, it would be necessary to provide sufficient information about this, and linking it to nuclear policy issues would also be a good idea. In addition, the study highlighted the importance of social science research in nuclear power. It is necessary to apply the social sciences sector widely to the nuclear engineering sector, and considering national policy changes, we could confirm that the nuclear industry would be sustainable. However, this study has limitations that it has applied big data analysis methods only to detailed research areas such as "Pyroprocessing," a spent nuclear fuel dry processing technology. Furthermore, there was no clear basis for the cause of the change in social perception, and only news articles were analyzed to determine social perception. Considering future comments, it is expected that more reliable results will be produced and efficiently used in the field of nuclear policy research if a media trend analysis study on nuclear power is conducted. Recently, the development of uncontact-related technologies such as artificial intelligence and big data research is accelerating in the wake of the recent arrival of the New Normal era caused by corona. Convergence research is being conducted in earnest in various research fields to follow these research trends, but not many studies have been conducted in the nuclear field with artificial intelligence and big data-related technologies such as natural language processing and text mining analysis. The academic significance of this study is that it was possible to confirm the applicability of data science analysis technology in the field of nuclear research. Furthermore, due to the impact of current government energy policies such as nuclear power plant reductions, re-evaluation of spent fuel treatment technology research is undertaken, and key keyword analysis in the field can contribute to future research orientation. It is important to consider the views of others outside, not just the safety technology and engineering integrity of nuclear power, and further reconsider whether it is appropriate to discuss nuclear engineering technology internally. In addition, if multidisciplinary research on nuclear power is carried out, reasonable alternatives can be prepared to maintain the nuclear industry.


    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.