• Title/Summary/Keyword: Co-word Analysis

Search Result 192, Processing Time 0.024 seconds

A Study on the Retrieval Effectiveness of KoreaMed using MeSH Search Filter and Word-Proximity Search (검색용 MeSH 필터와 단어인접탐색 기법을 활용한 KoreaMed 검색 효율성 향상 연구)

  • Jeong, So-Na;Jeong, Ji-Na
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.5
    • /
    • pp.596-607
    • /
    • 2017
  • This study examined the method for adding related to "stomach neoplasms" as filters to the Medical Subject Headings (MeSH) for search as well as a method for improving the search efficiency through a word-proximity search by measuring the distance of co-occurring terms. A total of 8,625 articles published between 2007 and 2016 with the major topic terms "stomach neoplasms" were downloaded from PubMed article titles. The vocabulary to be added to the MeSH for search were analyzed. The search efficiency was verified by 277 articles that had "Stomach Neoplasms" indexed as MEDLINE MeSH in KoreaMed. As a result, 973 terms were selected as the candidate vocabulary. "Gastric Cancer" (2,780 appearances) was the most frequent term and 7,376 compound words (88.51%) combined the histological terms of "stomach" and "neoplasm", such as "gastric adenocarcinoma" and "gastric MALT lymphoma". A total of 5,234 compounds words (70.95%), in which the co-occurring distance was two words, were found. The matching rate through the MEDLINE MeSH and KoreaMed MeSH Indexer was 209 articles (75.5%). The search efficiency improved to 263 articles (94.9%) when the search filters were added, and to 268 articles (96.7%) when the 13 word-proximity search technique of the co-occurring terms was applied. This study showed that the use of a thesaurus as a means of improving the search efficiency in a natural language search could maintain the advantages of controlled vocabulary. The search accuracy can be improved using the word-proximity search instead of a Boolean search.

A Bibliometric Approach for Department-Level Disciplinary Analysis and Science Mapping of Research Output Using Multiple Classification Schemes

  • Gautam, Pitambar
    • Journal of Contemporary Eastern Asia
    • /
    • v.18 no.1
    • /
    • pp.7-29
    • /
    • 2019
  • This study describes an approach for comparative bibliometric analysis of scientific publications related to (i) individual or several departments comprising a university, and (ii) broader integrated subject areas using multiple disciplinary schemes. It uses a custom dataset of scientific publications (ca. 15,000 articles and reviews, published during 2009-2013, and recorded in the Web of Science Core Collections) with author affiliations to the research departments, dedicated to science, technology, engineering, mathematics, and medicine (STEMM), of a comprehensive university. The dataset was subjected, at first, to the department level and discipline level analyses using the newly available KAKEN-L3 classification (based on MEXT/JSPS Grants-in-Aid system), hierarchical clustering, correspondence analysis to decipher the major departmental and disciplinary clusters, and visualization of the department-discipline relationships using two-dimensional stacked bar diagrams. The next step involved the creation of subsets covering integrated subject areas and a comparative analysis of departmental contributions to a specific area (medical, health and life science) using several disciplinary schemes: Essential Science Indicators (ESI) 22 research fields, SCOPUS 27 subject areas, OECD Frascati 38 subordinate research fields, and KAKEN-L3 66 subject categories. To illustrate the effective use of the science mapping techniques, the same subset for medical, health and life science area was subjected to network analyses for co-occurrences of keywords, bibliographic coupling of the publication sources, and co-citation of sources in the reference lists. The science mapping approach demonstrates the ways to extract information on the prolific research themes, the most frequently used journals for publishing research findings, and the knowledge base underlying the research activities covered by the publications concerned.

A Study on the Methodology of Traceability Analysis and Visualization between Non-standardized documents (비정형화된 문서간 추적성 분석 및 그 가시화 방안 제시)

  • Kim, EunHee;Song, Duck Yong;Hwang, Jin Sang;Jung, Jea Cheon
    • Journal of the Korean Society of Systems Engineering
    • /
    • v.10 no.1
    • /
    • pp.57-64
    • /
    • 2014
  • We propose a methodology to automatically extract the requirements from the documents and check the traceability between them. The documents include not only the text file but also PDF or image files. We also suggest a method to visualize the result with maps, numbers, and graphs. By comparing the results with those of expert reviews, we show that it is necessary to use knowledge-based method in future instead of the word-based method for improving the reliability. The results give more values when they are applied in already existing documents than those of newly developed product.

Text Mining Driven Content Analysis of Ebola on News Media and Scientific Publications (텍스트 마이닝을 이용한 매체별 에볼라 주제 분석 - 바이오 분야 연구논문과 뉴스 텍스트 데이터를 이용하여 -)

  • An, Juyoung;Ahn, Kyubin;Song, Min
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.50 no.2
    • /
    • pp.289-307
    • /
    • 2016
  • Infectious diseases such as Ebola virus disease become a social issue and draw public attention to be a major topic on news or research. As a result, there have been a lot of studies on infectious diseases using text-mining techniques. However, there is no research on content analysis of two media channels that have distinct characteristics. Accordingly, in this study, we conduct topic analysis between news (representing a social perspective) and academic research paper (representing perspectives of bio-professionals). As text-mining techniques, topic modeling is applied to extract various topics according to the materials, and the word co-occurrence map based on selected bio entities is used to compare the perspectives of the materials specifically. For network analysis, topic map is built by using Gephi. Aforementioned approaches uncovered the difference of topics between two materials and the characteristics of the two materials. In terms of the word co-occurrence map, however, most of entities are shared in both materials. These results indicate that there are differences and commonalties between social and academic materials.

WordNet-Based Category Utility Approach for Author Name Disambiguation (저자명 모호성 해결을 위한 개념망 기반 카테고리 유틸리티)

  • Kim, Je-Min;Park, Young-Tack
    • The KIPS Transactions:PartB
    • /
    • v.16B no.3
    • /
    • pp.225-232
    • /
    • 2009
  • Author name disambiguation is essential for improving performance of document indexing, retrieval, and web search. Author name disambiguation resolves the conflict when multiple authors share the same name label. This paper introduces a novel approach which exploits ontologies and WordNet-based category utility for author name disambiguation. Our method utilizes author knowledge in the form of populated ontology that uses various types of properties: titles, abstracts and co-authors of papers and authors' affiliation. Author ontology has been constructed in the artificial intelligence and semantic web areas semi-automatically using OWL API and heuristics. Author name disambiguation determines the correct author from various candidate authors in the populated author ontology. Candidate authors are evaluated using proposed WordNet-based category utility to resolve disambiguation. Category utility is a tradeoff between intra-class similarity and inter-class dissimilarity of author instances, where author instances are described in terms of attribute-value pairs. WordNet-based category utility has been proposed to exploit concept information in WordNet for semantic analysis for disambiguation. Experiments using the WordNet-based category utility increase the number of disambiguation by about 10% compared with that of category utility, and increase the overall amount of accuracy by around 98%.

Maximum Likelihood-based Automatic Lexicon Generation for AI Assistant-based Interaction with Mobile Devices

  • Lee, Donghyun;Park, Jae-Hyun;Kim, Kwang-Ho;Park, Jeong-Sik;Kim, Ji-Hwan;Jang, Gil-Jin;Park, Unsang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.9
    • /
    • pp.4264-4279
    • /
    • 2017
  • In this paper, maximum likelihood-based automatic lexicon generation using mixed-syllables is proposed for unlimited vocabulary voice interface for East Asian languages (e.g. Korean, Chinese and Japanese) in AI-assistant based interaction with mobile devices. The conventional lexicon has two inevitable problems: 1) a tedious repetition of out-of-lexicon unit additions to the lexicon, and 2) the propagation of errors during a morpheme analysis and space segmentation. The proposed method provides an automatic framework to solve the above problems. The proposed method produces a level of overall accuracy similar to one of previous methods in the presence of one out-of-lexicon word in a sentence, but the proposed method provides superior results with the absolute improvements of 1.62%, 5.58%, and 10.09% in terms of word accuracy when the number of out-of-lexicon words in a sentence was two, three and four, respectively.

Issues on Articles Covering Outstanding Management of Apartment Complexes - Content Analysis of Newspaper Reports with Lexical Statistics - (우수 아파트단지 취재기사에서의 관리상의 논점 - 탐방기사를 이용한 언어통계학적 내용분석 -)

  • Choi Jung-Min;Kang Soon-Joo
    • Journal of the Korean housing association
    • /
    • v.17 no.4
    • /
    • pp.131-143
    • /
    • 2006
  • Nowadays, diverse mass media discovers and introduces outstanding management cases of apartment complexes to induce vital competitions of constructors and active participation of residents to apartment management. This study statistically analyzed the management issues of outstanding apartment complexes that have been introduced by mass media with lexical criteria to examine the characteristics of their exemplary management. The key issues of outstanding apartment management are summarized as: efficient management of convenient facilities for residents, community activities based on residents' participation, and maintenance of pleasant living environments through transparent management. Also, the result of the relation arrangement of co-occurrence word from a Social Network Analysis included three key concepts of multi-family housing management - Maintenance Management, Operating Management, and Community Life Management - with emphasis on 'residents' and 'apartment complexes.' However, Operating Management was relatively deemphasized.

An Analysis of the Intellectual Structure of Venture-Creation Studies to build an Entrepreneurship Ontology (창업 온톨로지 구축을 위한 벤처창업 연구의 지식구조 분석)

  • Sim, Jae-Hu;Choi, Myeonggil
    • Knowledge Management Research
    • /
    • v.14 no.4
    • /
    • pp.75-86
    • /
    • 2013
  • The deeping interests and research toward Entrepreneurship, which is considered as an potential alternative for solving the continuing economic recession in the $21^{st}$ century, have grown. The process and methodology of the research could not be systematically arranged and the results of the research lack in efforts on the application of increasing suceess ratio in starting new business. This study adopted corpus methodology, through which we try to analyzes the knowledge structure in entrepreneurship research, derive essential concepts and the consisting domains in venture research. Based on the results of analysis, this study constructs the knowledge structure of venture research in a form of knowledge ontology. The results of the study could be a ground for entrepreneurship research and utilized as implication for a creation of construction for the entrepreneurship knowledge ontology.

  • PDF

An Analysis of the Current State of Marine Sports through the Analysis of Social Big Data: Use of the Social MaxtixTM Method (소셜 빅 데이터분석을 통한 해양스포츠 현황 분석 : 소셜매트릭스TM 기법의 활용)

  • PARK, Tae-Seung
    • Journal of Fisheries and Marine Sciences Education
    • /
    • v.29 no.2
    • /
    • pp.593-606
    • /
    • 2017
  • This study aims to provide preliminary data capable of suggesting directivity of an initiating start by understanding consumer awareness through analysis of SNS social big data on marine sports. This study selected windsurfing, yacht, jet ski, scuba diving and sea fishing as research subjects, and produced following results by setting period of total 1 month from January 22 through February 22, 2017 on the SNS (twitter, blog) through the Social MatrixTM service of Daumsoft Co., Ltd., and analyzing frequency of mention, associated words etc. First, sports that was mentioned the most out of marine sports was yacht, which was 3,273 cases on twitter and 2,199 on blog respectively. Second, the word which was shown the most associated with marine sports was the attribute showing unique characteristic of marine sports, which was 6,261 cases in total.

100 Article Paper Text Minning Data Analysis and Visualization in Web Environment (웹 환경에서 100 논문에 대한 텍스트 마이닝, 데이터 분석과 시각화)

  • Li, Xiaomeng;Li, Jiapei;Lee, HyunChang;Shin, SeongYoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.157-158
    • /
    • 2017
  • There is a method to analyze the big data of the article and text mining by using Python language. And Python is a kind of programming language and it is easy to operating. Reaserch and use Python to creat a Web environment that the research result of the analysis can show directly on the browser. In this thesis, there are 100 article paper frrom Altmetric, Altmetric tracks a range of sources to capture. It is necessary to collect and analyze the big data use an effictive method, After the result coming out, Use Python wordcloud to make a directive image that can show the highest frequency of words.

  • PDF