• 제목/요약/키워드: Text Databases

검색결과 195건 처리시간 0.029초

Natural language processing techniques for bioinformatics

  • Tsujii, Jun-ichi
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.3-3
    • /
    • 2003
  • With biomedical literature expanding so rapidly, there is an urgent need to discover and organize knowledge extracted from texts. Although factual databases contain crucial information the overwhelming amount of new knowledge remains in textual form (e.g. MEDLINE). In addition, new terms are constantly coined as the relationships linking new genes, drugs, proteins etc. As the size of biomedical literature is expanding, more systems are applying a variety of methods to automate the process of knowledge acquisition and management. In my talk, I focus on the project, GENIA, of our group at the University of Tokyo, the objective of which is to construct an information extraction system of protein - protein interaction from abstracts of MEDLINE. The talk includes (1) Techniques we use fDr named entity recognition (1-a) SOHMM (Self-organized HMM) (1-b) Maximum Entropy Model (1-c) Lexicon-based Recognizer (2) Treatment of term variants and acronym finders (3) Event extraction using a full parser (4) Linguistic resources for text mining (GENIA corpus) (4-a) Semantic Tags (4-b) Structural Annotations (4-c) Co-reference tags (4-d) GENIA ontology I will also talk about possible extension of our work that links the findings of molecular biology with clinical findings, and claim that textual based or conceptual based biology would be a viable alternative to system biology that tends to emphasize the role of simulation models in bioinformatics.

  • PDF

StrokeMed: an integrated literature database for stroke and the differentiation of stroke syndrome

  • Kim, Young-Uk;Kim, Jin-Ho;Park, Young-Kyu;Kim, Young-Joo
    • Interdisciplinary Bio Central
    • /
    • 제2권2호
    • /
    • pp.2.1-2.4
    • /
    • 2010
  • Complex diseases, such as stroke and cancer, have two or more genetic influences and are affected by environmental factors, which complicate them. Due to the complex characteristics of these diseases, we must search and study comprehensive literature-based article resources. Some disease-related literature databases have been developed through specialized journal issues or major websites. Most of them, however, are scattered throughout a website, and users encounter difficulties in finding accurate and comprehensive information easily and quickly. We developed StrokeMed, an integrated literature database for stroke and the differentiation of stroke syndrome. The system allows users to explore PubMed search results, categorized by MeSH (Medical Subject Headings), and the differentiation of stroke syndrome in Oriental medicine. StrokeMed collects data from important sites, such as PubMed, Scirus, and Scopus, automatically to maintain higher-quality and updated content. Currently, the system indexes more than 20,000 PubMed abstracts that are related to stroke, stroke etiology, and Oriental medicine. The system provides valuable literature information to the scientific and medical fields in stroke.

정보공유적 모델 기반의 학술커뮤니케이션에 대한 연구: 저작권을 중심으로 (A Study on the Open Access Model for Scholarly Communication)

  • 정경희
    • 정보관리학회지
    • /
    • 제19권4호
    • /
    • pp.384-399
    • /
    • 2002
  • 저널을 중심으로 한 학술커뮤니케이션의 문제는 저작권 문제라고 할 수 있다. 영리적 기관이 논문의 저작권을 배타적으로 양도받아, 원문데이터베이스를 구축할 경우 가격상승으로 인한 이용의 제한과 보존의 문제가 발생한다. 본 연구는 이러한 문제를 해결할 수 있는 방안으로 정보공유적 학술커뮤니케이션에 대한 개념적 모델을 제시하였다. 이 모델은 저자가 저작권을 가지되, 학술적이고 비영리적인 이용일 경우 해당 저작물을 자유롭게 이용할 수 있다는 공유적 라이센스를 채택하도록 하는 것이다. 따라서 도서관이 학술논문의 원문데이터베이스를 구축하고 자유롭게 이용시킬 수 있으며, 학술정보의 보존문제도 해결할 수 있을 것이다.

디스크립터 자동 할당을 위한 저자키워드의 재분류에 관한 실험적 연구 (A Study on the Reclassification of Author Keywords for Automatic Assignment of Descriptors)

  • 김판준;이재윤
    • 정보관리학회지
    • /
    • 제29권2호
    • /
    • pp.225-246
    • /
    • 2012
  • 본 연구는 국내 주요 학술 DB의 검색서비스에서 제공되고 있는 저자키워드(비통제키워드)의 재분류를 통하여 디스크립터(통제키워드)를 자동 할당할 수 있는 가능성을 모색하였다. 먼저 기계학습에 기반한 주요 분류기들의 특성을 비교하는 실험을 수행하여 재분류를 위한 최적 분류기와 파라미터를 선정하였다. 다음으로, 국내 독서 분야 학술지 논문들에 부여된 저자키워드를 학습한 결과에 따라 해당 논문들을 재분류함으로써 키워드를 추가로 할당하는 실험을 수행하였다. 또한 이러한 재분류 결과에 따라 새롭게 추가된 문헌들에 대하여 통제키워드인 디스크립터와 마찬가지로 동일 주제의 논문들을 모아주는 어휘통제 효과가 있는지를 살펴보았다. 그 결과, 저자키워드의 재분류를 통하여 디스크립터를 자동 할당하는 효과를 얻을 수 있음을 확인하였다.

IMPLEMENTATION OF SUBSEQUENCE MAPPING METHOD FOR SEQUENTIAL PATTERN MINING

  • Trang, Nguyen Thu;Lee, Bum-Ju;Lee, Heon-Gyu;Ryu, Keun-Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2006년도 Proceedings of ISRS 2006 PORSEC Volume II
    • /
    • pp.627-630
    • /
    • 2006
  • Sequential Pattern Mining is the mining approach which addresses the problem of discovering the existent maximal frequent sequences in a given databases. In the daily and scientific life, sequential data are available and used everywhere based on their representative forms as text, weather data, satellite data streams, business transactions, telecommunications records, experimental runs, DNA sequences, histories of medical records, etc. Discovering sequential patterns can assist user or scientist on predicting coming activities, interpreting recurring phenomena or extracting similarities. For the sake of that purpose, the core of sequential pattern mining is finding the frequent sequence which is contained frequently in all data sequences. Beside the discovery of frequent itemsets, sequential pattern mining requires the arrangement of those itemsets in sequences and the discovery of which of those are frequent. So before mining sequences, the main task is checking if one sequence is a subsequence of another sequence in the database. In this paper, we implement the subsequence matching method as the preprocessing step for sequential pattern mining. Matched sequences in our implementation are the normalized sequences as the form of number chain. The result which is given by this method is the review of matching information between input mapped sequences.

  • PDF

The Future Past of Humanities Research: Musing Methodology in the Digital Convergence Era

  • Kim, Jiyun
    • International journal of advanced smart convergence
    • /
    • 제9권3호
    • /
    • pp.161-168
    • /
    • 2020
  • Over the last half-century, computer science has revolutionarily changed the landscape of humanities research. This digital shift in research methodology has reached from the brainstorming process to preserving, constructing, collecting, visualizing, and even analyzing materials. Such transformation has brought about the birth of the new field of study: Digital Humanities (DH). DH undeniably has saved much of the physical chores and provided a new angle to interpret the text, thereby making its meteoric rise as a promising future of the humanities. Based on such innovation, electronic circuitry can seem to replace the imagination that detects relationships and significances of research data with ever-improving interfaces. However, despite hitherto technological development, the thousands-year-old essence of traditional liberal arts-human creativity-remains the heart of humanities research and always will. This paper starts by proving this proposition in the way of comparing the old and new liberal arts research methods, focusing on literary studies. Meanwhile, it thoroughly investigates how digitalized bibliographies, search engines, databases, and digital projects provide the most useful data preservation and virtual experience of browsing in the library, along with their limitations due to the intrinsic quality of humanities research data. Also, it probes the differences between traditional and digital data analysis in current methods of literary studies, ultimately presenting the ideal direction for humanities development in the era of digital convergence.

입원아동 돌봄을 위한 가족중심 순회의 통합적 고찰 (An Integrative Review on Family-Centered Rounds for Hospitalized Children Caring)

  • 임미해;오진아
    • Child Health Nursing Research
    • /
    • 제22권2호
    • /
    • pp.107-116
    • /
    • 2016
  • Purpose: Involvement of families in rounds is one strategy to implement patient- and family-centered care to help families get clear information about their child, and be actively involved in decision making. The purpose of this paper was to identify the major concepts of family-centered rounds for hospitalized children. Methods: We searched five electronic databases for relevant articles and used Whittemore and Knafl's integrative review methods to synthesize the literature. Articles published between June 2003 and January 2016 were reviewed and through full text screening 24 peer-reviewed articles were found that met the selection criteria for this review. Results: Through in-depth discussion and investigation of the relevant literature, four overarching components emerged: (a) cognition of parents and medical staff, (b) effective communication, (c) collaboration of family and medical staff, (d) coaching of medical staff. Conclusion: For successful family-centered rounds positive cognition is important. Appropriate communication skills and consideration of multi-cultural family can lead to effective communication. Offering consistent and transparent information is important for collaboration between family and medical staff. Prior education on family-centered rounds is also important. Four major components have been identified as basic standards for implementing family-centered rounds for hospitalized children.

인터넷 중독 중재 프로그램으로서의 인지행동요법: 생리적 관점에서의 이론적 기틀 및 활용에 대한 고찰 (Review of Cognitive Behavioral Therapy as an Intervention Program for Internet Addicts: A Theoretical Framework and Implications with Physiological Perspectives)

  • 김나현;홍승희
    • Journal of Korean Biological Nursing Science
    • /
    • 제17권3호
    • /
    • pp.219-227
    • /
    • 2015
  • Purpose: This study was conducted to review physiological mechanisms of internet addiction and to construct a theoretical framework for cognitive behavioral therapy for internet addicts. Methods: We searched for relevant literature in the PubMed and RISS databases using the terms "internet addiction", "internet game addiction", "internet abuser", and "online game". Only English, full-text articles published from 2000 to 2015 were included in this review of physiological indicators of internet addiction. Finally, 12 articles were selected for review. Results: The theoretical framework developed based on the review proposes that excessive internet use itself may induce physiological stress responses with an increase of stress-related hormones and neurotransmitters. Prolonged abnormal responses of these physiological features produce negative structural and functional changes in the prefrontal cortex, which is mainly involved in cognitive and executive functions. These changes may result in decreased cognitive function. As a stressor, excessive internet use leads to transforming voluntary use into involuntary, habitual use and thus promotes the development of internet addiction. Conclusion: The proposed theoretical framework encompasses cognitive processes that may contribute to the effects of internet use-induced physiological stress on internet addiction. We believe that this framework has important implications for developing cognitive behavioral strategies for internet addicts.

RDF 데이타에 대한 효율적인 검색 기법 (An Efficient Keyword Search Method on RDF Data)

  • 김진하;송인철;김명호
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제35권6호
    • /
    • pp.495-504
    • /
    • 2008
  • 최근 문서나 웹 페이지뿐만 아니라 관계형 데이타나 XML 데이타, RDF 데이타 같은 구조화된 데이타에 대해서도 검색을 지원하고자 하는 연구가 활발히 진행되고 있다. 본 논문에서는 RDF 데이타에 대한 효율적인 검색 기법을 제안한다. 제안하는 기법은 먼저 RDF 데이타의 크기를 줄여 검색 성능을 높이고 검색 결과로 관련 있는 정보를 함께 반환해 주기 위해 RDF 데이타에서 관련 있는 노드와 에지를 묶어 새로운 RDF 그래프를 생성한다. 또한 검색 과정에서 검색의 결과를 정렬하기 위해 RDF 데이타 그래프의 노드와 예지에 키워드와의 연관도를 부여할 때, RDF 온톨로지 데이타의 특성을 활용함으로써 보다 사용자의 의도에 부합하는 검색 결과를 반환한다. 실제 RDF 데이타를 사용한 성능 비교 결과는 제안하는 기법이 RDF 데이타의 크기를 최대 2배까지 줄이고 기존 기법에 비해 검색 속도가 최대 5배 빠르다는 것을 보여준다.

아동학회지를 어떻게 국제화시킬 것인가? (How to Promote the Korean Journal of Child Studies to an International Journal)

  • 허선
    • 아동학회지
    • /
    • 제37권1호
    • /
    • pp.7-16
    • /
    • 2016
  • Objective: It aimed at proposing the Korean Journal of Child Studies' strategy to be promoted to international journal based on the style and format of scholarly journals and journal metrics. Methods: The review of the journal in not only print version, but also an online version was done from the perspective of style and format. The total citation and impact factor were manually calculated from Web of Science Core Collection. Results: More professional level manuscript editing is required for maintaining the consistency of the style and format. The verso page and back matters should be improved to international level. Journal homepage should be reconstructed by adopting digital standards for the journal, including journal article tag suite, CrossMark, FundRef, ORCID, and text and data mining. To become an international journal, transformation into English journal and deposition to PubMed Central is mandatory. Conclusion: Since the editor's and society members' performance is top-notch, it will be possible to promote the journal up to international level soon. Society should guarantee the term of editor for enough time and support her with full cost and complete consent.