• Title/Summary/Keyword: automatic annotation

Search Result 78, Processing Time 0.028 seconds

COVID-19 recommender system based on an annotated multilingual corpus

  • Barros, Marcia;Ruas, Pedro;Sousa, Diana;Bangash, Ali Haider;Couto, Francisco M.
    • Genomics & Informatics
    • /
    • v.19 no.3
    • /
    • pp.24.1-24.7
    • /
    • 2021
  • Tracking the most recent advances in Coronavirus disease 2019 (COVID-19)-related research is essential, given the disease's novelty and its impact on society. However, with the publication pace speeding up, researchers and clinicians require automatic approaches to keep up with the incoming information regarding this disease. A solution to this problem requires the development of text mining pipelines; the efficiency of which strongly depends on the availability of curated corpora. However, there is a lack of COVID-19-related corpora, even more, if considering other languages besides English. This project's main contribution was the annotation of a multilingual parallel corpus and the generation of a recommendation dataset (EN-PT and EN-ES) regarding relevant entities, their relations, and recommendation, providing this resource to the community to improve the text mining research on COVID-19-related literature. This work was developed during the 7th Biomedical Linked Annotation Hackathon (BLAH7).

An Analytical Study on Automatic Classification of Domestic Journal articles Using Random Forest (랜덤포레스트를 이용한 국내 학술지 논문의 자동분류에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.2
    • /
    • pp.57-77
    • /
    • 2019
  • Random Forest (RF), a representative ensemble technique, was applied to automatic classification of journal articles in the field of library and information science. Especially, I performed various experiments on the main factors such as tree number, feature selection, and learning set size in terms of classification performance that automatically assigns class labels to domestic journals. Through this, I explored ways to optimize the performance of random forests (RF) for imbalanced datasets in real environments. Consequently, for the automatic classification of domestic journal articles, Random Forest (RF) can be expected to have the best classification performance when using tree number interval 100~1000(C), small feature set (10%) based on chi-square statistic (CHI), and most learning sets (9-10 years).

Deciphering FEATURE for Novel Protein Data Analysis and Functional Annotation (단백질 구조 및 기능 분석을 위한 FEATURE 시스템 개선)

  • Yu, Seung-Hak;Yoon, Sung-Roh
    • Journal of IKEEE
    • /
    • v.13 no.3
    • /
    • pp.18-23
    • /
    • 2009
  • FEATURE is a computational method to recognize functional and structural sites for automatic protein function prediction. By profiling physicochemical properties around residues, FEATURE can characterize and predict functional and structural sites in 3D protein structures in a high-throughput manner. Despite its effectiveness, it has been challenging to apply FEATURE to novel protein data due to limited customization support. To address this problem, we thoroughly analyze the internal modules of FEATURE and propose a methodology to customize FEATURE so that it can be used for new protein data for automatic functional annotations.

  • PDF

A Video Annotation System with Automatic Human Detection from Video Surveillance Data (비디오 감시 데이터로부터 사람의 자동 인식을 통한 비디오 주석 시스템)

  • Kim, Joo-Sung;Kim, Hak-Il;Kim, Yoo-Sung
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06a
    • /
    • pp.340-342
    • /
    • 2012
  • 사람관련 사건을 실시간으로 인지하거나 빠르게 사건 관련 증거를 확보하기 위해서는 대량의 비디오 감시 데이터로부터 사람 관련 정보를 빠르게 찾을 수 있어야 하는데 기존의 시스템에서는 모든 프레임으로부터 주석 편집자가 수작업으로 관련 정보를 추출하여 색인해야 하기 때문에 많은 주석 시간을 필요로 하는 문제를 갖고 있었다. 본 논문에서는 대량의 방범용 비디오 감시 데이터로부터 사람 관련 사건 정보를 빠르게 찾을 수 있도록 지원하기 위해 전체 비디오 데이터 중에서 사람의 출현과 퇴장을 기준으로 키 프레임 구간을 추출하고, 키 프레임에서만 사람 관련 정보를 추출하여 사람 관련 주요 정보를 자동으로 추출하여 XML 스키마 형식으로 색인하는 비디오 주석 시스템을 개발하였다. 또한, 색인된 XML 데이터에 대해 구조 및 내용 기반 질의를 이용하여 쉽고 빠르게 검색할 수 있도록 하기 위해 XPATH 질의 인터페이스를 구현 하였다.

Deep-Learning Approach for Text Detection Using Fully Convolutional Networks

  • Tung, Trieu Son;Lee, Gueesang
    • International Journal of Contents
    • /
    • v.14 no.1
    • /
    • pp.1-6
    • /
    • 2018
  • Text, as one of the most influential inventions of humanity, has played an important role in human life since ancient times. The rich and precise information embodied in text is very useful in a wide range of vision-based applications such as the text data extracted from images that can provide information for automatic annotation, indexing, language translation, and the assistance systems for impaired persons. Therefore, natural-scene text detection with active research topics regarding computer vision and document analysis is very important. Previous methods have poor performances due to numerous false-positive and true-negative regions. In this paper, a fully-convolutional-network (FCN)-based method that uses supervised architecture is used to localize textual regions. The model was trained directly using images wherein pixel values were used as inputs and binary ground truth was used as label. The method was evaluated using ICDAR-2013 dataset and proved to be comparable to other feature-based methods. It could expedite research on text detection using deep-learning based approach in the future.

Improving accessibility and distinction between negative results in biomedical relation extraction

  • Sousa, Diana;Lamurias, Andre;Couto, Francisco M.
    • Genomics & Informatics
    • /
    • v.18 no.2
    • /
    • pp.20.1-20.4
    • /
    • 2020
  • Accessible negative results are relevant for researchers and clinicians not only to limit their search space but also to prevent the costly re-exploration of research hypotheses. However, most biomedical relation extraction datasets do not seek to distinguish between a false and a negative relation among two biomedical entities. Furthermore, datasets created using distant supervision techniques also have some false negative relations that constitute undocumented/ unknown relations (missing from a knowledge base). We propose to improve the distinction between these concepts, by revising a subset of the relations marked as false on the phenotype-gene relations corpus and give the first steps to automatically distinguish between the false (F), negative (N), and unknown (U) results. Our work resulted in a sample of 127 manually annotated FNU relations and a weighted-F1 of 0.5609 for their automatic distinction. This work was developed during the 6th Biomedical Linked Annotation Hackathon (BLAH6).

The POS Elderly: Semi-automatic annotation tool for Historical Korean (형태소 깎는 노인: 국어사 자료를 위한 형태분석 보조기)

  • Kim, Migyeong;Park, Suzi;Lee, Sana
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.39-43
    • /
    • 2016
  • '형태소 깎는 노인'은 국어사 자료를 처리하는 고성능 자동 형태분석기의 개발이 난항을 겪고 있는 상황에서 수동으로 형태분석 작업을 하는 연구자들을 지원하기 위하여 개발된 형태분석 보조기이다. 인간과 기계의 분업을 통해 인간의 피로를 최대한 줄이고, 단순 반복 형태에 대해서는 정답을 확실하게 제안할 수 있다는 것이 특징이다. 국어사 자료에는 한국어 정보처리를 위해 필요한 어휘 사전이 없으므로, 문법형태소 사전을 만들어 이를 단서로 조사/어미부와 어간부를 구분하도록 하였다. 이를 통해 구축된 소규모 형태분석 말뭉치들이 장기적으로는 자동 형태분석기의 성능 개선에 일조할 수 있을 것으로 기대한다.

  • PDF

Design of Mobile Agent System for Remote Electric Safety Education (원격 전기안전 교육을 위한 모바일 에이전트 시스템 설계)

  • Cho, Hyun-Seob;Ryu, In-Ho;Jang, Sung-Whan;Rheu, Ki-Soo
    • Proceedings of the KIEE Conference
    • /
    • 2006.07d
    • /
    • pp.1951-1952
    • /
    • 2006
  • To effectively deal with video data, a semantic-based retrieval scheme that allows for processing diverse user queries and saving them on the database is required. In this regard, this paper proposes a semantic-based video retrieval system that allows the user to search diverse meanings of video data for electrical safetyrelated educational purposes by means of automatic annotation processing. If the user inputs a keyword to search video data for electrical safety-related educational purposes, the mobile agent of the proposed system extracts the features of the video data that are afterwards learned in a continuous manner, and detailed information on electrical safety education is saved on the database. The proposed system is designed to enhance video data retrieval efficiency for electrical safety-related educational purposes.

  • PDF

A Video Retrieval System for Electric Safety Education based on Mobile Agent (전기 안전 교육을 위한 모바일 에이전트 기반 비디오 검색 시스템)

  • Cho, Hyun-Seob;Lee, Keun-Wang;Kim, Hee-Sook
    • Proceedings of the KIEE Conference
    • /
    • 2005.07d
    • /
    • pp.2830-2832
    • /
    • 2005
  • Recently, retrieval or various video data has become an important issue as more and more multimedia content services are being provided. To effectively deal with video data, a semantic-based retrieval scheme that allows for processing diverse user queries and saving them on the database is required. In this regard, this paper proposes a semantic-based video retrieval system that allows the user to search diverse meanings of video data for electrical safetyrelated educational purposes by means of automatic annotation processing. If the user inputs a keyword to search video data for electrical safety-related educational purposes, the mobile agent of the proposed system extracts the features of the video data that are afterwards learned in a continuous manner, and detailed information on electrical safety education is saved on the database. The proposed system is designed to enhance video data retrieval efficiency for electrical safety-related educational purposes.

  • PDF

Automatic Generation of RDF Metadata for Semantic Search in Semantic Web (시맨틱 웹에서 의미 검색을 위한 RDF 메타데이타 자동 생성)

  • 강상구;양재영;양승섭;최원종;최중민
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.11a
    • /
    • pp.311-320
    • /
    • 2002
  • 시맨틱 웹은 인간이 이해하는 것처럼 웹 문서의 의미를 컴퓨터가 처리할 수 있도록 하는데 있다. 그러나 인터넷 등 정보통신 기술의 발전으로 인해 정보량이 급증함으로써 이들 정보 자원을 효과적으로 검색하기에는 많은 어려움이 있다. 이러한 문제점을 해결하기 위해 본 논문에서는 주석 에디터를 사용하여 논문에 대한 RDF 메타데이타의 자동 생성 방법을 제안한다. 사용자가 논문을 주석 처리할 때, 문서에 대한 특징을 추출하고 온토로지 인터페이스를 사용하여 문서를 분류한다. 구현된 시스템을 통해 사용자는 추출된 메타데이타를 메타데이타 뷰를 통해 볼 수 있으며, HTML 뷰를 통해 메타데이타를 수동으로 수정이 가능하다. 이 메타데이타는 RDF Repository로 저장할 수 있으며, 주석 뷰를 통하여 RDF 메타데이타 생성을 확인할 수 있다. 이렇게 생성된 RDF 메타데이타는 웹 로봇이 내용의 의미 파악 및 카테고리 정보를 쉽게 알 수 있도록 해준다. 본 논문은 검색 엔진을 통하여 논문 검색시 전체 내용보다 RDF 메타데이타 정보만으로 효율적인 검색을 할 수 있는 방법에 초점을 둔다.

  • PDF