• Title/Summary/Keyword: Bioinformatics data

Search Result 645, Processing Time 0.027 seconds

Establishment of Search Portal on Biodiversity Data (생물다양성데이터 검색포탈 구축)

  • Ahn, Sung-Soo;Park, Hyung-Seon;Kwon, Chang-Hyuk;Ahn, Bu-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2005.05a
    • /
    • pp.561-564
    • /
    • 2005
  • 본 논문은 생물다양성데이터 네트워크 구축에 필요한 국내외의 생물다양성데이터 표준형식과 프로토콜 등을 소개하고 지리적으로 분산된 국내 생물다양성데이터를 통합 검색하여 활용 할 수 있는 방법과 국내생물다양성데이터의 검색포탈을 어떻게 구축하였는지 설명한다. 다음으로 포탈구축에 사용된 데이터 표준, 데이터 교환 프로토콜, 시스템 아키텍쳐 그리고 소프트웨어 구성요소에 대해 설명하고 끝으로 검색포탈이 원활이 운영되어지기 위해 데이터 소유기관 등에서 필요한 활동과 생물다양성데이터 검색포탈 구축의 결과 및 기대효과 등에 서술한다.

  • PDF

Design of Metadata Schema for Biodiversity Data Exchange (생물다양성 데이터교환을 위한 메타데이터 스키마 설계)

  • Ahn Bu-young;Cho Hee-hyung;Ahn Sung-soo;Park Hyung-seon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11b
    • /
    • pp.91-93
    • /
    • 2005
  • 생물다양성은 육상 생태계, 해양과 기타 수생 생태계와 이들의 복합 생태계를 포함하는 모든 원천에서 발생한 생물체의 다양성을 알하며, 종내$\cdot$종간 및 생태계의 다양성을 포함한다. 지구상에 존재하는 생물이 매우 다양하듯이 생물다양성을 표현하는 데이터 또한 매우 다양하게 사용되고 있다. 본 논문에서는 먼저 생물다양성 데이터의 점보공유 및 교환을 위해 생물다양성 관련 국제기구에서 제안된 데이터 표준 및 데이터 교환 프로토콜을 알아보고, 이러한 데이터 표준과 프로토콜을 기반으로 국내 생물다양성 데이터 공유 및 교환을 위한 생물다양성 메타데이터 스키마를 크게 생물종 정보와 종정보에 관한 참조(reference) 정보로 나누어 설계하여 제시하고자 한다.

  • PDF

Stream Data Processing Prototype Development for Automated Prediction of Myocardial Ischemia (심근허혈 질환 진단을 위한 스트림 데이터 처리)

  • Park, Jin Hyoung;Saeed, Khalid E.K.;Lee, Jong Bum;Lee, Heon Gyu;Ryu, Keun Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.360-363
    • /
    • 2009
  • 실시간으로 심장 질환의 진단을 위하여 심전도 신호의 스트림 처리 및 데이터 마이닝 프로토타입을 구현하였다. 신체부착형 센서로부터 전송되는 심전도를 전처리하여 심장질환의 진단 지표를 추출하였고 실시간 진단을 위한 출현 패턴 마이닝 알고리즘을 구현 및 적용하였다. 이를 기반으로 심혈관계 질환에 대하여 실시간 자동 진단 및 예측이 가능한 생체 신호 스트림 데이터 처리 분석 프로토타입을 구현하였다.

Toxicoinformatics: The Master Key for Toxicogenomics

  • Lee, Wan-Sun;Kim, Yang-Seok
    • Molecular & Cellular Toxicology
    • /
    • v.1 no.1
    • /
    • pp.13-16
    • /
    • 2005
  • The current vision of toxicogenomics is the development of methods or platforms to predict toxicity of un characterized chemicals by using '-omics' information in pre-clinical stage. Because each chemical has different ADME (absorption, distribution, mechanism, excretion) and experimental animals have lots of variation, precise prediction of chemical's toxicity based on '-omics' information and toxicity data of known chemicals is very difficult problem. So, the importance of bioinformatics is more emphasized on toxicogenomics than other functional genomics studies because these problems can not be solved only with experiments. Thus, toxicoinformatics covers all information-based analytical methods from gene expression (bioinformatics) to chemical structures (cheminformatics) and it also deals with the integration of wide range of experimental data for further extensive analyses. In this review, the overall strategy to toxicoinformatics is discussed.

AN ANOMALY DETECTION METHOD BY ASSOCIATIVE CLASSIFICATION

  • Lee, Bum-Ju;Lee, Heon-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.301-304
    • /
    • 2005
  • For detecting an intrusion based on the anomaly of a user's activities, previous works are concentrated on statistical techniques or frequent episode mining in order to analyze an audit data. But, since they mainly analyze the average behaviour of user's activities, some anomalies can be detected inaccurately. Therefore, we propose an anomaly detection method that utilizes an associative classification for modelling intrusion detection. Finally, we proof that a prediction model built from associative classification method yields better accuracy than a prediction model built from a traditional methods by experimental results.

  • PDF

Deep Learning in Genomic and Medical Image Data Analysis: Challenges and Approaches

  • Yu, Ning;Yu, Zeng;Gu, Feng;Li, Tianrui;Tian, Xinmin;Pan, Yi
    • Journal of Information Processing Systems
    • /
    • v.13 no.2
    • /
    • pp.204-214
    • /
    • 2017
  • Artificial intelligence, especially deep learning technology, is penetrating the majority of research areas, including the field of bioinformatics. However, deep learning has some limitations, such as the complexity of parameter tuning, architecture design, and so forth. In this study, we analyze these issues and challenges in regards to its applications in bioinformatics, particularly genomic analysis and medical image analytics, and give the corresponding approaches and solutions. Although these solutions are mostly rule of thumb, they can effectively handle the issues connected to training learning machines. As such, we explore the tendency of deep learning technology by examining several directions, such as automation, scalability, individuality, mobility, integration, and intelligence warehousing.

CLUSTER ANALYSIS FOR REGION ELECTRIC LOAD FORECASTING SYSTEM

  • Park, Hong-Kyu;Kim, Young-Il;Park, Jin-Hyoung;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2007.10a
    • /
    • pp.591-593
    • /
    • 2007
  • This paper is to cluster the AMR (Automatic Meter Reading) data. The load survey system has been applied to record the power consumption of sampling the contract assortment in KEPRI AMR. The effect of the contract assortment change to the customer power consumption is determined by executing the clustering on the load survey results. We can supply the power to customer according to usage to the analysis cluster. The Korea a class of the electricity supply type is less than other country. Because of the Korea electricity markets exists one electricity provider. Need to further divide of electricity supply type for more efficient supply. We are found pattern that is different from supplied type to customer. Out experiment use the Clementine which data mining tools.

  • PDF

Statistical Analysis of Gene Expression Data

  • 박태성
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2001.10a
    • /
    • pp.97-115
    • /
    • 2001
  • cDNA microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. Many statistical analysis tools become widely applicable to the analysis of cDNA microarray data. In this talk, we consider a two-way ANOVA model to differentiate genes that have high variability and ones that do not. Using this model, we detect genes that have different gene expression profiles among experimental groups. The two-way ANOVA model is illustrated using cDNA microarrays of 3,800 genes obtained in an experiment to search for changes in gene expression profiles during neuronal differentiation of cortical stem cells.

  • PDF

FASIM: Fragments Assembly Simulation using Biased-Sampling Model and Assembly Simulation for Microbial Genome Shotgun Sequencing

  • Hur Cheol-Goo;Kim Sunny;Kim Chang-Hoon;Yoon Sung-Ho;In Yong-Ho;Kim Cheol-Min;Cho Hwan-Gue
    • Journal of Microbiology and Biotechnology
    • /
    • v.16 no.5
    • /
    • pp.683-688
    • /
    • 2006
  • We have developed a program for generating shotgun data sets from known genome sequences. Generation of synthetic data sets by computer program is a useful alternative to real data to which students and researchers have limited access. Uniformly-distributed-sampling clones that were adopted by previous programs cannot account for the real situation where sampled reads tend to come from particular regions of the target genome. To reflect such situation, a probabilistic model for biased sampling distribution was developed by using an experimental data set derived from a microbial genome project. Among the experimental parameters tested (varied fragment or read lengths, chimerism, and sequencing error), the extent of sequencing error was the most critical factor that hampered sequence assembly. We propose that an optimum sequencing strategy employing different insert lengths and redundancy can be established by performing a variety of simulations.

Prediction of subcellular localization of proteins using pairwise sequence alignment and support vector machine

  • Kim, Jong-Kyoung;Raghava, G. P. S.;Kim, Kwang-S.;Bang, Sung-Yang;Choi, Seung-Jin
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.158-166
    • /
    • 2004
  • Predicting the destination of a protein in a cell gives valuable information for annotating the function of the protein. Recent technological breakthroughs have led us to develop more accurate methods for predicting the subcellular localization of proteins. The most important factor in determining the accuracy of these methods, is a way of extracting useful features from protein sequences. We propose a new method for extracting appropriate features only from the sequence data by computing pairwise sequence alignment scores. As a classifier, support vector machine (SVM) is used. The overall prediction accuracy evaluated by the jackknife validation technique reach 94.70% for the eukaryotic non-plant data set and 92.10% for the eukaryotic plant data set, which show the highest prediction accuracy among methods reported so far with such data sets. Our numerical experimental results confirm that our feature extraction method based on pairwise sequence alignment, is useful for this classification problem.

  • PDF