• 제목/요약/키워드: Bioinformatics Software

검색결과 127건 처리시간 0.021초

바이오 셀 영상 분할에 관한 연구 (A Study on the Bio-Cell Image Segmentation)

  • 전병태;이형구;조수현;정연구;박선희
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2002년도 추계학술발표논문집 (상)
    • /
    • pp.743-746
    • /
    • 2002
  • 바이오 인포매틱스(bioinformatics) 분야 중 한 분야인 셀 기반 분석(cell-based assay) 시스템 구축의 필요성이 최근 대두되고 있다. 특정 시약 또는 시험 물질을 셀 세포에 투여했을 때 시간 축 변화에 따라 변화하는 세포의 변화를 감지하기 위해서 세포 영상의 영역 분할이 선행되어야 한다. 본 논문에서는 전체 영상에 대하여 셀 공통 영역을 추출하고, 추출된 공통영역을 스네이크(snake) 기법을 이용하여 세포 영역을 분할하는 방법을 제안하고자 한다.

  • PDF

선도화합물 탐색을 위한 고효율가상탐색 프로그램 개발 (Developing Virtual Screening Program for Lead Identification)

  • Nam, Ky-Youb;Cho, Yong-Kee;Lee, Chang-Joon;Shin, Jae-Hong;Choi, Jung-Won;Gil, Joon-Min;Park, Hark-Soo;Hwang, Il-Sun;No, Kyoung-Tai
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2004년도 The 3rd Annual Conference for The Korean Society for Bioinformatics Association of Asian Societies for Bioinformatics 2004 Symposium
    • /
    • pp.181-190
    • /
    • 2004
  • The docking and in silico ligand screening procedures can select small sets of lead -like candidates from large libraries of either commercially or synthetically available compounds; however, the vast number of such molecules make the potential size of this task enormous. To accelerate the discovery of drugs to inhibit several targets, we have exploited massively distributed computing to screen compound libraries virtually. The Korea@HOME project was launched in Feb. 2002, and one year later, more than 1200 PC's have been recruited. This has created a 31 -gigaflop machine that has already provided more than 1400 hours of CPU time. It has all owed databases of millions of compounds to be screened against protein targets in a matter of days. Now, the virtual screening software suitable for distributed environments is developed by BMD. It has been evaluated in terms of the accuracy of the scoring function and the search algorithm for the correct binding mode.

  • PDF

BJRNAFold: Prediction of RNA Secondary Structure Base on Constraint Parameters

  • Li, Wuju;Ying, Xiaomin
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.287-293
    • /
    • 2005
  • Predicting RNA secondary structure as accurately as possible is very important in functional analysis of RNA molecules. However, different prediction methods and related parameters including terminal GU pair of helices, minimum length of helices, and free energy systems often give different prediction results for the same RNA sequence. Then, which structure is more important than the others? i.e. which combinations of the methods and related parameters are the optimal? In order to investigate above problems, first, three prediction methods, namely, random stacking of helical regions (RS), helical regions distribution (HD), and Zuker's minimum free energy algorithm (ZMFE) were compared by taking 1139 tRNA sequences from Rfam database as the samples with different combinations of parameters. The optimal parameters are derived. Second, Zuker's dynamic programming method for prediction of RNA secondary structure was revised using the above optimal parameters and related software BJRNAFold was developed. Third, the effects of short-range interaction were studied. The results indicated that the prediction accuracy would be improved much if proper short-range factor were introduced. But the optimal short-range factor was difficult to determine. A user-adjustable parameter for short-range factor was introduced in BJRNAFold software.

  • PDF

Peptide Nucleic Acid(PNA)를 이용한 antisense 기법에 적용할 병렬 컴퓨팅용 Bioinformatics tool 개발 (Developing a Bioinformatics Tool for Peptide Nucleic Acid (PNA) antisense Technique Utilizing Parallel Computing System)

  • 김성조;전호상;홍승표;김현창;김한집;민철기
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2006년도 한국컴퓨터종합학술대회 논문집 Vol.33 No.1 (A)
    • /
    • pp.43-45
    • /
    • 2006
  • Unlike RNA interference, whose usage is limited to eukaryotic cells, Peptide Nucleic Acid (PNA) technique is applicable to both eukaryotic and prokaryotic cells. PNA has been proven to be an effective agent for blocking gene expressions and has several advantages over other antisense techniques. Here we developed a parallel computing software that provides the ideal sequences to design PNA oligos to prevent any off-target effects. We applied a new approach in our location-finding algorithm that finds a target gene from the whole genome sequence. Message Passing Interface (MPI) was used to perform parallel computing in order to reduce the calculation time. The software will help biologists design more accurate and effective antisense PNA by minimizing the chance of off-target effects.

  • PDF

COG 거리와 유전자 간의 상대 위치정보를 이용한 오페론 예측 전처리 모델 (Preprocessing Model for Operon Prediction Using Relative Distance of Genes and COG Distance)

  • Chun, Bong-Kyung;Jang, Chul-Jin;Kang, Eun-Mi;Cho, Hwan-Gue
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.210-219
    • /
    • 2003
  • 오페론(operon)은 보통 미생물에서 다수의 인접한 유전자들로 구성된 그룹으로 하나의 유전자처럼 공통된 프로모터에 의해 전사되는 단위이다. 오페론을 구성하는 유전자들은 기능적으로 서로 유사하거나 같은 물질대사경로(metabolic pathway) 상에 존재하는 특징을 지니기 때문에 이들은 중요한 의미를 가지며, 미생물 유전체 분석에서 오페론을 구성하는 유전자들을 예측하는 것은 상당히 중요하다. 오페론을 예측하는 이전 연구들로는 이미 알려진 오페론의 특징인 유전자간 거리나 오페론을 구성하는 평균 유전자 개수 등을 이용하는 방법, 마이크로어레이 발현 실험을 이용한 방법, 전유전체(whole genome)들 간의 보존된 유전자 집합(conserved gene cluster)을 이용한 방법 그리고 물질대사경로를 이용한 방법 등이 있다. 본 논문에서는 COG 기능(function) 거리, 유전자 간의 거리, 코돈 사용빈도(codon usage) 그리고COG 기능 거리와 유전자간 거리를 같이 적용한 방법을 이용하여 오페론 예측을 위한 전처리 모델을 생성하였다 전처리 모델을 E. coli 전유전체에 적용해본 결과, 알려진 오페론들의 약 90%가 이를 포함하였다. 따라서 본 논문에서 제시한 전처리 모델은, 추후 오페론 예측을 위한 좋은 도구로 활용할 수 있을 것이다.

  • PDF

GPCR 경로 추출을 위한 생물학 기반의 목적지향 텍스트 마이닝 시스템 (BIOLOGY ORIENTED TARGET SPECIFIC LITERATURE MINING FOR GPCR PATHWAY EXTRACTION)

  • KIm, Eun-Ju;Jung, Seol-Kyoung;Yi, Eun-Ji;Lee, Gary-Geunbae;Park, Soo-Jun
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.86-94
    • /
    • 2003
  • Electronically available biological literature has been accumulated exponentially in the course of time. So, researches on automatically acquiring knowledge from these tremendous data by text mining technology become more and more prosperous. However, most of the previous researches are technology oriented and are not well focused in practical extraction target, hence result in low performance and inconvenience for the bio-researchers to actually use. In this paper, we propose a more biology oriented target domain specific text mining system, that is, POSTECH bio-text mining system (POSBIOTM), for signal transduction pathway extraction, especially for G protein-coupled receptor (GPCR) pathway. To reflect more domain knowledge, we specify the concrete target for pathway extraction and define the minimal pathway domain ontology. Under this conceptual model, POSBIOTM extracts interactions and entities of pathways from the full biological articles using a machine learning oriented extraction method and visualizes the pathways using JDesigner module provided in the system biology workbench (SBW) [14]

  • PDF

가상 예제와 Edit-distance 자질을 이용한 SVM 기반의 단백질명 인식 (SVM-based Protein Name Recognition using Edit-Distance Features Boosted by Virtual Examples)

  • Yi, Eun-Ji;Lee, Gary-Geunbae;Park, Soo-Jun
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.95-100
    • /
    • 2003
  • In this paper, we propose solutions to resolve the problem of many spelling variants and the problem of lack of annotated corpus for training, which are two among the main difficulties in named entity recognition in biomedical domain. To resolve the problem of spotting valiants, we propose a use of edit-distance as a feature for SVM. And we propose a use of virtual examples to automatically expand the annotated corpus to resolve the lack-of-corpus problem. Using virtual examples, the annotated corpus can be extended in a fast, efficient and easy way. The experimental results show that the introduction of edit-distance produces some improvements in protein name recognition performance. And the model, which is trained with the corpus expanded by virtual examples, outperforms the model trained with the original corpus. According to the proposed methods, we finally achieve the performance 75.80 in F-measure(71.89% in precision,80.15% in recall) in the experiment of protein name recognition on GENIA corpus (ver.3.0).

  • PDF

미생물 게놈자원을 위한 메타정보 시스템의 개발 (The Development of Meta-Information System for Microbial Genome Resources)

  • Chung, Won-Hyong;Yu, Jae-Woo;Sohn, Tae-Kwon;Park, Yong-Ha;Kim, Hong-Ik
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.245-250
    • /
    • 2003
  • There are currently about 6000 bacterial species with validly published names, but scientists assume that these may be less than 1% of bacterial species present on the earth. Microbial resource is one of the most important bioresources in bioinderstry and provides us with high economic values. To find missing ones, the studies of metagenome, metabolome, and proteome about microbes have started recently in developed countries. We construct the information system that integrates information on microbial genome resources and manages the information to support efficient research of microbial genome application, and name this system 'Bio-Meta Information System (Bio-MIS)'. Bio-MIS consists of integrated microbial genome resources database, microbial genome resources input system, integrated microbial genome resources search engine, microbial resources on-line distribution system, portal service and management via internet. In the future, we will include public database connection and implement useful bioinformatics software for analyzing microbial genome resources. The web-site is accessible at http://biomis.probionic.com

  • PDF

Gene Set Analyses of Genome-Wide Association Studies on 49 Quantitative Traits Measured in a Single Genetic Epidemiology Dataset

  • Kim, Jihye;Kwon, Ji-Sun;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • 제11권3호
    • /
    • pp.135-141
    • /
    • 2013
  • Gene set analysis is a powerful tool for interpreting a genome-wide association study result and is gaining popularity these days. Comparison of the gene sets obtained for a variety of traits measured from a single genetic epidemiology dataset may give insights into the biological mechanisms underlying these traits. Based on the previously published single nucleotide polymorphism (SNP) genotype data on 8,842 individuals enrolled in the Korea Association Resource project, we performed a series of systematic genome-wide association analyses for 49 quantitative traits of basic epidemiological, anthropometric, or blood chemistry parameters. Each analysis result was subjected to subsequent gene set analyses based on Gene Ontology (GO) terms using gene set analysis software, GSA-SNP, identifying a set of GO terms significantly associated to each trait ($p_{corr}$ < 0.05). Pairwise comparison of the traits in terms of the semantic similarity in their GO sets revealed surprising cases where phenotypically uncorrelated traits showed high similarity in terms of biological pathways. For example, the pH level was related to 7 other traits that showed low phenotypic correlations with it. A literature survey implies that these traits may be regulated partly by common pathways that involve neuronal or nerve systems.

The Construction of Regulatory Network for Insulin-Mediated Genes by Integrating Methods Based on Transcription Factor Binding Motifs and Gene Expression Variations

  • Jung, Hyeim;Han, Seonggyun;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • 제13권3호
    • /
    • pp.76-80
    • /
    • 2015
  • Type 2 diabetes mellitus is a complex metabolic disorder associated with multiple genetic, developmental and environmental factors. The recent advances in gene expression microarray technologies as well as network-based analysis methodologies provide groundbreaking opportunities to study type 2 diabetes mellitus. In the present study, we used previously published gene expression microarray datasets of human skeletal muscle samples collected from 20 insulin sensitive individuals before and after insulin treatment in order to construct insulin-mediated regulatory network. Based on a motif discovery method implemented by iRegulon, a Cytoscape app, we identified 25 candidate regulons, motifs of which were enriched among the promoters of 478 up-regulated genes and 82 down-regulated genes. We then looked for a hierarchical network of the candidate regulators, in such a way that the conditional combination of their expression changes may explain those of their target genes. Using Genomica, a software tool for regulatory network construction, we obtained a hierarchical network of eight regulons that were used to map insulin downstream signaling network. Taken together, the results illustrate the benefits of combining completely different methods such as motif-based regulatory factor discovery and expression level-based construction of regulatory network of their target genes in understanding insulin induced biological processes and signaling pathways.