• 제목/요약/키워드: BLAST search

검색결과 213건 처리시간 0.022초

그리드 컴퓨팅을 이용한 BLAST 성능개선 및 유전체 서열분석 시스템 구현 (Performance Improvement of BLAST using Grid Computing and Implementation of Genome Sequence Analysis System)

  • 김동욱;최한석
    • 한국콘텐츠학회논문지
    • /
    • 제10권7호
    • /
    • pp.81-87
    • /
    • 2010
  • 본 논문에서는 현재 생물정보학 연구에서 가장 많이 사용하고 있는 BLAST의 문제점을 분석하고 이에 따른 해결책을 제시하기 위하여 그리드 컴퓨팅을 이용한 G-BLAST(Grid Computing을 이용한 Basic Local Alignment Search Tool)를 제안한다. 본 연구에서 제안하고 있는 G-BLAST을 이용한 시스템은 이기종 분산 환경에서 수행이 가능한 서열분석 통합 소프트웨어 패키지이며 기존 서열분석 서비스의 취약점인 검색 성능을 개선하여 BLAST 검색 기능을 강화 하였다. 또한, BLAST 결과를 사용자가 관리 및 분석이 용이하도록 데이터베이스 및 유전체 서열분석 서비스 시스템을 구현하였다. 본 논문에서는 G-BLAST시스템의 성능확인을 위하여 병렬컴퓨팅 성능테스트 기법을 도입하여 구현된 시스템을 기존 BLAST와 속도 및 효율부분에서 비교하여 성능개선을 확인하였으며 서열결과 분석에 필요한 자료를 사용자관점에서 제공해주고 있다.

Cervus elaphus 종의 sequencing과 BLAST search에 의한 감별 (Identification of Cervus elaphus Species by Sequencing Analysis and BLAST Search)

  • 서정철;김민정;이찬;임강현
    • 대한본초학회지
    • /
    • 제21권2호
    • /
    • pp.129-133
    • /
    • 2006
  • Objectives : Cervus elaphus species are some of the most medicinally important genera in the Oriental medicine. This study was performed to determine if Cenvus elaphus species could be identified by sequencing analysis and to verify Basic Local Alignment Search Tool (BLAST) search, which was used to assess genetic identification. Methods : The DNAs of Cervus elaphus species were extracted, amplified by PCR, and sequenced. The DNAs of Cervus species were identified by BLAST search in website. Results : By BLAST search one of Cervus elaphus species was identified as Cervus elaphussibericus but the other was identified as Cervus elaphus nelsoni. This work showed that identification can efficiently be performed by BLAST search. Conclusion : These results suggest that sequencing following BLAST search might be able to provide the identification of Cervus elaphus species.

  • PDF

생물정보시스템을 이용한 Local Animal BLAST Search System 구축 (Development of Local Animal BLAST Search System Using Bioinformatics Tools)

  • 김병우;이근우;김효선;노승희;이윤호;김시동;전진태;이지웅;조용민;정일정;이정규
    • Bioinformatics and Biosystems
    • /
    • 제1권2호
    • /
    • pp.99-102
    • /
    • 2006
  • BLAST(Basic Local Alignment Search Tool)는 서열 데이터베이스 탐색을 위하여 가장 많이 사용되는 프로그램이다. 전체 서열간의 최적 글로벌 정렬을 수행하는 대신에 지역적 유사성이 있는 부분을 찾아 서열 짝짓기를 수행하는 특징을 갖는다. 일반적인 연구자들은 서열 상동성 검색을 위해 NCBI에 접속하여 웹 브라우저를 통해 온라인으로 BLAST를 수행하게 되는데, 이 경우 사용자 각각의 네트워크 환경이나 입력할 데이터양에 따른 검색속도의 지연 및 제한 등과 같은 여러 문제에 부딪히게 되고, 또한 보안유지가 필요한 서열 데이터의 유출 가능성이 존재한다. 그러므로 대량의 서열 데이터에 대하여 빠르고 안전하게 BLAST 상동성 검색이 가능한 Local BLAST 검색 시스템의 필요성이 증대되고 있다. 본 연구에서는 NCBI의 Genbank에서 공개된 동물의 발현 유전자 단편들(ESTs)에 대한 데이터를 이용하여 소, 돼지, 닭, 등의 경제형질과 연관된 유용 유전자만을 추출하여 이들만으로 구성된 새로운 데이터베이스를 구축하였고, 또한 이들을 사용할 수 있는 새로운 검색시스템을 개발하였다 자체 제작한 Perl script를 사용하여 필요한 데이터를 축종별로 추출 하여 새로운 DB를 구축하였으며 이 속에는 소의 경우 650,046개, 돼지의 경우 368,120개, 닭의 경우 693,005개의 발현 유전자 단편들(ESTs)이 포함된다. 또한 이들 DB 분석이 가능한 Local Animal BLAST Web 검색시스템(http://bioinfo.kohost.net)을 고성능 병렬 PC Cluster 시스템과 연동하도록 자체 구축함으로써 본 시스템이 보다 효율적인 생물정보학 연구수행이 기여할 것으로 기대된다.

  • PDF

클러스터 환경에서의 MPI 기반 병렬 서열 유사성 검색에 관한 연구 (Study on MPI-based parallel sequence similarity search in the LINUX cluster)

  • 홍창범;차정호;이성훈;신승우;박근준;박근용
    • 한국컴퓨터정보학회논문지
    • /
    • 제11권6호
    • /
    • pp.69-78
    • /
    • 2006
  • 생물정보학 연구 있어서 아미노산이나 염기서열에 대한 유사성이나 상동성을 찾아내는 작업은 유전자의 기능에 대한 예측이나 단백질 구조를 예측하는 연구의 기반이 된다. 이러한 서열 데이터는 컴퓨터의 도입으로 매우 빠르게 증가하고 있다. 이러한 시점에서 서열에 대한 검색 속도는 매우 중요한 요소이기 때문에 대량의 서열정보를 다루기 위해서는 SMP(Sysmmetric Multi-Processors) 컴퓨터나 클러스터를 이용하고 있다. 본 논문에서는 서열 검색에 사용되는 BLAST(Basic Local Alignment Search Tool)의 속도향상을 위한 방법으로 클러스터 환경에서 병렬화 하는 nBLAST 알고리즘의 병렬화에 대해 제안한다. nBLAST는 기존의 BLAST 소스코드에 대한 수정 없이 병렬라이브러리인 MPI(Message Passing Interface)를 이용하여 질의를 분할하여 병렬화 하기 때문에 환경설정 등의 복잡한 과정을 거치지 않고 손쉽게 BLAST에 알고리즘에 대한 병렬화를 할 수 있다. 또한, 실험을 통하여 28대의 리눅스 클러스터에서 nBLAST를 수행하여 노드 수의 증가에 따른 성능 향상을 확인하였다.

  • PDF

A Pattern Summary System Using BLAST for Sequence Analysis

  • Choi, Han-Suk;Kim, Dong-Wook;Ryu, Tae-W.
    • Genomics & Informatics
    • /
    • 제4권4호
    • /
    • pp.173-181
    • /
    • 2006
  • Pattern finding is one of the important tasks in a protein or DNA sequence analysis. Alignment is the widely used technique for finding patterns in sequence analysis. BLAST (Basic Local Alignment Search Tool) is one of the most popularly used tools in bio-informatics to explore available DNA or protein sequence databases. BLAST may generate a huge output for a large sequence data that contains various sequence patterns. However, BLAST does not provide a tool to summarize and analyze the patterns or matched alignments in the BLAST output file. BLAST lacks of general and robust parsing tools to extract the essential information out from its output. This paper presents a pattern summary system which is a powerful and comprehensive tool for discovering pattern structures in huge amount of sequence data in the BLAST. The pattern summary system can identify clusters of patterns, extract the cluster pattern sequences from the subject database of BLAST, and display the clusters graphically to show the distribution of clusters in the subject database.

기능 도메인 예측을 위한 유전자 서열 클러스터링 (Gene Sequences Clustering for the Prediction of Functional Domain)

  • 한상일;이성근;허보경;변윤섭;황규석
    • 제어로봇시스템학회논문지
    • /
    • 제12권10호
    • /
    • pp.1044-1049
    • /
    • 2006
  • Multiple sequence alignment is a method to compare two or more DNA or protein sequences. Most of multiple sequence alignment tools rely on pairwise alignment and Smith-Waterman algorithm to generate an alignment hierarchy. Therefore, in the existing multiple alignment method as the number of sequences increases, the runtime increases exponentially. In order to remedy this problem, we adopted a parallel processing suffix tree algorithm that is able to search for common subsequences at one time without pairwise alignment. Also, the cross-matching subsequences triggering inexact-matching among the searched common subsequences might be produced. So, the cross-matching masking process was suggested in this paper. To identify the function of the clusters generated by suffix tree clustering, BLAST and CDD (Conserved Domain Database)search were combined with a clustering tool. Our clustering and annotating tool consists of constructing suffix tree, overlapping common subsequences, clustering gene sequences and annotating gene clusters by BLAST and CDD search. The system was successfully evaluated with 36 gene sequences in the pentose phosphate pathway, clustering 10 clusters, finding out representative common subsequences, and finally identifying functional domains by searching CDD database.

Predicting blast-induced ground vibrations at limestone quarry from artificial neural network optimized by randomized and grid search cross-validation, and comparative analyses with blast vibration predictor models

  • Salman Ihsan;Shahab Saqib;Hafiz Muhammad Awais Rashid;Fawad S. Niazi;Mohsin Usman Qureshi
    • Geomechanics and Engineering
    • /
    • 제35권2호
    • /
    • pp.121-133
    • /
    • 2023
  • The demand for cement and limestone crushed materials has increased many folds due to the tremendous increase in construction activities in Pakistan during the past few decades. The number of cement production industries has increased correspondingly, and so the rock-blasting operations at the limestone quarry sites. However, the safety procedures warranted at these sites for the blast-induced ground vibrations (BIGV) have not been adequately developed and/or implemented. Proper prediction and monitoring of BIGV are necessary to ensure the safety of structures in the vicinity of these quarry sites. In this paper, an attempt has been made to predict BIGV using artificial neural network (ANN) at three selected limestone quarries of Pakistan. The ANN has been developed in Python using Keras with sequential model and dense layers. The hyper parameters and neurons in each of the activation layers has been optimized using randomized and grid search method. The input parameters for the model include distance, a maximum charge per delay (MCPD), depth of hole, burden, spacing, and number of blast holes, whereas, peak particle velocity (PPV) is taken as the only output parameter. A total of 110 blast vibrations datasets were recorded from three different limestone quarries. The dataset has been divided into 85% for neural network training, and 15% for testing of the network. A five-layer ANN is trained with Rectified Linear Unit (ReLU) activation function, Adam optimization algorithm with a learning rate of 0.001, and batch size of 32 with the topology of 6-32-32-256-1. The blast datasets were utilized to compare the performance of ANN, multivariate regression analysis (MVRA), and empirical predictors. The performance was evaluated using the coefficient of determination (R2), mean absolute error (MAE), mean squared error (MSE), mean absolute percentage error (MAPE), and root mean squared error (RMSE)for predicted and measured PPV. To determine the relative influence of each parameter on the PPV, sensitivity analyses were performed for all input parameters. The analyses reveal that ANN performs superior than MVRA and other empirical predictors, andthat83% PPV is affected by distance and MCPD while hole depth, number of blast holes, burden and spacing contribute for the remaining 17%. This research provides valuable insights into improving safety measures and ensuring the structural integrity of buildings near limestone quarry sites.

KUGI: A Database and Search System for Korean Unigene and Pathway Information

  • Yang, Jin-Ok;Hahn, Yoon-Soo;Kim, Nam-Soon;Yu, Ung-Sik;Woo, Hyun-Goo;Chu, In-Sun;Kim, Yong-Sung;Yoo, Hyang-Sook;Kim, Sang-Soo
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.407-411
    • /
    • 2005
  • KUGI (Korean UniGene Information) database contains the annotation information of the cDNA sequences obtained from the disease samples prevalent in Korean. A total of about 157,000 5'-EST high throughput sequences collected from cDNA libraries of stomach, liver, and some cancer tissues or established cell lines from Korean patients were clustered to about 35,000 contigs. From each cluster a representative clone having the longest high quality sequence or the start codon was selected. We stored the sequences of the representative clones and the clustered contigs in the KUGI database together with their information analyzed by running Blast against RefSeq, human mRNA, and UniGene databases from NCBI. We provide a web-based search engine fur the KUGI database using two types of user interfaces: attribute-based search and similarity search of the sequences. For attribute-based search, we use DBMS technology while we use BLAST that supports various similarity search options. The search system allows not only multiple queries, but also various query types. The results are as follows: 1) information of clones and libraries, 2) accession keys, location on genome, gene ontology, and pathways to public databases, 3) links to external programs, and 4) sequence information of contig and 5'-end of clones. We believe that the KUGI database and search system may provide very useful information that can be used in the study for elucidating the causes of the disease that are prevalent in Korean.

  • PDF

고로슬래그 미분말을 사용한 콘크리트의 동결융해 저항성에 대한 실험적 연구 (An Experimental Study on Freezing-Thawing Resistance of Concrete Using Ground Granulated Blast-Furnace Slag)

  • 남용혁;최세규;김동신;김생빈
    • 한국콘크리트학회:학술대회논문집
    • /
    • 한국콘크리트학회 1996년도 가을 학술발표회 논문집
    • /
    • pp.148-153
    • /
    • 1996
  • Concrete with ground granulated blast-furnace slag can be affected by frost attack because the reaction of hydration is slow at the early age. In this study, therefore, the freezing and thawing test has been carried out to investigate the freezing and thawing resistance on concrete with ground granulated blast-furnace slag. The freezing and thawing test has been performed on concrete a blended cement, which was substituted by ground granulated blast-furnace slag with 4 kinds of ratio (non-admixture, 20%, 40% and 60%). And also tested on concrete added the AE agents to the concrete of same mix proportion to search the improvement effects about the resistance. As a result, the freezing and thawing resistance showed a tendency of reduction in proportion to the increase of the substitution ratio. For non-AE concrete, resistances of the freezing and thawing were very poor as the durability index indicated less than 5.8%. For AE concrte, resistance of the freezing and thawing were excellent as the durability index indicated more than 80.9%.

  • PDF

Modelling the dynamic response and failure modes of reinforced concrete structures subjected to blast and impact loading

  • Ngo, Tuan;Mendis, Priyan
    • Structural Engineering and Mechanics
    • /
    • 제32권2호
    • /
    • pp.269-282
    • /
    • 2009
  • Responding to the threat of terrorist attacks around the world, numerous studies have been conducted to search for new methods of vulnerability assessment and protective technologies for critical infrastructure under extreme bomb blasts or high velocity impacts. In this paper, a two-dimensional behavioral rate dependent lattice model (RDLM) capable of analyzing reinforced concrete members subjected to blast and impact loading is presented. The model inherently takes into account several major influencing factors: the progressive cracking of concrete in tension, the inelastic response in compression, the yielding of reinforcing steel, and strain rate sensitivity of both concrete and steel. A computer code using the explicit algorithm was developed based on the proposed lattice model. The explicit code along with the proposed numerical model was validated using experimental test results from the Woomera blast trial.