• 제목/요약/키워드: sequence alignment

검색결과 351건 처리시간 0.027초

SNPchaser : DNA서열의 SNPs 치환 및 Heterozygosity 확인 프로그램 (SNPchaser : A Web-based Program for Detecting SNPs Substitution and Heterozygosity Existence)

  • 장진우;이현철;이명훈;최연식;추동원;박기정;이대상
    • KSBB Journal
    • /
    • 제24권4호
    • /
    • pp.410-414
    • /
    • 2009
  • 단염기 다양성 (Single-Nucleotide Polymorphisms, SNPs)은 핵산수준에서의 개개인의 유전 서열간의 차이를 나타내는 말로 최근 맞춤의약 분야에서 각광 받고 있다. 일반적으로 SNPs존재 유무를 확인하는데 주로 사용되는 방법은 ABI automated DNA sequencer와 같은 대용량 염기서열 결정 기계에서 산출되는 결과물 파일로부터 DNA서열을 추출하여 BLAST와 같은 상동성 검색을 수행하는 것이다. 본 논문에서는 사용자로부터 참조서열, AB1파일, SNPs 존재 가능성을 가진 염기의 위치 정보를 입력 값으로 받아 해당 위치에 존재하는 염기의 SNPs 치환 및 heterozygosity 여부를 확인 할 수 있는 프로그램인 SNPchaser를 개발하였다. 특정 유전자 서열 내에서 SNPs를 보이는 염기의 위치에 대한 정보를 사용자가 알고 있는 경우, 전체 유전자 서열에 대해 SNPs유무를 조사할 필요 없이 SNPs를 보인다고 보고된 위치의 염기를 조사하여 SNPs유무를 판단하고, 해당지역의 염기의 chromatogram정보를 사용자에게 제공하는 기능을 가지고 있다. 또한 SNPchaser는 사람과 같은 2배체의 염색체를 가진 생명체에 존재 하는 SNPs지역의 염기에 대한 heterozygosity여부를 사용자가 손쉽게 판별할 수 있도록 하였다. 본 논문에서 개발한 SNPchaser는 http://www.bioinformatics.ac.kr/SNPchaser에서 사용 가능하다.

A Simple Java Sequence Alignment Editing Tool for Resolving Complex Repeat Regions

  • Ham, Seong-Il;Lee, Kyung-Eun;Park, Hyun-Seok
    • Genomics & Informatics
    • /
    • 제7권1호
    • /
    • pp.46-48
    • /
    • 2009
  • Finishing is the most time-consuming step in sequencing, and many genome projects are left unfinished due to complex repeat regions. Here, we have developed BACContigEditor, a prototype shotgun sequence finishing tool. It is essentially an editor that visualizes assemblies of shotgun sequence fragment reads as gapped multiple alignments. The program offers some flexibility that is needed to rapidly resolve complex regions within a working session. The sole purpose of the release is to promote collaborative creation of extensible software for fragment assembly editors, foster collaborative development, and reduce barriers to initial tool development effort. We describe our software architecture and identify current challenges. The program is available under an Open Source license.

A Primer for Disease Gene Prioritization Using Next-Generation Sequencing Data

  • Wang, Shuoguo;Xing, Jinchuan
    • Genomics & Informatics
    • /
    • 제11권4호
    • /
    • pp.191-199
    • /
    • 2013
  • High-throughput next-generation sequencing (NGS) technology produces a tremendous amount of raw sequence data. The challenges for researchers are to process the raw data, to map the sequences to genome, to discover variants that are different from the reference genome, and to prioritize/rank the variants for the question of interest. The recent development of many computational algorithms and programs has vastly improved the ability to translate sequence data into valuable information for disease gene identification. However, the NGS data analysis is complex and could be overwhelming for researchers who are not familiar with the process. Here, we outline the analysis pipeline and describe some of the most commonly used principles and tools for analyzing NGS data for disease gene identification.

NBLAST: a graphical user interface-based two-way BLAST software with a dot plot viewer

  • Choi, Beom-Soon;Choi, Seon Kang;Kim, Nam-Soo;Choi, Ik-Young
    • Genomics & Informatics
    • /
    • 제20권3호
    • /
    • pp.36.1-36.6
    • /
    • 2022
  • BLAST, a basic bioinformatics tool for searching local sequence similarity, has been one of the most widely used bioinformatics programs since its introduction in 1990. Users generally use the web-based NCBI-BLAST program for BLAST analysis. However, users with large sequence data are often faced with a problem of upload size limitation while using the web-based BLAST program. This proves inconvenient as scientists often want to run BLAST on their own data, such as transcriptome or whole genome sequences. To overcome this issue, we developed NBLAST, a graphical user interface-based BLAST program that employs a two-way system, allowing the use of input sequences either as "query" or "target" in the BLAST analysis. NBLAST is also equipped with a dot plot viewer, thus allowing researchers to create custom database for BLAST and run a dot plot similarity analysis within a single program. It is available to access to the NBLAST with http://nbitglobal.com/nblast.

Transposition of IntAs into the Conserved Regions of IS3 Family Elements

  • Han, Chang-Gyun
    • Journal of Microbiology
    • /
    • 제42권1호
    • /
    • pp.56-59
    • /
    • 2004
  • Together with the previous reports, my computer survey revealed that several bacteria contain six copies of the type group II intron IntA. The sequence analysis of IntAs showed the high level of homology in the nucleotide sequence (91.9-99.8%). The consensus sequence, 2,270 base pair long, was derived from the nucleotide sequences of all IntA members. The size of the open reading frame intA was 502 amino acids long, that is homologous to reverse transcriptase-like proteins encoded within the group II introns. It was reported that EPEC.IntA and Sf.IntA were inserted into IS911 and IS629, respectively. The sequence of the flanking region IntA was analyzed here. The data show the insertion of EC.IntA into IS629, the insertion of EHEC.IntA into IS3, the insertion of Yp.IntA into IS904-like sequence, and the insertion of EK12.IntA into IS911. Interestingly, these IS elements nested by IntAs were the members of IS3 family elements. The sequences of the IS3 members correspond to the OrfB with the DDE motif conserved in retroviral integrases. Alignment of the flanking sequences of IntAs revealed that the flanking regions -25 to + 10 of insertion sites, that are generally believed to be required for the retrohoming, were not strongly conserved. The data presented here suggests that the retrohoming pathway of IntA seems to differ from those of other group II introns.

Cloning, Expression, and Characterization of DNA Polymerase from Hyperthermophilic Bacterium Aquifex pyrophilus

  • Choi, Jeong-Jin;Kwon, Suk-Tae
    • Journal of Microbiology and Biotechnology
    • /
    • 제14권5호
    • /
    • pp.1022-1030
    • /
    • 2004
  • The gene encoding Aquifex pyrophilus (Apy) DNA polymerase was cloned and sequenced. The Apy DNA polymerase gene consists of 1,725 bp coding for a protein with 574 amino acid residues. The deduced amino acid sequence of Apy DNA. polymerase showed a high sequence homology to Escherichia coli DNA polymerase I-like DNA polymerases. It was deduced by amino acid sequence alignment that Apy DNA polymerase, like the Klenow fragment, has only the two domains, the $3'{\rightarrow}5'$ exonuclease domain and the $5'{\rightarrow}3'$ polymerase domain, containing the characteristic motifs. The Apy DNA polymerase gene was expressed under the control of T7lac promoter on the expression vector pET-22b(+) in E. coli. The expressed enzyme was purified by heat treatment, and Cibacron blue 3GA and $UNO^{TM}$ Q column chromatographies. The optimum pH of the purified enzyme was 7.5, and the optimal concentrations of KCl and $Mg^{2+}$ were 20 mM and 3 mM, respectively. Apy DNA polymerase contained a double strand-dependent $3'{\rightarrow}5'$ proofreading exonuclease activity, but lacked any detectable $5'{\rightarrow}3'$ exonuclease activity, which is consistent with its amino acid sequence. The somewhat lower thermostability of Apy DNA polymerase than the growth temperature of A. pyrophilus was analyzed by the comparison of amino acid composition and pressure effect.

Rapid and Efficient Isolation of Genes for Biosynthesis of Peptide Antibiotics from Gram-positive Bacterial Strains

  • Lee, Soon-Youl;Rhee, Sang-Ki;Kim, Chul-Ho;Suh, Joo-Won
    • Journal of Microbiology and Biotechnology
    • /
    • 제8권4호
    • /
    • pp.310-317
    • /
    • 1998
  • Peptide synthetases are large multifunctional enzyme complexes that catalyze the nonribosomal synthesis of a structurally diverse family of peptide antibiotics. These enzymes are composed of functionally independent domains with independent enzymatic activities. Their specific linkage order of domains forms the protein template that defines the sequence of the incorporated amino acids. Within each domain, several motifs of highly conserved sequences have been identified from the sequence alignment of the various peptide synthetases [30]. Taking advantage of the conserved nucleotide sequence of Core 1 and Core 2, we designed PCR primers to amplify the peptide synthetase genes from three different gram-positive bacterial strains. Nucleotide sequence analysis of the amplified PCR products from those three strains showed significant homology to various peptide synthetase genes, suggesting that the PCR products are parts of peptide synthetase genes. Therefore, this rapid and efficient PCR technique can be used for the isolation of peptide synthetase genes from various strains.

  • PDF

Identification of Viral Taxon-Specific Genes (VTSG): Application to Caliciviridae

  • Kang, Shinduck;Kim, Young-Chang
    • Genomics & Informatics
    • /
    • 제16권4호
    • /
    • pp.23.1-23.5
    • /
    • 2018
  • Virus taxonomy was initially determined by clinical experiments based on phenotype. However, with the development of sequence analysis methods, genotype-based classification was also applied. With the development of genome sequence analysis technology, there is an increasing demand for virus taxonomy to be extended from in vivo and in vitro to in silico. In this study, we verified the consistency of the current International Committee on Taxonomy of Viruses taxonomy using an in silico approach, aiming to identify the specific sequence for each virus. We applied this approach to norovirus in Caliciviridae, which causes 90% of gastroenteritis cases worldwide. First, based on the dogma "protein structure determines its function," we hypothesized that the specific sequence can be identified by the specific structure. Firstly, we extracted the coding region (CDS). Secondly, the CDS protein sequences of each genus were annotated by the conserved domain database (CDD) search. Finally, the conserved domains of each genus in Caliciviridae are classified by RPS-BLAST with CDD. The analysis result is that Caliciviridae has sequences including RNA helicase in common. In case of Norovirus, Calicivirus coat protein C terminal and viral polyprotein N-terminal appears as a specific domain in Caliciviridae. It does not include in the other genera in Caliciviridae. If this method is utilized to detect specific conserved domains, it can be used as classification keywords based on protein functional structure. After determining the specific protein domains, the specific protein domain sequences would be converted to gene sequences. This sequences would be re-used one of viral bio-marks.

Internal transcribed spacer (ITS) region의 염기서열 분석에 의한 보길도산 황칠나무의 분자 계통학적 연구 (Phylogenetic Analysis of Dendropanax morbifera Using Nuclear Ribosomal DNA Internal Transcribed Spacer (ITS) Region Sequences)

  • 신용국
    • 생명과학회지
    • /
    • 제26권11호
    • /
    • pp.1341-1344
    • /
    • 2016
  • 보길도에서 자라고 있는 황칠나무(Dendropanax morbifera)를 구입하여, 캘러스로 유도한 후, ribosomal DNA(nrDNA)의 internal transcribed spacer (ITS) region의 염기서열을 결정하였다 보길도의 황칠나무(Dendropanax morbifera)의 ITS region의 염기서열을 분석한 결과, 총 689염기를 결정하였다. 결정된 689염기 중에서 ITS1은 222 개염기, 5.8S rDNA는 160염기, ITS2는 233염기인 것으로 판명되었다. GenBank의 BLAST 프로그램(http://www.ncbi.nlm.nih.BLAST)을 사용하여 GenBank/EMBL/DDBJ에 등록되어 있는 Dendropanax 속 33의 염기서열을 수집한 후 multiple alignment를 수행한 결과, 유사도는 99.7%(D. chevalieri)에서 92.6%(Dendropanax arboreus)로 나타났으며, 일본황칠나무(D. trifidus)와는 유사도가 99.4%로 판명되었다.

전달정렬의 속도정합에 대한 지렛대 거리 오차의 가관측성 분석 (Observability Analysis of a Lever Arm Error for Velocity Matching in Transfer Alignment)

  • 양철관;심덕선
    • 전자공학회논문지
    • /
    • 제50권1호
    • /
    • pp.276-284
    • /
    • 2013
  • 본 논문에서는 관성항법시스템의 전달정렬시 속도정합 알고리듬에 대하여 지렛대 거리 오차의 가관측성 분석을 수행하였다. 이를 위해 지렛대 거리 오차를 포함한 칼만필터 상태 변수를 모델링하였고 측정 방정식을 구성하였다. 가관측성 분석 방법으로는 SOM을 이용하였고 다양한 항체의 운항 조건들에 대하여 가관측성 분석을 수행하였다. 기존의 지렛대 거리 오차를 포함한 가관측성 분석 기법들은 시뮬레이션을 통한 분석이 주를 이룬 반면에 본 논문에서는 상태 변수들이 완전 가관측하기 위한 항체의 운항 조건을 해석적으로 제시하였다. 그리고 시뮬레이션을 수행하여 분석 결과를 검증하였다.