• 제목/요약/키워드: genome annotation

검색결과 179건 처리시간 0.022초

HExDB: Human EXon DataBase for Alternative Splicing Pattern Analysis

  • Park, Junghwan;Lee, Minho;Bhak, Jong
    • Genomics & Informatics
    • /
    • 제3권3호
    • /
    • pp.80-85
    • /
    • 2005
  • HExDB is a database for analyzing exon and splicing pattern information in Homo sapiens. HExDB is useful for specific purposes: 1) to design primers for exon amplification from cDNA and 2) to understand the change of ORFs by alternative splicing. HExDB was constructed by integrating data from AltExtron which is the computationally predicted exon database, Ensemble cDNA annotation, and Affymetrix genome tile published recently. Although it may contain false positive data, HExDB is good starting point due to its sensitivity. At present, there areas many as 2,046,519 exons stored in the HExDB. We found that $16.8\%$ of the exons in the database was constitutive exons and $83.1\%$ were novel gene exons.

녹섹(NOGSEC): A NOnparametric method for Genome SEquence Clustering (NOGSEC: A NOnparametric method for Genome SEquence Clustering)

  • 이영복;김판규;조환규
    • 미생물학회지
    • /
    • 제39권2호
    • /
    • pp.67-75
    • /
    • 2003
  • 비교유전체학의 주요 주제 중 유전자서열을 분류하고 단백질기능을 예측하는 연구가 있으며, 이를 위해 단백질 구조, 공통서열 및 바인딩 위치 예측등의 방법과 함께, 전유전체 서열에서 구해지는 유사도 그래프를 분석해 상동유전자를 검색하는 계산학적인 접근방법이 있다. 유사도그래프를 사용한 방법은 서열에 대한 기존 지식에 의존하지 않는 장점이 있지만 유사도 하한값과 같은 주관적인 임계값이 필요한 단점이 있다. 본 논문에서는 반복적으로 그래프를 분해하는 이전의 방법을 일반화시켜, 유사도 그래프에 기반한 유전자 서열군집분석 방법론과 객관적이고 안정적인 파라미터 임계값 계산 방법을 제안한다. 제시된 방법으로 알려진 미생물 유전체 서 열을 분석하여 이전의 방법인 BAG 알고리즘 결과와 비교했다.

미생물 유전체 프로젝트 수행을 위한 Base-Calling 오류 감지 프로그램 및 알고리즘 개발 (A Base-Calling Error Detection Program for Use in Microbial Genome Projects)

  • 이대상;박기정
    • 미생물학회지
    • /
    • 제43권4호
    • /
    • pp.317-320
    • /
    • 2007
  • 미생물 유전체 프로젝트를 수행하는 과정에서 발생하는 base-calling 오류를 포함하는 것으로 의심되는 유전자나 염기서열의 리스트를 보여 주는 프로그램을 개발하였다. 이 프로그램의 모듈들은 base-calling 오류로 의심되는 염기들의 후보군을 유전체 프로젝트를 수행하는 주요 단계에서 감지할 수 있도록 하였다. 이들 프로그램들은 초기 단계에서는 Phrap 파일에 존재하는 contig assembly 정보를 이용하여 base-calling 오류를 감지하는 모듈, 중간 단계에서는 상동성 검색 결과물로부터 frame skift 돌연변이의 진위 유무를 분석할 수 있는 모듈, 마지막 단계에서는, 이미 발표된 미생물 유전체와 같은 종으로부터 유래된 균주에 대한 유전체 프로젝트를 수행할 경우, 비교유전체 분석 기법을 활용하여 base-calling 오류 가능성이 높은 서열의 후보군을 추출하여 해당서열의 크로마토그램파일을 유전체 연구자가 볼 수 있는 모듈로 구성되어 있다.

Comparative genome characterization of Leptospira interrogans from mild and severe leptospirosis patients

  • Anuntakarun, Songtham;Sawaswong, Vorthon;Jitvaropas, Rungrat;Praianantathavorn, Kesmanee;Poomipak, Witthaya;Suputtamongkol, Yupin;Chirathaworn, Chintana;Payungporn, Sunchai
    • Genomics & Informatics
    • /
    • 제19권3호
    • /
    • pp.31.1-31.9
    • /
    • 2021
  • Leptospirosis is a zoonotic disease caused by spirochetes from the genus Leptospira. In Thailand, Leptospira interrogans is a major cause of leptospirosis. Leptospirosis patients present with a wide range of clinical manifestations from asymptomatic, mild infections to severe illness involving organ failure. For better understanding the difference between Leptospira isolates causing mild and severe leptospirosis, illumina sequencing was used to sequence genomic DNA in both serotypes. DNA of Leptospira isolated from two patients, one with mild and another with severe symptoms, were included in this study. The paired-end reads were removed adapters and trimmed with Q30 score using Trimmomatic. Trimmed reads were constructed to contigs and scaffolds using SPAdes. Cross-contamination of scaffolds was evaluated by ContEst16s. Prokka tool for bacterial annotation was used to annotate sequences from both Leptospira isolates. Predicted amino acid sequences from Prokka were searched in EggNOG and David gene ontology database to characterize gene ontology. In addition, Leptospira from mild and severe patients, that passed the criteria e-value < 10e-5 from blastP against virulence factor database, were used to analyze with Venn diagram. From this study, we found 13 and 12 genes that were unique in the isolates from mild and severe patients, respectively. The 12 genes in the severe isolate might be virulence factor genes that affect disease severity. However, these genes should be validated in further study.

Complete Chloroplast Genome assembly and Annotation of Milk Thistle (Silybum marianum) and Phylogenetic Analysis

  • Hwajin Jung;Yedomon Ange Bovys Zoclanclounon;Jeongwoo Lee;Taeho Lee;Jeonggu Kim;Guhwang Park;Keunpyo Lee;Kwanghoon An;Jeehyoung Shim;Joonghyoun Chin;Suyoung Hong
    • 한국작물학회:학술대회논문집
    • /
    • 한국작물학회 2022년도 추계학술대회
    • /
    • pp.210-210
    • /
    • 2022
  • Silybum marianum is an annual or biennial plant from the Asteraceae family. It can grow in low-nutrient soil and drought conditions, making it easy to cultivate. From the seed, a specialized plant metabolite called silymarin (flavonolignan complex) is produced and is known to alleviate the liver from hepatitis and toxins damages. To infer the phylogenetic placement of a Korean milk thistle, we conducted a chloroplast assembly and annotation following by a comparison with existing Chinese reference genome (NC_028027). The chloroplast genome structure was highly similar with an assembly size of 152,642 bp, an 153,202 bp for Korean and Chinese milk thistle respectively. Moreover, there were similarities at the gene level, coding sequence (n = 82), transfer RNA (n = 31) and ribosomal RNA (n = 4). From all coding sequences gene set, the phylogenetic tree inference placed the Korean cultivar into the milk thistle clade; corroborating the expected tree. Moreover, an investigation the tree based only on the ycf1 gene confirmed the same tree; suggesting that ycf1 gene is a potential marker for DNA barcoding and population diversity study in milk thistle genus. Overall, the provided data represents a valuable resource for population genomics and species-centered determination since several species have been reported in the Silybum genus.

  • PDF

In-silico characterization and structure-based functional annotation of a hypothetical protein from Campylobacter jejuni involved in propionate catabolism

  • Mazumder, Lincon;Hasan, Mehedi;Rus’d, Ahmed Abu;Islam, Mohammad Ariful
    • Genomics & Informatics
    • /
    • 제19권4호
    • /
    • pp.43.1-43.12
    • /
    • 2021
  • Campylobacter jejuni is one of the most prevalent organisms associated with foodborne illness across the globe causing campylobacteriosis and gastritis. Many proteins of C. jejuni are still unidentified. The purpose of this study was to determine the structure and function of a non-annotated hypothetical protein (HP) from C. jejuni. A number of properties like physiochemical characteristics, 3D structure, and functional annotation of the HP (accession No. CAG2129885.1) were predicted using various bioinformatics tools followed by further validation and quality assessment. Moreover, the protein-protein interactions and active site were obtained from the STRING and CASTp server, respectively. The hypothesized protein possesses various characteristics including an acidic pH, thermal stability, water solubility, and cytoplasmic distribution. While alpha-helix and random coil structures are the most prominent structural components of this protein, most of it is formed of helices and coils. Along with expected quality, the 3D model has been found to be novel. This study has identified the potential role of the HP in 2-methylcitric acid cycle and propionate catabolism. Furthermore, protein-protein interactions revealed several significant functional partners. The in-silico characterization of this protein will assist to understand its molecular mechanism of action better. The methodology of this study would also serve as the basis for additional research into proteomic and genomic data for functional potential identification.

Identification of long non-coding RNA-mRNA interactions and genome-wide lncRNA annotation in animal transcriptome profiling

  • Yoon-Been Park;Jun-Mo Kim
    • Journal of Animal Science and Technology
    • /
    • 제65권2호
    • /
    • pp.293-310
    • /
    • 2023
  • Protein-translated mRNA analysis has been extensively used to determine the function of various traits in animals. The non-coding RNA (ncRNA), which was known to be non-functional because it was not encoded as a protein, was re-examined as it was studied to actually function. One of the ncRNAs, long non-coding RNA (lncRNA), is known to have a function of regulating mRNA expression, and its importance is emerging. Therefore, lncRNAs are currently being used to understand the traits of various animals as well as human diseases. However, studies on lncRNA annotation and its functions are still lacking in most animals except humans and mice. lncRNAs have unique characteristics of lncRNAs and interact with mRNA through various mechanisms. In order to make lncRNA annotations in animals in the future, it is essential to understand the characteristics of lncRNAs and the mechanisms by which lncRNAs function. In addition, this will allow lncRNAs to be used for a wider variety of traits in a wider range of animals, and it is expected that integrated analysis using other biological information will be possible.

Visualizing the phenotype diversity: a case study of Alexander disease

  • Dohi, Eisuke;Bangash, Ali Haider
    • Genomics & Informatics
    • /
    • 제19권3호
    • /
    • pp.28.1-28.4
    • /
    • 2021
  • Since only a small number of patients have a rare disease, it is difficult to identify all of the features of these diseases. This is especially true for patients uncommonly presenting with rare diseases. It can also be difficult for the patient, their families, and even clinicians to know which one of a number of disease phenotypes the patient is exhibiting. To address this issue, during Biomedical Linked Annotation Hackathon 7 (BLAH7), we tried to extract Alexander disease patient data in Portable Document Format. We then visualized the phenotypic diversity of those Alexander disease patients with uncommon presentations. This led to us identifying several issues that we need to overcome in our future work.

참다래 유전체 연구 동향 (Current status and prospects of kiwifruit (Actinidia chinensis) genomics)

  • 김성철;김호방;좌재호;송관정
    • Journal of Plant Biotechnology
    • /
    • 제42권4호
    • /
    • pp.342-349
    • /
    • 2015
  • 키위는 세계적으로 1970년대 이후 상업화되어 최근 재배가 급속히 확대되고 있는 신종 과수이며, 국내에서도 재배와 소비량이 급격히 증가하고 있다. 키위는 자웅이주 낙엽성 덩굴 식물로 과피에 털이 있고 과육색이 다양한 특성을 가지고 있으며 배수성도 다양하나, 산업적인 품종 구성은 매우 단순하다. 독특한 식물적 특성에 기인한 진화 및 생물학적 관점은 물론 다양한 품종의 효율적 개발의 요구에 따라 최근 유전체 해석 및 활용 연구가 활발히 진행되고 있다. 키위 유전체 draft 서열과 엽록체 서열이 Illumina HiSeq 기반으로 각각 2013년과 2015년에 해독 되었으며 gene annotation 연구가 계속적으로 진행되고 있다. 과거 ESTs 기반의 전사체 분석에서 최근 RNA-seq 기반의 전사체 분석으로 전환되어 과일의 아스코르브산 생합성, 과육색 발현 및 성숙, 그리고 나무의 궤양병 저항성 관련 유전적 발현조절과 유전자 발굴 연구가 중점적으로 진행되고 있다. 전통육종의 효율을 증대하기 위한 분자표지 개발 및 유전자지도 작성에 있어서는 이전의 RFLP, RAPD, AFLP 기반의 연구에서 벗어나 NGS 기반의 유전체 및 전사체 정보의 해독에 의한 SSR 및 SNP 기반의 농업적으로 중요한 형질연관 분자마커 개발 및 고밀도 유전자지도 작성이 연구되고 있다. 그러나 국내 연구는 아직 제한적인 수준에서 진행되고 있다. 향후 키위 유전체 및 전사체 분석 연구는 가까운 장래에 실질적으로 분자육종에 적용될 것으로 전망된다.

A replication study of genome-wide CNV association for hepatic biomarkers identifies nine genes associated with liver function

  • Kim, Hyo-Young;Byun, Mi-Jeong;Kim, Hee-Bal
    • BMB Reports
    • /
    • 제44권9호
    • /
    • pp.578-583
    • /
    • 2011
  • Aspartate aminotransferase (AST) and alanine aminotransferase (ALT) are biochemical markers used to test for liver diseases. Copy number variation (CNV) plays an important role in determining complex traits and is an emerging area in the study various diseases. We performed a genome-wide association study with liver function biomarkers AST and ALT in 407 unrelated Koreans. We assayed the genome-wide variations on an Affymetrix Genome-Wide 6.0 array, and CNVs were analyzed using HelixTree. Using single linear regression, 32 and 42 CNVs showed significance for AST and ALT, respectively (P value < 0.05). We compared CNV-based genes between the current study (KARE2; AST-140, ALT-172) and KARE1 (AST-1885, ALT-773) using NetBox. Results showed 9 genes (CIDEB, DFFA, PSMA3, PSMC5, PSMC6, PSMD12, PSMF1, SDC4, and SIAH1) were overlapped for AST, but no overlapped genes were found for ALT. Functional gene annotation analysis shown the proteasome pathway, Wnt signaling pathway, programmed cell death, and protein binding.