• Title/Summary/Keyword: Whole genome

Search Result 576, Processing Time 0.034 seconds

Prediction of an Essential Gene with Potential Drug Target Property in Streptococcus suis Using Comparative Genomics

  • Zaman, Aubhishek
    • Interdisciplinary Bio Central
    • /
    • v.4 no.4
    • /
    • pp.11.1-11.8
    • /
    • 2012
  • Genes that are indispensable for survival are referred to as essential gene. Due to the momentous significance of these genes for cellular activity they can be selected potentially as drug targets. Here in this study, an essential gene for Streptococcus suis was predicted using coherent statistical analysis and powerful genome comparison computational method. At first the whole genome protein scatter plot was generated and subsequently, on the basis of statistical significance, a reference genome was chosen. The parameters set forth for selecting the reference genome was that the genome of the query (Streptococcus suis) and subject must fall in the same genus and yet they must vary to a good degree. Streptococcus pneumoniae was found to be suitable as the reference genome. A whole genome comparison was performed for the reference (Streptococcus pneumoniae) and the query genome (Streptococcus suis) and 14 conserved proteins from them were subjected to a screen for potential essential gene property. Among those 14 only one essential gene was found to be with impressive similarity score between reference and query. The essential gene encodes for a type of 'Clp protease'. Clp proteases play major roles in degrading misfolded proteins. Results found here should help formulating a drug against Strptococcus suis which is responsible for mild to severe clinical conditions in human. However, like many other computational studies, the study has to be validated furthermore through in vitro assays for concrete proof.

Verifying Orthologous Paralogenes using Whole Genome Alignment

  • Chan, P.Y.;Lam, T.W.;Yiu, S.M.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.109-112
    • /
    • 2005
  • Identifying orthologous paralogenes is a fundamental problem in comparative genomics and can facilitate the study of evolutionary history of the species. Existing approaches for locating paralogs make use of local alignment based algorithms such as BLAST. However, there are cases that genes with high alignment scores are not paralogenes. On the other hand, whole genome alignment tools are designed to locate orthologs. Most of these tools are based on some unique substrings (called anchors) in the corresponding orthologous pair to identify them. Intuitively, these tools may not be useful in identifying orthologous paralogenes as paralogenes are very similar and there may not be enough unique anchors. However, our study shows that this is not true. Paralogenes although are similar, they have undergone different mutations. So, there are enough unique anchors for identifying them. Our contributions include the followings. Based on this counter-intuitive finding, we propose to employ the whole genome alignment tools to help verifying paralogenes. Our experiments on five pairs of human-mouse chromosomes show that our approach is effective and can identify most of the mis-classified paralog groups (more than 80%). We verify our finding that whole genome alignment tools are able to locate orthologous paralogenes through a simulation study. The result from the study confirms our finding.

  • PDF

Current status of whole-genome sequences of Korean angiosperms

  • Jongsun PARK;Yunho YUN;Hong XI;Woochan KWON;Janghyuk SON
    • Korean Journal of Plant Taxonomy
    • /
    • v.53 no.3
    • /
    • pp.181-200
    • /
    • 2023
  • Owing to the rapid development of sequencing technologies, more than 1,000 plant genomes have been sequenced and released. Among them, 69 Korean plant taxa (85 genome sequences) contain at least one whole-genome sequence despite the fact that some samples were not collected in Korea. The sequencing-by-synthesis method (next-generation sequencing) and the PacBio (third-generation sequencing) method were the most commonly used in studies appearing in 65 publications. Several scaffolding methods, such as the Hi-C and 10x types, have also been used for pseudo-chromosomal assembly. The most abundant families among the 69 taxa are Rosaceae (10 taxa), Brassicaceae (7 taxa), Fabaceae (7 taxa), and Poaceae (7 taxa). Due to the rapid release of plant genomes, it is necessary to assemble the current understanding of Korean plant species not only to understand their whole genomes as our own plant resources but also to establish new tools for utilizing plant resources efficiently with various analysis pipelines, including AI-based engines.

Development of InDel markers to identify Capsicum disease resistance using whole genome resequencing

  • Karna, Sandeep;Ahn, Yul-Kyun
    • Journal of Plant Biotechnology
    • /
    • v.45 no.3
    • /
    • pp.228-235
    • /
    • 2018
  • In this study, two pepper varieties, PRH1 (powdery mildew resistance line) and Saengryeg (powdery mildew resistance line), were resequenced using next generation sequencing technology in order to develop InDel markers. The genome-wide discovery of InDel variation was performed by comparing the whole-genome resequencing data of two pepper varieties to the Capsicum annuum cv. CM334 reference genome. A total of 334,236 and 318,256 InDels were identified in PRH1 and Saengryeg, respectively. The greatest number of homozygous InDels were discovered on chromosome 1 in PRH1 (24,954) and on chromosome 10 (29,552) in Saengryeg. Among these homozygous InDels, 19,094 and 4,885 InDels were distributed in the genic regions of PRH1 and Saengryeg, respectively, and 198,570 and 183,468 InDels were distributed in the intergenic regions. We have identified 197,821 polymorphic InDels between PRH1 and Saengryeg. A total of 11,697 primers sets were generated, resulting in the discovery of four polymorphic InDel markers. These new markers will be utilized in order to identify disease resistance genotypes in breeding populations. Therefore, our results will make a one-step advancement in whole genome resequencing and add genetic resource datasets in pepper breeding research.

Complete Genome Sequence of Enterococcus faecalis CAUM157 Isolated from Raw Cow's Milk

  • Elnar, Arxel G.;Lim, Sang-Dong;Kim, Geun-Bae
    • Journal of Dairy Science and Biotechnology
    • /
    • v.38 no.3
    • /
    • pp.142-145
    • /
    • 2020
  • Enterococcus faecalis CAUM157, isolated from raw cow's milk, is a Gram-positive, facultatively anaerobic, and non-spore-forming bacterium capable of inhabiting a wide range of environmental niches. E. faecalis CAUM157 was observed to produce a two-peptide bacteriocin that had a wide range of activity against several pathogens, including Listeria monocytogenes, Staphylococcus aureus, and periodontitis-causing bacteria. The whole genome of E. faecalis CAUM157 was sequenced using the PacBio RS II platform, revealing a genome size of 2,972,812 bp with a G+C ratio of 37.44%, assembled into two contigs. Annotation analysis revealed 2,830 coding sequences, 12 rRNAs, and 61 tRNAs. Further, in silico analysis of the genome identified a single bacteriocin gene cluster.

Complete genome sequence of Streptococcus hyointestinalis B19, a strain producing bacteriocin, isolated from chicken feces

  • Lee, Ju-Eun;Heo, Sunhak;Kim, Geun-Bae
    • Journal of Animal Science and Technology
    • /
    • v.62 no.3
    • /
    • pp.420-422
    • /
    • 2020
  • Streptococcus hyointestinalis B19 was isolated from chicken feces collected from local farm in Anseong, Korea. S. hyointestinalis B19 was shown to produce bacteriocin-like compounds exhibiting inhibitory activities against several pathogens including strains of Clostridium perfringens and Listeria monocytogenes. The whole genome of S. hyointestinalis B19 strain was sequenced using PacBio RS II platform. The genome comprised four contigs with a size of 2,217,061 bp. The DNA G + C content was found to be 42.95 mol%. Annotation results revealed 2,266 coding sequences (CDSs), 18 rRNAs, and 61 tRNA genes. Based on genome analysis, we found that the strain B19 possessed various genes associated with bacteriocin synthesis, modification, and transport.

Complete genome sequence of Bacillus coagulans CACC834 isolated from canine

  • Kim, Jung-Ae;Kim, Dae-Hyuk;Kim, Yangseon
    • Journal of Animal Science and Technology
    • /
    • v.63 no.6
    • /
    • pp.1464-1467
    • /
    • 2021
  • Bacillus coagulans CACC 834 was isolated from canine feces, and its potential probiotic properties were characterized by functional genome analysis. Whole-genome sequencing of B. coagulans CACC 834 was performed using the PacBio RSII platforms. The complete genome assembly consisted of one circular chromosome (3.1 Mb) with guanine (G) + cytosine (C) content of 47.1%. Annotation revealed 3,181 protein-coding sequences (CDSs), 30 rRNAs, and 83 tRNAs. Gene associated 11% of the genes were involved in replication, recombination, and repair. We also annotated various stress-related, acid resistance, bile salt resistance and adhesion-related domains in this strain, which likely provide support in exerting probiotic action by survival under gastrointestinal tract. These results add to our comprehensive understanding of B. coagulans and suggest potential mammal-related industrial applications.

High Resolution Whole Genome Multilocus Sequence Typing (wgMLST) Schemes for Salmonella enterica Weltevreden Epidemiologic Investigations

  • Tadee, Pakpoom;Tadee, Phacharaporn;Hitchings, Matthew D.;Pascoe, Ben;Sheppard, Samuel K.;Patchanee, Prapas
    • Microbiology and Biotechnology Letters
    • /
    • v.46 no.2
    • /
    • pp.162-170
    • /
    • 2018
  • Non-typhoidal Salmonella is one of the main pathogens causing food-borne illness in humans, with up to 20% of cases resulting from consumption of pork products. Over the gastroenteritis signs, multidrug resistant Salmonella has arisen. In this study, pan-susceptible phenotypic strains of Salmonella enterica serotype Weltevreden recovered from pig production chain in Chiang Mai, Thailand during 2012-2014 were chosen for analysis. The aim of this study was to use whole genome sequencing (WGS) data with an emphasis on antimicrobial resistance gene investigation to assess their pathogenic potential and genetic diversity determination based on whole genome Multilocus Sequence Typing (wgMLST) to expand epidemiological knowledge and to provide additional guidance for disease control. Analyis using ResFinder 3.0 for WGS database tracing found that one of pan-susceptible phenotypic strain carried five classes of resistance genes: aminoglycoside, beta-lactam, phenicol, sulfonamide, and tetracycline associated genes. Twenty four and 36 loci differences were detected by core genome Multilocus Sequence Typing (cgMLST) and pan genome Multilocus Sequence Typing (pgMLST), respectively, in two matching strains (44/13 vs A543057 and A543056 vs 204/13) initially assigned by conventional MLST and Pulsed-field Gel Electrophoresis (PFGE). One hundread percent discriminant ability can be achieved using the wgMLST technique. WGS is currently the ultimate molecular technique for various in-depth studies. As the findings stated above, a new of "gold standard typing method era" for routine works in genome study is being set.

Analysis of genome variants in dwarf soybean lines obtained in F6 derived from cross of normal parents (cultivated and wild soybean)

  • Roy, Neha Samir;Ban, Yong-Wook;Yoo, Hana;Ramekar, Rahul Vasudeo;Cheong, Eun Ju;Park, Nam-Il;Na, Jong Kuk;Park, Kyong-Cheul;Choi, Ik-Young
    • Genomics & Informatics
    • /
    • v.19 no.2
    • /
    • pp.19.1-19.9
    • /
    • 2021
  • Plant height is an important component of plant architecture and significantly affects crop breeding practices and yield. We studied DNA variations derived from F5 recombinant inbred lines (RILs) with 96.8% homozygous genotypes. Here, we report DNA variations between the normal and dwarf members of four lines harvested from a single seed parent in an F6 RIL population derived from a cross between Glycine max var. Peking and Glycine soja IT182936. Whole genome sequencing was carried out, and the DNA variations in the whole genome were compared between the normal and dwarf samples. We found a large number of DNA variations in both the dwarf and semi-dwarf lines, with one single nucleotide polymorphism (SNP) per at least 3.68 kb in the dwarf lines and 1 SNP per 11.13 kb of the whole genome. This value is 2.18 times higher than the expected DNA variation in the F6 population. A total of 186 SNPs and 241 SNPs were discovered in the coding regions of the dwarf lines 1282 and 1303, respectively, and we discovered 33 homogeneous nonsynonymous SNPs that occurred at the same loci in each set of dwarf and normal soybean. Of them, five SNPs were in the same positions between lines 1282 and 1303. Our results provide important information for improving our understanding of the genetics of soybean plant height and crop breeding. These polymorphisms could be useful genetic resources for plant breeders, geneticists, and biologists for future molecular biology and breeding projects.