• Title/Summary/Keyword: Genome-to-genome distance

Search Result 81, Processing Time 0.02 seconds

A note on the distance distribution paradigm for Mosaab-metric to process segmented genomes of influenza virus

  • Daoud, Mosaab
    • Genomics & Informatics
    • /
    • v.18 no.1
    • /
    • pp.7.1-7.7
    • /
    • 2020
  • In this paper, we present few technical notes about the distance distribution paradigm for Mosaab-metric using 1, 2, and 3 grams feature extraction techniques to analyze composite data points in high dimensional feature spaces. This technical analysis will help the specialist in bioinformatics and biotechnology to deeply explore the biodiversity of influenza virus genome as a composite data point. Various technical examples are presented in this paper, in addition, the integrated statistical learning pipeline to process segmented genomes of influenza virus is illustrated as sequential-parallel computational pipeline.

Genome-based identification of strain KCOM 1265 isolated from subgingival plaque at the species level

  • Park, Soon-Nang;Lim, Yun Kyong;Kook, Joong-Ki
    • International Journal of Oral Biology
    • /
    • v.45 no.2
    • /
    • pp.70-75
    • /
    • 2020
  • The aim of this study was to identify strain KCOM 1265 isolated from subgingival plaque at the species level by comparing 16S ribosomal RNA gene (16S rDNA) and genome sequences. The whole genome of strain KCOM 1265 was extracted using the phenol-chloroform extraction method. 16S rDNA was amplified using polymerase chain reaction and sequenced using the dideoxy chain termination method. Pairwise genome comparison was performed using average nucleotide identity (ANI) and genome-to-genome distance (GGD) analyses. The data showed that the percent similarity of 16S rDNA sequence of strain KCOM 1265 was 99.6% as compared with those of Fusobacterium polymorphum ATCC 10953T and Fusobacterium hwasookii KCOM 1249T. The ANI values of strain KCOM 1265 with F. polymorphum ATCC 10953T and F. hwasookii KCOM 1249T were 95.8% and 93.0%, respectively. The GGD values of strain KCOM 1265 with F. polymorphum ATCC 10953T and F. hwasookii KCOM 1249T were 63.9% and 49.6%, respectively. These results indicate that strain KCOM 1265 belongs to F. polymorphum.

Complete genome sequence of Fusobacterium vincentii KCOM 2931 isolated from a human periodontitis lesion (사람 치주염 병소에서 분리된 Fusobacterium vincentii KCOM 2931의 유전체 염기서열 해독)

  • Park, Soon-Nang;Lim, Yun Kyong;Shin, Ja Young;Roh, Hanseong;Kook, Joong-Ki
    • Korean Journal of Microbiology
    • /
    • v.54 no.1
    • /
    • pp.74-76
    • /
    • 2018
  • Recently, Fusobacterium nucleatum subsp. vincentii was reclassified as Fusobacterium vincentii based on the average nucleotide identity and genome-to-genome distance analyses. F. vincentii is a Gram-negative, anaerobic, and filament-shaped bacterium. F. vincentii is a member of normal flora of human oral cavity and plays a role in periodontal diseases. F. vincentii KCOM 2931 was isolated from a periodontitis lesion. Here, we present the complete genome sequence of F. vincentii KCOM 2931.

Draft genome sequence of Fusobacterium polymorphum KCOM 1001 isolated from a human subgingival dental plaque of gingivitis lesion (사람 치은염 병소 치은연하치면 세균막에서 분리된 Fusobacterium polymorphum KCOM 1001의 유전체 염기서열 해독)

  • Park, Soon-Nang;Lim, Yun Kyong;Shin, Ja Young;Roh, Hanseong;Kook, Joong-Ki
    • Korean Journal of Microbiology
    • /
    • v.54 no.1
    • /
    • pp.71-73
    • /
    • 2018
  • Recently, Fusobacterium nucleatum subsp. polymorphum was reclassified as Fusobacterium polymorphum based on the average nucleotide identity and genome-to-genome distance analyses. F. polymorphum is a Gram-negative, anaerobic, and filament-shaped bacterium. F. polymorphum is a part of normal flora of oral cavity and causative agent of periodontal diseases. F. polymorphum KCOM 1001 (= ChDC F119) was isolated from a human subgingival plaque of gingivitis lesion. Here, we present the complete genome sequence of F. polymorphum KCOM 1001.

Detecting outliers in segmented genomes of flu virus using an alignment-free approach

  • Daoud, Mosaab
    • Genomics & Informatics
    • /
    • v.18 no.1
    • /
    • pp.2.1-2.11
    • /
    • 2020
  • In this paper, we propose a new approach to detecting outliers in a set of segmented genomes of the flu virus, a data set with a heterogeneous set of sequences. The approach has the following computational phases: feature extraction, which is a mapping into feature space, alignment-free distance measure to measure the distance between any two segmented genomes, and a mapping into distance space to analyze a quantum of distance values. The approach is implemented using supervised and unsupervised learning modes. The experiments show robustness in detecting outliers of the segmented genome of the flu virus.

The complete mitochondrial genome sequence of the indigenous I pig (Sus scrofa) in Vietnam

  • Nguyen, Hieu Duc;Bui, Tuan Anh;Nguyen, Phuong Thanh;Kim, Oanh Thi Phuong;Vo, Thuy Thi Bich
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.30 no.7
    • /
    • pp.930-937
    • /
    • 2017
  • Objective: The I pig is a long nurtured longstanding breed in Vietnam, and contains excellent indigenous genetic resources. However, after 1970s, I pig breeds have become a small population because of decreasing farming areas and increasing pressure from foreign breeds with a high growth rate. Thus, there is now the risk of the disappearance of the I pigs breed. The aim of this study was to focus on classifying and identifying the I pig genetic origin and supplying molecular makers for conservation activities. Methods: This study sequenced the complete mitochondrial genome and used the sequencing result to analyze the phylogenetic relationship of I pig with Asian and European domestic pigs and wild boars. The full sequence was annotated and predicted the secondary tRNA. Results: The total length of I pig mitochondrial genome (accession number KX094894) was 16,731 base pairs, comprised two rRNA (12S and 16S), 22 tRNA and 13 mRNA genes. The annotation structures were not different from other pig breeds. Some component indexes as AT content, GC, and AT skew were counted, in which AT content (60.09%) was smaller than other pigs. We built the phylogenetic trees from full sequence and D loop sequence using Bayesian method. The result showed that I pig, Banna mini, wild boar (WB) Vietnam and WB Hainan or WB Korea, WB Japan were a cluster. They were a group within the Asian clade distinct from Chinese pigs and other Asian breeds in both phylogenetic trees (0.0004 and 0.0057, respectively). Conclusion: These results were similar to previous phylogenic study in Vietnamese pig and showed the genetic distinctness of I pig with other Asian domestic pigs.

Genome-Wide Association Study of Orthostatic Hypotension and Supine-Standing Blood Pressure Changes in Two Korean Populations

  • Hong, Kyung-Won;Kim, Sung Soo;Kim, Yeonjung
    • Genomics & Informatics
    • /
    • v.11 no.3
    • /
    • pp.129-134
    • /
    • 2013
  • Orthostatic hypotension (OH) is defined by a 20-mm Hg difference of systolic blood pressure (dtSBP) and/or a 10-mm Hg difference of diastolic blood pressure (dtDBP) between supine and standing, and OH is associated with a failure of the cardiovascular reflex to maintain blood pressure on standing from a supine position. To understand the underlying genetic factors for OH traits (OH, dtSBP, and dtDBP), genome-wide association studies (GWASs) using 333,651 single nucleotide polymorphisms (SNPs) were conducted separately for two population-based cohorts, Ansung (n = 3,173) and Ansan (n = 3,255). We identified 8 SNPs (5 SNPs for dtSBP and 3 SNPs for dtDBP) that were repeatedly associated in both the Ansung and Ansan cohorts and had p-values of < $1{\times}10^{-5}$ in the meta-analysis. Unfortunately, the SNPs of the OH case control GWAS did not pass our p-value criteria. Four of 8 SNPs were located in the intergenic region of chromosome 2, and the nearest gene (CTNNA2) was located at 1 Mb of distance. CTNNA2 is a linker between cadherin adhesion receptors and the actin cytoskeleton and is essential for stabilizing dendritic spines in rodent hippocampal neurons. Although there is no report about the function in blood pressure regulation, hippocampal neurons interact primarily with the autonomic nervous system and might be related to OH. The remaining SNPs, rs7098785 of dtSBP trait and rs6892553, rs16887217, and rs4959677 of dtDBP trait were located in the PIK3AP1 intron, ACTBL2-3' flanking, STAR intron, and intergenic region, respectively, but there was no clear functional link to blood pressure regulation.

Subspecific Status of the Korean Tiger Inferred by Ancient DNA Analysis

  • Lee, Mu-Yeong;Hyun, Jee-Yun;Lee, Seo-Jin;An, Jung-Hwa;Lee, Eun-Ok;Min, Mi-Sook;Kimura, Junpei;Kawada, Shin-Ichiro;Kurihara, Nozomi;Luo, Shu-Jin;O'Brien, Stephen J.;Johnson, Warren E.;Lee, Hang
    • Animal Systematics, Evolution and Diversity
    • /
    • v.28 no.1
    • /
    • pp.48-53
    • /
    • 2012
  • The tiger population that once inhabited the Korean peninsula was initially considered a unique subspecies (Panthera tigris coreensis), distinct from the Amur tiger of the Russian Far East (P. t. altaica). However, in the following decades, the population of P. t. coreensis was classified as P. t. altaica and hence forth the two populations have been considered the same subspecies. From an ecological point of view, the classification of the Korean tiger population as P. t. altaica is a plausible conclusion. Historically, there were no major dispersal barriers between the Korean peninsula and the habitat of Amur tigers in Far Eastern Russia and northeastern China that might prevent gene flow, especially for a large carnivore with long-distance dispersal abilities. However, there has yet to be a genetic study to confirm the subspecific status of the Korean tiger. Bone samples from four tigers originally caught in the Korean peninsula were collected from two museums in Japan and the United States. Eight mitochondrial gene fragments were sequenced and compared to previously published tiger subspecies' mtDNA sequences to assess the phylogenetic relationship of the Korean tiger. Three individuals shared an identical haplotype with the Amur tigers. One specimen grouped with Malayan tigers, perhaps due to misidentification or mislabeling of the sample. Our results support the conclusion that the Korean tiger should be classified as P. t. altaica, which has important implications for the conservation and reintroduction of Korean tigers.

Draft Genome Sequences of a Unique t324-ST541-V Methicillin-Resistant Staphylococcus aureus Strain from a Pig

  • Moon, Dong Chan;Kim, Byung-Yong;Nam, Hyang-Mi;Jang, Geum-Chan;Jung, Suk-Chan;Lee, Hee-Soo;Park, Yong-Ho;Lim, Suk-Kyung
    • Journal of Microbiology and Biotechnology
    • /
    • v.26 no.4
    • /
    • pp.799-805
    • /
    • 2016
  • Methicillin-resistant Staphylococcus aureus (MRSA), the major causative agent of nosocomial infection, has also been reported from non-human sources. A sequence type (ST) 541 MRSA isolate designated K12PJN53 was isolated from a healthy pig in 2012. The genome of K12PJN53 consists of 44 contiguous sequences (contigs), totalling 2,880,108 bases with 32.88% GC content. Among the annotated contigs, 14, 17, and 18 contained genes related to antimicrobial resistance, adherence, and toxin genes, respectively. The genomic distance of strain K12PJN53 was close to the ST398 strains. This is the first report of the draft genome sequence of a novel livestock-associated MRSA ST541 strain.

Predicting the Accuracy of Breeding Values Using High Density Genome Scans

  • Lee, Deuk-Hwan;Vasco, Daniel A.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.24 no.2
    • /
    • pp.162-172
    • /
    • 2011
  • In this paper, simulation was used to determine accuracies of genomic breeding values for polygenic traits associated with many thousands of markers obtained from high density genome scans. The statistical approach was based upon stochastically simulating a pedigree with a specified base population and a specified set of population parameters including the effective and noneffective marker distances and generation time. For this population, marker and quantitative trait locus (QTL) genotypes were generated using either a single linkage group or multiple linkage group model. Single nucleotide polymorphism (SNP) was simulated for an entire bovine genome (except for the sex chromosome, n = 29) including linkage and recombination. Individuals drawn from the simulated population with specified marker and QTL genotypes were randomly mated to establish appropriate levels of linkage disequilibrium for ten generations. Phenotype and genomic SNP data sets were obtained from individuals starting after two generations. Genetic prediction was accomplished by statistically modeling the genomic relationship matrix and standard BLUP methods. The effect of the number of linkage groups was also investigated to determine its influence on the accuracy of breeding values for genomic selection. When using high density scan data (0.08 cM marker distance), accuracies of breeding values on juveniles were obtained of 0.60 and 0.82, for a low heritable trait (0.10) and high heritable trait (0.50), respectively, in the single linkage group model. Estimates of 0.38 and 0.60 were obtained for the same cases in the multiple linkage group models. Unexpectedly, use of BLUP regression methods across many chromosomes was found to give rise to reduced accuracy in breeding value determination. The reasons for this remain a target for further research, but the role of Mendelian sampling may play a fundamental role in producing this effect.