• Title/Summary/Keyword: Long reads

Search Result 34, Processing Time 0.024 seconds

Trimming conditions for DADA2 analysis in QIIME2 platform

  • Lee, Seo-Young;Yu, Yeuni;Chung, Jin;Na, Hee Sam
    • International Journal of Oral Biology
    • /
    • v.46 no.3
    • /
    • pp.146-153
    • /
    • 2021
  • Accurate identification of microbes facilitates the prediction, prevention, and treatment of human diseases. To increase the accuracy of microbiome data analysis, a long region of the 16S rRNA is commonly sequenced via paired-end sequencing. In paired-end sequencing, a sufficient length of overlapping region is required for effective joining of the reads, and high-quality sequencing reads are needed at the overlapping region. Trimming sequences at the reads distal to a point where sequencing quality drops below a specific threshold enhance the joining process. In this study, we examined the effect of trimming conditions on the number of reads that remained after quality control and chimera removal in the Illumina paired-end reads of the V3-V4 hypervariable region. We also examined the alpha diversity and taxa assigned by each trimming condition. Optimum quality trimming increased the number of good reads and assigned more number of operational taxonomy units. The pre-analysis trimming step has a great influence on further microbiome analysis, and optimized trimming conditions should be applied for Divisive Amplicon Denoising Algorithm 2 analysis in QIIME2 platform.

Birth of an 'Asian cool' reference genome: AK1

  • Kim, Changhoon
    • BMB Reports
    • /
    • v.49 no.12
    • /
    • pp.653-654
    • /
    • 2016
  • The human reference genome, maintained by the Genome Reference Consortium, is conceivably the most complete genome assembly ever, since its first construction. It has continually been improved by incorporating corrections made to the previous assemblies, thanks to various technological advances. Many currently-ongoing population sequencing projects have been based on this reference genome, heightening hopes of the development of useful medical applications of genomic information, thanks to the recent maturation of high-throughput sequencing technologies. However, just one reference genome does not fit all the populations across the globe, because of the large diversity in genomic structures and technical limitations inherent to short read sequencing methods. The recent success in de novo construction of the highly contiguous Asian diploid genome AK1, by combining single molecule technologies with routine sequencing data without resorting to traditional clone-by-clone sequencing and physical mapping, reveals the nature of genomic structure variation by detecting thousands of novel structural variations and by finally filling in some of the prior gaps which had persistently remained in the current human reference genome. Now it is expected that the AK1 genome, soon to be paired with more upcoming de novo assembled genomes, will provide a chance to explore what it is really like to use ancestry-specific reference genomes instead of hg19/hg38 for population genomics. This is a major step towards the furthering of genetically-based precision medicine.

Design of HMI software Interoperable with OPC-DA Server (OPC Server와 연동되는 HMI 소프트웨어 설계)

  • Cha, Jae-Pil;Jang, Dong-Wook;Jo, Sang-Hyun;Sun, Bok-Keun;Kim, Su-Hee;Han, Kwang-Rok
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.801-802
    • /
    • 2006
  • The present study purposes to develop HMI software that runs on Windows CE .NET platform without being bound to specific HMI equipment and accesses data in PLC equipment through interoperation with an OPC-DA server. As the OPC-DA server reads data in PLC equipment, HMI system does not need to be configured differently for different equipments. In addition, when the interface environment of specific equipment changes, it does not require the application of new equipment or the change of communication protocol. As HMI system runs on Windows CE .NET platform, it can be configured using common embedded devices based on Windows CE .NET platform. HMI software reads data in PLC equipment through RS-232C communication interface. In addition, because it connects to an OPC-DA server through Ethernet communication interface, it can access data in PLC equipment as long as Ethernet is usable.

  • PDF

Storing Digital Information in Long-Read DNA

  • Ahn, TaeJin;Ban, Hamin;Park, Hyunsoo
    • Genomics & Informatics
    • /
    • v.16 no.4
    • /
    • pp.30.1-30.6
    • /
    • 2018
  • There is urgent need for effective and cost-efficient data storage, as the worldwide requirement for data storage is rapidly growing. DNA has introduced a new tool for storing digital information. Recent studies have successfully stored digital information, such as text and gif animation. Previous studies tackled technical hurdles due to errors from DNA synthesis and sequencing. Studies also have focused on a strategy that makes use of 100-150-bp read sizes in both synthesis and sequencing. In this paper, we a suggest novel data encoding/decoding scheme that makes use of long-read DNA (~1,000 bp). This enables accurate recovery of stored digital information with a smaller number of reads than the previous approach. Also, this approach reduces sequencing time.

Analysis of unmapped regions associated with long deletions in Korean whole genome sequences based on short read data

  • Lee, Yuna;Park, Kiejung;Koh, Insong
    • Genomics & Informatics
    • /
    • v.17 no.4
    • /
    • pp.40.1-40.9
    • /
    • 2019
  • While studies aimed at detecting and analyzing indels or single nucleotide polymorphisms within human genomic sequences have been actively conducted, studies on detecting long insertions/deletions are not easy to orchestrate. For the last 10 years, the availability of long read data of human genomes from PacBio or Nanopore platforms has increased, which makes it easier to detect long insertions/deletions. However, because long read data have a critical disadvantage due to their relatively high cost, many next generation sequencing data are produced mainly by short read sequencing machines. Here, we constructed programs to detect so-called unmapped regions (UMRs, where no reads are mapped on the reference genome), scanned 40 Korean genomes to select UMR long deletion candidates, and compared the candidates with the long deletion break points within the genomes available from the 1000 Genomes Project (1KGP). An average of about 36,000 UMRs were found in the 40 Korean genomes tested, 284 UMRs were common across the 40 genomes, and a total of 37,943 UMRs were found. Compared with the 74,045 break points provided by the 1KGP, 30,698 UMRs overlapped. As the number of compared samples increased from 1 to 40, the number of UMRs that overlapped with the break points also increased. This eventually reached a peak of 80.9% of the total UMRs found in this study. As the total number of overlapped UMRs could probably grow to encompass 74,045 break points with the inclusion of more Korean genomes, this approach could be practically useful for studies on long deletions utilizing short read data.

Transcriptome-based identification of water-deficit stress responsive genes in the tea plant, Camellia sinensis

  • Tony, Maritim;Samson, Kamunya;Charles, Mwendia;Paul, Mireji;Richard, Muoki;Mark, Wamalwa;Stomeo, Francesca;Sarah, Schaack;Martina, Kyalo;Francis, Wachira
    • Journal of Plant Biotechnology
    • /
    • v.43 no.3
    • /
    • pp.302-310
    • /
    • 2016
  • A study aimed at identifying putative drought responsive genes that confer tolerance to water stress deficit in tea plants was conducted in a 'rain-out shelter' using potted plants. Eighteen months old drought tolerant and susceptible tea cultivars were each separately exposed to water stress or control conditions of 18 or 34% soil moisture content, respectively, for three months. After the treatment period, leaves were harvested from each treatment for isolation of RNA and cDNA synthesis. The cDNA libraries were sequenced on Roche 454 high-throughput pyrosequencing platform to produce 232,853 reads. After quality control, the reads were assembled into 460 long transcripts (contigs). The annotated contigs showed similarity with proteins in the Arabidopsis thaliana proteome. Heat shock proteins (HSP70), superoxide dismutase (SOD), catalase (cat), peroxidase (PoX), calmodulinelike protein (Cam7) and galactinol synthase (Gols4) droughtrelated genes were shown to be regulated differently in tea plants exposed to water stress. HSP70 and SOD were highly expressed in the drought tolerant cultivar relative to the susceptible cultivar under drought conditions. The genes and pathways identified suggest efficient regulation leading to active adaptation as a basal defense response against water stress deficit by tea. The knowledge generated can be further utilized to better understand molecular mechanisms underlying stress tolerance in tea.

Generation of single stranded DNA with selective affinity to bovine spermatozoa

  • Vinod, Sivadasan Pathiyil;Vignesh, Rajamani;Priyanka, Mani;Tirumurugaan, Krishnaswamy Gopalan;Sivaselvam, Salem Nagalingam;Raj, Gopal Dhinakar
    • Animal Bioscience
    • /
    • v.34 no.10
    • /
    • pp.1579-1589
    • /
    • 2021
  • Objective: This study was conducted to generate single stranded DNA oligonucleotides with selective affinity to bovine spermatozoa, assess its binding potential and explore its potential utility in trapping spermatozoa from suspensions. Methods: A combinatorial library of 94 mer long oligonucleotide was used for systematic evolution of ligands by exponential enrichment (SELEX) with bovine spermatozoa. The amplicons from sixth and seventh rounds of SELEX were sequenced, and the reads were clustered employing cluster database at high identity with tolerance (CD-HIT) and FASTAptamer. The enriched nucleotides were predicted for secondary structures by Mfold, motifs by Multiple Em for Motif Elicitation and 5' labelled with biotin/6-FAM to determine the binding potential and binding pattern. Results: We generated 14.1 and 17.7 million reads from sixth and seventh rounds of SELEX respectively to bovine spermatozoa. The CD-HIT clustered 78,098 and 21,196 reads in the top ten clusters and FASTAptamer identified 2,195 and 4,405 unique sequences in the top three clusters from the sixth and seventh rounds, respectively. The identified oligonucleotides formed secondary structures with delta G values between -1.17 to -26.18 kcal/mol indicating varied stability. Confocal imaging with the oligonucleotides from the seventh round revealed different patterns of binding to bovine spermatozoa (fluorescence of the whole head, spot of fluorescence in head and mid- piece and tail). Use of a 5'-biotin tagged oligonucleotide from the sixth round at 100 pmol with 4×106 spermatozoa could trap almost 80% from the suspension. Conclusion: The binding patterns and ability of the identified oligonucleotides confirms successful optimization of the SELEX process and generation of aptamers to bovine spermatozoa. These oligonucleotides provide a quick approach for selective capture of spermatozoa from complex samples. Future SELEX rounds with X- or Y- enriched sperm suspension will be used to generate oligonucleotides that bind to spermatozoa of a specific sex type.

Low-Power DTMB Deinterleaver Structure Using Buffer Transformation and Single-Pointer Register Structure (버퍼 변환과 단일 위치 레지스터 구조를 이용한 저전력 DTMB 디인터리버 구조)

  • Kang, Hyeong-Ju
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.5
    • /
    • pp.1135-1140
    • /
    • 2011
  • This paper proposes a DTMB deinterleaver structure to reduce the SDRAM power consumption with buffer conversion and the single pointer-register structure. The DTMB deinterleaver with deep interleaving for higher performance consists of long delay buffers allocated on SDRAM. The conventional structure activates a new SDRAM row almost everytime when it reads and writes a datum. In the proposed structure, long buffers are transformed into several short buffers so that the number of row activations is reduced. The single pointer-register structure solves the problem of many pointer-registers. The experimental results show that the SDRAM power consumption can be reduced to around 37% with slight logic area reduction.

A method of assisting small intestine capsule endoscopic lesion examination using artificial neural network (인공신경망을 이용한 소장 캡슐 내시경 병변 검사 보조 방법)

  • Wang, Tae-su;Kim, Minyoung;Jang, Jongwook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.2-5
    • /
    • 2022
  • Human organs in the body have a complex structure, and in particular, the small intestine is about 7m long, so endoscopy is not easy and the risk of endoscopy is high. Currently, the test is performed with a capsule endoscope, and the test time is very long. The doctor connects the removed storage device to the computer to store the patient's capsule endoscope image and reads it using a program, but the capsule endoscope test results in a long image length, which takes a lot of time to read. In addition, in the case of the small intestine, there are many curves due to villi, so the occlusion area or light and shade of the image are clearly visible during the examination, and there may be cases where lesions and abnormal signs are missed during the examination. In this paper, we provide a method of assisting small intestine capsule endoscopic lesion examination using artificial neural networks to shorten the doctor's image reading time and improve diagnostic reliability.

  • PDF

Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms

  • Jeong, Haeyoung;Lee, Dae-Hee;Ryu, Choong-Min;Park, Seung-Hwan
    • Journal of Microbiology and Biotechnology
    • /
    • v.26 no.1
    • /
    • pp.207-212
    • /
    • 2016
  • PacBio's long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of second-generation, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction.