• Title/Summary/Keyword: Next Generation Sequence

Search Result 174, Processing Time 0.041 seconds

An Adaptive Workflow Scheduling Scheme Based on an Estimated Data Processing Rate for Next Generation Sequencing in Cloud Computing

  • Kim, Byungsang;Youn, Chan-Hyun;Park, Yong-Sung;Lee, Yonggyu;Choi, Wan
    • Journal of Information Processing Systems
    • /
    • v.8 no.4
    • /
    • pp.555-566
    • /
    • 2012
  • The cloud environment makes it possible to analyze large data sets in a scalable computing infrastructure. In the bioinformatics field, the applications are composed of the complex workflow tasks, which require huge data storage as well as a computing-intensive parallel workload. Many approaches have been introduced in distributed solutions. However, they focus on static resource provisioning with a batch-processing scheme in a local computing farm and data storage. In the case of a large-scale workflow system, it is inevitable and valuable to outsource the entire or a part of their tasks to public clouds for reducing resource costs. The problems, however, occurred at the transfer time for huge dataset as well as there being an unbalanced completion time of different problem sizes. In this paper, we propose an adaptive resource-provisioning scheme that includes run-time data distribution and collection services for hiding the data transfer time. The proposed adaptive resource-provisioning scheme optimizes the allocation ratio of computing elements to the different datasets in order to minimize the total makespan under resource constraints. We conducted the experiments with a well-known sequence alignment algorithm and the results showed that the proposed scheme is efficient for the cloud environment.

Complete mitochondrial genome of Nyctalus aviator and phylogenetic analysis of the family Vespertilionidae

  • Lee, Seon-Mi;Lee, Mu-Yeong;Kim, Sun-sook;Kim, Hee-Jong;Jeon, Hye Sook;An, Junghwa
    • Journal of Species Research
    • /
    • v.8 no.3
    • /
    • pp.313-317
    • /
    • 2019
  • Bats influence overall ecosystem health by regulating species diversity and being a major source of zoonotic viruses. Hence, there is a need to elucidate their migration, population structure, and phylogenetic relationship. The complete mitochondrial genome is widely used for studying the genome-level characteristics and phylogenetic relationship of various animals due to its high mutation rate, simple structure, and maternal inheritance. In this study, we determined the complete mitogenome sequence of the bird-like noctule (Nyctalus aviator) by Illumina next-generation sequencing. The sequences obtained were used to reconstruct a phylogenic tree of Vespertilionidae to elucidate the phylogenetic relationship among its members. The mitogenome of N. aviator is 16,863-bp long with a typical vertebrate gene arrangement, consisting of 13 protein-coding genes (PCGs), 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 putative control region. Overall, the nucleotide composition is as follows: 32.3% A, 24.2% C, 14.3% G, and 29.2% T, with a slight AT bias (61.5%). The base composition of the 13 PCGs is as follows: 30.3% A, 13.4% G, 31.0% T, and 25.2% C. The phylogenetic analysis, based on 13 concatenated PCG sequences, infers that N. aviator is closely related to N. noctula with a high bootstrap value (100%).

Biodiversity and Enzyme Activity of Marine Fungi with 28 New Records from the Tropical Coastal Ecosystems in Vietnam

  • Pham, Thu Thuy;Dinh, Khuong V.;Nguyen, Van Duy
    • Mycobiology
    • /
    • v.49 no.6
    • /
    • pp.559-581
    • /
    • 2021
  • The coastal marine ecosystems of Vietnam are one of the global biodiversity hotspots, but the biodiversity of marine fungi is not well known. To fill this major gap of knowledge, we assessed the genetic diversity (ITS sequence) of 75 fungal strains isolated from 11 surface coastal marine and deeper waters in Nha Trang Bay and Van Phong Bay using a culture-dependent approach and 5 OTUs (Operational Taxonomic Units) of fungi in three representative sampling sites using next-generation sequencing. The results from both approaches shared similar fungal taxonomy to the most abundant phylum (Ascomycota), genera (Candida and Aspergillus) and species (Candida blankii) but were different at less common taxa. Culturable fungal strains in this study belong to 3 phyla, 5 subdivisions, 7 classes, 12 orders, 17 families, 22 genera and at least 40 species, of which 29 species have been identified and several species are likely novel. Among identified species, 12 and 28 are new records in global and Vietnamese marine areas, respectively. The analysis of enzyme activity and the checklist of trophic mode and guild assignment provided valuable additional biological information and suggested the ecological function of planktonic fungi in the marine food web. This is the largest dataset of marine fungal biodiversity on morphology, phylogeny and enzyme activity in the tropical coastal ecosystems of Vietnam and Southeast Asia. Biogeographic aspects, ecological factors and human impact may structure mycoplankton communities in such aquatic habitats.

Development of PCR-based markers for discriminating Solanum berthaultii using its complete chloroplast genome sequence

  • Kim, Soojung;Cho, Kwang-Soo;Park, Tae-Ho
    • Journal of Plant Biotechnology
    • /
    • v.45 no.3
    • /
    • pp.207-216
    • /
    • 2018
  • Solanum berthaultii is one of the wild diploid Solanum species, which is an excellent resource in potato breeding owing to its resistance to several important pathogens. On the other hand, sexual hybridization between S. berthaultii and S. tuberosum (potato) is limited because of their sexual incompatibility. Therefore, cell fusion can be used to introgress various novel traits from this wild species into the cultivated potatoes. After cell fusion, it is crucial to identify fusion products with the aid of molecular markers. In this study, the chloroplast genome sequence of S. berthaultii obtained by next-generation sequencing technology was described and compared with those of five other Solanum species to develop S. berthaultii specific markers. A total sequence length of the chloroplast genome is 155,533 bp. The structural organization of the chloroplast genome is similar to those of the five other Solanum species. Phylogenic analysis with 25 other Solanaceae species revealed that S. berthaultii is most closely located with S. tuberosum. Additional comparison of the chloroplast genome sequence with those of the five Solanum species revealed 25 SNPs specific to S. berthaultii. Based on these SNPs, six PCR-based markers for differentiating S. berthaultii from other Solanum species were developed. These markers will facilitate the selection of fusion products and accelerate potato breeding using S. berthaultii.

Error Detection of Phase Offsets for Binary Sequences (이원부호의 위상오프셋 오류 검출)

  • Song, Young-Joon;Han, Young-Yearl
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.9
    • /
    • pp.27-35
    • /
    • 1999
  • In this paper, we propose an error detection scheme of phase offsets for binary sequences including PN (Pseudo Noise) sequences based on the number theoretical approach. It is important to know phase offsets of spreading sequences in the CDMA (Code Division Multiple Access) mobile communication systems because phase offsets of the same spreading sequence are used to achieve the acquisition and are used to distinguish each base station. When the period of the sequence is not very long, the relative phase offset between the sequence and its shifted replica can be found by comparing them, but as the period of the sequence increases it becomes difficult to find the phase offset. The error detection failure probability of the proposed method is derived, and it is confirmed by the simulation results. We also discuss the circuit realization of the proposed method and show it can be easily implemented.

  • PDF

Application of Quail Model for Studying the Poultry Functional Genomics (가금 기능유전체 연구를 위한 메추리 모델의 활용)

  • Shin, Sangsu
    • Korean Journal of Poultry Science
    • /
    • v.44 no.2
    • /
    • pp.103-111
    • /
    • 2017
  • The quail (Coturnix japonica) has been used as a model animal in many research fields and its application is still expanding in other fields. Compared to the chicken, the quail is quicker to reach sexually maturity, has short generation intervals, is easy to handle, requires less space and feed, and is sturdy. In addition, it produces many eggs and the research tools developed for chicken can be applied directly to quail or with some modifications. Due to recent advances in next-generation sequencing, abundant sequence data for the quail genome and transcripts have been generated. These sequence data are valuable sources for studying functional genomics using quail, which is one of the model animal used to investigate gene function and networks. Although there are some obstacles to be removed, the quail is the best optimized model to study the functional genomics of poultry. In many research fields, functional genomics study using the quail model will provide the best opportunity to understand the phenomena and principles of life. We review why, among many other birds, the quail is the best model for studying poultry functional genomics.

Current status of whole-genome sequences of Korean angiosperms

  • Jongsun PARK;Yunho YUN;Hong XI;Woochan KWON;Janghyuk SON
    • Korean Journal of Plant Taxonomy
    • /
    • v.53 no.3
    • /
    • pp.181-200
    • /
    • 2023
  • Owing to the rapid development of sequencing technologies, more than 1,000 plant genomes have been sequenced and released. Among them, 69 Korean plant taxa (85 genome sequences) contain at least one whole-genome sequence despite the fact that some samples were not collected in Korea. The sequencing-by-synthesis method (next-generation sequencing) and the PacBio (third-generation sequencing) method were the most commonly used in studies appearing in 65 publications. Several scaffolding methods, such as the Hi-C and 10x types, have also been used for pseudo-chromosomal assembly. The most abundant families among the 69 taxa are Rosaceae (10 taxa), Brassicaceae (7 taxa), Fabaceae (7 taxa), and Poaceae (7 taxa). Due to the rapid release of plant genomes, it is necessary to assemble the current understanding of Korean plant species not only to understand their whole genomes as our own plant resources but also to establish new tools for utilizing plant resources efficiently with various analysis pipelines, including AI-based engines.

Development of HRM Markers Based on Identification of SNPs from Next-Generation Sequencing of Sanguisorba officinalis, Sanguisorba tenuifolia f. alba (Trautv. & Mey.) Kitam and Sanguisorba tenuifolia Fisch. ex Link (오이풀, 흰오이풀, 긴오이풀의 NGS 기반 유전체 서열의 완전 해독 및 차세대 염기서열 재분석으로 탐색된 SNP 기반 HRM 분자표지 개발)

  • Sim, Mi-Ok;Jang, Ji Hun;Jung, Ho-Kyung;Hwang, Taeyeon;Kim, Sunyoung;Cho, Hyun-Woo
    • The Korea Journal of Herbology
    • /
    • v.34 no.6
    • /
    • pp.91-97
    • /
    • 2019
  • Objective : To establish a reliable tool between for the distinction of original plants of Sanguisorbae Radix, we analyzed the complete chloroplast genome sequence of Sanguisorbae Radix and identified single nucleotide polymorphisms (SNPs). Materials and methods : The chloroplast genome sequence of Sanguisorba officinalis, Sanguisorba tenuifolia f. alba (Trautv. & Mey.) Kitam and Sanguisorba tenuifolia Fisch. ex Link obtained using next-generation sequencing technology were described and compared with those of other species to develop specific markers. Candidate genetic markers were identified to distinguish species from the chloroplast sequences of each species using Modified Phred Phrap Consed and CLC Genomics Workbench programs. Results : The structure of the chloroplast genome of each sample that had been assembled and verified was circular, and the length was about 155 kbp. Through comparative analysis of the chloroplast sequences, we found 220 nucleotides, 158 SNPs, and 62 Indel (insertion and/or deletion), to distinguish Sanguisorba officinalis, Sanguisorba tenuifolia f. alba (Trautv. & Mey.) Kitam and Sanguisorba tenuifolia Fisch. ex Link. Finally, 15 specific SNP genetic markers were selected for the verification at positions. Avaliable primers for the dried herb, which is used as medicine, were used to develop the PCR amplification product of Sanguisorbae Radix to assess the applicability of PCR analysis. Conclusion : In this study, we found that Fendel-qPCR analysis based on the chloroplast DNA sequences can be an efficient tool for discrimination of Sanguisorba officinalis, Sanguisorba tenuifolia f. alba (Trautv. & Mey.) Kitam and Sanguisorba tenuifolia Fisch. ex Link.

Intrusion Detection Method Using Unsupervised Learning-Based Embedding and Autoencoder (비지도 학습 기반의 임베딩과 오토인코더를 사용한 침입 탐지 방법)

  • Junwoo Lee;Kangseok Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.8
    • /
    • pp.355-364
    • /
    • 2023
  • As advanced cyber threats continue to increase in recent years, it is difficult to detect new types of cyber attacks with existing pattern or signature-based intrusion detection method. Therefore, research on anomaly detection methods using data learning-based artificial intelligence technology is increasing. In addition, supervised learning-based anomaly detection methods are difficult to use in real environments because they require sufficient labeled data for learning. Research on an unsupervised learning-based method that learns from normal data and detects an anomaly by finding a pattern in the data itself has been actively conducted. Therefore, this study aims to extract a latent vector that preserves useful sequence information from sequence log data and develop an anomaly detection learning model using the extracted latent vector. Word2Vec was used to create a dense vector representation corresponding to the characteristics of each sequence, and an unsupervised autoencoder was developed to extract latent vectors from sequence data expressed as dense vectors. The developed autoencoder model is a recurrent neural network GRU (Gated Recurrent Unit) based denoising autoencoder suitable for sequence data, a one-dimensional convolutional neural network-based autoencoder to solve the limited short-term memory problem that GRU can have, and an autoencoder combining GRU and one-dimensional convolution was used. The data used in the experiment is time-series-based NGIDS (Next Generation IDS Dataset) data, and as a result of the experiment, an autoencoder that combines GRU and one-dimensional convolution is better than a model using a GRU-based autoencoder or a one-dimensional convolution-based autoencoder. It was efficient in terms of learning time for extracting useful latent patterns from training data, and showed stable performance with smaller fluctuations in anomaly detection performance.

Test suite generation technique for protocols with nondeterminism (비결정성을 갖는 프로토콜을 위한 시험 스위트 생성방법)

  • 김병식;김우직
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.9
    • /
    • pp.1854-1866
    • /
    • 1997
  • This paper proposes a new test case generation technique for a nondeterministic finite state machine by improving the existing UIO sequence generation method. First, a new conformance relationis defined, which is one of prerequisites for automatic test case generation. Because fof the nondeterministic property of torpocols, the output of the systems under test is not known deterministically to the tester. Therefore, tree-like test case generation method is introduced for adaptive testing, in which the next input is selected after observing the previous output. Since the test cases are generated with regarding the inputs and outputs as separate events and are represented in tree notation, the test cases are easily converted into TTCN, the international standard test suite specification language.

  • PDF