• Title/Summary/Keyword: 차세대 염기서열 분석

Search Result 92, Processing Time 0.03 seconds

Efficient Processing of Next Generation Sequencing Reads Using Hitting Set Problem (Hitting Set 문제를 이용한 Next Generation Sequencing Read의 효율적인 처리)

  • Park, Tae-Won;Kim, So-Ra;Choi, Seok-Moon;Cho, Hwan-Gue;Lee, Do-Hoon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06b
    • /
    • pp.466-469
    • /
    • 2011
  • 최근에 등장한 Next Generation Sequencing(NGS)은 전통적인 방법에 비해 빠르고 저비용으로 대용량의 시퀀스 데이터를 이용한 차세대 시퀀싱 기술을 말한다. 이렇게 얻은 NGS 데이터를 분석하는 단계 중에서 alignment 단계는 시퀀서에서 얻은 대량의 read를 참조 염기서열에 맵핑하는 단계로 NGS 데이터 분석의 가장 기본이면서 핵심인 단계이다. alignment 도구는 긴 참조 염기서열을 색인화해서 짧은 read를 빠르게 맵핑하는 용도로 사용된다. 현재 많이 사용되고 있는 일반적인 alignment 도구들은 입력데이터에 대한 별도의 전처리 과정이 없으며 나열된 read를 순차적으로 맵핑하는 단순한 구조를 가지고 있다. 본 논문은 NGS 데이터의 특징 중에 특히 read간의 중복성이 존재하고 이를 이용한 read의 효율적 공통부분 서열을 찾는다. 중복이 가능한 read의 공통부분서열과 read의 관계를 그래프 이론의 Hitting Set 문제로 모델링하고 여러 read가 포함하는 공통 부분서열을 사용해서 alignment 단계의 효율을 높일 수 방법을 제안한다.

Identification and Characterization of Polymorphic Microsatellite DNA Markers Using Next-generation Sequencing in Parapristipoma trilineatum (차세대 염기서열 분석법을 사용한 벤자리(Parapristipoma trilineatum)의 microsatellite 마커의 개발 및 유전학적 특성 분석)

  • Chun Mae Dong;Mi-Nan Lee;Jae Koo Noh;Jin Woo Park;Young-Ok Kim;Eun-Mi Kim
    • Journal of Life Science
    • /
    • v.33 no.8
    • /
    • pp.623-631
    • /
    • 2023
  • This study was conducted to develop microsatellite markers in Parapristipoma trilineatum using next-generation sequencing. A total of 402,244,934 reads were generated on the Illumina Hiseq X Ten System, yielding 60,738,985,034 bp of sequences. The de novo assembly resulted in 1,320,995 contigs. A total of 952,326 contigs (0.016%) including 151 microsatellite loci were derived from the 1,320,995 contigs longer than 640 bp. A total of 34 primer sets were designed from the 151 microsatellite loci. As a result, 15 microsatellite loci were chosen and used for assuming population genetic parameters in the wild and farmed populations. The mean number of effective alleles was 12, ranging from 6 to 25. The observed heterozygosity (HO) and the expected heterozygosity (HE) ranged between 0.530 and 0.873, with an average of 0.750, and from 0.647 to 0.895, with an average of 0.793, respectively. According to these results, the developed set of 15 microsatellite markers is expected to be useful for the analysis of genetic characteristics in the population of P. trilineatum in Korea. There are requirements now for further genetic information, fishery resource management, breeding guidelines, support with the selection of breeds and studies on the effects of release, all of which will improve species conservation, and through future research, we aim to offer genetic foundational data with that goal.

Identification and Characterization of Polymorphic Microsatellite Loci using Next Generation Sequencing in Quercus variabilis (차세대 염기서열 분석을 이용한 굴참나무(Quercus variabilis)의 microsatellite 마커 개발 및 특성 분석)

  • Baek, Seung-Hoon;Lee, Jei-Wan;Hong, Kyung-Nak;Lee, Seok-Woo;Ahn, Ji-Young;Lee, Min-Woo
    • Journal of Korean Society of Forest Science
    • /
    • v.105 no.2
    • /
    • pp.186-192
    • /
    • 2016
  • This study was conducted to develop microsatellite markers in Quercus variabilis using next generation sequencing. A total of 305,771 reads (384 bp on average) were generated on a Roche GS-FLX system, yielding 117 Mbp of sequences. The de novo assembly resulted in 7,346 contigs. A total of 606 contigs (20.75%) including 911 microsatellite loci were derived from the 2,921 contigs longer than 500 bp. A total of 180 primer sets were designed from the 911 microsatellite loci and screened in eight Q. variabilis individual trees sampled from a natural stand to obtain polymorphic loci. As a result, a total of thirteen polymorphic microsatellite loci were selected and used for estimating population genetic parameters in the 54 individual trees. The mean number of effective alleles was 4.996 ranging from 2.439 to 7.515. The observed heterozygosity and the expected heterozygosity ranged between 0.731 and 1.000 with an average of 0.873 and from 0.590 to 0.867 with an average of 0.766, respectively. Null alleles were not detected in all loci. No significant linkage disequilibrium was detected after Bonferroni correction in all loci. In the near future, these novel polymorphic microsatellite markers will be used to study population and conservation genetics of Q. variabilis of Korea in more detail.

Current Status and Prospect of Wheat Functional Genomics using Next Generation Sequencing (차세대 염기서열분석을 통한 밀 기능유전체 연구의 현황과 전망)

  • Choi, Changhyun;Yoon, Young-Mi;Son, Jae-Han;Cho, Seong-Woo;Kang, Chon-Sik
    • Korean Journal of Breeding Science
    • /
    • v.50 no.4
    • /
    • pp.364-377
    • /
    • 2018
  • Hexaploid wheat (common wheat/bread wheat) is one of the most important cereal crops in the world and a model for research of an allopolyploid plant with a large, highly repetitive genome. In the heritability of agronomic traits, variation in gene presence/absence plays an important role. However, there have been relatively few studies on the variation in gene presence/absence in crop species, including common wheat. Recently, a reference genome sequence of common wheat has been fully annotated and published. In addition, advanced next-generation sequencing (NGS) technology provides high quality genome sequences with continually decreasing NGS prices, thereby dawning full-scale wheat functional genomic studies in other crops as well as common wheat, in spite of their large and complex genomes. In this review, we provide information about the available tools and methodologies for wheat functional genomics research supported by NGS technology. The use of the NGS and functional genomics technology is expected to be a powerful strategy to select elite lines for a number of germplasms.

Toward The Fecal Microbiome Project (분변 미생물군집 프로젝트)

  • Unno, Tatsuya
    • Korean Journal of Microbiology
    • /
    • v.49 no.4
    • /
    • pp.415-418
    • /
    • 2013
  • Since the development of the next generation sequencing (NGS) technology, 16S rRNA gene sequencing has become a major tool for microbial community analysis. Recently, human microbiome project (HMP) has been completed to identify microbes associated with human health and diseases. HMP achieved characterization of several diseases caused by bacteria, especially the ones in human gut. While human intestinal bacteria have been well characterized, little have been studied about other animal intestinal bacteria. In this study, we surveyed diversity of livestock animal fecal microbiota and discuss importance of studying fecal microbiota. Here, we report the initiation of the fecal microbiome project in South Korea.

Improvement of SNPs detection efficient by reuse of sequences in Genotyping By Sequencing technology (유전체 서열 재사용을 이용한 Genotyping By Sequencing 기술의 단일 염기 다형성 탐지 효율 개선)

  • Baek, Jeong-Ho;Kim, Do-Wan;Kim, Junah;Lee, Tae-Ho
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.10
    • /
    • pp.2491-2499
    • /
    • 2015
  • Recently, the most popular technique to determine the Genotype, genetic features of individual organisms, is the GBS based on SNP from sequences determined by NGS. As analyzing the sequences by the GBS, TASSEL is the most used program to identify the genotypes. But, TASSEL has limitation that it uses only the partial sequences that is obtained by NGS. We tried to improve the efficiency in use of the sequences in order to solve the limitation. So, we constructed new data sets by quality checking, filtering the unused sequences with error rate below 0.1% and clipping the sequences considering the location of barcode and enzyme. As a result, approximately over 17% of the SNP detection efficiency was increased. In this paper, we suggest the method and the applied programs in order to detect more SNPs by using the disused sequences.

Comparative Microbiome Analysis of and Microbial Biomarker Discovery in Two Different Fermented Soy Products, Doenjang and Ganjang, Using Next-generation Sequencing (차세대 염기서열 분석법을 이용한 된장과 간장의 미생물 분포 및 바이오마커 분석)

  • Ha, Gwangsu;Jeong, Ho Jin;Noh, Yunjeong;Kim, JinWon;Jeong, Su-Ji;Jeong, Do-Youn;Yan, Hee-Jong
    • Journal of Life Science
    • /
    • v.32 no.10
    • /
    • pp.803-811
    • /
    • 2022
  • Despite the importance of traditional Korean fermented foods, little is known about the microbial communities and diversity of fermented soy products. To gain insight into the unexplored microbial communities of both Doenjang (DJ) and Ganjang (GJ) that may contribute to the fermentation in Korean traditional foods, we carried out next-generation sequencing (NGS) based on the V3-V4 region of 16S rDNA gene analysis. The alpha diversity analysis results revealed that both the Shannon and Simpson diversity indices were significantly different between the two groups, whereas the richness indices, including ACE, CHAO, and Jackknife, were not significant. Firmicutes were the most dominant phylum in both groups, but several taxa were found to be more abundant in DJ than in GJ. The proportions of Bacillus, Kroppenstedtia, Clostridium, and Pseudomonas and most halophiles and halotolerant bacteria, such as Tetragenococcus, Chromohalobacter, Lentibacillus, and Psychrobacter, were lower in DJ than in GJ. Linear discriminant effect size (LEfSe) analysis was carried out to discover discriminative functional biomarkers. Biomarker discovery results showed that Bacillus and Tetragenococcus were identified as the most important features for the classification of subjects to DJ and GJ. Paired-permutational multivariate analysis of variance (PERMANOVA) further revealed that the bacterial community structure between the two groups was statistically different (p=0.001).

Lung Adenocarcinoma Gene Mutation in Koreans: Detection Using Next Generation Sequence Analysis Technique and Analysis of Concordance with Existing Genetic Test Methods (한국인의 폐선암 유전자 돌연변이: 차세대 염기서열 분석법을 이용한 검출 및 기존 유전자 검사법과의 일치도 분석)

  • Jae Ha BAEK;Kyu Bong CHO
    • Korean Journal of Clinical Laboratory Science
    • /
    • v.55 no.1
    • /
    • pp.16-28
    • /
    • 2023
  • Lung adenocarcinoma accounts for about 40% of all lung cancers. With the recent development of gene profiling technology, studies on mutations in oncogenes and tumor suppressor genes, which are important for the development and growth of tumors, have been actively conducted. Companion diagnosis using next-generation sequencing helps improve survival with targeted therapy. In this study, formalin-fixed paraffin-embedded tissues of non-small cell lung cancer patients were subjected to hematoxylin and eosin staining for detecting genetic mutations that induce lung adenocarcinoma in Koreans. Immunohistochemical staining was also performed to accurately classify lung adenocarcinoma tissues. Based on the results, next-generation sequencing was applied to analyze the types and patterns of genetic mutations, and the association with smoking was established as the most representative cause of lung cancer. Results of next-generation sequencing analysis confirmed the single nucleotide variations, copy number variations, and gene rearrangements. In order to validate the reliability of next-generation sequencing, we additionally performed the existing genetic testing methods (polymerase chain reaction-epidermal growth factor receptor, immunohistochemistry-anaplastic lymphoma kinase (D5F3), and fluorescence in situ hybridiation-receptor tyrosine kinase 1 tests) to confirm the concordance rates with the next-generation sequencing test results. This study demonstrates that next-generation sequencing of lung adenocarcinoma patients simultaneously identifies mutation.

An Efficient Parallelization Mechanism for Preprocessing of Genome Sequence Data on HPC environment (고성능 클러스터와 분산 병렬 파일 시스템을 이용한 유전체데이터 전처리 작업의 효율적인 병렬화 기법)

  • Byun, Eun-Kyu;Mun, Ji-hyeob;Kwak, Jae-Hyuck
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.50-53
    • /
    • 2018
  • 차세대 염기서열 분석법이 생성한 유전체 원시 데이터를 기존의 방식대로 하나의 서버에서 분석하기 위해서는 수십 시간이 필요할 수 있고 이러한 시간을 최대한 줄여야 하는 응급 상황도 존재한다. 따라서 본 연구에서는 고속의 네트워크로 연결되고 병렬 파일 시스템을 공유하는 서버 클러스터를 활용하여 분석 시간을 크게 단축 시킬 수 있는 유전체 데이터 분석의 전처리 프로세스의 병렬화 방법을 제안한다. 기존의 검증된 분석도구를 기반으로 프로세스의 병렬화, 데이터의 분배 및 병렬 병합 기법을 개발하였고 실험을 통해 성능을 향상 시킬 수 있음을 증명하였다.

A Study on the Hierarchical Expression of Human Cell Lineage (인간 세포 Lineage 의 계층적 표현에 관한 연구)

  • Park, JaeSoon;Kwon, Seong Gyu;Oh, Ji Won;Lee, JongHyuk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.663-664
    • /
    • 2020
  • 차세대 염기서열 분석 기술은 성능과 비용 면에서 매우 향상되어 한 개체 내 여러 세포의 유전자 분석이 가능한 수준이다. 한 개체 내 여러 조직 세포의 유전자는 모두 동일하지 않기 때문에 여러 조직 세포의 Lineage 를 계층적으로 표현하고 이를 조직 세포 간 변이 정도를 파악하는 데 활용한다면 암 돌연변이 발생 등을 미리 예측할 수 있다. 본 논문은 한 개체 내 여러 조직 간 변이를 관찰하기 위해 변이 검출 데이터를 계층적 군집 방법을 이용해 분석하고 이를 시각화 하는 방법을 제안한다. 실제의 8 개 조직 세포의 유전자를 분석하고 변이를 검출하여 Dendrogram 그래프로 시각화 하였다.