• Title/Summary/Keyword: 차세대 염기서열 분석

Search Result 95, Processing Time 0.027 seconds

Feature selection and prediction modeling of drug responsiveness in Pharmacogenomics (약물유전체학에서 약물반응 예측모형과 변수선택 방법)

  • Kim, Kyuhwan;Kim, Wonkuk
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.2
    • /
    • pp.153-166
    • /
    • 2021
  • A main goal of pharmacogenomics studies is to predict individual's drug responsiveness based on high dimensional genetic variables. Due to a large number of variables, feature selection is required in order to reduce the number of variables. The selected features are used to construct a predictive model using machine learning algorithms. In the present study, we applied several hybrid feature selection methods such as combinations of logistic regression, ReliefF, TurF, random forest, and LASSO to a next generation sequencing data set of 400 epilepsy patients. We then applied the selected features to machine learning methods including random forest, gradient boosting, and support vector machine as well as a stacking ensemble method. Our results showed that the stacking model with a hybrid feature selection of random forest and ReliefF performs better than with other combinations of approaches. Based on a 5-fold cross validation partition, the mean test accuracy value of the best model was 0.727 and the mean test AUC value of the best model was 0.761. It also appeared that the stacking models outperform than single machine learning predictive models when using the same selected features.

Lung Adenocarcinoma Mutation Hotspot in Koreans: Oncogenic Mutation Potential of the TP53 P72R Single Nucleotide Polymorphism (한국인의 폐선암 돌연변이 핫스팟: TP53 P72R Single Nucleotide Polymorphism의 발암성 돌연변이 가능성)

  • Jae Ha BAEK;Kyu Bong CHO
    • Korean Journal of Clinical Laboratory Science
    • /
    • v.55 no.2
    • /
    • pp.93-104
    • /
    • 2023
  • This study aimed to identify new markers that cause lung adenocarcinoma by analyzing mutation hotspots for the top five genes with high mutation frequency in lung adenocarcinoma in Koreans by next generation sequencing (NGS) analysis. The association between TP53 mutation types and patterns with smoking, a major cause of lung cancer, was examined. The clinicopathological characteristics of lung adenocarcinoma patients with TP53 P72R SNPs were analyzed. In Korean lung adenocarcinoma cases, regardless of the smoking status, the TP53 P72R SNP was the most frequently occurring mutational hotspot, in which the nucleotide base was transversed from C to G, and the amino acid was substituted from proline to arginine at codon 72 of TP53. An analysis of the clinicopathological characteristics of lung adenocarcinoma cases with TP53 P72R SNP revealed no significant correlation with the patient's age, gender, smoking status, and tumor differentiation, but a significant correlation with low stage (P-value =0.026). This study confirmed an increase in TP53 rather than EGFR, which was reported as the most frequent mutations in lung adenocarcinoma in Koreans through NGS. Among them, TP53 P72R SNP is the most frequent regardless of smoking status.

Parallelization of Genome Sequence Data Pre-Processing on Big Data and HPC Framework (빅데이터 및 고성능컴퓨팅 프레임워크를 활용한 유전체 데이터 전처리 과정의 병렬화)

  • Byun, Eun-Kyu;Kwak, Jae-Hyuck;Mun, Jihyeob
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.10
    • /
    • pp.231-238
    • /
    • 2019
  • Analyzing next-generation genome sequencing data in a conventional way using single server may take several tens of hours depending on the data size. However, in order to cope with emergency situations where the results need to be known within a few hours, it is required to improve the performance of a single genome analysis. In this paper, we propose a parallelized method for pre-processing genome sequence data which can reduce the analysis time by utilizing the big data technology and the highperformance computing cluster which is connected to the high-speed network and shares the parallel file system. For the reliability of analytical data, we have chosen a strategy to parallelize the existing analytical tools and algorithms to the new environment. Parallelized processing, data distribution, and parallel merging techniques have been developed and performance improvements have been confirmed through experiments.

A comparison of the reproduction of two closely related species, tiger worm(Eisenia fetida) and red tiger worm(Eisenia andrei) when the organic sludge was suppied to them (유기성 슬러지 먹이에 대한 두 근연종인 줄지렁이(Eisenia fetida)와 붉은줄지렁이(Eisenia andrei)의 생식반응 비교)

  • Bae, Yoon-Hwan;Shin, Hyun-Gon
    • Journal of the Korea Organic Resources Recycling Association
    • /
    • v.29 no.3
    • /
    • pp.27-33
    • /
    • 2021
  • CO I gene sequence analysis was applied to earthworms that had been used as test animals in toxicity test in Institute of Kyeongbook Agrochemicals and earthworms used as vermicomposting agents in the farm of Youngdong province to identify their species names. In terms of molecular species, the former was identified as Eisenia fetida and the latter was Eisenia andrei. Cocoons produced from Eisenia fetida was more than those from Eisenia andrei. And No. of adults developed from eggs of Eisenia fetida was more or less higher than those developed from eggs of Eisenia andrei. These results were contradictory to previous reports on two Eisenia spp.. When Eisenia fetida was crossed with Eisenia andrei, hybridized eggs were produced and adults were developed from those eggs, but cocoons and adults were much less than those from non-crossed Eisenia fetida or Eisenia andrei. This indicated that two Eisenia spp. were not distinctly different biological species because there was no complete 'reproductive isolation' between Eisenia fetida and Eisenia andrei. However, this also meant that Eisenia fetida and Eisenia andrei had already been on the tract of speciation.

Genomic epidemiology and surveillance of zoonotic viruses using targeted next-generation sequencing (표적화 차세대염기서열분석법을 이용한 인수공통 바이러스의 유전체 역학과 예찰)

  • Seonghyeon Lee;Seung-Hwan Baek;Shivani Rajoriya;Sara Puspareni;Won-Keun Kim
    • Korean Journal of Veterinary Service
    • /
    • v.46 no.1
    • /
    • pp.93-106
    • /
    • 2023
  • Emerging and re-emerging zoonotic viruses become critical public health, economic, societal, and cultural burdens. The Coronavirus disease-19 (COVID-19) pandemic reveals needs for effective preparedness and responsiveness against the emergence of variants and the next virus outbreak. The targeted next-generation sequencing (NGS) significantly contributes to the acquisition of viral genome sequences directly from clinical specimens. Using this advanced NGS technology, the genomic epidemiology and surveillance play a critical role in identifying of infectious source and origin, tracking of transmission chains and virus evolution, and characterizing the virulence and developing of vaccines during the outbreak. In this review, we highlight the platforms and preparation of targeted NGS for the viral genomics. We also demonstrate the application of this strategy to take advantage of the responsiveness and prevention of emerging zoonotic viruses. This article provides broad and deep insights into the preparedness and responsiveness for the next zoonotic virus outbreak.

Prevalence of PERVs from Domestic Pigs in Korea (pol gene sequences) (국내 돼지에 존재하는 내인성 레트로 바이러스의 분포)

  • Kim, Y.B.;Yoo, J.Y.;Lee, J.Y.;Kim, G.W.;Park, H.Y.
    • Journal of Animal Science and Technology
    • /
    • v.46 no.3
    • /
    • pp.307-314
    • /
    • 2004
  • Xenotransplantation of porcine organs has the potential to overcome the severe. shortage of human tissues and organs available for human transplantation. The swine represents an ideal source of such organs because of their plentiful supply and their numerous anatomical and physiological similarities to the human. However, this procedure also carries with a number of safety issues relating to the zoonotic infections. Porcine endogenous retrovinJses(PERVs), \Wich are germ line transmitted and persist without symptoms in the pigs, are most concerning zoonotic viroses. In order to analyze the prevalence of PERV in domestic pigs, four kinds of pigs'(Landrace, Berkshire, Yorkshire, and Duroc) genomic DNA were isolated from their hair follicles. PCR analysis was carried out for detection of PERVs using subgroup A/B/C and E pol sequence primers. All pigs (20 heads) tested had high copy number of PERVs within genomes. Subgroup A/B/C and E pol gene sequences from 20 isolates were determined by direct sequencing. Sequence analysis showed pol sequences are highly conserved among intra- and inter-subspecies(99.l and 98.8%, respectively). As a first report of PERV prevalence in Korea pigs, our data would be the basic concepts of PERV transmission study in xenotransplantation.

Phylogenetic characteristics of actinobacterial population in bamboo (Sasa borealis) soil (조릿대 대나무림 토양 내 방선균군집의 계통학적 특성)

  • Lee, Hyo-Jin;Han, Song-Ih;Whang, Kyung-Sook
    • Korean Journal of Microbiology
    • /
    • v.52 no.1
    • /
    • pp.59-64
    • /
    • 2016
  • In this study, a pyrosequencing was performed and analyzed to verify the phylogenetic diversity of actinomycetes in the bamboo (Sasa borealis) soil as a base study to obtain the genetic resources of actinomycetes. It was found that the rhizosphere soil had much various distribution in bacterial communities showing a diversity of 8.15 with 2,868 OTUs, while the litter layer showed a diversity of 7.55 with 2,588 OTUs. The bacterial community in the bamboo soil was composed of 35 phyla and the predominant phyla were Proteobacteria (51-60%), Bacteroidetes (16-20%), Acidobacteria (4-16%) and Actinobacteria (4-14%). In particular, Actinobacteria including Micromonosporaceae and Streptomycetaceae had a diverse distribution of actinomycetes within the six orders, 35 families and 121 genera, and it was characterized that about 83% of actinomycetes within Actinomycetales belonged to the 28 families. Among the dominant actinobacterial populations, Micromonosporaceae, Pseudonocardiaceae and Streptomycetaceae were representative family groups in the bamboo soils.

Molecular phylogeny and the biogeographic origin of East Asian Isoëtes (Isoëtaceae) (동아시아 물부추속 식물의 분자계통 및 식물지리학적 기원에 대한 고찰)

  • CHOI, Hong-Keun;JUNG, Jongduk;NA, Hye-Ryun;KIM, Hojoon;KIM, Changkyun
    • Korean Journal of Plant Taxonomy
    • /
    • v.48 no.4
    • /
    • pp.249-259
    • /
    • 2018
  • $Iso{\ddot{e}}tes$ L. ($Iso{\ddot{e}}taceae$) is a cosmopolitan genus of heterosporous lycopods containing ca. 200 species being found in lakes, streams, and wetlands of terrestrial habitats. Despite its ancient origin, worldwide distribution, and adaptation to diverse environment, species in $Iso{\ddot{e}}tes$ show remarkable morphological simplicity and convergence. Allopolyploidy appears to be a significant speciation process in the genus. These characteristics have made it difficult to assess the phylogenetic relationships and biogeographic history of $Iso{\ddot{e}}tes$ species. In recent years, these difficulties have somewhat been reduced by employing multiple molecular markers. Here, we reconstruct the phylogenetic relationships in East Asian $Iso{\ddot{e}}tes$ species. We also provide their divergence time and biogeographic origin using a fossil calibrated chronogram. East Asian $Iso{\ddot{e}}tes$ species are divided into two clades: I. asiatica and the remaining species. $Iso{\ddot{e}}tes$ asiatica from Hokkaido forms a clade with northeastern Russian and western North American $Iso{\ddot{e}}tes$ species. In clade I, western North America is the source area for the dispersal of $Iso{\ddot{e}}tes$ to Hokkaido and northeastern Russia via the Bering land bridge during the late Miocene. The remaining $Iso{\ddot{e}}tes$ species (I. sinensis, I. yunguiensis, I. hypsophila, I. orientalis, I. japonica, I. coreana, I. taiwanensis, I. jejuensis, I. hallasanensis) from East Asia form a sister group to Papua New Guinean and Australian species. The biogeographic reconstruction suggests an Australian origin for the East Asian species that arose through long-distance dispersal during the late Oligocene.

Construction of Genetic Linkage Map and Identification of Quantitative Trait Loci in Populus davidiana using Genotyping-by-sequencing (Genotyping-by-sequencing 기법을 이용한 사시나무(Populus davidiana) 유전연관지도 작성 및 양적형질 유전자좌 탐색)

  • Suvi Kim;Yang-gil Kim;Dayoung Lee;Hye-jin Lee;Kyu-Suk Kang
    • Journal of Korean Society of Forest Science
    • /
    • v.112 no.1
    • /
    • pp.40-56
    • /
    • 2023
  • Tree species within the Populus genus grow rapidly and have an excellent capacity to absorb carbon, conferring substantial ability to effective purify the environment. Poplar breeding can be achieved rapidly and efficiently if a genetic linkage map is constructed and quantitative trait loci (QTLs) are identified. Here, a high-density genetic linkage map was constructed for the control pollinated progeny using the genotyping-by-sequencing (GBS) technique, which is a next-generation sequencing method. A search was also performed for the genes associated with quantitative traits located in the genetic linkage map by examining the variables of height and diameter at root collar, and resilience to insect damage. The height and diameter at root collar were measured directly, while the ability to recover from insect damage was scored in a 4-year-old breeding population of aspen hybrids (Odae19 × Bonghyeon4 F1) established in the research forest of Seoul National University. After DNA extraction, paternity was confirmed using five microsatellite markers, and only the individuals for which paternity was confirmed were used for the analysis. The DNA was cut using restriction enzymes and the obtained DNA fragments were prepared using a GBS library and sequenced. The analyzed results were sorted using Populus trichocarpa as a reference genome. Overall, 58,040 aligned single-nucleotide polymorphism (SNP) markers were identified, 17,755 of which were used for mapping genetic linkages. The genetic linkage map was divided into 19 linkage groups, with a total length of 2,129.54 cM. The analysis failed to identify any growth-related QTLs, but a gene assumed to be related to recovery from insect damage was identified on linkage group (chromosome) 4 through genome-wide association study.

Analysis of Microbial Communities in Paddy Soil Under Organic and Conventional Farming Methods (유기 및 관행 영농법에 따른 논 토양 미생물 군집 분석)

  • Se yoon Jung;Yoon seok Kim;Ji hwan Kim;Hyuck soo Kim;Woon ki Moon;Eun mi Hong
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.487-487
    • /
    • 2023
  • 농업 분야에서 미생물은 영양분 가용화, 유기물 분해 등 토양 영양분 공급에 중요한 역할을 하며, 토양 건강성 증진, 식량 안보 및 식품 건강 면에서 많은 활용 가능성을 지니고 있다. 최근 유역 환경 건강성, 생물 다양성 보존, 효율적인 고품질 농산물 생산에 대한 관심이 커져, 지속 가능한 농업 중 하나인 유기농업과 관행농업 토양의 이화학적 및 생물학적 특성에 관한 비교 연구가 진행되고 있다. 미생물은 지속 가능한 농업 발전의 중요한 요소 중 하나로써, 미생물 다양성이 풍부할수록 토양 비옥도, 작물 성장 면에서 긍정적인 영향을 미친다고 알려져 있다. 본 연구는 이에 대한 기초 데이터를 제공하기 위해 논 경작지를 대상으로 유기 및 관행농업 토양의 미생물 군집조성과 Alpha diversity analysis(Chao1, Shannon, Simpson index)을 통해 비교하였다. 경기도 양평군에서 유기 및 관행 논 지역을 각각 1지점씩 선정하였으며, 8월부터 11월까지 총 4회 현장 조사를 진행하였다. 미생물 분석은 차세대염기서열분석을 실시하였으며, bacteria는 16S rRNA V3-4 영역, fungi는 ITS 3-4 영역을 sequencing 하였다. 미생물 군집조성은 문수준에서는 큰 차이가 없었으나, 속수준에서는 fungi 군집조성에 차이를 보였다. 예로 Ustilaginoidea 속은 관행 논 토양에서만 발견되었으며, 벼 이삭누룩병을 일으키는 병원균으로 과도한 질소 비료 시비가 원인으로 추정된다. 종 다양성은 bacteria diversity의 경우 관행 논 토양에서 높게 측정되는 반면, fungi diversity의 경우 유기 논 토양에서 높게 측정되었다. 결론적으로 체계적인 시비 관리 통해 미생물 군집은 조절될 수 있으며, 관행농업은 적절한 시비를 통해 토양 건강성 및 식품 건강성 면에서 유기농업과 비슷한 효과를 보여줄 가능성이 있다고 판단된다.

  • PDF