• 제목/요약/키워드: Informative genes

검색결과 38건 처리시간 0.027초

De novo transcriptome sequencing and gene expression profiling with/without B-chromosome plants of Lilium amabile

  • Park, Doori;Kim, Jong-Hwa;Kim, Nam-Soo
    • Genomics & Informatics
    • /
    • 제17권3호
    • /
    • pp.27.1-27.9
    • /
    • 2019
  • Supernumerary B chromosomes were found in Lilium amabile (2n = 2x = 24), an endemic Korean lily that grows in the wild throughout the Korean Peninsula. The extra B chromosomes do not affect the host-plant morphology; therefore, whole transcriptome analysis was performed in 0B and 1B plants to identify differentially expressed genes. A total of 154,810 transcripts were obtained from over 10 Gbp data by de novo assembly. By mapping the raw reads to the de novo transcripts, we identified 7,852 differentially expressed genes (log2FC > |10|), in which 4,059 and 3,794 were up-and down-regulated, respectively, in 1B plants compared to 0B plants. Functional enrichment analysis revealed that various differentially expressed genes were involved in cellular processes including the cell cycle, chromosome breakage and repair, and microtubule formation; all of which may be related to the occurrence and maintenance of B chromosomes. Our data provide insight into transcriptomic changes and evolution of plant B chromosomes and deliver an informative database for future study of B chromosome transcriptomes in the Korean lily.

mRMR과 수정된 입자군집화 방법을 이용한 다범주 분류를 위한 최적유전자집단 구성 (A hybrid method to compose an optimal gene set for multi-class classification using mRMR and modified particle swarm optimization)

  • 이선호
    • 응용통계연구
    • /
    • 제33권6호
    • /
    • pp.683-696
    • /
    • 2020
  • 표본의 다범주 표현형을 예측하는데 사용되는 최적의 유전자집단이란 적은 수의 유전자로 표현형을 정확히 예측할 수 있는 유전자들의 모임이다. 특이발현유전자를 검색하는 통계량은 이미 여러 가지가 있고, K-평균 군집화를 곁들여 중복성이 적은 특이발현유전자들을 선택 가능하다. 이들을 바탕으로 적은 수로 정확하게 다범주 분류가 가능한 유전자집단을 구성할 수 있도록 수정한 입자최적화 방법을 제안한다. 널리 알려진 ALL 248례와 SRBCT 83례를 이용하여 제안된 방법으로 최적유전자집단을 찾을 수 있음을 보였다.

표준화 기반 표지 유전자를 이용한 난소암 마이크로어레이 데이타 분류 시스템 (Ovarian Cancer Microarray Data Classification System Using Marker Genes Based on Normalization)

  • 박수영;정채영
    • 한국정보통신학회논문지
    • /
    • 제15권9호
    • /
    • pp.2032-2037
    • /
    • 2011
  • 표지 유전자는 특정한 실험 조건의 특성을 나타내주는 발현수준의 유전자를 의미한다. 이 유전자들은 여러 집단간의 발현수준에서 유의한 차이를 보여주며, 실제로 집단 간의 차이를 유발하는 유전자일 확률이 높아 특정 생물학적 현상과 관련 있는 표지 유전자를 찾는 연구에 이용될 수 있다. 본 논문에서는, 먼저 그 동안 제안된 여러 표준화 방법들 중에서 가장 널리 사용되고 있는 방법들을 이용하여 데이터를 표준화 한 후 통계에 따라 유전자의 우선순위를 정함으로써 표지유전자를 추출할 수 있는 시스템을 제안하였다. 다층퍼셉트론 신경망 분류기를 이용하여 각 표준화 방법들의 성능을 비교분석하였다. 그 결과 Lowess 표준화 후 ANOVA를 이용하여 선택된 8개의 표지 유전자를 포함하는 마이크로어레이 데이터 셋에 MLP 알고리즘을 적용한 결과 99.32%의 가장 높은 분류 정확도와 가장 낮은 예측 에러 추정치를 나타내었다.

Transcriptional Profiles of Peripheral Blood Leukocytes Identify Patients with Cholangiocarcinoma and Predict Outcome

  • Subimerb, Chutima;Wongkham, Chaisiri;Khuntikeo, Narong;Leelayuwat, Chanvit;McGrath, Michael S.;Wongkham, Sopit
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권10호
    • /
    • pp.4217-4224
    • /
    • 2014
  • Cholangiocarcinoma (CCA), a slow growing but highly metastatic tumor, is highly prevalent in Northeast Thailand. Specific tests that predict prognosis of CCA remain elusive. The present study was designed to investigate whether peripheral blood leukocyte (PBL) transcriptional profiles might be of use as a prognostic test in CCA patients. Gene expression profiles of PBLs from 9 CCA and 8 healthy subjects were conducted using the Affymetrix HG_U133 Plus 2.0 GeneChip. We indentified informative PBLs gene expression profiles that could reliably distinguish CCA patients from healthy subjects. Of these CCA specific genes, 117 genes were up regulated and 60 were down regulated. The molecular and cellular functions predicted for these CCA specific genes according to the Gene Ontology database indicated differential PBL expression of host immune response and tumor progression genes (EREG, TGF ${\beta}1$, CXCL2, CXCL3, IL-8, and VEGFA). The expression levels of 9 differentially expressed genes were verified in 36 CCA vs 20 healthy subjects. A set of three tumor invasion related genes (PLAU, CTSL and SERPINB2) computed as "prognostic index" was found to be an independent and statistically significant predictor for CCA patient survival. The present study shows that CCA PBLs may serve as disease predictive clinically accessible surrogates for indentifying expressed genes reflective of CCA disease severity.

Molecular Systematics of the Tephritoidea (Insecta: Diptera): Phylogenetic Signal in 16S and 28S rDNAs for Inferring Relationships Among Families

  • Han, Ho-Yeon;Ro, Kyung-Eui;Choi, Deuk-Soo;Kim, Sam-Kyu
    • Animal cells and systems
    • /
    • 제6권2호
    • /
    • pp.145-151
    • /
    • 2002
  • Phylogenetic signal present in the mitochondrial 16S ribosomal RNA gene (16S rDNA) and the nuclear large subunit ribosomal RNA gene (28S rDNA) was explored to assess their utility in resolving family level relationships of the superfamily Tephritoidea. These two genes were chosen because they appear to evolve at different rates, and might contribute to resolve both shallow and deeper phylogenetic branches within a highly diversified group. For the 16S rDNA data set, the number of aligned sites was 1,258 bp, but 1,204 bp were used for analysis after excluding sites of ambiguous alignment. Among these 1,204 sites, 662 sites were variable and 450 sites were informative for parsimony analysis. For the 28S rDNA data set, the number of aligned sites was 1,102 bp, but 1,000 bp were used for analysis after excluding sites of ambiguous alignment. Among these 1000 sites, 235 sites were variable and 95 sites were informative for parsimony analysis. Our analyses suggest that: (1) while 16S rDNA is useful for resolving more recent phylogenetic divergences, 28S rDNA can be used to define much deeper phylogenetic branches; (2) the combined analysis of the 16S and 28S rDNAs enhances the overall resolution without losing phylogenetic signal from either single gene analysis; and (3) additional genes that evolve at intermediate rates between the 16S and 28S rDNAs are needed to further resolve relationships among the tephritoid families.

종양 분류를 위한 마이크로어레이 데이터 분류 모델 설계와 구현 (The Design and Implement of Microarry Data Classification Model for Tumor Classification)

  • 박수영;정채영
    • 한국정보통신학회논문지
    • /
    • 제11권10호
    • /
    • pp.1924-1929
    • /
    • 2007
  • 오늘날 인간 프로젝트와 같은 종합적 인 연구의 궁극적 목적을 달성하기 위해서는 이 들 연구로부터 획득한 대량의 관련 데이터에 대해 새로운 현실적 의미를 부여할 수 있어야 한다. 마이크로어레이를 기반으로 하는 종양 분류 방법은 종양 종류에 따라 다르게 발현되는 유전자 양상을 통계적으로 발견함으로써 정확한 종양 분류에 기여 할 수 있다. 따라서 현재의 마이크로어레이 기술을 이용해서 효과적으로 종양을 분류하기 위해서는 특정 종양 분류와 밀접하게 관련이 있는 정보력 있는 유전자를 선택하는 과정이 필수적이다. 본 논문에서는 암에 걸린 흰쥐 외피 기간 세포 분화 실험에서 얻어진 3840 유전자의 마이크로어레이 cDNA를 이용해 데이터의 정규화를 거쳐 정보력 있는 유전자 목록을 별도로 추출하여 보다 정확한 종양 분류 모델을 구축하고 각각의 실험 결과들을 비교 분석함으로써 성능평가를 하였다. 피어슨 적률 상관 계수를 이용하여 선택된 유전자들을 멀티퍼셉트론 분류기로 분류한 결과 98.6%의 정확도를 보였다.

Identification of Gene-based Potential Biomarkers for Cephalexin-induced Nephrotoxicity in Mice

  • Park, Han-Jin;Oh, Jung-Hwa;Hwang, Ji-Yoon;Lim, Jung-Sun;Jeong, Sun-Young;Kim, Yong-Bum;Yoon, Seok-Joo
    • Molecular & Cellular Toxicology
    • /
    • 제2권3호
    • /
    • pp.193-201
    • /
    • 2006
  • Cephalexin, one of most widely prescribed cephalosporin, has been reported to cause acute renal failure as a side effect in human and experimental animals. Although numerous animal studies have been reported for the cephalosporin nephrotoxicity, the molecular and cellular nephrotoxic mechanisms of cephalexin are still unknown. This investigation evaluated the time-dependent gene expression profile of kidney in mouse during cephalexin induced nephrotoxicity. C57BL/6 female mice were administered either saline or 1,000 mg/kg cephalexin intraperitoneally. Mice were sacrificed at 3, 6, and 24 hr after administration. Blood biochemical and histopathological results indicated cephalexin induced nephrotoxicity. Microarray experiment carried out using Affymetrix $GeneChip^{(R)}$. There were 198 informative genes that were significantly expressed >5-fold versus control at 3, 6, and 24 hr (p<0.01), of which 156 and 42 were up-and down-regulated, respectively. Major classes of up-regulated genes at 3, 6 hr included those involved in MAPK/Jak-STAT signaling pathway and immune response such as cytokine-cytokine receptor interaction and complement and coagulation cascades. At 24 hr, up-regulated genes were mainly involved in regeneration/repair and immune response; down-regulated genes were generally associated with transporters and intermediary metabolism. Among the up-regulated genes at 24 hr, several potential biomarkers on nephrotoxicity such as Kim-1, Fga, Timp1, and Slc34a2 were clustered in a same category. In addition, Tnfrsf12a and Lcn2 which were consistently up-regulated (>5 fold) were also included as potential biomarkers. These results may provide clues for elucidating the mechanism of cephalexin induced nephrotoxicity and evaluating potential biomarkers to assess nephrotoxicity.

Identification of Functional and In silico Positional Differentially Expressed Genes in the Livers of High- and Low-marbled Hanwoo Steers

  • Lee, Seung-Hwan;Park, Eung-Woo;Cho, Yong-Min;Yoon, Duhak;Park, Jun-Hyung;Hong, Seong-Koo;Im, Seok-Ki;Thompson, J.M.;Oh, Sung-Jong
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제20권9호
    • /
    • pp.1334-1341
    • /
    • 2007
  • This study identified hepatic differentially expressed genes (DEGs) affecting the marbling of muscle. Most dietary nutrients bypass the liver and produce plasma lipoproteins. These plasma lipoproteins transport free fatty acids to the target tissue, adipose tissue and muscle. We examined hepatic genes differentially expressed in a differential-display reverse transcription-polymerase chain reaction (ddRT-PCR) analysis comparing high- and low-marbled Hanwoo steers. Using 60 arbitrary primers, we found 13 candidate genes that were upregulated and five candidate genes that were downregulated in the livers of high-marbled Hanwoo steers compared to low-marbled individuals. A BLAST search for the 18 DEGs revealed that 14 were well characterized, while four were not annotated. We examined four DEGs: ATP synthase F0, complement component CD, insulin-like growth factor binding protein-3 (IGFBP3) and phosphatidylethanolamine binding protein (PEBP). Of these, only two genes (complement component CD and IGFBP3) were differentially expressed at p<0.05 between the livers of high- and low-marbled individuals. The mean mRNA levels of the PEBP and ATP synthase F0 genes did not differ significantly between the livers of high- and low-marbled individuals. Moreover, these DEGs showed very high inter-individual variation in expression. These informative DEGs were assigned to the bovine chromosome in a BLAST search of MS marker subsets and the bovine genome sequence. Genes related to energy metabolism (ATP synthase F0, ketohexokinase, electron-transfer flavoprotein-ubiquinone oxidoreductase and NADH hydrogenase) were assigned to BTA 1, 11, 17, and 22, respectively. Syntaxin, IGFBP3, decorin, the bax inhibitor gene and the PEBP gene were assigned to BTA 3, 4, 5, 5, and 17, respectively. In this study, the in silico physical maps provided information on the specific location of candidate genes associated with economic traits in cattle.

나이브 베이스 분류기를 이용한 유전발현 데이타기반 암 분류를 위한 순위기반 다중클래스 유전자 선택 (Rank-based Multiclass Gene Selection for Cancer Classification with Naive Bayes Classifiers based on Gene Expression Profiles)

  • 홍진혁;조성배
    • 한국정보과학회논문지:시스템및이론
    • /
    • 제35권8호
    • /
    • pp.372-377
    • /
    • 2008
  • 최근 활발히 연구가 진행 중인 유전발현 데이타를 이용한 다중클래스 암 분류는 DNA 마이크로어레이로부터 획득된 대규모의 유전자 정보를 분석하여 암의 종류를 판단한다. 수집된 유전발현 데이타에는 대상 암과 관련이 없는 유전자도 포함되어 있기 때문에 높은 성능의 분류 결과를 얻기 위해서 유용한 유전자를 선택하는 것이 필요하다. 기존의 순위기반 유전자 선택은 이진클래스를 대상으로 고안되었고 이상표식 유전자(Ideal marker gene)를 이용하기 때문에 다중클래스 암 분류에 직접 적용하기에는 한계가 있다. 본 논문에서는 이상표식 유전자를 사용하지 않고 유전발현 수준의 분포를 직접 분석하는 순위기반 다중클래스 유전자 선택 기법을 제안한다. 유전발현 수준을 이산화하고 학습 데이타로부터 빈도를 계산하여 클래스 간 분별력을 측정한 후, 선택된 유전자를 이용하여 나이브 베이즈 분류기를 사용해 다중 암 분류를 수행한다. 제안하는 방법을 다수의 다중클래스 암 분류 데이타에 적용하여 기존 유전자 선택 방법에 비해 우수함을 확인하였다.

Feature Selection via Embedded Learning Based on Tangent Space Alignment for Microarray Data

  • Ye, Xiucai;Sakurai, Tetsuya
    • Journal of Computing Science and Engineering
    • /
    • 제11권4호
    • /
    • pp.121-129
    • /
    • 2017
  • Feature selection has been widely established as an efficient technique for microarray data analysis. Feature selection aims to search for the most important feature/gene subset of a given dataset according to its relevance to the current target. Unsupervised feature selection is considered to be challenging due to the lack of label information. In this paper, we propose a novel method for unsupervised feature selection, which incorporates embedded learning and $l_{2,1}-norm$ sparse regression into a framework to select genes in microarray data analysis. Local tangent space alignment is applied during embedded learning to preserve the local data structure. The $l_{2,1}-norm$ sparse regression acts as a constraint to aid in learning the gene weights correlatively, by which the proposed method optimizes for selecting the informative genes which better capture the interesting natural classes of samples. We provide an effective algorithm to solve the optimization problem in our method. Finally, to validate the efficacy of the proposed method, we evaluate the proposed method on real microarray gene expression datasets. The experimental results demonstrate that the proposed method obtains quite promising performance.