• 제목/요약/키워드: Gene Selection

Search Result 868, Processing Time 0.034 seconds

Simultaneous Optimization of Gene Selection and Tumor Classification Using Intelligent Genetic Algorithm and Support Vector Machine

  • Huang, Hui-Ling;Ho, Shinn-Ying
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.57-62
    • /
    • 2005
  • Microarray gene expression profiling technology is one of the most important research topics in clinical diagnosis of disease. Given thousands of genes, only a small number of them show strong correlation with a certain phenotype. To identify such an optimal subset from thousands of genes is intractable, which plays a crucial role when classify multiple-class genes express models from tumor samples. This paper proposes an efficient classifier design method to simultaneously select the most relevant genes using an intelligent genetic algorithm (IGA) and design an accurate classifier using Support Vector Machine (SVM). IGA with an intelligent crossover operation based on orthogonal experimental design can efficiently solve large-scale parameter optimization problems. Therefore, the parameters of SVM as well as the binary parameters for gene selection are all encoded in a chromosome to achieve simultaneous optimization of gene selection and the associated SVM for accurate tumor classification. The effectiveness of the proposed method IGA/SVM is evaluated using four benchmark datasets. It is shown by computer simulation that IGA/SVM performs better than the existing method in terms of classification accuracy.

  • PDF

Cancer-Subtype Classification Based on Gene Expression Data (유전자 발현 데이터를 이용한 암의 유형 분류 기법)

  • Cho Ji-Hoon;Lee Dongkwon;Lee Min-Young;Lee In-Beum
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.10 no.12
    • /
    • pp.1172-1180
    • /
    • 2004
  • Recently, the gene expression data, product of high-throughput technology, appeared in earnest and the studies related with it (so-called bioinformatics) occupied an important position in the field of biological and medical research. The microarray is a revolutionary technology which enables us to monitor several thousands of genes simultaneously and thus to gain an insight into the phenomena in the human body (e.g. the mechanism of cancer progression) at the molecular level. To obtain useful information from such gene expression measurements, it is essential to analyze the data with appropriate techniques. However the high-dimensionality of the data can bring about some problems such as curse of dimensionality and singularity problem of matrix computation, and hence makes it difficult to apply conventional data analysis methods. Therefore, the development of method which can effectively treat the data becomes a challenging issue in the field of computational biology. This research focuses on the gene selection and classification for cancer subtype discrimination based on gene expression (microarray) data.

Feature Selection with Ensemble Learning for Prostate Cancer Prediction from Gene Expression

  • Abass, Yusuf Aleshinloye;Adeshina, Steve A.
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.12spc
    • /
    • pp.526-538
    • /
    • 2021
  • Machine and deep learning-based models are emerging techniques that are being used to address prediction problems in biomedical data analysis. DNA sequence prediction is a critical problem that has attracted a great deal of attention in the biomedical domain. Machine and deep learning-based models have been shown to provide more accurate results when compared to conventional regression-based models. The prediction of the gene sequence that leads to cancerous diseases, such as prostate cancer, is crucial. Identifying the most important features in a gene sequence is a challenging task. Extracting the components of the gene sequence that can provide an insight into the types of mutation in the gene is of great importance as it will lead to effective drug design and the promotion of the new concept of personalised medicine. In this work, we extracted the exons in the prostate gene sequences that were used in the experiment. We built a Deep Neural Network (DNN) and Bi-directional Long-Short Term Memory (Bi-LSTM) model using a k-mer encoding for the DNA sequence and one-hot encoding for the class label. The models were evaluated using different classification metrics. Our experimental results show that DNN model prediction offers a training accuracy of 99 percent and validation accuracy of 96 percent. The bi-LSTM model also has a training accuracy of 95 percent and validation accuracy of 91 percent.

Identification of a Rice Gene (Bph 1) Conferring Resistance to Brown Planthopper (Nilaparvata lugens Stal) Using STS Markers

  • Kim, Suk-Man;Sohn, Jae-Keun
    • Molecules and Cells
    • /
    • v.20 no.1
    • /
    • pp.30-34
    • /
    • 2005
  • This study was carried out to identify a high-resolution marker for a gene conferring resistance to brown planthopper (BPH) biotype 1, using japonica type resistant lines. Bulked segregant analyses were conducted using 520 RAPD primers to identify RAPD fragments linked to the BPH resistance gene. Eleven RAPDs were shown to be polymorphic amplicons between resistant and susceptible progeny. One of these primers, OPE 18, which amplified a 923 bp band tightly linked to resistance, was converted into a sequence-tagged-site (STS) marker. The STS marker, BpE18-3, was easily detectable as a dominant band with tight linkage (3.9cM) to Bph1. It promises to be useful as a marker for assisted selection of resistant progeny in backcross breeding programs to introgress the resistance gene into elite japonica cultivars.

Characteristics of the Plasmid pCS100 Containing Nisin Resistant Gene from Lactococcus lactis subsp. lactis ATCC7962. (Lactococcus lactis subsp. lactis ATCC 7962의 nisin 저항성 유전자를 포함하는 plasmid pCS100의 특성규명)

  • 송종효;이형주;김정환;정대균
    • Microbiology and Biotechnology Letters
    • /
    • v.26 no.6
    • /
    • pp.562-565
    • /
    • 1998
  • Nisin-producing and nisin resistant L. lactis subsp. lactis ATCC7962 harbored six plasmids. To find a plasmid containing a nisin resistant gene, these plasmids were transformed into L lactis LM0230 of plasmid-free and nisin sensitive strain. After screening on nisin selection media containing nisin (150 $\mu\textrm{g}$/$m\ell$), several nisin resistant transformants were obtained and the level of nisin resistance was very similar to that of wild type L lactis subsp. lactis ATCC7962. A 26.5 kb plasmid, named as pCS100, which confers resistance to nisin, was identified in transformants. The pCS100 was digested with EcoRI and Southern blot hybridization was done with nisI probe to localize the nisin resistant gene. A 4 kb EcoRI fragment showed a strong positive signal, and it was cloned into pBluescript for the potential selection marker.

  • PDF

Training Molecularly Enabled Field Biologists to Understand Organism-Level Gene Function

  • Kang, Jin-Ho;Baldwin, Ian T.
    • Molecules and Cells
    • /
    • v.26 no.1
    • /
    • pp.1-4
    • /
    • 2008
  • A gene's influence on an organism's Darwinian fitness ultimately determines whether it will be lost, maintained or modified by natural selection, yet biologists have few gene expression systems in which to measure whole-organism gene function. In the Department of Molecular Ecology at the Max Planck Institute for Chemical Ecology we are training "molecularly enabled field biologists" to use transformed plants silenced in the expression of environmentally regulated genes and the plant's native habitats as "laboratories." Research done in these natural laboratories will, we hope, increase our understanding of the function of genes at the level of the organism. Examples of the role of threonine deaminase and RNA-directed RNA polymerases illustrate the process.

A Study on the Timetabling by Evolution Programs (진화 프로그램을 이용한 강의시간표 작성에 관한 연구)

  • 박유석;김용범;김병재;오충환;김복만
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.19 no.38
    • /
    • pp.43-50
    • /
    • 1996
  • Evolution Programs, a form of Genetic Algorithms transformed from chromosome representation, are applied to the Timetabling of University which is one of the NP-hard problems. At the step of algorithms application, each class is established to be a specific category in feasible solution space. At. the same time, the exiting gene used in chromosome expression of Evolution Programs is modified to satisfy constraints effectively by transformation of gene which has multi-information. The new crossover method for fester operation in the Recombination attempted.. Roulette wheel selection and tournament selection are prepared.

  • PDF

ON THE LIMITING DIFFUSION OF SPECIAL DIPLOID MODEL IN POPULATION GENETICS

  • CHOI, WON
    • Bulletin of the Korean Mathematical Society
    • /
    • v.42 no.2
    • /
    • pp.397-404
    • /
    • 2005
  • In this note, we characterize the limiting diffusion of a diploid model by defining the discrete generator for the resealed Markov chain. We conclude that this limiting diffusion model is with uncountable state space and mutation selection and special 'mutation or gene conversion rate'.

Detection of copy number variation and selection signatures on the X chromosome in Chinese indigenous sheep with different types of tail

  • Zhu, Caiye;Li, Mingna;Qin, Shizhen;Zhao, Fuping;Fang, Suli
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.33 no.9
    • /
    • pp.1378-1386
    • /
    • 2020
  • Objective: Chinese indigenous sheep breeds can be classified into the following three categories by their tail morphology: fat-tailed, fat-rumped and thin-tailed sheep. The typical sheep breeds corresponding to fat-tailed, fat-rumped, and thin-tailed sheep are large-tailed Han, Altay, and Tibetan sheep, respectively. Detection of copy number variation (CNV) and selection signatures provides information on the genetic mechanisms underlying the phenotypic differences of the different sheep types. Methods: In this study, PennCNV software and F-statistics (FST) were implemented to detect CNV and selection signatures, respectively, on the X chromosome in three Chinese indigenous sheep breeds using ovine high-density 600K single nucleotide polymorphism arrays. Results: In large-tailed Han, Altay, and Tibetan sheep, respectively, a total of six, four and 22 CNV regions (CNVRs) with lengths of 1.23, 0.93, and 7.02 Mb were identified on the X chromosome. In addition, 49, 34, and 55 candidate selection regions with respective lengths of 27.49, 16.47, and 25.42 Mb were identified in large-tailed Han, Altay, and Tibetan sheep, respectively. The bioinformatics analysis results indicated several genes in these regions were associated with fat, including dehydrogenase/reductase X-linked, calcium voltage-gated channel subunit alpha1 F, and patatin like phospholipase domain containing 4. In addition, three other genes were identified from this analysis: the family with sequence similarity 58 member A gene was associated with energy metabolism, the serine/arginine-rich protein specific kinase 3 gene was associated with skeletal muscle development, and the interleukin 2 receptor subunit gamma gene was associated with the immune system. Conclusion: The results of this study indicated CNVRs and selection regions on the X chromosome of Chinese indigenous sheep contained several genes associated with various heritable traits.

Identification of chromosomal translocation causing inactivation of the gene encoding anthocyanidin synthase in white pomegranate (Punica granatum L.) and development of a molecular marker for genotypic selection of fruit colors

  • Jeong, Hyeon-ju;Park, Moon-Young;Kim, Sunggil
    • Horticulture, Environment, and Biotechnology : HEB
    • /
    • v.59 no.6
    • /
    • pp.857-864
    • /
    • 2018
  • Previous studies have not detected transcripts of the gene encoding anthocyanidin synthase (ANS) in white pomegranates (Punica granatum L.) and suggest that a large-sized insertion in the coding region of the ANS gene might be the causal mutation. To elucidate the identity of the putative insertion, 3887-bp 5' and 3392-bp 3' partial sequences of the insertion site were obtained by genome walking and a gene coding for an expansin-like protein was identified in these genome-walked sequences. An identical protein (GenBank accession OWM71963) isolated from pomegranate was identified from BLAST search. Based on information of OWM71963, a 5.8-Mb scaffold sequence with genes coding for the expansin-like protein and ANS were identified. The scaffold sequence assembled from a red pomegranate cultivar also contained all genome-walked sequences. Analysis of positions and orientations of these genes and genome-walked sequences revealed that the 27,786-bp region, including the 88-bp 5' partial sequences of the ANS gene, might be translocated into an approximately 22-kb upstream region in an inverted orientation. Borders of the translocated region were confirmed by PCR amplification and sequencing. Based on the translocation mutation, a simple PCR codominant marker was developed for efficient genotyping of the ANS gene. This molecular marker could serve as a useful tool for selecting desirable plants at young seedling stages in pomegranate breeding programs.