• Title/Summary/Keyword: optimal gene set

Search Result 25, Processing Time 0.028 seconds

A hybrid method to compose an optimal gene set for multi-class classification using mRMR and modified particle swarm optimization (mRMR과 수정된 입자군집화 방법을 이용한 다범주 분류를 위한 최적유전자집단 구성)

  • Lee, Sunho
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.683-696
    • /
    • 2020
  • The aim of this research is to find an optimal gene set that provides highly accurate multi-class classification with a minimum number of genes. A two-stage procedure is proposed: Based on minimum redundancy and maximum relevance (mRMR) framework, several statistics to rank differential expression genes and K-means clustering to reduce redundancy between genes are used for data filtering procedure. And a particle swarm optimization is modified to select a small subset of informative genes. Two well known multi-class microarray data sets, ALL and SRBCT, are analyzed to indicate the effectiveness of this hybrid method.

Optimized Protocols for Efficient Plant Regeneration and Gene Transfer in Pepper (Capsicum annuum L.)

  • Mihalka, Virag;Fari, Miklos;Szasz, Attila;Balazs, Ervin;Nagy, Istvan
    • Journal of Plant Biotechnology
    • /
    • v.2 no.3
    • /
    • pp.143-149
    • /
    • 2000
  • An Efficient in vitro regeneration system and an optimized Agrobacterium mediated transformation protocol are described, based on the use of young seedling cotyledons of Capsicum annuum L. Optimal regeneration efficiency can be obtained by cultivating cotyledon explants on media containing 4 mg/L benzyladenine and 0.1 mg/L indolacetic acid. The effect of antibiotics used to eliminate Agrobacteria, as well as the toxic level of some generally used selection agents (kanamycin, geneticin, hygromycin, phosphinotricin and methotrexate) in regenerating pepper tissues were determined. To enable the comparison of different selection markers in identical vector background, a set of binary vectors containing the marker genes for NPTII, HPT, DHFR and BAR respectively, as well as the CaMV 35S promoter/enhancer-GUS chimaeric gene was constructed and introduced into four different Agrobacterium host strains.

  • PDF

Identification of csp Homolog in Bradyrhizobium japonicum

  • No, Jae-Sang;Yu, Ji-Cheol;So, Jae-Seong
    • 한국생물공학회:학술대회논문집
    • /
    • 2001.11a
    • /
    • pp.602-605
    • /
    • 2001
  • Low-temperature adaptation and protection for environmental stresses were studied in the gram-negative soil bacterium Bradyrhizobium japonicum 61A101c. B. japonicum was more resistant to alcohol, $H_2O_2$, heat and freezing following a pretreatment at $4^{\circ}C$, resulting in approximately 10 to 1,000 folds increased survival compared to mid-exponential-phase cells grown at an optimal temperature at $28^{\circ}C$. This phenomena relate to the cold shock protein expressed when cells are exposed to a downshift in temperature. To confirm the presence of cold shock protein genes in B. japonicum, a PCR strategy was employed using a degenerate primer set, which successfully amplified a putative csp gene fragment. Sequence analysis of the PCR product(200bp) revealed csp-like sequences that were up to 96% identical to csp gene of S. typhimurium.

  • PDF

Optimal Design of Fuzzy Set-based Polynomial Neural Networks Using Symbolic Gene Type and Information Granulation (유전 알고리즘의 기호코딩과 정보입자화를 이용한 퍼지집합 기반 다항식 뉴럴네트워크의 최적 설계)

  • Lee, In-Tae;Oh, Sung-Kwun
    • Proceedings of the KIEE Conference
    • /
    • 2006.10c
    • /
    • pp.217-219
    • /
    • 2006
  • 본 연구는 정보입자와 유전알고리즘의 기호코딩을 통해 퍼지집합 기반 다항식 뉴럴네트워크(IG based gFSPNN)의 최적 설계 제안한다. 기존의 Furry Srt-based Polynomial Neural Networks의 최적설계를 위해 유전자 알고리즘의 이진코딩을 사용하였다. 이지코딩은 스티링 길이 때문에 연산시간이 급격히 증가되는 현상과 해밍절벽(Hamming Cliff)에 따른 급격한 비트변환이 힘들다는 단점이 내제 하였다. 이에 본 논문에서는 스티링 길이와 해밍절벽에 따른 문제를 해결 하기위해 기호코딩을 사용하였다._데이터들의 특성을 모델에 반영하기 위해 Hard C-Means(HCM)을 결합한 Information Granulation(IG)을 사용하여 최적모델 구축 속도를 빠르게 하였다. 실험적 예제를 통하여 제안된 모델의 성능을 평가한다.

  • PDF

Prediction of Lung Cancer Based on Serum Biomarkers by Gene Expression Programming Methods

  • Yu, Zhuang;Chen, Xiao-Zheng;Cui, Lian-Hua;Si, Hong-Zong;Lu, Hai-Jiao;Liu, Shi-Hai
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.21
    • /
    • pp.9367-9373
    • /
    • 2014
  • In diagnosis of lung cancer, rapid distinction between small cell lung cancer (SCLC) and non-small cell lung cancer (NSCLC) tumors is very important. Serum markers, including lactate dehydrogenase (LDH), C-reactive protein (CRP), carcino-embryonic antigen (CEA), neurone specific enolase (NSE) and Cyfra21-1, are reported to reflect lung cancer characteristics. In this study classification of lung tumors was made based on biomarkers (measured in 120 NSCLC and 60 SCLC patients) by setting up optimal biomarker joint models with a powerful computerized tool - gene expression programming (GEP). GEP is a learning algorithm that combines the advantages of genetic programming (GP) and genetic algorithms (GA). It specifically focuses on relationships between variables in sets of data and then builds models to explain these relationships, and has been successfully used in formula finding and function mining. As a basis for defining a GEP environment for SCLC and NSCLC prediction, three explicit predictive models were constructed. CEA and NSE are requentlyused lung cancer markers in clinical trials, CRP, LDH and Cyfra21-1 have significant meaning in lung cancer, basis on CEA and NSE we set up three GEP models-GEP 1(CEA, NSE, Cyfra21-1), GEP2 (CEA, NSE, LDH), GEP3 (CEA, NSE, CRP). The best classification result of GEP gained when CEA, NSE and Cyfra21-1 were combined: 128 of 135 subjects in the training set and 40 of 45 subjects in the test set were classified correctly, the accuracy rate is 94.8% in training set; on collection of samples for testing, the accuracy rate is 88.9%. With GEP2, the accuracy was significantly decreased by 1.5% and 6.6% in training set and test set, in GEP3 was 0.82% and 4.45% respectively. Serum Cyfra21-1 is a useful and sensitive serum biomarker in discriminating between NSCLC and SCLC. GEP modeling is a promising and excellent tool in diagnosis of lung cancer.

Optimization of a microarray for fission yeast

  • Kim, Dong-Uk;Lee, Minho;Han, Sangjo;Nam, Miyoung;Lee, Sol;Lee, Jaewoong;Woo, Jihye;Kim, Dongsup;Hoe, Kwang-Lae
    • Genomics & Informatics
    • /
    • v.17 no.3
    • /
    • pp.28.1-28.9
    • /
    • 2019
  • Bar-code (tag) microarrays of yeast gene-deletion collections facilitate the systematic identification of genes required for growth in any condition of interest. Anti-sense strands of amplified bar-codes hybridize with ~10,000 (5,000 each for up-and down-tags) different kinds of sense-strand probes on an array. In this study, we optimized the hybridization processes of an array for fission yeast. Compared to the first version of the array (11 ㎛, 100K) consisting of three sectors with probe pairs (perfect match and mismatch), the second version (11 ㎛, 48K) could represent ~10,000 up-/ down-tags in quadruplicate along with 1,508 negative controls in quadruplicate and a single set of 1,000 unique negative controls at random dispersed positions without mismatch pairs. For PCR, the optimal annealing temperature (maximizing yield and minimizing extra bands) was 58℃ for both tags. Intriguingly, up-tags required 3× higher amounts of blocking oligonucleotides than down-tags. A 1:1 mix ratio between up- and down-tags was satisfactory. A lower temperature (25℃) was optimal for cultivation instead of a normal temperature (30℃) because of extra temperature-sensitive mutants in a subset of the deletion library. Activation of frozen pooled cells for >1 day showed better resolution of intensity than no activation. A tag intensity analysis showed that tag(s) of 4,316 of the 4,526 strains tested were represented at least once; 3,706 strains were represented by both tags, 4,072 strains by up-tags only, and 3,950 strains by down-tags only. The results indicate that this microarray will be a powerful analytical platform for elucidating currently unknown gene functions.

Reference Gene Screening for Analyzing Gene Expression Across Goat Tissue

  • Zhanga, Yu;Zhang, Xiao-Dong;Liu, Xing;Li, Yun-Sheng;Ding, Jian-Ping;Zhang, Xiao-Rong;Zhang, Yun-Hai
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.26 no.12
    • /
    • pp.1665-1671
    • /
    • 2013
  • Real-time quantitative PCR (qRT-PCR) is one of the important methods for investigating the changes in mRNA expression levels in cells and tissues. Selection of the proper reference genes is very important when calibrating the results of real-time quantitative PCR. Studies on the selection of reference genes in goat tissues are limited, despite the economic importance of their meat and dairy products. We used real-time quantitative PCR to detect the expression levels of eight reference gene candidates (18S, TBP, HMBS, YWHAZ, ACTB, HPRT1, GAPDH and EEF1A2) in ten tissues types sourced from Boer goats. The optimal reference gene combination was selected according to the results determined by geNorm, NormFinder and Bestkeeper software packages. The analyses showed that tissue is an important variability factor in genes expression stability. When all tissues were considered, 18S, TBP and HMBS is the optimal reference combination for calibrating quantitative PCR analysis of gene expression from goat tissues. Dividing data set by tissues, ACTB was the most stable in stomach, small intestine and ovary, 18S in heart and spleen, HMBS in uterus and lung, TBP in liver, HPRT1 in kidney and GAPDH in muscle. Overall, this study provided valuable information about the goat reference genes that can be used in order to perform a proper normalisation when relative quantification by qRT-PCR studies is undertaken.

DNA Sequence analysis and rfbM gene amplification using PCR for detect salmonella C1 serogroup (살모넬라 C1 serogroup 특이 rfbM 유전자 증폭과 염기서열 분석)

  • Lee, Sung-il;Jung, Suk-chan;Moon, Jin-san;Park, Yong-ho;Lee, John-wha;Kim, Byeong-su;Baek, Byeong-kirl
    • Korean Journal of Veterinary Research
    • /
    • v.36 no.1
    • /
    • pp.109-118
    • /
    • 1996
  • The Salmonella rfb gene encoding for the biosynthesis of the oligosaccharide-repeating units of the O-antigenic determinants was cloned and sequenced. A set of nucleotide primers(a forward and reverse) was selected to target a defined region of the guanosine diphospho-mannose(GDP-Man) pyrophosphorylase synthase gene : rfbM of Salmonella C serogroup. The primer set was used to develop a PCR-based rapid and specific detection system for Salmonella C1 serogroup. Amplification bands of predicted size(1,422bp) were generated from 11 different Salmonella C1 isolates. The bands were verified to be specific for the C1 serogroup by Southern blot analysis using reference homologous DNA specificity was further confirmed by the lack of reactivity with heterologous DNA derived from non-salmonella members of the family enterobacteriaeceae. A specificity of 100% was deduced along with a very high sensitivity shown by a detection limit of 1fg of a purified DNA template. The isolated DNA sequence was found to be 99.8% homologous to S montevideo but the related primers amplified with the predicted band sizes with all the Salmonella C1 serogroups tested. It is concluded that the PCR protocol based on the rfbM gene from S cholerasuis is optimal fast and specific for the detection of Salmonella C1 serogroup and also the corresponding probe is suitable for rapid detection of all Salmonella C1 serogroup DNA tested. This technology should facilitate the identification of contaminated pig products and for any other products contaminated with the Salmonalla C1 serogroup. The immediate impact of this developed method will be in the area of food safety of pig products with the potential prospect for adaptation to other food inspection technologies.

  • PDF

Optimization of a Multiplex DNA Amplification of Three Short Tandem Repeat Loci for Genetic Identification

  • Ryu, Jae-Song;Noh, Jae-Sang;Koo, Yoon-Mo;Lee, Choul-Gyun;So, Jae-Seong
    • Journal of Microbiology and Biotechnology
    • /
    • v.10 no.6
    • /
    • pp.873-876
    • /
    • 2000
  • Short tendem repeat (STR) loci have been used in the field of forensic science. There are literally hundreds of STR systems which have been mapped throughout the human genome. These STR loci are found in almost every chromosome in the genome. They may be amplified using a variety of PCR primers. In this study, a DNA genotyping system based on the multiplex amplification of highly polymorphic STR loci was developed. Three STR loci with nonoverlapping allele size ranges have been utilized in the multiplex amplification including the Neurotensin receptor gene, D21S11, and Human tyrosine hydroxylase gene. The optimal condition for triplex PCr was obtained in a solution with a total volume of $25{\mu}l$ containing 2.0 U of Taq polymerase, 3 mM of $MgCl_2$, $300{\mu}M$ of dNTP, 10 pmole of each primer set, an annealing temperature of $62^{\circ}C$, and 35 cycles. The optimized condition was successfully employed in a family paternity test.

  • PDF

An enhanced feature selection filter for classification of microarray cancer data

  • Mazumder, Dilwar Hussain;Veilumuthu, Ramachandran
    • ETRI Journal
    • /
    • v.41 no.3
    • /
    • pp.358-370
    • /
    • 2019
  • The main aim of this study is to select the optimal set of genes from microarray cancer datasets that contribute to the prediction of specific cancer types. This study proposes the enhancement of the feature selection filter algorithm based on Joe's normalized mutual information and its use for gene selection. The proposed algorithm is implemented and evaluated on seven benchmark microarray cancer datasets, namely, central nervous system, leukemia (binary), leukemia (3 class), leukemia (4 class), lymphoma, mixed lineage leukemia, and small round blue cell tumor, using five well-known classifiers, including the naive Bayes, radial basis function network, instance-based classifier, decision-based table, and decision tree. An average increase in the prediction accuracy of 5.1% is observed on all seven datasets averaged over all five classifiers. The average reduction in training time is 2.86 seconds. The performance of the proposed method is also compared with those of three other popular mutual information-based feature selection filters, namely, information gain, gain ratio, and symmetric uncertainty. The results are impressive when all five classifiers are used on all the datasets.