• Title/Summary/Keyword: Sequence validation

Search Result 88, Processing Time 0.022 seconds

Feature Selection with Ensemble Learning for Prostate Cancer Prediction from Gene Expression

  • Abass, Yusuf Aleshinloye;Adeshina, Steve A.
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.12spc
    • /
    • pp.526-538
    • /
    • 2021
  • Machine and deep learning-based models are emerging techniques that are being used to address prediction problems in biomedical data analysis. DNA sequence prediction is a critical problem that has attracted a great deal of attention in the biomedical domain. Machine and deep learning-based models have been shown to provide more accurate results when compared to conventional regression-based models. The prediction of the gene sequence that leads to cancerous diseases, such as prostate cancer, is crucial. Identifying the most important features in a gene sequence is a challenging task. Extracting the components of the gene sequence that can provide an insight into the types of mutation in the gene is of great importance as it will lead to effective drug design and the promotion of the new concept of personalised medicine. In this work, we extracted the exons in the prostate gene sequences that were used in the experiment. We built a Deep Neural Network (DNN) and Bi-directional Long-Short Term Memory (Bi-LSTM) model using a k-mer encoding for the DNA sequence and one-hot encoding for the class label. The models were evaluated using different classification metrics. Our experimental results show that DNN model prediction offers a training accuracy of 99 percent and validation accuracy of 96 percent. The bi-LSTM model also has a training accuracy of 95 percent and validation accuracy of 91 percent.

Sequence Validation for the Identification of the White-Rot Fungi Bjerkandera in Public Sequence Databases

  • Jung, Paul Eunil;Fong, Jonathan J.;Park, Myung Soo;Oh, Seung-Yoon;Kim, Changmu;Lim, Young Woon
    • Journal of Microbiology and Biotechnology
    • /
    • v.24 no.10
    • /
    • pp.1301-1307
    • /
    • 2014
  • White-rot fungi of the genus Bjerkandera are cosmopolitan and have shown potential for industrial application and bioremediation. When distinguishing morphological characters are no longer present (e.g., cultures or dried specimen fragments), characterizing true sequences of Bjerkandera is crucial for accurate identification and application of the species. To build a framework for molecular identification of Bjerkandera, we carefully identified specimens of B. adusta and B. fumosa from Korea based on morphological characters, followed by sequencing the internal transcribed spacer region and 28S nuclear ribosomal large subunit. The phylogenetic analysis of Korean Bjerkandera specimens showed clear genetic differentiation between the two species. Using this phylogeny as a framework, we examined the identification accuracy of sequences available in GenBank. Analyses revealed that many Bjerkandera sequences in the database are either misidentified or unidentified. This study provides robust reference sequences for sequence-based identification of Bjerkandera, and further demonstrates the presence and dangers of incorrect sequences in GenBank.

Sequence Group Validation based on Boundary Locking for Valid XML Documents (유효한 XML 문서에 대한 경계 로킹에 기반한 시퀀스 그룹 검증 기법)

  • Choi, Yoon-Sang;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.32 no.6
    • /
    • pp.628-640
    • /
    • 2005
  • The XML is well accepted in several different Web application areas. As soon as many users and applications work concurrently on the same collection of XML documents, isolating accesses and modifications of different transactions becomes an important issue. When an XML document correctly corresponds to the rules laid out in a DTD or XML schema, it is also said to be valid. The valid XML document's validity should be guaranteed after the document is updated. The validation method mentioned above, however, results in lower degree of concurrency. For getting higher degree of concurrency and minimizing the range of the XML document validity, a new validation method based on a specific locking method is required. In this paper we propose the sequence group validation method for minimizing the range of the XML document validity. We also propose the boundary locking method for isolating accesses and modifications of different transactions while supporting the valid XML document's validity. Finally, the results of some experiments show the validation and locking methods increase the degree of transaction concurrency.

Design and Validation of MAC Protocol for B-WLL System (B-WLL 시스템 MAC 프로토콜의 설계 및 검증)

  • Back, Seung-Kwon;Kim, Eung-Bae;Han, Ki-Jun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.8 no.4
    • /
    • pp.468-478
    • /
    • 2002
  • In this paper, we designed a B-WLL MAC(Media Access Control) protocol and validated its operation for implementation of high-speed subscriber networks. Our MAC protocol was designed by SDL using the DAVIC specifications based upon the variable contention/reserved time slot allocation algorithm. For validation of our MAC protocol, Syntax and semantic error check were performed by the Simulation Builder of ObjectGeode and the MAC(Message Sequence Chart) respectively. The validation results showed that our B-WLL MAC protocol is working correctly and may successfully support B-WLL services.

A retroviral insertion in the tyrosinase (TYR) gene is associated with the recessive white plumage color in the Yeonsan Ogye chicken

  • Cho, Eunjin;Kim, Minjun;Manjula, Prabuddha;Cho, Sung Hyun;Seo, Dongwon;Lee, Seung-Sook;Lee, Jun Heon
    • Journal of Animal Science and Technology
    • /
    • v.63 no.4
    • /
    • pp.751-758
    • /
    • 2021
  • The recessive white (locus c) phenotype observed in chickens is associated with three alleles (recessive white c, albino ca, and red-eyed white cre) and causative mutations in the tyrosinase (TYR) gene. The recessive white mutation (c) inhibits the transcription of TYR exon 5 due to a retroviral sequence insertion in intron 4. In this study, we genotyped and sequenced the insertion in TYR intron 4 to identify the mutation causing the unusual white plumage of Yeonsan Ogye chickens, which normally have black plumage. The white chickens had a homozygous recessive white genotype that matched the sequence of the recessive white type, and the inserted sequence exhibited 98% identity with the avian leukosis virus ev-1 sequence. In comparison, brindle and normal chickens had the homozygous color genotype, and their sequences were the same as the wild-type sequence, indicating that this phenotype is derived from other mutation(s). In conclusion, white chickens have a recessive white mutation allele. Since the size of the sample used in this study was limited, further research through securing additional samples to perform validation studies is necessary. Therefore, after validation studies, a selection system for conserving the phenotypic characteristics and genetic diversity of the population could be established if additional studies to elucidate specific phenotype-related genes in Yeonsan Ogye are performed.

Setting an Initial Validation Gate based on Signal Intensity for Target Tracking in IR Image Sequences (적외선 영상에서 표적 추적을 위한 신호세기 기반 초기 유효게이트 설정 방법)

  • Yang, Yu Kyung;Kim, Jieun;Lee, Boohwan
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.17 no.1
    • /
    • pp.108-114
    • /
    • 2014
  • This paper describes a method to set an intensity-based initial validation gate for tracking filter while preserves the ability of tracking a target with maximum speed. First, we collected real data set of signal versus distance of an airplane target. And at each data point, we computed maximum distance the target can move. And a function is modeled to expect the maximum moving pixels on the lateral direction based on the intensity of the detected target in IR image sequence. The initial prediction error covariance can be computed using this function to decide the size of the initial validation gate. The simulation results show the proposed method can set the appropriate initial validation gates to track the targets with the maximum speed.

Sequence driven features for prediction of subcellular localization of proteins

  • Kim, Jong-Kyoung;Bang, Sung-Yang;Choi, Seung-Jin
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.237-242
    • /
    • 2005
  • Predicting the cellular location of an unknown protein gives a valuable information for inferring the possible function of the protein. For more accurate prediction system, we need a good feature extraction method that transforms the raw sequence data into the numerical feature vector, minimizing information loss. In this paper, we propose new methods of extracting underlying features only from the sequence data by computing pairwise sequence alignment scores. In addition, we use composition based features to improve prediction accuracy. To construct an SVM ensemble from separately trained SVM classifiers, we propose specificity based weighted majority voting. The overall prediction accuracy evaluated by the 5-fold cross-validation reached 88.53% for the eukaryotic animal data set. By comparing the prediction accuracy of various feature extraction methods, we could get the biological insight on the location of targeting information. Our numerical experiments confirm that our new feature extraction methods are very useful for predicting subcellular localization of proteins.

  • PDF

Text-driven Speech Animation with Emotion Control

  • Chae, Wonseok;Kim, Yejin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.8
    • /
    • pp.3473-3487
    • /
    • 2020
  • In this paper, we present a new approach to creating speech animation with emotional expressions using a small set of example models. To generate realistic facial animation, two example models called key visemes and expressions are used for lip-synchronization and facial expressions, respectively. The key visemes represent lip shapes of phonemes such as vowels and consonants while the key expressions represent basic emotions of a face. Our approach utilizes a text-to-speech (TTS) system to create a phonetic transcript for the speech animation. Based on a phonetic transcript, a sequence of speech animation is synthesized by interpolating the corresponding sequence of key visemes. Using an input parameter vector, the key expressions are blended by a method of scattered data interpolation. During the synthesizing process, an importance-based scheme is introduced to combine both lip-synchronization and facial expressions into one animation sequence in real time (over 120Hz). The proposed approach can be applied to diverse types of digital content and applications that use facial animation with high accuracy (over 90%) in speech recognition.

Prediction of subcellular localization of proteins using pairwise sequence alignment and support vector machine

  • Kim, Jong-Kyoung;Raghava, G. P. S.;Kim, Kwang-S.;Bang, Sung-Yang;Choi, Seung-Jin
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.158-166
    • /
    • 2004
  • Predicting the destination of a protein in a cell gives valuable information for annotating the function of the protein. Recent technological breakthroughs have led us to develop more accurate methods for predicting the subcellular localization of proteins. The most important factor in determining the accuracy of these methods, is a way of extracting useful features from protein sequences. We propose a new method for extracting appropriate features only from the sequence data by computing pairwise sequence alignment scores. As a classifier, support vector machine (SVM) is used. The overall prediction accuracy evaluated by the jackknife validation technique reach 94.70% for the eukaryotic non-plant data set and 92.10% for the eukaryotic plant data set, which show the highest prediction accuracy among methods reported so far with such data sets. Our numerical experimental results confirm that our feature extraction method based on pairwise sequence alignment, is useful for this classification problem.

  • PDF

Selection of Signal Strength and Detection Threshold for Optimal Tracking with Nearest Neighbor Filter (NN 필터 추적을 위한 최적 신호 강도 및 검출 문턱값 선택)

  • Jeong, Yeong-Heon;Gwon, Il-Hwan;Hong, Sun-Mok
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.37 no.3
    • /
    • pp.1-8
    • /
    • 2000
  • In this paper, we formulate an optimal control problem to obtain the optimal signal strength and detection threshold for tracking with NN filter, First, we predict the tracking performance of NN filter by using the HYCA method. Based on this method, the predicted tracking performance is represented with respect to signal strength and detection threshold. Using this relation, we find the optimal parameters for following three examples: 1) the sequence of optimal detection threshold which minimizes sum of position estimation error; 2) the sequence of optimal detection threshold which minimizes sum of validation gate volume; and 3) the sequence of optimal signal strength and detection threshold which minimizes sum of signal strength.

  • PDF