• Title/Summary/Keyword: sequence data

Search Result 3,115, Processing Time 0.029 seconds

Development of a Data Structure for Effective Monitoring of Power Plant Start-up Sequences (화력 발전소의 기동 시퀀스 진행 모니터링을 위한 자료구조 개발)

  • Lee, Seung-Chul;Han, Seung-Woo;Kim, Seung-Jin
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.23 no.12
    • /
    • pp.224-232
    • /
    • 2009
  • Power plant start-up is a complicate process involving hundreds of operations that should be performed either automatically or manually. Several major operations should be proceeded in parallel and each major operation is again broken down into detailed operations that must be carried out in a strict sequence. Even though most of the operations are automated, still substantial portions of the operations are carried out manually and the operational status should be monitored by the crew members, which are quite stressful tasks to be performed in real time. In this paper, a data structure called an Event Sequence Monitoring Graph(ESMG) is proposed for monitoring a sequence of events involved in the power plant start-up process. The ESMG is currently being applied to a thermal power plant with a rated output of 500MW. An application example is shown with the boiler feed water pump system start-up process, which exhibits a good potential for future applications.

An Anomalous Sequence Detection Method Based on An Extended LSTM Autoencoder (확장된 LSTM 오토인코더 기반 이상 시퀀스 탐지 기법)

  • Lee, Jooyeon;Lee, Ki Yong
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.1
    • /
    • pp.127-140
    • /
    • 2021
  • Recently, sequence data containing time information, such as sensor measurement data and purchase history, has been generated in various applications. So far, many methods for finding sequences that are significantly different from other sequences among given sequences have been proposed. However, most of them have a limitation that they consider only the order of elements in the sequences. Therefore, in this paper, we propose a new anomalous sequence detection method that considers both the order of elements and the time interval between elements. The proposed method uses an extended LSTM autoencoder model, which has an additional layer that converts a sequence into a form that can help effectively learn both the order of elements and the time interval between elements. The proposed method learns the features of the given sequences with the extended LSTM autoencoder model, and then detects sequences that the model does not reconstruct well as anomalous sequences. Using experiments on synthetic data that contains both normal and anomalous sequences, we show that the proposed method achieves an accuracy close to 100% compared to the method that uses only the traditional LSTM autoencoder.

A Reranking Model for Korean Morphological Analysis Based on Sequence-to-Sequence Model (Sequence-to-Sequence 모델 기반으로 한 한국어 형태소 분석의 재순위화 모델)

  • Choi, Yong-Seok;Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.4
    • /
    • pp.121-128
    • /
    • 2018
  • A Korean morphological analyzer adopts sequence-to-sequence (seq2seq) model, which can generate an output sequence of different length from an input. In general, a seq2seq based Korean morphological analyzer takes a syllable-unit based sequence as an input, and output a syllable-unit based sequence. Syllable-based morphological analysis has the advantage that unknown words can be easily handled, but has the disadvantages that morpheme-based information is ignored. In this paper, we propose a reranking model as a post-processor of seq2seq model that can improve the accuracy of morphological analysis. The seq2seq based morphological analyzer can generate K results by using a beam-search method. The reranking model exploits morpheme-unit embedding information as well as n-gram of morphemes in order to reorder K results. The experimental results show that the reranking model can improve 1.17% F1 score comparing with the original seq2seq model.

Whole-genome sequence analysis through online web interfaces: a review

  • Gunasekara, A.W.A.C.W.R.;Rajapaksha, L.G.T.G.;Tung, T.L.
    • Genomics & Informatics
    • /
    • v.20 no.1
    • /
    • pp.3.1-3.10
    • /
    • 2022
  • The recent development of whole-genome sequencing technologies paved the way for understanding the genomes of microorganisms. Every whole-genome sequencing (WGS) project requires a considerable cost and a massive effort to address the questions at hand. The final step of WGS is data analysis. The analysis of whole-genome sequence is dependent on highly sophisticated bioinformatics tools that the research personal have to buy. However, many laboratories and research institutions do not have the bioinformatics capabilities to analyze the genomic data and therefore, are unable to take maximum advantage of whole-genome sequencing. In this aspect, this study provides a guide for research personals on a set of bioinformatics tools available online that can be used to analyze whole-genome sequence data of bacterial genomes. The web interfaces described here have many advantages and, in most cases exempting the need for costly analysis tools and intensive computing resources.

An Investigation on Expanding Traditional Sequential Analysis Method by Considering the Reversion of Purchase Realization Order (구매의도 생성 순서와 구매실현 순서의 역전 현상을 감안한 확장된 순차분석 방법론)

  • Kim, Minseok;Kim, Namgyu
    • The Journal of Information Systems
    • /
    • v.22 no.3
    • /
    • pp.25-42
    • /
    • 2013
  • Recently various kinds of Information Technology services are created and the quantities of the data flow are increase rapidly. Not only that, but the data patterns that we deal with also slowly becoming diversity. As a result, the demand of discover the meaningful knowledge/information through the various mining analysis such as linkage analysis, sequencing analysis, classification and prediction, has been steadily increasing. However, solving the business problems using data mining analysis does not always concerning, one of the major causes of these limitations is there are some analyzed data can't accurately reflect the real world phenomenon. For example, although the time gap of purchasing the two products is very short, by using the traditional sequencing analysis, the precedence relationship of the two products is clearly reflected. But in the real world, with the very short time interval, the precedence relationship of the two purchases might not be defined. What was worse, the sequence of the purchase intention and the sequence of the purchase realization of the two products might be mutually be reversed. Therefore, in this study, an expanded sequencing analysis methodology has been proposed in order to reflect this situation. In this proposed methodology, the purchases that being made in a very short time interval among the purchase order which might not important will be notice, and the analysis which included the original sequence and reversed sequence will be used to extend the analysis of the data. Also, to some extent a very short time interval can be defined as the time interval, so an experiment were carried out to determine the varying based on the time interval for the actual data.

DSSS MODEM Design and Implementation for a Medium Speed Wireless Link (대중저속 무선 통신을 위한 DSSS 모뎀 설계 및 구현)

  • Won Hee-Seok;Kim Young-Sik
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.43 no.1 s.343
    • /
    • pp.121-126
    • /
    • 2006
  • This paper report on the design and implementation of a 9.6kbps DSSS CDMA modem for a medium speed wireless link. The proposed modem provides a general purpose I/O interface with a microprocessor. The I/O interface consists of 8-bit data bus, chip enable, read/write, and interrupt pins. In transmit block, the 8-bit data delivered from the I/O interface buffer is converted to 9.6kbps serial data, which are spreaded into 76.8kcps with 8-bit PN code generated inside the modem by direct sequence method. An 8-bit training sequence is preceded in the data frame for data synchronization in receiver. In receiver block the PN code is synchronized from the received data spreaded to 76.8kcps and find the data timing from the 8-bit training sequence. We have used the Early-and-Late integration method. The modem has been implemented and verified using a Xilix FPGA board and has been fabricated as an ASIC CHIP through Hynir $0.25{\mu}m$ CMOS. The multiple accessing method is DSSS CDMA.

Studies on the Oranization and Expression of tRNA Genes in Aspergillus nidulans (V) The Molecular Structure of $tRNA^{Arg}$ in Aspergillus nidulans (Aspergillus nidulans의 tRNA유전자의 구조와 발현에 관한 연구 V Aspergillus nidulansd의 $tRNA^{Arg}$ 분자구조)

  • 이병재;강현삼
    • Korean Journal of Microbiology
    • /
    • v.24 no.2
    • /
    • pp.79-85
    • /
    • 1986
  • We have determined the sequence of $tRNA^{Arg}$ of A. nidulans partially by enzymatic rapid RNA sequencing technique. The sequence was 5'GGCCGGCUGGCCCAAXUGGCAAGGXUCUGAXUACGAAXCAGGAGAUUGCACXXXXXGAGCXXUXXGUCGGUCACCA3' The cloverleaf structure was made from above data. As a result, the anticodon sequence was identified as ACG. This result was confirmed with charging test. The complete sequence was proposed by supplementing the DNA sequence to and by assigning the position of minor bases to this RNA sequence.

  • PDF

Exploratory Approach for Fibonacci Numbers and Benford's Law (피보나치수와 벤포드법칙에 대한 탐색적 접근)

  • Jang, Dae-Heung
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.5
    • /
    • pp.1103-1113
    • /
    • 2009
  • We know that the first digits sequence of fibonacci numbers obey Benford's law. For the sequence in which the first two numbers are the arbitrary integers and the recurrence relation $a_{n+2}=a_{n+1}+a_n$ is satisfied, we can find that the first digits sequence of this sequence obey Benford's law. Also, we can find the stucture of the first digits sequence of this sequence with the exploratory data analysis tools.

Molecular Cloning and Sequencing of Cell Wall Hydrolase Gene of an Alkalophilic Bacillus subtilis BL-29

  • Kim, Tae-Ho;Hong, Soon-Duck
    • Journal of Microbiology and Biotechnology
    • /
    • v.7 no.4
    • /
    • pp.223-228
    • /
    • 1997
  • A DNA fragment containing the gene for cell wall hydrolase of alkalophilic Bacillus subtilis BL-29 was cloned into E. coli JM109 using pUC18 as a vector. A recombinant plasmid, designated pCWL45B, was contained in the fragment originating from the alkalophilic B. subtilis BL-29 chromosomal DNA by Southern hybridization analysis. The nucleotide sequence of a 1.6-kb HindIII fragment containing a cell wall hydrolase-encoding gene was determined. The nucleotide sequence revealed an open reading frame (ORF) of 900 bp with a concensus ribosome-binding site located 6 nucleotide upstream from the ATG start codon. The primary amino acid sequence deduced from the nucleotide sequence revealed a putative protein of 299 amino acid residues with an M.W. of 33, 206. Based on comparison of the amino acid sequence of the ORF with amino acid sequences in the GenBank data, it showed significant homology to the sequence of cell wall amidase of the PBSX bacteriophage of B. subtilis.

  • PDF

Could Decimal-binary Vector be a Representative of DNA Sequence for Classification?

  • Sanjaya, Prima;Kang, Dae-Ki
    • International journal of advanced smart convergence
    • /
    • v.5 no.3
    • /
    • pp.8-15
    • /
    • 2016
  • In recent years, one of deep learning models called Deep Belief Network (DBN) which formed by stacking restricted Boltzman machine in a greedy fashion has beed widely used for classification and recognition. With an ability to extracting features of high-level abstraction and deal with higher dimensional data structure, this model has ouperformed outstanding result on image and speech recognition. In this research, we assess the applicability of deep learning in dna classification level. Since the training phase of DBN is costly expensive, specially if deals with DNA sequence with thousand of variables, we introduce a new encoding method, using decimal-binary vector to represent the sequence as input to the model, thereafter compare with one-hot-vector encoding in two datasets. We evaluated our proposed model with different contrastive algorithms which achieved significant improvement for the training speed with comparable classification result. This result has shown a potential of using decimal-binary vector on DBN for DNA sequence to solve other sequence problem in bioinformatics.