• Title/Summary/Keyword: sequence-to-sequence

Search Result 15,209, Processing Time 0.037 seconds

Correlation Analysis between Regulatory Sequence Motifs and Expression Profiles by Kernel CCA

  • Rhee, Je-Keun;Joung, Je-Gun;Chang, Jeong-Ho;Zhang, Byoung-Tak
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.63-68
    • /
    • 2005
  • Transcription factors regulate gene expression by binding to gene upstream region. Each transcription factor has the specific binding site in promoter region. So the analysis of gene upstream sequence is necessary for understanding regulatory mechanism of genes, under a plausible idea that assumption that DNA sequence motif profiles are closely related to gene expression behaviors of the corresponding genes. Here, we present an effective approach to the analysis of the relation between gene expression profiles and gene upstream sequences on the basis of kernel canonical correlation analysis (kernel CCA). Kernel CCA is a useful method for finding relationships underlying between two different data sets. In the application to a yeast cell cycle data set, it is shown that gene upstream sequence profile is closely related to gene expression patterns in terms of canonical correlation scores. By the further analysis of the contributing values or weights of sequence motifs in the construction of a pair of sequence motif profiles and expression profiles, we show that the proposed method can identify significant DNA sequence motifs involved with some specific gene expression patterns, including some well known motifs and those putative, in the process of the yeast cell cycle.

  • PDF

T2 Relaxographic Mapping using 8-echo CPMG MRI Pulse Sequence

  • E-K. Jeong;Lee, S-H.;J-S. Suh;Y-Y wak;S-A. Shin;Y-K. Kwon;Y. Huh
    • Journal of the Korean Magnetic Resonance Society
    • /
    • v.1 no.1
    • /
    • pp.7-20
    • /
    • 1997
  • The mapping of the spin-spin relaxation time T2 in pixed-by-pixel was suggested as a quantitative diagnostic tool in medicine. Although the CPMG pulse sequence has been known to be the best pulse sequence for T2 measurement in physics NMR, the supplied pulse sequence by the manufacture of MRI system was able to obtain the maximum of 4 CPMG images. Eight or more images with different echo time TEs are required to construct a reliable T2 map, so that two or more acquisitions were required, which easily took more than 10 minutes. 4-echo CPMG imaging pulse sequence was modified to generate the maximum of 8 MR images with evenly spaced echo time TEs. In human MR imaging, since patients tend to move at least several pixels between the different acquisitions, 8-echo CPMG imaging sequence reduces the acquisition time and may remove any misregistration of each pixel's signal for the fitting T2. The resultant T2 maps using the theoretically simulated images and using the MR images of the human brain suggested that 8 echo CPMG sequence with short echo spacing such as 17∼20 msec can give the reliable T2 map.

  • PDF

A Study on a Binary Random Sequence Generator with Two Characteristic Polynomials (두개의 특성 다항식으로 구성된 이진 난수열 발생기에 관한 연구)

  • 김대엽;주학수;임종인
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.12 no.3
    • /
    • pp.77-85
    • /
    • 2002
  • A Research of binary random sequence generator that uses a linear shift register had been studied since the 1970s. These generators were used in stream cipher. In general, the binary random sequence generator consists of linear shift registers that generate sequences of maximum period and a nonlinear filter function or a nonlinear combination function to generate a sequence of high linear complexity. Therefore, To generate a sequence that have long period as well as high linear complexity becomes an important factor to estimate safety of stream cipher. Usually, the maximum period of the sequence generated by a linear feedback shift register with L resistors is less than or equal to $2^L$-1. In this paper, we propose new binary random sequence generator that consist of L registers and 2 sub-characteristic polynomials. According to an initial state vector, the least period of the sequence generated by the proposed generator is equal to or ions than it of the sequence created by the general linear feedback shift register, and its linear complexity is increased too.

Comparison and Sequence Analysis of the 3` - terminal Regions of RNA 1 of Barley Yellow Mosaic Virus

  • Lee, Kui-Jae
    • Plant Resources
    • /
    • v.1 no.2
    • /
    • pp.92-97
    • /
    • 1998
  • An isolate of barley yellow mosaic virus(BaYMV-HN) obtained from Haenam, Korea was compared with two BaYMV strains. BaYMV-Ⅱ-1 from Japan and BaYMV-G from Germany. The sequence of the 3'-terminal 3817nucleotides[excluding the poly (A) tail] of RNA 1 of BaYMV-HN was determined to start within a long open reading frame coding for a part of the NIa-VPg polymerase(26 amino acids). NIa-Pro polymerase (343 amino acids), NIb polymerase(528 amino acids) and the entire capsid protein(297 amino acids), which is followed by a noncoding region(NCR) of 235 nucelotides. In the partial ORFs, BaYMV-HN shows higher sequence homology with BaYMV-Ⅱ-1(99.5%) than BaYMV-G(92.7%). The 3' non-coding regions of BaYMV-HN(235nt) shows higher nucleotide sequence homology with BaYMV-G(235nt)(99.6%) than BaYMV-Ⅱ-1(231nt)(97.0%). The 3' NIa-Pro protein sequence of BaYMV-HN shows higher amino acid sequence homology with BaYMV-Ⅱ-1(95.0%) than BaYMV-G(93.6%), but, NIb protein sequence of BaYMV-HN shows same all amino acid sequence. The capsid protein sequence of BaYMV-HN(297aa) shows same with BaYMV-Ⅱ-1, and shows higher nucleotide sequence homology with BaYMV-UK (from United Kingdom)(97.3%) than BaYMV-G(96.9%) and G2(96.9%). Difference of capsid protein amino acid were 0-9 between the Japan, United Kingdom and Germany and were 2-6 between all Korean isolates. Many of the amino acid differences are located in the N-terminal regions of the capsid proteins from 1 to 74 amino acid positions.

  • PDF

A management Technique for Protein Version Information based on Local Sequence Alignment and Trigger (로컬 서열 정렬과 트리거 기반의 단백질 버전 정보 관리 기법)

  • Jung Kwang-Su;Park Sung-Hee;Ryu Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.12D no.1 s.97
    • /
    • pp.51-62
    • /
    • 2005
  • After figuring out the function of an amino acid sequence, we can infer the function of the other amino acids that have similar sequence composition. Besides, it is possible that we alter protein whose function we know, into useful protein using genetic engineering method. In this process. an original protein amino sequence produces various protein sequences that have different sequence composition. Here, a systematic technique is needed to manage protein version sequences and reference data of those sequences. Thus, in this paper we proposed a technique of managing protein version sequences based on local sequence alignment and a technique of managing protein historical reference data using Trigger This method automatically determines the similarity between an original sequence and each version sequence while the protein version sequences are stored into database. When this technique is employed, the storage space that stores protein sequences is also reduced. After storing the historical information of protein and analyzing the change of protein sequence, we expect that a new useful protein and drug are able to be discovered based on analysis of version sequence.

Mining High Utility Sequential Patterns Using Sequence Utility Lists (시퀀스 유틸리티 리스트를 사용하여 높은 유틸리티 순차 패턴 탐사 기법)

  • Park, Jong Soo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.2
    • /
    • pp.51-62
    • /
    • 2018
  • High utility sequential pattern (HUSP) mining has been considered as an important research topic in data mining. Although some algorithms have been proposed for this topic, they incur the problem of producing a large search space for HUSPs. The tighter utility upper bound of a sequence can prune more unpromising patterns early in the search space. In this paper, we propose a sequence expected utility (SEU) as a new utility upper bound of each sequence, which is the maximum expected utility of a sequence and all its descendant sequences. A sequence utility list for each pattern is used as a new data structure to maintain essential information for mining HUSPs. We devise an algorithm, high sequence utility list-span (HSUL-Span), to identify HUSPs by employing SEU. Experimental results on both synthetic and real datasets from different domains show that HSUL-Span generates considerably less candidate patterns and outperforms other algorithms in terms of execution time.

A DNA Sequence Alignment Algorithm Using Quality Information and a Fuzzy Inference Method (품질 정보와 퍼지 추론 기법을 이용한 DNA 염기 서열 배치 알고리즘)

  • Kim, Kwang-Baek
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.2
    • /
    • pp.55-68
    • /
    • 2007
  • DNA sequence alignment algorithms in computational molecular biology have been improved by diverse methods. In this paper, we proposed a DNA sequence alignment algorithm utilizing quality information and a fuzzy inference method utilizing characteristics of DNA sequence fragments and a fuzzy logic system in order to improve conventional DNA sequence alignment methods using DNA sequence quality information. In conventional algorithms, DNA sequence alignment scores were calculated by the global sequence alignment algorithm proposed by Needleman-Wunsch applying quality information of each DNA fragment. However, there may be errors in the process for calculating DNA sequence alignment scores in case of low quality of DNA fragment tips, because overall DNA sequence quality information are used. In the proposed method, exact DNA sequence alignment can be achieved in spite of low quality of DNA fragment tips by improvement of conventional algorithms using quality information. And also, mapping score parameters used to calculate DNA sequence alignment scores, are dynamically adjusted by the fuzzy logic system utilizing lengths of DNA fragments and frequencies of low quality DNA bases in the fragments. From the experiments by applying real genome data of NCBI (National Center for Biotechnology Information), we could see that the proposed method was more efficient than conventional algorithms using quality information in DNA sequence alignment.

  • PDF

M2M Transformation Rules for Automatic Test Case Generation from Sequence Diagram (시퀀스 다이어그램으로부터 테스트 케이스 자동 생성을 위한 M2M(Model-to-Model) 변환 규칙)

  • Kim, Jin-a;Kim, Su Ji;Seo, Yongjin;Cheon, Eunyoung;Kim, Hyeon Soo
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.1
    • /
    • pp.32-37
    • /
    • 2016
  • In model-based testing using sequence diagrams, test cases are automatically derived from the sequence diagrams. For the generation of test cases, scenarios need to be found for representing as a sequence diagram, and to extract test paths satisfying the test coverage. However, it is hard to automatically extract test paths from the sequence diagram because a sequence diagram represents loop, opt, and alt information using CombinedFragments. To resolve this problem, we propose a transformation process that transforms a sequence diagram into an activity diagram which represents scenarios as a type of control flows. In addition, we generate test cases from the activity diagram by applying a test coverage concept. Finally, we present a case study for test cases generation from a sequence diagram.

Analysis on the nucleotide sequence of the signal region of bacillus subitilis extracellular cellulase gene (Bacillus subtilis로 부터 분리한 cellulase 유전자의 조절부위에 대한 염기서열분석)

  • 서연수;이영호;백운화;강현삼
    • Korean Journal of Microbiology
    • /
    • v.24 no.3
    • /
    • pp.236-242
    • /
    • 1986
  • The nucleotide sequence of the genetic control site of Bacillus subtilis gene for $(1-4)-{\beta}-D-glucan$ endoglucanase (cellulase) was determined according to the procedures of the dideoxy chain termination method(Sanger et. al., 1977). The deduced amino acid sequence of this enzyme has a hydrophobic signal peptide at the $NH_2$ terminus similar to those found in fifteen other extracellualr enzymes from Bacillus species. This is followed by a sequence resembling the Bacillus ribosome binding site 14 nucleotide before the first codon of the gene. The presumptive promoter sequence was located 92 base pairs upstream fromthe initiation codon. The homology region in signal sequences was striking when comparing all the signal sequences of sixteen extracellular enzymes from Bacillus species so far compiled.

  • PDF

Studies on the Oranization and Expression of tRNA Genes in Aspergillus nidulans (V) The Molecular Structure of $tRNA^{Arg}$ in Aspergillus nidulans (Aspergillus nidulans의 tRNA유전자의 구조와 발현에 관한 연구 V Aspergillus nidulansd의 $tRNA^{Arg}$ 분자구조)

  • 이병재;강현삼
    • Korean Journal of Microbiology
    • /
    • v.24 no.2
    • /
    • pp.79-85
    • /
    • 1986
  • We have determined the sequence of $tRNA^{Arg}$ of A. nidulans partially by enzymatic rapid RNA sequencing technique. The sequence was 5'GGCCGGCUGGCCCAAXUGGCAAGGXUCUGAXUACGAAXCAGGAGAUUGCACXXXXXGAGCXXUXXGUCGGUCACCA3' The cloverleaf structure was made from above data. As a result, the anticodon sequence was identified as ACG. This result was confirmed with charging test. The complete sequence was proposed by supplementing the DNA sequence to and by assigning the position of minor bases to this RNA sequence.

  • PDF