• 제목/요약/키워드: exact sequences

검색결과 104건 처리시간 0.024초

AllEC: An Implementation of Application for EC Numbers Prediction based on AEC Algorithm

  • Park, Juyeon;Park, Mingyu;Han, Sora;Kim, Jeongdong;Oh, Taejin;Lee, Hyun
    • International Journal of Advanced Culture Technology
    • /
    • 제10권2호
    • /
    • pp.201-212
    • /
    • 2022
  • With the development of sequencing technology, there is a need for technology to predict the function of the protein sequence. Enzyme Commission (EC) numbers are becoming markers that distinguish the function of the sequence. In particular, many researchers are researching various methods of predicting the EC numbers of protein sequences based on deep learning. However, as studies using various methods exist, a problem arises, in which the exact prediction result of the sequence is unknown. To solve this problem, this paper proposes an All Enzyme Commission (AEC) algorithm. The proposed AEC is an algorithm that executes various prediction methods and integrates the results when predicting sequences. This algorithm uses duplicates to give more weights when duplicate values are obtained from multiple methods. The largest value, among the final prediction result values for each method to which the weight is applied, is the final prediction result. Moreover, for the convenience of researchers, the proposed algorithm is provided through the AllEC web services. They can use the algorithms regardless of the operating systems, installation, or operating environment.

ANALYSIS OF NEIGHBOR-JOINING BASED ON BOX MODEL

  • Cho, Jin-Hwan;Joe, Do-Sang;Kim, Young-Rock
    • Journal of applied mathematics & informatics
    • /
    • 제25권1_2호
    • /
    • pp.455-470
    • /
    • 2007
  • In phylogenetic tree construction the neighbor-joining algorithm is the most well known method which constructs a trivalent tree from a pairwise distance data measured by DNA sequences. The core part of the algorithm is its cherry picking criterion based on the tree structure of each quartet. We give a generalized version of the criterion based on the exact box model of quartets, known as the tight span of a metric. We also show by experiment why neighbor-joining and the quartet consistency count method give similar performance.

ALGEBRAIC STRUCTURES IN A PRINCIPAL FIBRE BUNDLE

  • Park, Joon-Sik
    • 충청수학회지
    • /
    • 제21권3호
    • /
    • pp.371-376
    • /
    • 2008
  • Let $P(M,G,{\pi})=:P$ be a principal fibre bundle with structure Lie group G over a base manifold M. In this paper we get the following facts: 1. The tangent bundle TG of the structure Lie group G in $P(M,G,{\pi})=:P$ is a Lie group. 2. The Lie algebra ${\mathcal{g}}=T_eG$ is a normal subgroup of the Lie group TG. 3. $TP(TM,TG,{\pi}_*)=:TP$ is a principal fibre bundle with structure Lie group TG and projection ${\pi}_*$ over base manifold TM, where ${\pi}_*$ is the differential map of the projection ${\pi}$ of P onto M. 4. for a Lie group $H,\;TH=H{\circ}T_eH=T_eH{\circ}H=TH$ and $H{\cap}T_eH=\{e\}$, but H is not a normal subgroup of the group TH in general.

  • PDF

Finding approximate occurrence of a pattern that contains gaps by the bit-vector approach

  • Lee, In-Bok;Park, Kun-Soo
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2003년도 제2차 연례학술대회 발표논문집
    • /
    • pp.193-199
    • /
    • 2003
  • The application of finding occurrences of a pattern that contains gaps includes information retrieval, data mining, and computational biology. As the biological sequences may contain errors, it is important to find not only the exact occurrences of a pattern but also approximate ones. In this paper we present an O(mnk$_{max}$/w) time algorithm for the approximate gapped pattern matching problem, where m is the length of the text, H is the length of the pattern, w is the word size of the target machine, and k$_{max}$ is the greatest error bound for subpatterns.

  • PDF

새로운 공간경사를 사용한 시공간 경사법에 의한 운동경계 검출 및 이동벡터 추정 (Motion Boundary Detection and Motion Vector Estimation by spatio-temporal Gradient Method using a New Spatial Gradient)

  • 김이한;김성대
    • 전자공학회논문지B
    • /
    • 제30B권2호
    • /
    • pp.59-68
    • /
    • 1993
  • The motion vector estimation and motion boundary detection have been briskly studied since they are an important clue for analysis of object structure and 3-d motion. The purpose of this researches is more exact estimation, but there are two main causes to make inaccurate. The one is the erroneous measurement of gradients in brightness values and the other is the blurring of motion boundries which is caused by the smoothness constraint. In this paper, we analyze the gradient measurement error of conventional methods and propose new technique based on it. When the proposed method is applied to the motion boundary detection in Schunck and motion vector estimation in Horn & Schunck, it is shown to have much better performance than conventional method is some artificial and real image sequences.

  • PDF

Estimation of Gini-Simpson index for SNP data

  • Kang, Joonsung
    • Journal of the Korean Data and Information Science Society
    • /
    • 제28권6호
    • /
    • pp.1557-1564
    • /
    • 2017
  • We take genomic sequences of high-dimensional low sample size (HDLSS) without ordering of response categories into account. When constructing an appropriate test statistics in this model, the classical multivariate analysis of variance (MANOVA) approach might not be useful owing to very large number of parameters and very small sample size. For these reasons, we present a pseudo marginal model based upon the Gini-Simpson index estimated via Bayesian approach. In view of small sample size, we consider the permutation distribution by every possible n! (equally likely) permutation of the joined sample observations across G groups of (sizes $n_1,{\ldots}n_G$). We simulate data and apply false discovery rate (FDR) and positive false discovery rate (pFDR) with associated proposed test statistics to the data. And we also analyze real SARS data and compute FDR and pFDR. FDR and pFDR procedure along with the associated test statistics for each gene control the FDR and pFDR respectively at any level ${\alpha}$ for the set of p-values by using the exact conditional permutation theory.

압축영역에서 객체 움직임 맵에 의한 효율적인 비디오 인덱싱 방법에 관한 연구 (An Efficient Video Indexing Method using Object Motion Map in compresed Domain)

  • 김소연;노용만
    • 한국정보처리학회논문지
    • /
    • 제7권5호
    • /
    • pp.1570-1578
    • /
    • 2000
  • Object motion is an important feature of content in video sequences. By now, various methods to exact feature about the object motion have been reported[1,2]. However they are not suitable to index video using the motion, since a lot of bits and complex indexing parameters are needed for the indexing [3,4] In this paper, we propose object motion map which could provide efficient indexing method for object motion. The proposed object motion map has both global and local motion information during an object is moving. Furthermore, it requires small bit of memory for the indexing. to evaluate performance of proposed indexing technique, experiments are performed with video database consisting of MPEG-1 video sequence in MPEG-7 test set.

  • PDF

Distribution of Runs and Patterns in Four State Trials

  • Jungtaek Oh
    • Kyungpook Mathematical Journal
    • /
    • 제64권2호
    • /
    • pp.287-301
    • /
    • 2024
  • From the mathematical and statistical point of view, a segment of a DNA strand can be viewed as a sequence of four-state (A, C, G, T) trials. Herein, we consider the distributions of runs and patterns related to the run lengths of multi-state sequences, especially for four states (A, B, C, D). Let X1, X2, . . . be a sequence of four state independent and identically distributed trials taking values in the set 𝒢 = {A, B, C, D}. In this study, we obtain exact formulas for the probability distribution function for the discrete distribution of runs of B's of order k. We obtain longest run statistics, shortest run statistics, and determine the distributions of waiting times and run lengths.

The Effect of Acoustic Correlates of Domain-initial Strengthening in Lexical Segmentation of English by Native Korean Listeners

  • Kim, Sa-Hyang;Cho, Tae-Hong
    • 말소리와 음성과학
    • /
    • 제2권3호
    • /
    • pp.115-124
    • /
    • 2010
  • The current study investigated the role of acoustic correlates of domain-initial strengthening in lexical segmentation of a non-native language. In a series of cross-modal identity-priming experiments, native Korean listeners heard English auditory stimuli and made lexical decision to visual targets (i.e., written words). The auditory stimuli contained critical two word sequences which created temporal lexical ambiguity (e.g., 'mill#company', with the competitor 'milk'). There was either an IP boundary or a word boundary between the two words in the critical sequences. The initial CV of the second word (e.g., [$k_{\Lambda}$] in 'company') was spliced from another token of the sequence in IP- or Wd-initial positions. The prime words were postboundary words (e.g., company) in Experiment 1, and preboundary words (e.g., mill) in Experiment 2. In both experiments, Korean listeners showed priming effects only in IP contexts, indicating that they can make use of IP boundary cues of English in lexical segmentation of English. The acoustic correlates of domain-initial strengthening were also exploited by Korean listeners, but significant effects were found only for the segmentation of postboundary words. The results therefore indicate that L2 listeners can make use of prosodically driven phonetic detail in lexical segmentation of L2, as long as the direction of those cues are similar in their L1 and L2. The exact use of the cues by Korean listeners was, however, different from that found with native English listeners in Cho, McQueen, and Cox (2007). The differential use of the prosodically driven phonetic cues by the native and non-native listeners are thus discussed.

  • PDF

Differentially Expressed Genes by Methylmercury in Neuroblastoma cell line using suppression subtractive hybridization (SSH) and cDNA Microarray

  • Kim, Youn-Jung;Chang, Suk-Tai;Yun, Hye-Jung;Ryu, Jae-Chun
    • 한국환경독성학회:학술대회논문집
    • /
    • 한국환경독성학회 2003년도 춘계학술대회
    • /
    • pp.187-187
    • /
    • 2003
  • Methylmercury (MeHg), one of the heavy metal compounds, can cause severe damage to the central nervous system in humans. Many reports have shown that MeHg is poisonous to human body through contaminated foods and has released into the environment. Despite many studies on the pathogenesis of MeHg-induced central neuropathy, no useful mechanism of toxicity has been established so far. In this study, two methods, cDNA Microarray and SSH, were performed to assess the expression profile against MeHg and to identify differentially expressed genes by MeHg in neuroblastoma cell line. TwinChip Human-8K (Digital Genomics) was used with total RNA from SH-SY5Y (human neuroblastoma cell line) treated with solvent (DMSO) and 6.25 uM (IC50) MeHg. And we performed forward and reverse SSH method on mRNA derived from SH-SY5Y treated with DMSO and MeHg (6.25 uM). Differentially expressed cDNA clones were sequenced and were screened by dot blot and ribonuclease protection assay to confirm that individual clones indeed represent differentially expressed genes. These sequences were identified by BLAST homology search to known genes or expressed sequence tags (ESTs). Analysis of these sequences may provide an insight into the biological effects of MeHg in the pathogenesis of neurodegenerative disease and a possibility to develop more efficient and exact monitoring system of heavy metals as environmental pollutants.

  • PDF