• Title/Summary/Keyword: sequence motif

검색결과 241건 처리시간 0.029초

Protein Motif Extraction via Feature Interval Selection

  • Sohn, In-Suk;Hwang, Chang-Ha;Ko, Jun-Su;Chiu, David;Hong, Dug-Hun
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권4호
    • /
    • pp.1279-1287
    • /
    • 2006
  • The purpose of this paper is to present a new algorithm for extracting the consensus pattern, or motif from sequence belonging to the same family. Two methods are considered for feature interval partitioning based on equal probability and equal width interval partitioning. C2H2 zinc finger protein and epidermal growth factor protein sequences are used to demonstrate the effectiveness of the proposed algorithm for motif extraction. For two protein families, the equal width interval partitioning method performs better than the equal probability interval partitioning method.

  • PDF

MOTIF BASED PROTEIN FUNCTION ANALYSIS USING DATA MINING

  • Lee, Bum-Ju;Lee, Heon-Gyu;Ryu, Keun-Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2006년도 Proceedings of ISRS 2006 PORSEC Volume II
    • /
    • pp.812-815
    • /
    • 2006
  • Proteins are essential agents for controlling, effecting and modulating cellular functions, and proteins with similar sequences have diverged from a common ancestral gene, and have similar structures and functions. Function prediction of unknown proteins remains one of the most challenging problems in bioinformatics. Recently, various computational approaches have been developed for identification of short sequences that are conserved within a family of closely related protein sequence. Protein function is often correlated with highly conserved motifs. Motif is the smallest unit of protein structure and function, and intends to make core part among protein structural and functional components. Therefore, prediction methods using data mining or machine learning have been developed. In this paper, we describe an approach for protein function prediction of motif-based models using data mining. Our work consists of three phrases. We make training and test data set and construct classifier using a training set. Also, through experiments, we evaluate our classifier with other classifiers in point of the accuracy of resulting classification.

  • PDF

Interpretation of Association Networks among Protein Sequence Motifs

  • Kam, Hye J.;Lee, Junehawk;Lee, Doheon;Lee, Kwang H.
    • Genomics & Informatics
    • /
    • 제1권2호
    • /
    • pp.75-79
    • /
    • 2003
  • Every protein can be characterized by either a distinct motif or a combination of motifs. Nevertheless, little is known about the relationships among (more than two) the motifs. Some of the proteins in the world are share motifs for evolutional or other biological benefits - they can save energy, time and resource for controlling and managing a variety of proteins. In some cases of motifs, the tendency is quite common and they can act the 'hub' motif of a network of the motif associations. The hubs are structurally and functionally important in themselves and also important in disease-related mutations. They will be highly resistant mutation to conserve their functions. But, in case of the a rare mutation, mutations on the position of hub can more easily cause fatal diseases.

인간 단백질 분석을 위한 빅 데이타 기반 RMF 방법 (A Big Data Based Random Motif Frequency Method for Analyzing Human Proteins)

  • 김은미;정종철;이배호
    • 한국전자통신학회논문지
    • /
    • 제13권6호
    • /
    • pp.1397-1404
    • /
    • 2018
  • 입체적 단백질 구조를 이용한 단백질의 분석은 3차원 데이타를 생성하기 위한 기술적인 어려움과 요구되는 높은 비용으로 인해 크게 발전하지 못하였다. 모티프(motif)는 단백질이나 유전자 염기서열의 단편(segment) 정보로 정의된다. 단순성 때문에 모티프는 다양한 분야에서 활발하고 폭넓게 응용되고 있다. 그러나 모티프 자체에 대한 포괄적인 이해와 연구는 미미하다. 이 논문이 가지는 중요성은 인공지능 기법을 활용하여 인간 단백질을 분석하는 방법으로 3가지 측면에서 찾아볼 수 있다. (1) 현재 단백질 데이타 뱅크 (PDB)에 저장된 모든 인간의 단백질 구조를, 이에 상응하는 효소위원회 (EC)의 데이타베이스와 단백질의 구조적 특성에 따른 분류 데이타베이스 (SCOP)를 연동하여, 단백질이 가지는 고유의 특성을 모티프를 응용한 새로운 방법으로 컴퓨터를 이용하여, 분석한 최초의 종합적이고 심층적인 인간 단백질의 분석법이다. (2) 본 연구는 모티프에 의해 생성된 새로운 단백질의 특성을 계층적 클러스터링을 이용하여 단백질이 가지는 고유한 특징을 패턴 분석법과 통계 그리고 단백질 기능 분석의 세 가지 범주로 단백질의 특성을 분석한다. (3) 임의로 생성된 모티프가 단백질 내에서 가지는 빈도에 대해 빅 데이타를 활용하여 모티프의 길이를 다양화시킴과 동시에 접촉 염기와 단백질의 기능을 다각도로 분석할 수 있는 임의 모티프 빈도 (RMF)를 이용한 단백질 분석 방법론을 제안한다.

A New Esterase, Belonging to Hormone-Sensitive Lipase Family, Cloned from Rheinheimera sp. Isolated from Industrial Effluent

  • Virk, Antar Puneet;Sharma, Prince;Capalash, Neena
    • Journal of Microbiology and Biotechnology
    • /
    • 제21권7호
    • /
    • pp.667-674
    • /
    • 2011
  • The gene for esterase (rEst1) was isolated from a new species of genus Rheinheimera by functional screening of E. coli cells transformed with the pSMART/HaeIII genomic library. E. coli cells harboring the esterase gene insert could grow and produce clear halo zones on tributyrin agar. The rEst1 ORF consisted of 1,029 bp, corresponding to 342 amino acid residues with a molecular mass of 37 kDa. The signal P program 3.0 revealed the presence of a signal peptide of 25 amino acids. Esterase activity, however, was associated with a homotrimeric form of molecular mass 95 kDa and not with the monomeric form. The deduced amino acid sequence showed only 54% sequence identity with the closest lipase from Cellvibrio japonicus strain Ueda 107. Conserved domain search and multiple sequence alignment revealed the presence of an esterase/ lipase conserved domain consisting of a GXSXG motif, HGGG motif (oxyanion hole) and HGF motif, typical of the class IV hormone sensitive lipase family. On the basis of the sequence comparison with known esterases/ lipases, REst1 represents a new esterase belonging to the class IV family. The purified enzyme worked optimally at $50^{\circ}C$ and pH 8, utilized pNP esters of short chain lengths, and showed best catalytic activity with p-nitrophenyl butyrate ($C_4$), indicating that it was an esterase. The enzyme was completely inhibited by PMSF and DEPC and showed moderate organotolerance.

Cloning of a pore-forming subunit of ATP-sensitive potassium channel from Clonorchis sinensis

  • Hwang, Seung-Young;Han, Hye-Jin;Kim, So-Hee;Park, Sae-Gwang;Seog, Dae-Hyun;Kim, Na-Ri;Han, Jin;Chung, Joon-Yong;Kho, Weon-Gyu
    • Parasites, Hosts and Diseases
    • /
    • 제41권2호
    • /
    • pp.129-133
    • /
    • 2003
  • A complete cDNA sequence encoding a pore-forming subunit (Kir6.2) of ATP-senstive potassium channel in the adult worm, Clonorchis sinensis, termed CsKir6.2, was isolated from an adult CDNA library. The cDNA contained a single open-reading frame of 333 amino acids, which has a structural motif (a GFG-motif) of the putative pore-forming loop of the Kir6.2. Peculiarly, the Cskir6.2 shows a lack-sequence structure, which deleted 57 amino acids were deleted from its N-terminus. The predicted amino acid sequence revealed a highly conserved sequence as other known other Kir6.2 subunits. The mRNA was weekly expressed in the adult worm.

Cloning and Characterization of the Putative Transferrin Receptor cDNA from the Olive Flounder (Paralichthys olivaceus)

  • Won Kyoung-Mi;Park Soo-Il
    • Fisheries and Aquatic Sciences
    • /
    • 제6권2호
    • /
    • pp.101-104
    • /
    • 2003
  • A cDNA clone for the olive flounder (Paralichthys olivaceus) transferrin receptor (fTfR) was isolated from a leukocytes cDNA library. The fTfR gene consisted of 2,319 bp encoding 773 amino acid residues. The amino acid sequence alignment of the fTfR showed that their size and hydrophobic profile are similar. In addition, the Tyr-Thr-Arg-Phe (YTRF) motif that is the recognition signal for high-efficiency endocytosis, is conserved very well. This motif is important for functional properties of TfR. The deduced amino acid sequence had $42.4-42.9\%$ identities with the previously reported TfRs of vertebrates. The fTfR was expressed in the blood, kidney, spleen, and liver of healthy olive flounder by the Northern blot hybridization.

Regulation of amyloid precursor protein processing by its KFERQ motif

  • Park, Ji-Seon;Kim, Dong-Hou;Yoon, Seung-Yong
    • BMB Reports
    • /
    • 제49권6호
    • /
    • pp.337-343
    • /
    • 2016
  • Understanding of trafficking, processing, and degradation mechanisms of amyloid precursor protein (APP) is important because APP can be processed to produce β-amyloid (Aβ), a key pathogenic molecule in Alzheimer's disease (AD). Here, we found that APP contains KFERQ motif at its C-terminus, a consensus sequence for chaperone-mediated autophagy (CMA) or microautophagy which are another types of autophagy for degradation of pathogenic molecules in neurodegenerative diseases. Deletion of KFERQ in APP increased C-terminal fragments (CTFs) and secreted N-terminal fragments of APP and kept it away from lysosomes. KFERQ deletion did not abolish the interaction of APP or its cleaved products with heat shock cognate protein 70 (Hsc70), a protein necessary for CMA or microautophagy. These findings suggest that KFERQ motif is important for normal processing and degradation of APP to preclude the accumulation of APP-CTFs although it may not be important for CMA or microautophagy.

분할 순차 패턴과 SVM을 이용한 HPV 타입 예측 시스템 (HPV-type Prediction System using SVM and Partial Sequential Pattern)

  • 김진수
    • 디지털융복합연구
    • /
    • 제12권12호
    • /
    • pp.365-370
    • /
    • 2014
  • 기존의 시스템에서는 서열 전체 혹은 정렬되지 않은 서열로부터 패턴들을 생성하기 때문에 패턴의 수가 기하급수적으로 증가하여 많은 시간과 비용이 소모된다. 본 논문에서는 단백질의 전체 서열로부터 패턴을 찾아내는 것이 아니라, 다중 서열 정렬 기법을 이용하여 단백질의 분할 서열 구간을 생성하고 분할 서열 구간의 순차 패턴을 생성하며 생성된 패턴들을 통합하여 전체 모티프 후보 집합을 만들어 SVM의 훈련 집합으로 선택 및 학습하며, 최종적으로 미지의 혹은 알려진 단백질 서열의 HPV 타입을 SVM을 통해 학습된 정보를 적용하여 예측하는 시스템을 제안한다. 제안된 시스템은 기존의 시스템에 비해 최소 지지도 30%에서 정확도와 재현율 측면에서 보다 향상된 성능을 보였다.

Molecular Cloning and Nucleotide Sequence of Endo-Inulinase Gene from Xanthomonas oryzae #5

  • 김병우;김미랑;유동주
    • 한국생물공학회:학술대회논문집
    • /
    • 한국생물공학회 2000년도 추계학술발표대회 및 bio-venture fair
    • /
    • pp.655-659
    • /
    • 2000
  • 토양에서 분리한 endo-inulinase 생산 균주인 Xanthomonas oryzae #5로 부터 11.5kb의 endo-inulinase 유전자를 포함하는 재조합 plasmid를 함유한 형질 전환주를 분리하였다. 11.5kb의 단편으로부터 8.6kb, 4.1kb의 단편을 포함한 pDI 2, pDI4 재조합 plasmid를 제작하여 활성을 확인한 결과 endo-inuliase 활성을 나타내었으며, 재조합 plasmid pDI 2를 이용하여 DNA sequence를 한 결과 endo-inulinase 유전자는 1,333개의 아미노산으로 구성된 ORF를 가지고 있었다. 또한 B. circulans MCI-2554의 CFTase와 아미노산 배열에 있어은 약 72%의 높은 homology를 나타냈었으며, 다른 fructan hydrolases, inulinase, levanase와의 아미노산 비교로부터 ${\beta}-fructouranosidase$ motif를 포함한 6개의 유사부위를 확인하였다.

  • PDF