• Title/Summary/Keyword: 단백질 프로파일

Search Result 19, Processing Time 0.022 seconds

A Performance Comparison of Protein Profiles for the Prediction of Protein Secondary Structures (단백질 이차 구조 예측을 위한 단백질 프로파일의 성능 비교)

  • Chi, Sang-Mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.1
    • /
    • pp.26-32
    • /
    • 2018
  • The protein secondary structures are important information for studying the evolution, structure and function of proteins. Recently, deep learning methods have been actively applied to predict the secondary structure of proteins using only protein sequence information. In these methods, widely used input features are protein profiles transformed from protein sequences. In this paper, to obtain an effective protein profiles, protein profiles were constructed using protein sequence search methods such as PSI-BLAST and HHblits. We adjust the similarity threshold for determining the homologous protein sequence used in constructing the protein profile and the number of iterations of the profile construction using the homologous sequence information. We used the protein profiles as inputs to convolutional neural networks and recurrent neural networks to predict the secondary structures. The protein profile that was created by adding evolutionary information only once was effective.

Application of Data Cube to Identify Differentially Expressed Proteins by Disease (질병 의존 단백질 도출을 위한 데이터 큐브의 응용)

  • 김단비;이원석
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.04b
    • /
    • pp.268-270
    • /
    • 2004
  • 주어진 셀이나 조직에 발현된 단백질 프로파일의 구조적인 분석을 다루는 단백질체학(Proteomics) 연구에 있어서, 질병에 대한 마커 단백질(marker proteins)을 도출(identification)하는 것은 핵심 논점 중 하나이다. 수십 개의 샘플로부터 추출한 셀이나 조직 내에는 수많은 단백질이 포함되어 있으며, 존재하는 단백질의 질병에 의한 발현량(expression level) 변화 및 임상 특성에 의한 영향을 분석하기 위해서 데이터베이스와 데이터 마이닝 기술의 활용이 효과적이다. 본 논문에서는 질병 일 임상 특성에 따른 단백질의 발현량 변화를 분석하기 위한 OLAP 데이터 큐브(Data cube)의 응용 방법과 단백질 데이터의 분석에 적합한 척도(measure)를 제안하고, 유효성을 보인다.

  • PDF

Comparison of External Information Performance Predicting Subcellular Localization of Proteins (단백질의 세포내 위치를 예측하기 위한 외부정보의 성능 비교)

  • Chi, Sang-Mun
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.11
    • /
    • pp.803-811
    • /
    • 2010
  • Since protein subcellular location and biological function are highly correlated, the prediction of protein subcellular localization can provide information about the function of a protein. In order to enhance the prediction performance, external information other than amino acids sequence information is actively exploited in many researches. This paper compares the prediction capabilities resided in amino acid sequence similarity, protein profile, gene ontology, motif, and textual information. In the experiments using PLOC dataset which has proteins less than 80% sequence similarity, sequence similarity information and gene ontology are effective information, achieving a classification accuracy of 94.8%. In the experiments using BaCelLo IDS dataset with low sequence similarity less than 30%, using gene ontology gives the best prediction accuracies, 93.2% for animals and 86.6% for fungi.

Prediction of protein binding regions in RNA using random forest (Random forest를 이용한 RNA에서의 단백질 결합 영역 예측)

  • Choi, Daesik;Park, Byungkyu;Chae, Hanju;Lee, Wook;Han, Kyungsook
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.10a
    • /
    • pp.583-586
    • /
    • 2016
  • 단백질과 RNA의 상호작용 데이터가 대량으로 늘어남에 따라, 단백질과 RNA의 결합부위를 예측하는 계산학적인 방법들이 많이 개발되고 있다. 하지만, 많은 계산학적인 방법들은 단백질에서 단백질과 RNA 결합부위를 예측한다는 한계점이 있었다. 본 논문에서는 RNA와 단백질의 서열정보를 모두 사용하여, 단백질과 결합하는 RNA 결합부위를 예측하는 기법과 그 결과를 논한다. WEKA random forest(http://www.cs.waikato.ac.nz/ml/weka/)를 이용하여 예측 모델을 개발하였고, RNA 서열의 서열 프로파일, 서열 composition, 결합 상대방의 단백질의 특성 등을 특정으로 표현하였다. Random forest 기법을 사용한 cross validation의 결과로서 1:1 모델에서 제일 높은 성능인 92.4% sensitivity, 92.0% specificity, 92.2% accuracy를 보였고, independent test에서는 72.5% sensitivity, 90.0% specificity, 2.1% accuracy를 보였다.

Automated Method of Landmark Extraction for Protein 2DE Images based on Multi-dimensional Clustering (다차원 클러스터링 기반의 단백질 2DE 이미지에서의 자동화된 기준점 추출 방법)

  • Shim, Jung-Eun;Lee, Won-Suk
    • The KIPS Transactions:PartD
    • /
    • v.12D no.5 s.101
    • /
    • pp.719-728
    • /
    • 2005
  • 2-dimensional electrophoresis(2DE) is a separation technique to identify proteins contained in a sample. However, the image is very sensitive to its experimental conditions as well as the quality of scanning. In order to adjust the possible variation of spots in a particular image, a user should manually annotate landmark spots on each gel image to analyze the spots of different images together. However, this operation is an error-prone and tedious job. This thesis develops an automated method of extracting the landmark spots of an image based on landmark profile. The landmark profile is created by clustering the previously identified landmarks of sample images of the same type. The profile contains the various properties of clusters identified for each landmark. When the landmarks of a new image need to be fount all the candidate spots of each landmark are first identified by examining the properties of its clusters. Subsequently, all the landmark spots of the new image are collectively found by the well-known optimization algorithm $A^*$. The performance of this method is illustrated by various experiments on real 2DE images of mouse's brain-tissues.

Inter-Process Synchronization by Large Scaled File (대용량 파일에 의한 프로세스간의 동기화)

  • 하성진;황선태;정갑주;이지수
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.322-324
    • /
    • 2002
  • 최근에 지역적으로 분산된 컴퓨팅 자원을 어디에서나 활용할 수 있도록 해주는 GRID가 많은 주목을 받고 있다. 특히 단백질 분자모사나 고에너지 물리학 분야 둥과 같이 매우 많은 계산을 요구하는 분야에서는 GRID를 통해서 계산 자원을 제공받을 수 있다. GRID에서 제공되는 계산 능력을 잘 활용하기 위해서 각 분야에서 사용되는 어플리케이션을 병렬화 할 수도 있지만 이미 계산 방법이나 결과가 검증되어 있는 기존의 패키지를 활용하는 것도 매우 중요하므로 기존 패키지에 의한 직렬 또는 지역적으로 병렬인 프로세스를 매우 많이 생성하여 GRID를 채우는 것도 한 방법이라 하겠다. 일반적으로 이와 같은 패키지는 기동할 때에 패러미터 파일을 참조하게 되고 그 계산 결과는 매우 큰 파일로 출력이 되는데 본 논문에서는 대용량 파일에 의해서 프로세스간에 동기화 및 통신을 이루어야할 때 발생하는 문제를 해결하는 방안을 제시한다. 동기화와 통신을 동시에 다루어야 하므로 Linda 개념을 도입하였으며 기존 Linda에서는 Tuple Space안에서 대용량 파일 처리를 고려하기 어려우므로 이에 대한 해결책을 제안하였다.

  • PDF

Prediction of Protein Secondary Structure Using the Weighted Combination of Homology Information of Protein Sequences (단백질 서열의 상동 관계를 가중 조합한 단백질 이차 구조 예측)

  • Chi, Sang-mun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.9
    • /
    • pp.1816-1821
    • /
    • 2016
  • Protein secondary structure is important for the study of protein evolution, structure and function of proteins which play crucial roles in most of biological processes. This paper try to effectively extract protein secondary structure information from the large protein structure database in order to predict the protein secondary structure of a query protein sequence. To find more remote homologous sequences of a query sequence in the protein database, we used PSI-BLAST which can perform gapped iterative searches and use profiles consisting of homologous protein sequences of a query protein. The secondary structures of the homologous sequences are weighed combined to the secondary structure prediction according to their relative degree of similarity to the query sequence. When homologous sequences with a neural network predictor were used, the accuracies were higher than those of current state-of-art techniques, achieving a Q3 accuracy of 92.28% and a Q8 accuracy of 88.79%.

Comparative untargeted metabolomic analysis of Korean soybean four varieties (Glycine max (L.) Merr.) based on liquid chromatography mass spectrometry (국내콩 4품종의 LC-MS 기반 비표적대사체 비교평가)

  • Eun-Ha Kim;Soo-Yun Park;Sang-Gu Lee;Hyoun-Min Park;Oh Suk Yu;Yun-Young Kang;Myeong Ji Kim;Jung-Won Jung;Seon-Woo Oh
    • Journal of Applied Biological Chemistry
    • /
    • v.65 no.4
    • /
    • pp.439-446
    • /
    • 2022
  • Soybean is a crop with high-quality of protein and oil, and it is one of the most widely used genetically modified (GM) crops in the world today. In South Korea, Kwangan is the most utilized variety as a parental line for GM soybean development. In this study, untargeted LC-MS metabolomic approaches were used to compare metabolite profiles of Kwangan and three other commercial varieties cultivated in Gunwi and Jeonju in 2020 year. Metabolomic studies revealed that the 4 soybean varieties were distinct based on the partial least squares-discriminant analysis (PLS-DA) score plots; 18 metabolites contributed to variety distinction, including phenylalanine, isoflavones, and fatty acids. All varieties were clearly differentiated by location on the PLS-DA score plot, indicating that the growing environment is also attributable to metabolite variability. In particular, isoflavones and linolenic acid levels in Kwangan were significantly lower and higher, respectively compared to those of the three varieties. It was discussed that it might need to include more diverse conventional varieties as comparators in regard to metabolic characteristics of Kwangan for the assessment of substantial equivalence biogenetically engineered soybeans in a Kwangan-variety background.

Effects of Sasa quelpaertensis Extract on mRNA and microRNA Profiles of SNU-16 Human Gastric Cancer Cells (SNU-16 위암 세포의 mRNA 및 miRNA 프로파일에 미치는 제주조릿대 추출물의 영향)

  • Jang, Mi Gyeong;Ko, Hee Chul;Kim, Se-Jae
    • Journal of Life Science
    • /
    • v.30 no.6
    • /
    • pp.501-512
    • /
    • 2020
  • Sasa quelpaertensis Nakai leaf has been used as a folk medicine for the treatment of gastric ulcer, dipsosis, and hematemesis based on its anti-inflammatory, antipyretic, and diuretic characteristics. We have previously reported the procedure for deriving a phytochemical-rich extract (PRE) from S. quelpaertensis and how PRE and its ethyl acetate fraction (EPRE) exhibits an anticancer effect by inducing apoptosis in various gastric cancer cells. To explore the molecular targets involved in this apoptosis, we investigated the mRNA and microRNA profiles of EPRE-treated SNU-16 human gastric cancer cells. In total, 2,875 differentially expressed genes were identified by RNA sequencing, and gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses indicated that the EPRE-modulated genes are associated with apoptosis, mitogen-activated protein kinase, inflammatory response, tumor necrosis factor signaling, and cancer pathways. Subsequently, protein-protein interaction network analysis confirmed interactions among genes associated with cell death and apoptosis, and 27 differentially expressed microRNAs were identified by further sequencing. Here, GO and KEGG pathway analysis revealed that EPRE modified the expression of microRNAs associated with the cell cycle and cell death, as well as signaling of tropomyosin-receptor-kinase receptor, transforming growth factor-b, nuclear factor kB, and cancer pathways. Taken together, these results provide insight into the mechanisms underlying the anticancer effect of EPRE.

Biological Roles of the Glycan in the Investigation of the Novel Disease Diagnosis and Treatment Methods (신개념 질병 진단 및 치료 연구에 있어서의 당사슬의 생물학적 역할)

  • Kim, Dong-Chan
    • Journal of Life Science
    • /
    • v.28 no.11
    • /
    • pp.1379-1385
    • /
    • 2018
  • Glycans are attached to proteins as in glycoproteins and proteoglycans. They are found on the exterior surface of cells. O- and N-linked glycans are very common in eukaryotic cells but may also be found in prokaryotes. The interaction of cell surface glycans with complementary glycan binding proteins located on neighboring cells, other cell types, pathogens like virus, or bacteria is crucial in biologically and biomedically important processes like pathogen recognition, cell migration, cell-cell adhesion, development, and infection. Their implication in pathological condition, suggests an important role for glycans as disease markers. In addition, a great amount of research has been shown that appropriate glycosylation of a recombinant therapeutic protein is critical for product solubility, stability, pharmacokinetics and pharmacodynamics, bioactivity, and safety. Besides, cancer-associated glycosylation changes often involve sialic acid in glycan branch which play important roles in cell-cell interaction, recognition and immunological response. This review aims at giving a comprehensive overview of the glycan's biological function and describing the relevance among the glycosylation, disease diagnosis and treatment methods. Furthermore, the high-throughput analytic methods available to measure the profile changing patterns of glycan in the blood serum as well as possible underlying biochemical mechanisms.