• 제목/요약/키워드: ORF prediction

검색결과 12건 처리시간 0.022초

ORF Miner: a Web-based ORF Search Tool

  • Park, Sin-Gi;Kim, Ki-Bong
    • Genomics & Informatics
    • /
    • 제7권4호
    • /
    • pp.217-219
    • /
    • 2009
  • The primary clue for locating protein-coding regions is the open reading frame and the determination of ORFs (Open Reading Frames) is the first step toward the gene prediction, especially for prokaryotes. In this respect, we have developed a web-based ORF search tool called ORF Miner. The ORF Miner is a graphical analysis utility which determines all possible open reading frames of a selectable minimum size in an input sequence. This tool identifies all open reading frames using alternative genetic codes as well as the standard one and reports a list of ORFs with corresponding deduced amino acid sequences. The ORF Miner can be employed for sequence annotation and give a crucial clue to determination of actual protein-coding regions.

Helicobacter pylori cag Pathogenicity Island cagL and orf17 Genotypes Predict Risk of Peptic Ulcerations but not Gastric Cancer in Iran

  • Raei, Negin;Latifi-Navid, Saeid;Zahri, Saber
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제16권15호
    • /
    • pp.6645-6650
    • /
    • 2015
  • Background: Gastric cancer (GC) is the third most common cancer regarding mortality in the world. The cag pathogenicity island (PAI) of Helicobacter pylori which contains genes associated with a more aggressive phenotype may involve in the pathogenesis of gastrointestinal disease. We here aimed to examine the associations of cagH, cagL, orf17, and cagG genotypes of H. pylori cag PAI with severe gastrointestinal disease. Materials and Methods: A total of 242 H. pylori strains were genotyped. Histopathological examination and classification of subjects were performed. Results: The frequencies of the cagH, cagL, cagG, and orf17 genotypes were 40/54 (74.1%), 53/54 (98.1%), 38/54 (70.4%), and 43/54 (79.6%), respectively, in patients with peptidic ulceration (PU),while in the control group, the frequencies were 87/147 (59.6%) for cagH, 121/146 (82.9%) for cagL, 109/146 (74.7%) for cagG, and 89/146 (61.0%) for orf17. The results of simple logistic regression analysis showed that the cagL and orf17 genotypes were significantly associated with an increased risk of PU not GC; the ORs (95% CI) were 10.950 (1.446-82.935), and 2.504 (1.193-5.253), respectively. No significant association was found between the cagH and cagG genotypes and the risk of both the PU and the GC in Iran (P>0.05). Finally, multiple logistic regression analysis showed that the cagL genotype was independently and significantly associated with the age-and sex-adjusted risk for PU; the OR (95% CI) was 9.557 (1.219-17.185). Conclusions: We conclude that the orf17 and especially cagL genotypes of H. pylori cag PAI could be factors for risk prediction of PU, but not GC in Iran.

메타게놈 서열에 존재하는 보존적인 전사와 번역 인자를 이용한 ORF 예측 (Prediction of ORFs in Metagenome by Using Cis-acting Transcriptional and Translational Factors)

  • 정대은;김근중
    • KSBB Journal
    • /
    • 제25권5호
    • /
    • pp.490-496
    • /
    • 2010
  • 미생물은 지구상에 약 $5\;{\times}\;10^{30}$ 정도의 개체가 존재하며, 350~550 Pg (1Pg = 1015g)의 탄소, 85~130 Pg의 질소, 9~14 Pg의 인 등, 지구상의 어떠한 생물 종보다 거대한 양의 원소를 포함하고 있다. 또한 이러한 미생물과 생태계를 구성하는 다른 유기체나 무기물과의 관계가 지속적으로 밝혀지고 있다. 이러한 연구들의 기본적인 목표는 상호작용에 중요한 인자들의 규명 (대표적으로 유전자)하는 것이기 때문에, 염색체에 존재하는 true ORF의 검색과 확인은 가장 중요한 기본 수단이 된다. 그러나 다양한 미생물로 구성된 환경 유전체는 기존 정보로 검색 가능한 비율을 정확하게 유추할 수 없기에 많은 어려움이 있다. 이렇게 경계가 불분명한 자료의 검색을 위해서는 보다 많은 정보를 필요 (training이나 space를 규정하기 위한 보다 많은 유전자 서열)로 하며, 다른 검색 방법이나 기법들이 추가적으로 개발되어야 할 것이다. 이러한 방법의 대안으로써, 미생물의 유전자간 서열에 존재하는 전사/번역인자의 보존성에 근거한 검색방법은 개량 여하에 따라 광범위한 적용 범위를 지닐 것이다. 현 수준에서도 조합 탐색, 즉 기존의 방법과 혼용하거나 기존의 방법을 보완하는 과정으로 충분한 가치를 지니고 있다. 이러한 추정은, 기존의 ORF 중심의 발굴 결과와 전혀 일치되지 않는 경우에서부터 90% 이상 일치하는 등의 결과로서 확인하였다. 일치 되지 않는 많은 경우가 BLASTing으로 검색되지 않는 새로운 ORF를 포함하기 때문이다.

De-novo Hybrid Protein Design for Biodegradation of Organophosphate Pesticides

  • Awasthi, Garima;Yadav, Ruchi;Srivastava, Prachi
    • 한국미생물·생명공학회지
    • /
    • 제47권2호
    • /
    • pp.278-288
    • /
    • 2019
  • In the present investigation, we attempted to design a protocol to develop a hybrid protein with better bioremediation capacity. Using in silico approaches, a Hybrid Open Reading Frame (Hybrid ORF) is developed targeting the genes of microorganisms known for degradation of organophosphates. Out of 21 genes identified through BLAST search, 8 structurally similar genes (opdA, opd, opaA, pte RO, pdeA, parC, mpd and phnE) involved in biodegradation were screened. Gene conservational analysis categorizes these organophosphates degrading 8 genes into 4 super families i.e., Metallo-dependent hydrolases, Lactamase B, MPP and TM_PBP2 superfamily. Hybrid protein structure was modeled using multi-template homology modeling (3S07_A; 99%, 1P9E_A; 98%, 2ZO9_B; 33%, 2DXL_A; 33%) by $Schr{\ddot{o}}dinger$ software suit version 10.4.018. Structural verification of protein models was done using Ramachandran plot, it was showing 96.0% residue in the favored region, which was verified using RAMPAGE. The phosphotriesterase protein was showing the highest structural similarity with hybrid protein having raw score 984. The 5 binding sites of hybrid protein were identified through binding site prediction. The docking study shows that hybrid protein potentially interacts with 10 different organophosphates. The study results indicate that the hybrid protein designed has the capability of degrading a wide range of organophosphate compounds.

Identification of Prostate Cancer LncRNAs by RNA-Seq

  • Hu, Cheng-Cheng;Gan, Ping;Zhang, Rui-Ying;Xue, Jin-Xia;Ran, Long-Ke
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권21호
    • /
    • pp.9439-9444
    • /
    • 2014
  • Purpose: To identify prostate cancer lncRNAs using a pipeline proposed in this study, which is applicable for the identification of lncRNAs that are differentially expressed in prostate cancer tissues but have a negligible potential to encode proteins. Materials and Methods: We used two publicly available RNA-Seq datasets from normal prostate tissue and prostate cancer. Putative lncRNAs were predicted using the biological technology, then specific lncRNAs of prostate cancer were found by differential expression analysis and co-expression network was constructed by the weighted gene co-expression network analysis. Results: A total of 1,080 lncRNA transcripts were obtained in the RNA-Seq datasets. Three genes (PCA3, C20orf166-AS1 and RP11-267A15.1) showed a significant differential expression in the prostate cancer tissues, and were thus identified as prostate cancer specific lncRNAs. Brown and black modules had significant negative and positive correlations with prostate cancer, respectively. Conclusions: The pipeline proposed in this study is useful for the prediction of prostate cancer specific lncRNAs. Three genes (PCA3, C20orf166-AS1, and RP11-267A15.1) were identified to have a significant differential expression in prostate cancer tissues. However, there have been no published studies to demonstrate the specificity of RP11-267A15.1 in prostate cancer tissues. Thus, the results of this study can provide a new theoretic insight into the identification of prostate cancer specific genes.

계통유전체학과 COG를 이용한 유전자 기능예측 (Gene Prediction Using Phylogenomics and COG)

  • 신창진;강병철;박준형;신동훈;김철민
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2004년도 춘계학술대회 학술발표 논문집 제14권 제1호
    • /
    • pp.255-258
    • /
    • 2004
  • 본 연구는 유전자 기능예측에 있어서 유사성 검색과 비교유전체학이 가진 한계를 극복하기 위하여 9종의 Human Herpesvirus를 대상으로 COG와 계통유전학적 방법을 적용하여 향상된 유전자 기능예측을 하고자 하였다. COG의 방법을 이용하여 114 HCOGs (Human Herpesvvirus COGs)를 구축하고, HCOGs를 바탕으로 유전자 컨텐츠트리를 제작하였다. 이 트리를 통하여 각 HCOG는 $\alpha$-특이적 그룹, $\beta$-특이적 그룹, $\alpha$, $\beta$, ${\gamma}$ -특이적 그룹 중 하나에 속함을 보였다. 계통유전체학의 적용을 위하여 u, $\beta$, ${\gamma}$ -특이 그룹에 속하는 ORF중 DNA polymerase를 이용하여 종트리를 제작하였다. SDI (Speciation and Duplication) 알고리즘을 통하여 148개의 당단백질에서 47개의 복제점을 예측하였고, 초기 HCOG의 제작에서 제외되었던 7 ORF는 당단백질과 관련된 5개의 HCOG로 재 정의 하였다. 이 연구를 통하여 COG는 ortholog 그룹을 를러스터링하는데 효과적인 방법이며, 이를 더욱 보완할 수 있는 방법으로 비교유전체학이 사용될 수 있음을 확인하였다. 이는 비교유전체학의 방법과 계통유전체학적 방법을 조화시켜 유전자 기능 예측을 보완할 수 있음을 보여 주었다.

  • PDF

단섬유 보강 복합재료에서의 섬유배향의 수치모사를 위한 개선된 근사모델 (Improved Closure Approximation for Numerical Simulation of Fiber Orientation in Fiber-Reinforced Composite)

  • D.H. Chung;T.H. Kwon
    • 유변학
    • /
    • 제10권4호
    • /
    • pp.202-216
    • /
    • 1998
  • 기존의 'Orthotropic' 근사모델의 개선된 형태인 ORW를 새로운 유동 자료를 이용하여 수치적으로 구하였다. 기존의 'Orthotropic' 근사모델인 ORF나 ORL은 특히 전단유동 하에서 상호작용상수 $C_1$<0.001인 경우 비물리적 진동특성을 나타낸다. 물론 center-gated disk와 같은 비균일 유동하에서도 비물리적 진동특성을 나타내고 'Distribution Function Calculation'과 비교하여 배향 상태를 낮게 예측한다. 이런 현상들은 바로 least-square 최적화 시 사용된 유동 자료에 기인한 것을 알 수 있었다. 작은 상호작용계수의 균일 유동 자료를 이용하여 최적화를 한 ORW의 경우 비물리적 진동특성도 나타나지 않았고 균일 및 비균일 유동하에서 모두 정성적으로 잘 일치함을 확인할 수 있었다. 최적화 시 사용된 함수의 선택은 근사모델을 발전시키는데 그다지 영향을 미치지 못하였다. 하지만, 모든 배향 텐서의 eigenvalue들을 고려하면 보다 정량적으로 발전시킬 수 있지만 이들의 함수모양 선택은 중요하고 어려운 문제다. 비교를 위하여 ORW와 다른 여러 가지 근사모델을 이용하여 Film-gated strip과 Center-gated disk에 대한 연계효과 및 평면속도구배를 포함한 사출성형 충전공정의 수치모사를 수행하였다. ORW가 'Distribution Function Calculation' 과 비교하여 정량적으로도 거의 비슷한 결과를 예측함을 보여주지만 실제 실험자료와 비교하였을 때 약간의 차이가 있음을 확인하였다. 따라서 좀 더 정확히 섬유의 배향도를 예측하기 위해서는 섬유들의 상호작용을 나타내는 항의 모델링의 변화가 요구된다.

  • PDF

A data management system for microbial genome projects

  • Ki-Bong Kim;Hyeweon Nam;Hwajung Seo and Kiejung Park
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2000년도 International Symposium on Bioinformatics
    • /
    • pp.83-85
    • /
    • 2000
  • A lot of microbial genome sequencing projects is being done in many genome centers around the world, since the first genome, Haemophilus influenzae, was sequenced in 1995. The deluge of microbial genome sequence data demands new and highly automatic data flow system in order for genome researchers to manage and analyze their own bulky sequence data from low-level to high-level. In such an aspect, we developed the automatic data management system for microbial genome projects, which consists mainly of local database, analysis programs, and user-friendly interface. We designed and implemented the local database for large-scale sequencing projects, which makes systematic and consistent data management and retrieval possible and is tightly coupled with analysis programs and web-based user interface, That is, parsing and storage of the results of analysis programs in local database is possible and user can retrieve the data in any level of data process by means of web-based graphical user interface. Contig assembly, homology search, and ORF prediction, which are essential in genome projects, make analysis programs in our system. All but Contig assembly program are open as public domain. These programs are connected with each other by means of a lot of utility programs. As a result, this system will maximize the efficiency in cost and time in genome research.

  • PDF

A genome-wide association study (GWAS) for pH value in the meat of Berkshire pigs

  • Park, Jun;Lee, Sang-Min;Park, Ja-Yeon;Na, Chong-Sam
    • Journal of Animal Science and Technology
    • /
    • 제63권1호
    • /
    • pp.25-35
    • /
    • 2021
  • The purpose of this study is to estimate the single nucleotide polymorphism (SNP) effect for pH values affecting Berkshire meat quality. A total of 39,603 SNPs from 1,978 heads after quality control and 882 pH values were used estimate SNP effect by single step genomic best linear unbiased prediction (ssGBLUP) method. The average physical distance between adjacent SNP pairs was 61.7kbp and the number and proportion of SNPs whose minor allele frequency was below 10% were 9,573 and 24.2%, respectively. The average of observed heterozygosity and polymorphic information content was 0.32 ± 0.16 and 0.26 ± 0.11, respectively and the estimate for average linkage disequilibrium was 0.40. The heritability of pH45m and pH24h were 0.10 and 0.15 respectively. SNPs with an absolute value more than 4 standard deviations from the mean were selected as threshold markers, among the selected SNPs, protein-coding genes of pH45m and pH24h were detected in 6 and 4 SNPs, respectively. The distribution of coding genes were detected at pH45m and were detected at pH24h.

Three Non-Aspartate Amino Acid Mutations in the ComA Response Regulator Receiver Motif Severely Decrease Surfactin Production, Competence Development, and Spore Formation in Bacillus subtilis

  • Wang, Xiaoyu;Luo, Chuping;Liu, Youzhou;Nie, Yafeng;Liu, Yongfeng;Zhang, Rongsheng;Chen, Zhiyi
    • Journal of Microbiology and Biotechnology
    • /
    • 제20권2호
    • /
    • pp.301-310
    • /
    • 2010
  • Bacillus subtilis strains produce a broad spectrum of bioactive peptides. The lipopeptide surfactin belongs to one well-known class, which includes amphiphilic membrane-active biosurfactants and peptide antibiotics. Both the srfA promoter and the ComP-ComA signal transduction system are an important part of the factor that results in the production of surfactin. Bs-M49, obtained by means of low-energy ion implantation in wild-type Bs-916, produced significantly lower levels of surfactin, and had no obvious effects against R. solani. Occasionally, we found strain Bs-M49 decreased spore formation and the development of competence. Blast comparison of the sequences from Bs-916 and M49 indicate that there is no difference in the srfA operon promoter PsrfA, but there are differences in the coding sequence of the comA gene. These differences result in three missense mutations within the M49 ComA protein. RT-PCR analyses results showed that the expression levels of selected genes involved in competence and sporulation in both the wild-type Bs-916 and mutant M49 strains were significantly different. When we integrated the comA ORF into the chromosome of M49 at the amyE locus, M49 restored hemolytic activity and antifungal activity. Then, HPLC analyses results also showed the comA-complemented strain had a similar ability to produce surf actin with wild-type strain Bs-916. These data suggested that the mutation of three key amino acids in ComA greatly affected the biological activity of Bacillus subtilis. ComA protein 3D structure prediction and motif search prediction indicated that ComA has two obvious motifs common to response regulator proteins, which are the N-terminal response regulator receiver motif and the C-terminal helix-turn-helix motif. The three residues in the ComA N-terminal portion may be involved in phosphorylation activation mechanism. These structural prediction results implicate that three mutated residues in the ComA protein may play an important role in the formation of a salt-bridge to the phosphoryl group keeping active conformation to subsequent regulation of the expression of downstream genes.