• Title/Summary/Keyword: Bioinformatics data

Search Result 646, Processing Time 0.035 seconds

TEST DB: The intelligent data management system for Toxicogenomics (독성유전체학 연구를 위한 지능적 데이터 관리 시스템)

  • Lee, Wan-Seon;Jeon, Ki-Seon;Um, Chan-Hwi;Hwang, Seung-Young;Jung, Jin-Wook;Kim, Seung-Jun;Kang, Kyung-Sun;Park, Joon-Suk;Hwang, Jae-Woong;Kang, Jong-Soo;Lee, Gyoung-Jae;Chon, Kum-Jin;Kim, Yang-Suk
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2003.10a
    • /
    • pp.66-72
    • /
    • 2003
  • Toxicogenomics is now emerging as one of the most important genomics application because the toxicity test based on gene expression profiles is expected more precise and efficient than current histopathological approach in pre-clinical phase. One of the challenging points in Toxicogenomics is the construction of intelligent database management system which can deal with very heterogeneous and complex data from many different experimental and information sources. Here we present a new Toxicogenomics database developed as a part of 'Toxicogenomics for Efficient Safety Test (TEST) project'. The TEST database is especially focused on the connectivity of heterogeneous data and intelligent query system which enables users to get inspiration from the complex data sets. The database deals with four kinds of information; compound information, histopathological information, gene expression information, and annotation information. Currently, TEST database has Toxicogenomics information fer 12 molecules with 4 efficacy classes; anti cancer, antibiotic, hypotension, and gastric ulcer. Users can easily access all kinds of detailed information about there compounds and simultaneously, users can also check the confidence of retrieved information by browsing the quality of experimental data and toxicity grade of gene generated from our toxicology annotation system. Intelligent query system is designed for multiple comparisons of experimental data because the comparison of experimental data according to histopathological toxicity, compounds, efficacy, and individual variation is crucial to find common genetic characteristics .Our presented system can be a good information source for the study of toxicology mechanism in the genome-wide level and also can be utilized fur the design of toxicity test chip.

  • PDF

A Hybrid Efficient Feature Selection Model for High Dimensional Data Set based on KNHNAES (2013~2015) (KNHNAES (2013~2015) 에 기반한 대형 특징 공간 데이터집 혼합형 효율적인 특징 선택 모델)

  • Kwon, Tae il;Li, Dingkun;Park, Hyun Woo;Ryu, Kwang Sun;Kim, Eui Tak;Piao, Minghao
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.739-747
    • /
    • 2018
  • With a large feature space data, feature selection has become an extremely important procedure in the Data Mining process. But the traditional feature selection methods with single process may no longer fit for this procedure. In this paper, we proposed a hybrid efficient feature selection model for high dimensional data. We have applied our model on KNHNAES data set, the result shows that our model outperforms many existing methods in terms of accuracy over than at least 5%.

Design of Efficient Storage Exploiting Structural Similarity in Microarray Data (마이크로어레이 데이터의 구조적 유사성을 이용한 효율적인 저장 구조의 설계)

  • Yun, Jong-Han;Shin, Dong-Kyu;Shin, Dong-Il
    • The KIPS Transactions:PartD
    • /
    • v.16D no.5
    • /
    • pp.643-650
    • /
    • 2009
  • As one of typical techniques for acquiring bio-information, microarray has contributed greatly to development of bioinformatics. Although it is established as a core technology in bioinformatics, it has difficulty in sharing and storing data because data from experiments has huge and complex type. In this paper, we propose a new method which uses the feature that microarray data format in MAGE-ML, a standard format for exchanging data, has frequent structurally similar patterns. This method constructs compact database by simplifying MAGE-ML schema. In this method, Inlining techniques and newly proposed classification techniques using structural similarity of elements are used. The structure of database becomes simpler and number of table-joins is reduced, performance is enhanced using this method.

Confirming Single Nucleotide Polymorphisms from Expressed Sequence Tag Datasets Derived from Three Cattle cDNA Libraries

  • Lee, Seung-Hwan;Park, Eung-Woo;Cho, Yong-Min;Lee, Ji-Woong;Kim, Hyoung-Yong;Lee, Jun-Heon;Oh, Sung-Jong;Cheong, Il-Cheong;Yoon, Du-Hak
    • BMB Reports
    • /
    • v.39 no.2
    • /
    • pp.183-188
    • /
    • 2006
  • Using the Phred/Phrap/Polyphred/Consed pipeline established in the National Livestock Research Institute of Korea, we predicted candidate coding single nucleotide polymorphisms (cSNPs) from 7,600 expressed sequence tags (ESTs) derived from three cDNA libraries (liver, M. longissimus dorsi, and intermuscular fat) of Hanwoo (Korean native cattle) steers. From the 7,600 ESTs, 829 contigs comprising more than two EST reads were assembled using the Phrap assembler. Based on the contig analysis, 201 candidate cSNPs were identified in 129 contigs, in which transitions (69%) outnumbered transversions (31%). To verify whether the predicted cSNPs are real, 17 SNPs involved in lipid and energy metabolism were selected from the ESTs. Twelve of these were confirmed to be real while five were identified as artifacts, possibly due to expressed sequence tag sequence error. Further analysis of the 12 verified cSNPs was performed using the program BLASTX. Five were identified as nonsynonymous cSNPs, five were synonymous cSNPs, and two SNPs were located in 3'-UTRs. Our data indicated that a relatively high SNP prediction rate (71%) from a large EST database could produce abundant cSNPs rapidly, which can be used as valuable genetic markers in cattle.

Associations of Most Prevalent Risk Factors with Lung Cancer and Their Impact on Survival Length

  • Khan, Mohammad Haroon;Hussain, Shahid;Bano, Raisa;Jamshed-ul-Hassan, Hafiz;Aadil ur Rehman, Muhammad
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.sup3
    • /
    • pp.65-70
    • /
    • 2016
  • Lung cancer is one of the most common malignancies in the world. Its incidence and mortality rates are on the rise in Pakistan. However, epidemiological studies to identify common lung cancer determinants in the Pakistani population have been limited. In this study, data of 440 cases and 323 controls were collected from different hospitals in Peshawar and Islamabad, along with information about socio-demographic factors including age, sex and smoking. Univariate and multi-factorial analyses of socio-demographic factors in association with each other were also performed. Overall survival analysis highlighted that, out of 440 patients in the lung cancer dataset, 204 people were uncensored with a median survival time of 13 months (95% CI=12-18). There were 41 femaleand 399 male patients. Differences were observed between length of survival in the males and females (${\chi}12$ = 6.1; p-value = 0.01). Gender was observed to be significantly related to survival (p-value< 0.01), with better survival in females (hazard ratio=2). Cox regression was extended to adjust for the covariate age (z = 2.5; p-value = 0.02). Survival analysis was also performed on the basis of smoking groups (current smokers, former smokers and never smoked individuals) and smoking duration (smoking duration >10 years, <10 years and never smoked). Smoking duration was significantly associated with survival (p-value < 0.01), with better survival in never smokers in comparison to both smoking for greater or less than 10 years. Strong associations were observed for smoking group with duration greater than 10 years, OR=6.1(3.9-9.5) on univariate and multifactorial analysis OR=11.3(CI=6.8-19.3).

Individual Identification using The Multiplex PCR with Microsatellite Markers in Swine

  • Kim, Lee-Kung;Park, Chang-Min;Park, Sun-Ae;Kim, Seung-Chang;Chung, Hoyoung;Chai, Han-Ha;Jeong, Gyeong-Yong;Choi, Bong-Hwan
    • Reproductive and Developmental Biology
    • /
    • v.37 no.4
    • /
    • pp.205-211
    • /
    • 2013
  • The swine is one of the most widespread mammalian throughout the whole world. Presently, many studies concerning microsatellites in swine, especially domestic pigs, have been carried out in order to investigate general diversity patterns among either populations or breeds. Until now, a lot of time and effort spend into a single PCR method. But simple and more rapid multiplex PCR methods have been developed. The purpose of this study is to develop a robust set of microsatellites markers (MS marker) for traceability and individual identification. Using multiplex-PCR method with 23 MS marker divided 2 set, various alleles occurring to 5 swine breed (Berkshire, Landrace, Yorkshire, Duroc and Korea native pig) used markers to determine allele frequency and heterozygosity. MS marker found 4 alleles at SW403, S0227, SWR414, SW1041 and SW1377. The most were found 10 alleles at SW1920. Heterozygosity represented the lowest value of 0.102 at SWR414 and highest value of 0.861 at SW1920. So, it was recognized appropriate allele frequency for individual identification in swine. Using multiplex-PCR method, MS markers used to determine individual identification biomarker and breed-specific marker for faster, more accurate and lower analysis cost. Based on this result, a scientific basis was established to the existing pedigree data by applying genetics additionally. Swine traceability is expected to be very useful system and be conducted nationwide in future.

Analysis of 16S rRNA gene sequencing data for the taxonomic characterization of the vaginal and the fecal microbial communities in Hanwoo

  • Choi, Soyoung;Cha, Jihye;Song, Minji;Son, JuHwan;Park, Mi-Rim;Lim, Yeong-jo;Kim, Tae-Hun;Lee, Kyung-Tai;Park, Woncheoul
    • Animal Bioscience
    • /
    • v.35 no.11
    • /
    • pp.1808-1816
    • /
    • 2022
  • Objective: The study of Hanwoo (Korean native cattle) has mainly been focused on meat quality and productivity. Recently the field of microbiome research has increased dramatically. However, the information on the microbiome in Hanwoo is still insufficient, especially relationship between vagina and feces. Therefore, the purpose of this study is to examine the microbial community characteristics by analyzing the 16S rRNA sequencing data of Hanwoo vagina and feces, as well as to confirm the difference and correlation between vaginal and fecal microorganisms. As a result, the goal is to investigate if fecal microbiome can be used to predict vaginal microbiome. Methods: A total of 31 clinically healthy Hanwoo that delivered healthy calves more than once in Cheongju, South Korea were enrolled in this study. During the breeding season, we collected vaginal and fecal samples and sequenced the microbial 16S rRNA genes V3-V4 hypervariable regions from microbial DNA of samples. Results: The results revealed that the phylum-level microorganisms with the largest relative distribution were Firmicutes, Actinobacteria, Bacteroidetes, and Proteobacteria in the vagina, and Firmicutes, Bacteroidetes, and Spirochaetes in the feces, respectively. In the analysis of alpha, beta diversity, and effect size measurements (LefSe), the results showed significant differences between the vaginal and fecal samples. We also identified the function of these differentially abundant microorganisms by functional annotation analyses. But there is no significant correlation between vaginal and fecal microbiome. Conclusion: There is a significant difference between vaginal and fecal microbiome, but no significant correlation. Therefore, it is difficult to interrelate vaginal microbiome as fecal microbiome in Hanwoo. In a further study, it will be necessary to identify the genetic relationship of the entire microorganism between vagina and feces through the whole metagenome sequencing analysis and meta-transcriptome analysis to figure out their relationship.

Property-based Design of Ion-Channel-Targeted Library

  • Ahn, Ji-Young;Nam, Ky-Youb;Chang, Byung-Ha;Yoon, Jeong-Hyeok;Cho, Seung-Joo;Koh, Hun-Yeong;No, Kyoung-Tai
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.134-138
    • /
    • 2005
  • The design of ion channel targeted library is a valuable methodology that can aid in the selection and prioritization of potential ion channel-likeness for ion-channel-targeted bio-screening from large commercial available chemical pool. The differences of property profiling between the 93 ion-channel active compounds from MDDR and CMC database and the ACDSC compounds were classified by suitable descriptors calculated with preADME software. Through the PCA, clustering, and similarity analysis, the compounds capable of ion channel activity were defined in ACDSC compounds pool. The designed library showed a tendency to follow the property profile of ion-channel active compounds and can be implemented with great time and economical efficiencies of ligand-based drug design or virtual high throughput screening from an enormous small molecule space.

  • PDF

AN ABSTRACTION MODEL FOR IN-SITU SENSOR DATA USING SENSORML

  • Lee Yang Koo;Jung Young Jin;Park Mi;Kim Hak Cheol;Lee Chung Ho;Ryu Keun Ho
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.337-340
    • /
    • 2005
  • Context-awareness techniques in ubiquitous computing environment provide various services to users who need to get information via the analysis of collected information from sensors in a spatial area. Context-awareness has been increased in ubiquitous computing and is applied to many different applications such as disaster management system, intelligent robot system, transportation management system, shopping management system, and digital home service. Many researches have recently focused on services that provide the appropriate information, which are collected from Internet by different kinds of sensors, to users according to context of their surrounding environment. In this paper, we propose an abstraction model to manage the large-scale contextual information and their metadata which are collected from different kinds of in-situ sensors in a spatial area and are presented them on the web. This model is composed of the modules expressing functional elements of sensors using sensorML(Sensor Model Language) based on XML language and the modules managing contextual information, which is transmitted from the sensors.

  • PDF

Integrated Bioinformatics Approach Reveals Crosstalk Between Tumor Stroma and Peripheral Blood Mononuclear Cells in Breast Cancer

  • He, Lang;Wang, Dan;Wei, Na;Guo, Zheng
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.3
    • /
    • pp.1003-1008
    • /
    • 2016
  • Breast cancer is now the leading cause of cancer death in women worldwide. Cancer progression is driven not only by cancer cell intrinsic alterations and interactions with tumor microenvironment, but also by systemic effects. Integration of multiple profiling data may provide insights into the underlying molecular mechanisms of complex systemic processes. We performed a bioinformatic analysis of two public available microarray datasets for breast tumor stroma and peripheral blood mononuclear cells, featuring integrated transcriptomics data, protein-protein interactions (PPIs) and protein subcellular localization, to identify genes and biological pathways that contribute to dialogue between tumor stroma and the peripheral circulation. Genes of the integrin family as well as CXCR4 proved to be hub nodes of the crosstalk network and may play an important role in response to stroma-derived chemoattractants. This study pointed to potential for development of therapeutic strategies that target systemic signals travelling through the circulation and interdict tumor cell recruitment.