• Title/Summary/Keyword: Data enrichment

Search Result 286, Processing Time 0.023 seconds

A Universal Analysis Pipeline for Hybrid Capture-Based Targeted Sequencing Data with Unique Molecular Indexes

  • Kim, Min-Jung;Kim, Si-Cho;Kim, Young-Joon
    • Genomics & Informatics
    • /
    • v.16 no.4
    • /
    • pp.29.1-29.5
    • /
    • 2018
  • Hybrid capture-based targeted sequencing is being used increasingly for genomic variant profiling in tumor patients. Unique molecular index (UMI) technology has recently been developed and helps to increase the accuracy of variant calling by minimizing polymerase chain reaction biases and sequencing errors. However, UMI-adopted targeted sequencing data analysis is slightly different from the methods for other types of omics data, and its pipeline for variant calling is still being optimized in various study groups for their own purposes. Due to this provincial usage of tools, our group built an analysis pipeline for global application to many studies of targeted sequencing generated with different methods. First, we generated hybrid capture-based data using genomic DNA extracted from tumor tissues of colorectal cancer patients. Sequencing libraries were prepared and pooled together, and an 8-plexed capture library was processed to the enrichment step before 150-bp paired-end sequencing with Illumina HiSeq series. For the analysis, we evaluated several published tools. We focused mainly on the compatibility of the input and output of each tool. Finally, our laboratory built an analysis pipeline specialized for UMI-adopted data. Through this pipeline, we were able to estimate even on-target rates and filtered consensus reads for more accurate variant calling. These results suggest the potential of our analysis pipeline in the precise examination of the quality and efficiency of conducted experiments.

Knowledge Representation Using Fuzzy Ontologies: A Survey

  • V.Manikandabalaji;R.Sivakumar
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.12
    • /
    • pp.199-203
    • /
    • 2023
  • In recent decades, the growth of communication technology has resulted in an explosion of data-related information. Ontology perception is being used as a growing requirement to integrate data and unique functionalities. Ontologies are not only critical for transforming the traditional web into the semantic web but also for the development of intelligent applications that use semantic enrichment and machine learning to transform data into smart data. To address these unclear facts, several researchers have been focused on expanding ontologies and semantic web technologies. Due to the lack of clear-cut limitations, ontologies would not suffice to deliver uncertain information among domain ideas, conceptual formalism supplied by traditional. To deal with this ambiguity, it is suggested that fuzzy ontologies should be used. It employs Ontology to introduce fuzzy logical policies for ambiguous area concepts such as darkness, heat, thickness, creaminess, and so on in a device-readable and compatible format. This survey efforts to provide a brief and conveniently understandable study of the research directions taken in the domain of ontology to deal with fuzzy information; reconcile various definitions observed in scientific literature, and identify some of the domain's future research-challenging scenarios. This work is hoping that this evaluation can be treasured by fuzzy ontology scholars. This paper concludes by the way of reviewing present research and stating research gaps for buddy researchers.

A Knowledge Graph on Japanese "Comfort Women": Interlinking Fragmented Digital Archival Resources (일본군 '위안부' 지식그래프: 파편화된 디지털 기록의 연결)

  • Park, Haram;Kim, Haklae
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.21 no.3
    • /
    • pp.61-78
    • /
    • 2021
  • Records on Japanese "Comfort Women" have been individually managed by private sectors or institutions, and some are provided as digital archives on the Internet. However, records of digital archives differ in the composition and representation of metadata by individual institutions. Meanwhile, there is a lack of a consistent structure to describe the relationships between and among these records, leading to their fragmentation and disconnectedness. This paper proposes a knowledge model for interlinking the digital archival resources and builds a knowledge graph by integrating the records from distributed digital archives. It derives common elements by analyzing metadata from the diverse digital archives and expresses them in standard vocabularies to semantically describe multiple entities and relationships of the digital archival resources. In particular, the study includes the refinement of collected data to search and thread dispersed records and the enrichment of external data to provide significant contextual information of records. An evaluation of the knowledge graph is performed via a query measuring the (dis)connectivity between the distributed records. As a result, the knowledge graph is capable of interlinking and retrieving fragmented records, providing substantial contextual information on the records with external data enrichment, and searching accurately to match the user's intentions through semantic-based queries.

Development of Enrichment Semi-nested PCR for Clostridium botulinum types A, B, E, and F and Its Application to Korean Environmental Samples

  • Shin, Na-Ri;Yoon, So-Yeon;Shin, Ji-Hun;Kim, Yun Jeong;Rhie, Gi-eun;Kim, Bong Su;Seong, Won Keun;Oh, Hee-Bok
    • Molecules and Cells
    • /
    • v.24 no.3
    • /
    • pp.329-337
    • /
    • 2007
  • An enrichment semi-nested PCR procedure was developed for detection of Clostridium botulinum types A, B, E, and F. It was applied to sediment samples to examine the prevalence of C. botulinum in the Korean environment. The first pair of primers for the semi-nested PCR was designed using a region shared by the types A, B, E, and F neurotoxin gene sequences, and the second round employed four nested primers complementary to the BoNT/A, /B, /E, and /F encoding genes for simultaneous detection of the four serotypes. Positive results were obtained from the PCR analysis of five of 44 sediments (11%) collected from Yeong-am Lake in Korea; all were identified as deriving from type B neurotoxin (bontb) genes. Two of the C. botulinum type B organisms were isolated, and their bontb genes sequenced. The deduced amino acid sequences of BoNT/B showed 99.5 and 99.8% identity with the amino acid sequence of accession no. AB084152. Our data suggest that semi-nested PCR is a useful tool for detecting C. botulinum in sediments, and renders it practicable to conduct environmental surveys.

Tritium Concentrations in Surface Seawater around Korean Peninsula (한국 주변 해역 표층해수중 삼중수소 농도)

  • Kim, Chang-Kyu;Cho, Yong-Woo;Kim, Kye-Hun
    • Journal of Radiation Protection and Research
    • /
    • v.21 no.2
    • /
    • pp.107-115
    • /
    • 1996
  • An electrolytic enrichment technique was used to measure low levels of tritium in seawater around the Korean peninsula. Tritium concentrations were determined for surface seawater samples collected from the East Sea, the South Sea, and the Yellow Sea. The tritium concentrations in surface seawater samples from the study area ranged from $0.12 BqL^{-1}\;to\;1.50BqL^{-1}$ with a mean value of $0.60{\pm}0.35 BqL^{-1}$. The means of the tritium concentration were $0.54{\pm}0.30 BqL^{-1}$ for the East Sea, $0.48{\pm}0.35 BqL^{-1}$ for the South Sea, and $0.77{\pm}0.32 BqL^{-1}$ for the Yellow Sea. The tritium concentrations in the sea areas did not show much difference no matter where the samples were taken. Due to the limited number and distribution of sampling points, no systematic change in tritium levels with latitude was observed. Measured tritium levels were similar to those observed in other data collected near Japan, but higher than mid-Pacific Ocean measurements.

  • PDF

Study on Airborne Particulate Matter ($PM_{10}$) Monitoring in Urban and Rural Area by Using Gent SFU Sampler and Instrumental Neutron Activation Analysis (중성자 방사화분석법과 Gent SFU 샘플러를 이용한 도시의 농촌지역의 대기분지($PM_{10}$)관측 연구)

  • 정용삼;문종화;김선하;박광원;강상훈
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.16 no.5
    • /
    • pp.453-467
    • /
    • 2000
  • The aim of this research is to collect and characterize fine particles (FPM:$\leq$2.5${\mu}{\textrm}{m}$) and coarse particles (CPM: 2.5~10${\mu}{\textrm}{m}$) using a low volume air sampler provided by the IAEA, at urban (Taejon) and rural area(Wonju) for a period of about two years(April 1996 to May 1998) and to promote a use of nuclear analytical techniques for air pollution studies. For the collection of airborne particulate matter (PM(sub)10), the Gent stacked filter unit sampler and polycarbonate membrane filters were employed. The concentration of trace elements in collected APM samples were determined byu instrumental Neutron Activation Analysis. For validation of the analytical data, internal quality control were implemented by using both the comparison of the analytical results of standard reference materials(NIST SRM 1648) and interlaboratory comparison for proficiency test (NAT-3). The standard uncertainty was less than 15% and Z-score of two samples were within $\pm$1. The monitoring of (PM(sub)10) mass concentration and elemental concentrations were carried out weekly. The average mass concentration of (PM(sub)10) in urban and rural areas were 59.2$\pm$36.5$\mu\textrm{g}$/㎥ and 41.4$\pm$23.7$\mu\textrm{g}$/㎥, respectively. To investigate the emission source, the enrichment factors were calculated for the fine and coarse particle fractions at two sites, respectively and these values were classified for anthropogenic and soil origin elements.

  • PDF

Elemental Composition of PM2.5 Particulate with a 3-Stage DRUM Sampler during Spring and Summer Seasons in Urban Area of Gwangju, Korea (3-Stage DRUM 샘플러를 이용한 광주 도심지역의 봄철과 여름철 PM2.5 원소적 조성 비교)

  • Ryu S.Y.;Kim Y.J.
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.21 no.6
    • /
    • pp.699-708
    • /
    • 2005
  • To characterize the elemental composition of fine particles in urban area, $PM_{2.5}$ was collected by a 3-stage DRUM impactor at Gwangju during spring and summer. Time and size resolved concentrations for 19 trace elements were obtained by synchrotron X-Ray fluorescence analysis. Trace elements in summer were distributed in smaller size range compared to those in spring. Almost trace element concentrations in fine particles were highly increased during the Asian dust. In spring, soil elements such as Si, K, Ca, Ti and Fe had low enrichment factors indicating the dominant influence of soil dust. However, all elements had high enrichment factors in summer implying that these elements could be emitted from the anthropogenic sources. Factor analysis was conducted with the elemental composition data in order to identify anthropogenic sources of aerosols in urban area during spring and summer. Fine particles in spring have several sources such as soil dust originating from China continental region, coal and oil combustion, biomass burning, sea salt, ferrous and nonferrous metal sources. On the other hand, fine particles in summer were influenced by road dust, gasoline vehicle as well as coal and oil combustion, sea salt, ferrous and nonferrous metal sources.

Developing a Parametric Method for Testing the Significance of Gene Sets in Microarray Data Analysis (마이크로어레이 자료분석에서 모수적 방법을 이용한 유전자군의 유의성 검정)

  • Lee, Sun-Ho;Lee, Seung-Kyu;Lee, Kwang-Hyun
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.3
    • /
    • pp.397-408
    • /
    • 2009
  • The development of microarray technology makes possible to analyse many thousands of genes simultaneously. While it is important to test each gene whether it shows changes in expression associated with a phenotype, human diseases are thought to occur through the interactions of multiple genes within a same functional cafe-gory. Recent research interests aims to directly test the behavior of sets of functionally related genes, instead of focusing on single genes. Gene set enrichment analysis(GSEA), significance analysis of microarray to gene-set analysis(SAM-GS) and parametric analysis of gene set enrichment(PAGE) have been applied widely as a tool for gene-set analyses. We describe their problems and propose an alternative method using a parametric analysis by adopting normal score transformation of gene expression values. Performance of the newly derived method is compared with previous methods on three real microarray datasets.

Analysis and Enrichment of Microbial Community Showing Reducing Ability toward indigo in the Natural Fermentation of Indigo-Plant (자연발효 과정에서 인디고에 환원력을 지닌 미생물 커뮤니티 분석과 농화배양)

  • Choi, Eun-Sil;Lee, Eun-Bin;Choi, Hyueong-An;Son, Kyunghee;Kim, Geun-Joong;Shin, Younsook
    • KSBB Journal
    • /
    • v.28 no.5
    • /
    • pp.295-302
    • /
    • 2013
  • Indigo is utilized in various industries including textile dyeing, cosmetics, printing and medicinal products and its reduced form, leuco-indigo, is mainly used in these process. Chemical reducing agent (sodium dithionite, sodium sulfide, etc.) is preferred to use for the formation of leucoindigo in industry. In traditional indigo fermentation process, microorganisms can participate in the reduction of indigo and thus it has been known to reduce environmental pollution and noxious byproducts. However, in fermentation method using microorganisms it is difficult to standardize large scale production process due to low yield and reproducibility. In this study, we attempted to develop the indigo reduction process using microbial flora which was isolated from naturally fermented indigo vat or deduced by metagenomic approach. From the results of library analyses of PCR-amplified 16S rRNA genes from the traditional indigo fermentation vat sample (metagenome), it was confirmed that Alkalibacteriums (71%) was distinctly dominant in population. Some strains were identified after confirming that they become pure culture in nutrient media modified slightly. Four strains were separated in this process and each strain showed obvious reducing ability toward indigo in dyeing test. It is expected that the analyzed results will provide important data for standardizing the natural fermentation of indigo and investigating the mechanism of indigo reduction.

An Importance and Performance Analysis regarding Classroom Assessment - Professional General Education and MSC curriculum in the Engineering College Enrichment Program - (학습 평가에 대한 중요도 및 수행도 분석(IPA) - 공과대학 심화프로그램의 전문교양 및 MSC 교과목을 중심으로 -)

  • Noh, Jin-Ah;Choi, Yu-Hyun
    • Journal of Engineering Education Research
    • /
    • v.17 no.3
    • /
    • pp.51-58
    • /
    • 2014
  • The main objective of this research lies in estimating how much importance the educators of professional general education and MSC curriculum put on assessment, and, thereafter, take what support should be provided for the efficient assessment of education. The subjects of this research are educators of professional general education and MSC curriculum in the engineering education enrichment program at the 58 universities out of 72 universities where the accreditation for engineering education is implemented. Accordingly questionnaires were distributed to a total of 58 universities among which 136 questionnaires were collected. The data analysis methods, mean and response sample T test, were used in this research. The following conclusions were drawn from the results of this research. First, tile mean of importance and performance in the 'assessment activities' were relatively high. However, in tile assessment, the assessment performance was relatively lower than the assessment importance. Second, the results of the Importance-Performance Matrix in the 'assessment activities' was analyzed by two ways. First, Analyzed with the scale mean, The result means that Keep up the Good Work (KGW) sector included all factors. Second, Analyzed with the actual mean, The result means that KGW sector included 5 factors A, J, B, D, C. Possible Overkill(PO) sector included factors C, I. Low Priority(LP) sector included 4 factors K, E, H, L. Concentrate Here (CH) sector included factor G.