• Title/Summary/Keyword: Reference dataset

Search Result 117, Processing Time 0.035 seconds

CEOP Annual Enhanced Observing Period Starts

  • Koike, Toshio
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.343-346
    • /
    • 2002
  • Toward more accurate determination of the water cycle in association with climate variability and change as well as baseline data on the impacts of this variability on water resources, the Coordinated Enhanced Observing Period (CEOP) was launched on July 1,2001. The preliminary data period, EOP-1, was implemented from July to September in 2001. The first annual enhanced observing period, EOP-3, is going to start on October 1,2002. CEOP is seeking to achieve a database of common measurements from both in situ and satellite remote sensing, model output, and four-dimensional data analyses (4DDA; including global and regional reanalyses) for a specified period. In this context a number of carefully selected reference stations are linked closely with the existing network of observing sites involved in the GEWEX Continental Scale Experiments, which are distributed across the world. The initial step of CEOP is to develop a pilot global hydro-climatological dataset with global consistency under the climate variability that can be used to help validate satellite hydrology products and evaluate, develop and eventually predict water and energy cycle processes in global and regional models. Based on the dataset, we will address the studies on the inter-comparison and inter-connectivity of the monsoon systems and regional water and energy budget, and a path to down-scaling from the global climate to local water resources, as the second step.

  • PDF

A HIERARCHICAL APPROACH TO HIGH-RESOLUTION HYPERSPECTRAL IMAGE CLASSIFICATION OF LITTLE MIAMI RIVER WATERSHED FOR ENVIRONMENTAL MODELING

  • Heo, Joon;Troyer, Michael;Lee, Jung-Bin;Kim, Woo-Sun
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.647-650
    • /
    • 2006
  • Compact Airborne Spectrographic Imager (CASI) hyperspectral imagery was acquired over the Little Miami River Watershed (1756 square miles) in Ohio, U.S.A., which is one of the largest hyperspectral image acquisition. For the development of a 4m-resolution land cover dataset, a hierarchical approach was employed using two different classification algorithms: 'Image Object Segmentation' for level-1 and 'Spectral Angle Mapper' for level-2. This classification scheme was developed to overcome the spectral inseparability of urban and rural features and to deal with radiometric distortions due to cross-track illumination. The land cover class members were lentic, lotic, forest, corn, soybean, wheat, dry herbaceous, grass, urban barren, rural barren, urban/built, and unclassified. The final phase of processing was completed after an extensive Quality Assurance and Quality Control (QA/QC) phase. With respect to the eleven land cover class members, the overall accuracy with a total of 902 reference points was 83.9% at 4m resolution. The dataset is available for public research, and applications of this product will represent an improvement over more commonly utilized data of coarser spatial resolution such as National Land Cover Data (NLCD).

  • PDF

Supervised Model for Identifying Differentially Expressed Genes in DNA Microarray Gene Expression Dataset Using Biological Pathway Information

  • Chung, Tae Su;Kim, Keewon;Kim, Ju Han
    • Genomics & Informatics
    • /
    • v.3 no.1
    • /
    • pp.30-34
    • /
    • 2005
  • Microarray technology makes it possible to measure the expressions of tens of thousands of genes simultaneously under various experimental conditions. Identifying differentially expressed genes in each single experimental condition is one of the most common first steps in microarray gene expression data analysis. Reasonable choices of thresholds for determining differentially expressed genes are used for the next-stap-analysis with suitable statistical significances. We present a supervised model for identifying DEGs using pathway information based on the global connectivity structure. Pathway information can be regarded as a collection of biological knowledge, thus we are trying to determine the optimal threshold so that the consequential connectivity structure can be the most compatible with the existing pathway information. The significant feature of our model is that it uses established knowledge as a reference to determine the direction of analyzing microarray dataset. In the most of previous work, only intrinsic information in the miroarray is used for the identifying DEGs. We hope that our proposed method could contribute to construct biologically meaningful structure from microarray datasets.

Determining differentially expressed genes in a microarray expression dataset based on the global connectivity structure of pathway information

  • Chung, Tae-Su;Kim, Kee-Won;Lee, Hye-Won;Kim, Ju-Han
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2004.11a
    • /
    • pp.124-130
    • /
    • 2004
  • Microarray expression datasets are incessantly cumulated with the aid of recent technological advances. One of the first steps for analyzing these data under various experimental conditions is determining differentially expressed genes (DEGs) in each condition. Reasonable choices of thresholds for determining differentially expressed genes are used for the next -step-analysis with suitable statistical significances. We present a model for identifying DEGs using pathway information based on the global connectivity structure. Pathway information can be regarded as a collection of biological knowledge, thus we are tying to determine the optimal threshold so that the consequential connectivity structure can be the most compatible with the existing pathway information. The significant feature of our model is that it uses established knowledge as a reference to determine the direction of analyzing microarray dataset. In the most of previous work, only intrinsic information in the miroarray is used for the identifying DEGs. We hope that our proposed method could contribute to construct biologically meaningful network structure from microarray datasets.

  • PDF

Development of Weather Forecast Models for a Short-term Building Load Prediction (건물의 단기부하 예측을 위한 기상예측 모델 개발)

  • Jeon, Byung-Ki;Lee, Kyung-Ho;Kim, Eui-Jong
    • Journal of the Korean Solar Energy Society
    • /
    • v.38 no.1
    • /
    • pp.1-11
    • /
    • 2018
  • In this work, we propose weather prediction models to estimate hourly outdoor temperatures and solar irradiance in the next day using forecasting information. Hourly weather data predicted by the proposed models are useful for setting system operating strategies for the next day. The outside temperature prediction model considers 3-hourly temperatures forecasted by Korea Meteorological Administration. Hourly data are obtained by a simple interpolation scheme. The solar irradiance prediction is achieved by constructing a dataset with the observed cloudiness and correspondent solar irradiance during the last two weeks and then by matching the forecasted cloud factor for the next day with the solar irradiance values in the dataset. To verify the usefulness of the weather prediction models in predicting a short-term building load, the predicted data are inputted to a TRNSYS building model, and results are compared with a reference case. Results show that the test case can meet the acceptance error level defined by the ASHRAE guideline showing 8.8% in CVRMSE in spite of some inaccurate predictions for hourly weather data.

The Chromatin Accessibility Landscape of Nonalcoholic Fatty Liver Disease Progression

  • Kang, Byeonggeun;Kang, Byunghee;Roh, Tae-Young;Seong, Rho Hyun;Kim, Won
    • Molecules and Cells
    • /
    • v.45 no.5
    • /
    • pp.343-352
    • /
    • 2022
  • The advent of the assay for transposase-accessible chromatin using sequencing (ATAC-seq) has shown great potential as a leading method for analyzing the genome-wide profiling of chromatin accessibility. A comprehensive reference to the ATAC-seq dataset for disease progression is important for understanding the regulatory specificity caused by genetic or epigenetic changes. In this study, we present a genome-wide chromatin accessibility profile of 44 liver samples spanning the full histological spectrum of nonalcoholic fatty liver disease (NAFLD). We analyzed the ATAC-seq signal enrichment, fragment size distribution, and correlation coefficients according to the histological severity of NAFLD (healthy control vs steatosis vs fibrotic nonalcoholic steatohepatitis), demonstrating the high quality of the dataset. Consequently, 112,303 merged regions (genomic regions containing one or multiple overlapping peak regions) were identified. Additionally, we found differentially accessible regions (DARs) and performed transcription factor binding motif enrichment analysis and de novo motif analysis to determine new biomarker candidates. These data revealed the gene-regulatory interactions and noncoding factors that can affect NAFLD progression. In summary, our study provides a valuable resource for the human epigenome by applying an advanced approach to facilitate diagnosis and treatment by understanding the non-coding genome of NAFLD.

Labeling strategy to improve neutron/gamma discrimination with organic scintillator

  • Ali Hachem;Yoann Moline;Gwenole Corre;Bassem Ouni;Mathieu Trocme;Aly Elayeb;Frederick Carrel
    • Nuclear Engineering and Technology
    • /
    • v.55 no.11
    • /
    • pp.4057-4065
    • /
    • 2023
  • Organic scintillators are widely used for neutron/gamma detection. Pulse shape discrimination algorithms have been commonly used to discriminate the detected radiations. These algorithms have several limits, in particular with plastic scintillator which has lower discrimination ability, compared to liquid scintillator. Recently, machine learning (ML) models have been explored to enhance discrimination performance. Nevertheless, obtaining an accurate ML model or evaluating any discrimination approach requires a reference neutron dataset. The preparation of this is challenging because neutron sources are also gamma-ray emitters. Therefore, this paper proposes a pipeline to prepare clean labeled neutron/gamma datasets acquired by an organic scintillator. The method is mainly based on a Time of Flight setup and Tail-to-Total integral ratio (TTTratio) discrimination algorithm. In the presented case, EJ276 plastic scintillator and 252Cf source were used to implement the acquisition chain. The results showed that this process can identify and remove mislabeled samples in the entire ToF spectrum, including those that contribute to peak values. Furthermore, the process cleans ToF dataset from pile-up events, which can significantly impact experimental results and the conclusions extracted from them.

Assessment of Historical and Future Climatic Trends in Seti-Gandaki Basin of Nepal. A study based on CMIP6 Projections

  • Bastola Shiksha;Cho Jaepil;Jung Younghun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.162-162
    • /
    • 2023
  • Climate change is a complex phenomenon having its impact on diverse sectors. Temperature and precipitation are two of the most fundamental variables used to characterize climate, and changes in these variables can have significant impacts on ecosystems, agriculture, and human societies. This study evaluated the historical (1981-2010) and future (2011-2100) climatic trends in the Seti-Gandaki basin of Nepal based on 5 km resolution Multi Model Ensemble (MME) of 18 Global Climate Models (GCMs) from the Coupled Model Intercomparison Project Phase 6 (CMIP6) for SSP1-2.6, SSP2-4.5 and SSP5-85 scenarios. For this study, ERA5 reanalysis dataset is used for historical reference dataset instead of observation dataset due to a lack of good observation data in the study area. Results show that the basin has experienced continuous warming and an increased precipitation pattern in the historical period, and this rising trend is projected to be more prominent in the future. The Seti basin hosts 13 operational hydropower projects of different sizes, with 10 more planned by the government. Consequently, the findings of this study could be leveraged to design adaptation measures for existing hydropower schemes and provide a framework for policymakers to formulate climate change policies in the region. Furthermore, the methodology employed in this research could be replicated in other parts of the country to generate precise climate projections and offer guidance to policymakers in devising sustainable development plans for sectors like irrigation and hydropower.

  • PDF

TET2DICOM-GUI: Graphical User Interface Based TET2DICOM Program to Convert Tetrahedral-Mesh-Phantom to DICOM-RT Dataset

  • Se Hyung Lee;Bo-Wi Cheon;Chul Hee Min;Haegin Han;Chan Hyeong Kim;Min Cheol Han;Seonghoon Kim
    • Progress in Medical Physics
    • /
    • v.33 no.4
    • /
    • pp.172-179
    • /
    • 2022
  • Recently, tetrahedral phantoms have been newly adopted as international standard mesh-type reference computational phantoms (MRCPs) by the International Commission on Radiological Protection, and a program has been developed to convert them to computational tomography images and DICOM-RT structure files for application of radiotherapy. Through this program, the use of the tetrahedral standard phantom has become available in clinical practice, but utilization has been difficult due to various library dependencies requiring a lot of time and effort for installation. To overcome this limitation, in this study a newly developed TET2DICOM-GUI, a TET2DICOM program based on a graphical user interface (GUI), was programmed using only the MATLAB language so that it can be used without additional library installation and configuration. The program runs in the same order as TET2DICOM and has been optimized to run on a personal computer in a GUI environment. A tetrahedron-based male international standard human phantom, MRCP-AM, was used to evaluate TET2DICOM-GUI. Conversion into a DICOM-RT dataset applicable in clinical practice in about one hour with a personal computer as a basis was confirmed. Also, the generated DICOM-RT dataset was confirmed to be effectively implemented in the radiotherapy planning system. The program developed in this study is expected to replace actual patient data in future studies.

A streamlined pipeline based on HmmUFOtu for microbial community profiling using 16S rRNA amplicon sequencing

  • Hyeonwoo Kim;Jiwon Kim;Ji Won Cho;Kwang-Sung Ahn;Dong-Il Park;Sangsoo Kim
    • Genomics & Informatics
    • /
    • v.21 no.3
    • /
    • pp.40.1-40.11
    • /
    • 2023
  • Microbial community profiling using 16S rRNA amplicon sequencing allows for taxonomic characterization of diverse microorganisms. While amplicon sequence variant (ASV) methods are increasingly favored for their fine-grained resolution of sequence variants, they often discard substantial portions of sequencing reads during quality control, particularly in datasets with large number samples. We present a streamlined pipeline that integrates FastP for read trimming, HmmUFOtu for operational taxonomic units (OTU) clustering, Vsearch for chimera checking, and Kraken2 for taxonomic assignment. To assess the pipeline's performance, we reprocessed two published stool datasets of normal Korean populations: one with 890 and the other with 1,462 independent samples. In the first dataset, HmmUFOtu retained 93.2% of over 104 million read pairs after quality trimming, discarding chimeric or unclassifiable reads, while DADA2, a commonly used ASV method, retained only 44.6% of the reads. Nonetheless, both methods yielded qualitatively similar β-diversity plots. For the second dataset, HmmUFOtu retained 89.2% of read pairs, while DADA2 retained a mere 18.4% of the reads. HmmUFOtu, being a closed-reference clustering method, facilitates merging separately processed datasets, with shared OTUs between the two datasets exhibiting a correlation coefficient of 0.92 in total abundance (log scale). While the first two dimensions of the β-diversity plot exhibited a cohesive mixture of the two datasets, the third dimension revealed the presence of a batch effect. Our comparative evaluation of ASV and OTU methods within this streamlined pipeline provides valuable insights into their performance when processing large-scale microbial 16S rRNA amplicon sequencing data. The strengths of HmmUFOtu and its potential for dataset merging are highlighted.