• Title/Summary/Keyword: Bayesian 분석

Search Result 689, Processing Time 0.034 seconds

AptaCDSS - A Cardiovascular Disease Level Prediction and Clinical Decision Support System using Aptamer Biochip (AptaCDSS - 압타머칩을 이용한 심혈관질환 질환단계 예측 및 진단의사결정지원시스템)

  • Eom, Jae-Hong;Kim, Byoung-Hee;Lee, Je-Keun;Heo, Min-Oh;Park, Young-Jin;Kim, Min-Hyeok;Kim, Sung-Chun;Zhang, Byoung-Tak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10a
    • /
    • pp.28-32
    • /
    • 2006
  • 최근 연구결과에 의하면 심장질환을 포함한 심혈관질환은 성별에 관계없이 미국 및 전 세계적으로 질병사망의 주요 원인으로 조사되었다. 본 연구에서는 보다 효율적으로 진단하기 위해 진단의사 결정 보조시스템에 대해서 다룬다. 개발된 시스템은 혈청 내의 특정 단백질의 상대적 양을 측정할 수 있는 바이오칩인 압타머칩을 이용해 생성한 환자들의 칩 데이터를 Support Vector Machine, Neural Network, Decision Tree, Bayesian Network 등의 총 4가지 기계학습 알고리즘으로 분석하여 질환단계를 예측하고 진단을 위한 보조정보를 제공한다. 논문에서는 총 135개 샘플로 구성된 3K 압타머칩 데이터에 대해 측정된 초기 시스템의 질환단계 분류성능을 제시하고 보다 유용한 진단의사결정 보조 시스템을 구성하기 위한 요소들에 대해서 논의한다.

  • PDF

A Study on Auction Mechanism for DMZ Conservation using the South-North Korean Economic Development Projects (남북경제협력에 따른 개발이익 경매와 DMZ 보전기금 확보)

  • Park, Hojeong;Kim, Joonsoon;Kim, Hyunhee
    • Environmental and Resource Economics Review
    • /
    • v.28 no.1
    • /
    • pp.39-59
    • /
    • 2019
  • The Korean Demilitarized Zone (DMZ) has the great ecosystem as all the artificial activities in DMZ have been prohibited over half a century. The ecosystem should be conserved even after the reunification of Korea and hence the conservation plan should be established not after the reunification but before it. It requires a considerable budget to conserve DMZ, considering management of ecology resource, recovery, and research. The objective of this paper is to analyze a fund-raising measure for DMZ conservation, using economic incentives mechanism when multiple developers participate in the auction to get the right to develop North Korean regions, have private information about their sunk costs and pay a part of their profits for the fund. First, we analyze the real option model to decide the optimal investment time. Second, we construct the auction for bidders not to misrepresent their private information, based on Bayesian Nash equilibrium.

Diabetes prediction mechanism using machine learning model based on patient IQR outlier and correlation coefficient (환자 IQR 이상치와 상관계수 기반의 머신러닝 모델을 이용한 당뇨병 예측 메커니즘)

  • Jung, Juho;Lee, Naeun;Kim, Sumin;Seo, Gaeun;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.10
    • /
    • pp.1296-1301
    • /
    • 2021
  • With the recent increase in diabetes incidence worldwide, research has been conducted to predict diabetes through various machine learning and deep learning technologies. In this work, we present a model for predicting diabetes using machine learning techniques with German Frankfurt Hospital data. We apply outlier handling using Interquartile Range (IQR) techniques and Pearson correlation and compare model-specific diabetes prediction performance with Decision Tree, Random Forest, Knn (k-nearest neighbor), SVM (support vector machine), Bayesian Network, ensemble techniques XGBoost, Voting, and Stacking. As a result of the study, the XGBoost technique showed the best performance with 97% accuracy on top of the various scenarios. Therefore, this study is meaningful in that the model can be used to accurately predict and prevent diabetes prevalent in modern society.

Ensemble data assimilation using WRF-Hydro and DART (WRF-Hydro와 DART를 이용한 분포형 수문모형의 자료동화)

  • Noh, Seong Jin;Choi, Hyeonjin;Kim, Bomi;Lee, Garim;Lee, Songhee
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.392-392
    • /
    • 2021
  • 자료동화(data assimilation) 기법은 관측 자료와 예측 모형의 정보를 동시에 활용, 모형의 상태량(state variables)이나 매개변수(model parameters)를 실시간으로 업데이트하는 Bayesian 필터링 이론에 근거한 방법으로, 최근 이를 활용한 수문 모의 정확도 향상 기술이 빠르게 발전하고 있다. 본 연구에서는 앙상블 자료동화의 정확성을 향상시키기 위한 세부 방법인 along-the-stream localization과 inflation 기법의 분포형 수문 모형에 대한 적용성을 대규모 지역 단위(regional-scale) 모의를 통해 검토한다. 분포형 수문모형과 자료동화 framework로는 WRF-Hydro(Weather Research and Forecasting Model Hydrological Modeling System)와 DART(Data Assimilation Research Testbed)를 각각 적용한다. WRF-Hydro는 미국의 전 대륙지역(CONUS; continental United States)에 대한 수문 모델링 framework인 National Water Model의 핵심엔진이고, DART는 미국 National Center for Atmospheric Research(NCAR) 연구소에서 개발한 범용 자료동화 도구이다. 본 연구에서는 지표수 수문모형의 자료동화를 위해 개발된 기법인 along-the-stream localization과 inflation 기법이 하도 추적에 미치는 영향을 분석한다. along-the stream localization 기법은 공간적 근접도 외에 하도의 수문학적 연관관계를 고려하는 localization 기법으로, 상대적으로 수문학적 상관도가 떨어지는 하도에 대한 과도한 자료동화를 줄여줄 수 있다. inflation 기법은 앙상블의 다양성을 증가시키는 기법으로, 칼만 필터(Kalman filter)에 의한 업데이트의 이전이나 이후 적용하여 앙상블 예측의 정확도를 추가적으로 향상시킬 수 있다. 본 고에서는 앙상블 자료동화 기법을 지표수 수문 모의에 적용할 경우 남아 있는 난제와 적용 가능한 방법에 대해 중점적으로 논의한다.

  • PDF

Machine Learning-based Data Analysis for Designing High-strength Nb-based Superalloys (고강도 Nb기 초내열 합금 설계를 위한 기계학습 기반 데이터 분석)

  • Eunho Ma;Suwon Park;Hyunjoo Choi;Byoungchul Hwang;Jongmin Byun
    • Journal of Powder Materials
    • /
    • v.30 no.3
    • /
    • pp.217-222
    • /
    • 2023
  • Machine learning-based data analysis approaches have been employed to overcome the limitations in accurately analyzing data and to predict the results of the design of Nb-based superalloys. In this study, a database containing the composition of the alloying elements and their room-temperature tensile strengths was prepared based on a previous study. After computing the correlation between the tensile strength at room temperature and the composition, a material science analysis was conducted on the elements with high correlation coefficients. These alloying elements were found to have a significant effect on the variation in the tensile strength of Nb-based alloys at room temperature. Through this process, a model was derived to predict the properties using four machine learning algorithms. The Bayesian ridge regression algorithm proved to be the optimal model when Y, Sc, W, Cr, Mo, Sn, and Ti were used as input features. This study demonstrates the successful application of machine learning techniques to effectively analyze data and predict outcomes, thereby providing valuable insights into the design of Nb-based superalloys.

Study on Genetic Evaluation using Genomic Information in Animal Breeding - Simulation Study for Estimation of Marker Effects (가축 유전체정보 활용 종축 유전능력 평가 연구 - 표지인자 효과 추정 모의실험)

  • Cho, Chung-Il;Lee, Deuk-Hwan
    • Journal of Animal Science and Technology
    • /
    • v.53 no.1
    • /
    • pp.1-6
    • /
    • 2011
  • This simulation study was performed to investigate the accuracy of the estimated breeding value by using genomic information (GEBV) by way of Bayesian framework. Genomic information by way of single nucleotide polymorphism (SNP) from a chromosome with length of 100cM were simulated with different marker distance (0.1cM, 0.5cM), heritabilities (0.1, 0.5) and half sibs families (20 heads, 4 heads). For generating the simulated population in which animals were inferred to genomic polymorphism, we assumed that the number of quantitative trait loci (QTL) were equal with the number of no effect markers. The positions of markers and QTLs were located with even and scatter distances, respectively. The accuracies of estimated breeding values by way of indicating correlations between true and estimated breeding values were compared on several cases of marker distances, heritabilities and family sizes. The accuracies of breeding values on animals only having genomic information were 0.87 and 0.81 in marker distances of 0.1cM and 0.5cM, respectively. These accuracies were shown to be influenced by heritabilities (0.87 at $h^2$ =0.10, 0.94 at $h^2$ =0.50). According to half sibs' family size, these accuracies were 0.87 and 0.84 in family size of 20 and 4, respectively. As half sibs family size is high, accuracy of breeding appeared high. Based on the results of this study it is concluded that the amount of marker information, heritability and family size would influence the accuracy of the estimated breeding values in genomic selection methodology for animal breeding.

Genetic Variation of Korean Fir Sub-Populations in Mt. Jiri for the Restoration of Genetic Diversity (유전다양성 복원을 위한 지리산 구상나무 아집단의 유전변이)

  • Ahn, Ji Young;Lim, Hyo-In;Ha, Hyun-Woo;Han, Jingyu;Han, Sim-Hee
    • Journal of Korean Society of Forest Science
    • /
    • v.106 no.4
    • /
    • pp.417-423
    • /
    • 2017
  • To provide a ecological restoration strategy considering genetic diversity of Abies koreana in Mt. Jiri, the genetic diversity and the genetic differentiation among sub-populations such as Banyabong, Byeoksoryeong, and Cheonwangbong were investigated. The average number of alleles (A) was 7.8, the average number of effective alleles ($A_e$) was 4.9, observed heterozygosity ($H_o$) was 0.578, and expected heterozygosity ($H_e$) was 0.672, respectively. The level of genetic diversity within sub-populations ($H_e=0.672$) was lower than those of both population ($H_e=0.778$) and species ($H_e=0.759$) level. However, the level of genetic diversity was high compared those of Genus Abies. Genetic differentiation was 0.014 from F-statistics ($F_{ST}$) and was 0.004 from AMOVA analysis (${\Phi}_{ST}$). There was no almost genetic differentiation among sub-populations in Mt. Jiri from bayesian clustering. Therefore, If the seeds are sampled sufficiently by selecting the parameters from three sub-populations, it is possible that we could obtain genetically appropriate materials for ecological restoration.

Fault Localization for Self-Managing Based on Bayesian Network (베이지안 네트워크 기반에 자가관리를 위한 결함 지역화)

  • Piao, Shun-Shan;Park, Jeong-Min;Lee, Eun-Seok
    • The KIPS Transactions:PartB
    • /
    • v.15B no.2
    • /
    • pp.137-146
    • /
    • 2008
  • Fault localization plays a significant role in enormous distributed system because it can identify root cause of observed faults automatically, supporting self-managing which remains an open topic in managing and controlling complex distributed systems to improve system reliability. Although many Artificial Intelligent techniques have been introduced in support of fault localization in recent research especially in increasing complex ubiquitous environment, the provided functions such as diagnosis and prediction are limited. In this paper, we propose fault localization for self-managing in performance evaluation in order to improve system reliability via learning and analyzing real-time streams of system performance events. We use probabilistic reasoning functions based on the basic Bayes' rule to provide effective mechanism for managing and evaluating system performance parameters automatically, and hence the system reliability is improved. Moreover, due to large number of considered factors in diverse and complex fault reasoning domains, we develop an efficient method which extracts relevant parameters having high relationships with observing problems and ranks them orderly. The selected node ordering lists will be used in network modeling, and hence improving learning efficiency. Using the approach enables us to diagnose the most probable causal factor with responsibility for the underlying performance problems and predict system situation to avoid potential abnormities via posting treatments or pretreatments respectively. The experimental application of system performance analysis by using the proposed approach and various estimations on efficiency and accuracy show that the availability of the proposed approach in performance evaluation domain is optimistic.

The development of water circulation model based on quasi-realtime hydrological data for drought monitoring (수문학적 가뭄 모니터링을 위한 실적자료 기반 물순환 모델 개발)

  • Kim, Jin-Young;Kim, Jin-Guk;Kim, Jang-Gyeng;Chun, Gun-il;Kang, Shin-uk;Lee, Jeong-Ju;Nam, Woo-Sung;Kwon, Hyun-Han
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.8
    • /
    • pp.569-582
    • /
    • 2020
  • Recently, Korea has faced a change in the pattern of water use due to urbanization, which has caused difficulties in understanding the rainfall-runoff process and optimizing the allocation of available water resources. In this perspective, spatially downscaled analysis of the water balance is required for the efficient operation of water resources in the National Water Management Plan and the River Basin Water Resource Management Plan. However, the existing water balance analysis does not fully consider water circulation and availability in the basin, thus, the obtained results provide limited information in terms of decision making. This study aims at developing a novel water circulation analysis model that is designed to support a quasi-real-time assessment of water availability along the river. The water circulation model proposed in this study improved the problems that appear in the existing water balance analysis. More importantly, the results showed a significant improvement over the existing model, especially in the low flow simulation. The proposed modeling framework is expected to provide primary information for more realistic hydrological drought monitoring and drought countermeasures by providing streamflow information in quasi-real-time through a more accurate natural flow estimation approach with highly complex network.

Genetic Variation of Pinus densiflora Populations in South Korea Based on ESTP Markers (ESTP 표지를 이용한 국내 소나무 집단의 유전변이)

  • Ahn, Ji Young;Hong, Kyung Nak;Lee, Jei Wan;Hong, Yong Pyo;Kang, Hoduck
    • Korean Journal of Plant Resources
    • /
    • v.28 no.2
    • /
    • pp.279-289
    • /
    • 2015
  • Genetic diversity and genetic differentiation of thirteen Pinus densiflora populations in South Korea were estimated using nine ESTP (Expressed Sequence Tag Polymorphism) markers. The numbers of allele and the effective allele were 2.2 and 1.8, respectively. The percentage of polymorphic loci (P) was 98.8%. The observed and the expected heterozygosity were 0.391 and 0.402, respectively, and the eleven populations except for Ahngang and Gangneung population were under Hardy-Weinberg equilibrium state. The level of genetic differentiation (Wright’s FST = 0.057) was higher than those of isozyme or nSSR markers. We could not find out any relationship between the genetic distance and geographic distribution among populations from cluster analysis. Also, the genetic differentiation between populations was not correlated with the geographic distance (r = 0.017 and P = 0.344 from Mantel test). From the result of FST-outlier analysis to identify a locus under selection, six loci were detected at confidence interval of 99% by the frequentist’s method. However, only three loci (sams2+AluⅠ, sams2+RsaⅠ, PtNCS_p14A9+HaeⅢ) were presumed as outliers by Bayesian method. The sams2+AluⅠ and sams2+RsaⅠlocus were originated from the sams2 gene and seemed to be the loci under balancing selection.