• Title/Summary/Keyword: Probabilistic Data Association

Search Result 118, Processing Time 0.031 seconds

Nonstandard Machine Learning Algorithms for Microarray Data Mining

  • Zhang, Byoung-Tak
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2001.10a
    • /
    • pp.165-196
    • /
    • 2001
  • DNA chip 또는 microarray는 다수의 유전자 또는 유전자 조각을 (보통 수천내지 수만 개)칩상에 고정시켜 놓고 DNA hybridization 반응을 이용하여 유전자들의 발현 양상을 분석할 수 있는 기술이다. 이러한 high-throughput기술은 예전에는 생각하지 못했던 여러가지 분자생물학의 문제에 대한 해답을 제시해 줄 수 있을 뿐 만 아니라, 분자수준에서의 질병 진단, 신약 개발, 환경 오염 문제의 해결 등 그 응용 가능성이 무한하다. 이 기술의 실용적인 적용을 위해서는 DNA chip을 제작하기 위한 하드웨어/웻웨어 기술 외에도 이러한 데이터로부터 최대한 유용하고 새로운 지식을 창출하기 위한 bioinformatics 기술이 핵심이라고 할 수 있다. 유전자 발현 패턴을 데이터마이닝하는 문제는 크게 clustering, classification, dependency analysis로 구분할 수 있으며 이러한 기술은 통계학과인공지능 기계학습에 기반을 두고 있다. 주로 사용된 기법으로는 principal component analysis, hierarchical clustering, k-means, self-organizing maps, decision trees, multilayer perceptron neural networks, association rules 등이다. 본 세미나에서는 이러한 기본적인 기계학습 기술 외에 최근에 연구되고 있는 새로운 학습 기술로서 probabilistic graphical model (PGM)을 소개하고 이를 DNA chip 데이터 분석에 응용하는 연구를 살펴본다. PGM은 인공신경망, 그래프 이론, 확률 이론이 결합되어 형성된 기계학습 모델로서 인간 두뇌의 기억과 학습 기작에 기반을 두고 있으며 다른 기계학습 모델과의 큰 차이점 중의 하나는 generative model이라는 것이다. 즉 일단 모델이 만들어지면 이것으로부터 새로운 데이터를 생성할 수 있는 능력이 있어서, 만들어진 모델을 검증하고 이로부터 새로운 사실을 추론해 낼 수 있어 biological data mining 문제에서와 같이 새로운 지식을 발견하는 exploratory analysis에 적합하다. 또한probabilistic graphical model은 기존의 신경망 모델과는 달리 deterministic한의사결정이 아니라 확률에 기반한 soft inference를 하고 학습된 모델로부터 관련된 요인들간의 인과관계(causal relationship) 또는 상호의존관계(dependency)를 분석하기에 적합한 장점이 있다. 군체적인 PGM 모델의 예로서, Bayesian network, nonnegative matrix factorization (NMF), generative topographic mapping (GTM)의 구조와 학습 및 추론알고리즘을소개하고 이를 DNA칩 데이터 분석 평가 대회인 CAMDA-2000과 CAMDA-2001에서 사용된cancer diagnosis 문제와 gene-drug dependency analysis 문제에 적용한 결과를 살펴본다.

  • PDF

Drought Frequency Analysis Using Hidden Markov Chain Model and Bivariate Copula Function (Hidden Markov Chain 모형과 이변량 코플라함수를 이용한 가뭄빈도분석)

  • Chun, Si-Young;Kim, Yong-Tak;Kwon, Hyun-Han
    • Journal of Korea Water Resources Association
    • /
    • v.48 no.12
    • /
    • pp.969-979
    • /
    • 2015
  • This study applied a probabilistic-based hidden Markov model (HMM) to better characterize drought patterns. In addition, a copula-based bivariate drought frequency analysis was employed to further investigate return periods of the current drought condition in year 2015. The obtained results revealed that western Kangwon area was generally more vulnerable to drought risk than eastern Kangwon area using the 40-year data. Imjin-river watershed including Cheorwon area was the most vulnerable area in terms of severe drought events. Four stations in Han-river watershed showed a joint return period exceeding 1,000 years associated with the drought duration and severity in 2014-2015. Especially, current drought status in Northern Han-river and Imjin-river watershed is most severe drought exceeding 100-year return period.

Return Period Estimation of Droughts Using Drought Variables from Standardized Precipitation Index (표준강수지수 시계열의 가뭄특성치를 이용한 가뭄 재현기간 산정)

  • Kwak, Jae Won;Lee, Sung Dae;Kim, Yon Soo;Kim, Hung Soo
    • Journal of Korea Water Resources Association
    • /
    • v.46 no.8
    • /
    • pp.795-805
    • /
    • 2013
  • Drought is one of the severe natural disasters and it can profoundly affect our society and ecosystem. Also, it is a very important variable for water resources planning and management. Therefore, the drought is analyzed in this study to understand the drought distribution and trend. The Standard Precipitation Index (SPI) is estimated using precipitation data obtained from 55 rain gauge stations in South Korea and the SPI based drought variables such as drought duration and drought severity were defined. Drought occurrence and joint probabilistic analysis for SPI based drought variables were performed with run theory and copula functions. And then the return period and spatial distribution of droughts on the South Korea was estimated. As the results, we have shown that Gongju and Chungju in Chungcheong-do and Wonju, Inje, Jeongseon, Taebeak in Gangwon-do have vulnerability to droughts.

Low-flow simulation and forecasting for efficient water management: case-study of the Seolmacheon Catchment, Korea

  • Birhanu, Dereje;Kim, Hyeon Jun;Jang, Cheol Hee;ParkYu, Sanghyun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2015.05a
    • /
    • pp.243-243
    • /
    • 2015
  • Low-flow simulation and forecasting is one of the emerging issues in hydrology due to the increasing demand of water in dry periods. Even though low-flow simulation and forecasting remains a difficult issue for hydrologists better simulation and earlier prediction of low flows are crucial for efficient water management. The UN has never stated that South Korea is in a water shortage. However, a recent study by MOLIT indicates that Korea will probably lack water by 4.3 billion m3 in 2020 due to several factors, including land cover and climate change impacts. The two main situations that generate low-flow events are an extended dry period (summer low-flow) and an extended period of low temperature (winter low-flow). This situation demands the hydrologists to concentrate more on low-flow hydrology. Korea's annual average precipitation is about 127.6 billion m3 where runoff into rivers and losses accounts 57% and 43% respectively and from 57% runoff discharge to the ocean is accounts 31% and total water use is about 26%. So, saving 6% of the runoff will solve the water shortage problem mentioned above. The main objective of this study is to present the hydrological modelling approach for low-flow simulation and forecasting using a model that have a capacity to represent the real hydrological behavior of the catchment and to address the water management of summer as well as winter low-flow. Two lumped hydrological models (GR4J and CAT) will be applied to calibrate and simulate the streamflow. The models will be applied to Seolmacheon catchment using daily streamflow data at Jeonjeokbigyo station, and the Nash-Sutcliffe efficiencies will be calculated to check the model performance. The expected result will be summarized in a different ways so as to provide decision makers with the probabilistic forecasts and the associated risks of low flows. Finally, the results will be presented and the capacity of the models to provide useful information for efficient water management practice will be discussed.

  • PDF

A study on prediction method for flood risk using LENS and flood risk matrix (국지 앙상블자료와 홍수위험매트릭스를 이용한 홍수위험도 예측 방법 연구)

  • Choi, Cheonkyu;Kim, Kyungtak;Choi, Yunseok
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.9
    • /
    • pp.657-668
    • /
    • 2022
  • With the occurrence of localized heavy rain while river flow has increased, both flow and rainfall cause riverside flood damages. As the degree of damage varies according to the level of social and economic impact, it is required to secure sufficient forecast lead time for flood response in areas with high population and asset density. In this study, the author established a flood risk matrix using ensemble rainfall runoff modeling and evaluated its applicability in order to increase the damage reduction effect by securing the time required for flood response. The flood risk matrix constructs the flood damage impact level (X-axis) using flood damage data and predicts the likelihood of flood occurrence (Y-axis) according to the result of ensemble rainfall runoff modeling using LENS rainfall data and as well as probabilistic forecasting. Therefore, the author introduced a method for determining the impact level of flood damage using historical flood damage data and quantitative flood damage assessment methods. It was compared with the existing flood warning data and the damage situation at the flood warning points in the Taehwa River Basin and the Hyeongsan River Basin in the Nakdong River Region. As a result, the analysis showed that it was possible to predict the time and degree of flood risk from up to three days in advance. Hence, it will be helpful for damage reduction activities by securing the lead time for flood response.

Utility of Climate Model Information For Water Resources Management in Korea

  • Jeong, Chang-Sam
    • Journal of the Korean Society of Hazard Mitigation
    • /
    • v.8 no.6
    • /
    • pp.37-45
    • /
    • 2008
  • It is expected that conditions of water resources will be changed in Korea in accordance with world wide climate change. In order to deal with this problem and find a way of minimizing the effect of future climate change, the usefulness of climate model simulation information is examined in this study. The objective of this study is to assess the applicability of GCM (General Circulation Model) information for Korean water resources management through uncertainty analysis. The methods are based on probabilistic measures of the effectiveness of GCM simulations of an indicator variable for discriminating high versus low regional observations of a target variable. The formulation uses the significance probability of the Kolmogorov-Smirnov test for detecting differences between two variables. An estimator that accounts for climate model simulation and spatial association between the GCM data and observed data is used. Atmospheric general circulation model (AGCM) simulations done by ECMWF (European Centre for Medium-Range Weather Forecasts) with a resolution of $2^{\circ}{\times}2^{\circ}$, and METRI (Meteorological Research Institute, Korea) with resolutions of $2^{\circ}{\times}2^{\circ}$ and $4^{\circ}{\times}5^{\circ}$, were used for indicator variables, while observed mean areal precipitation (MAP) data, discharge data and mean areal temperature data on the seven major river basins in Korea were used for target variables. The results show that GCM simulations are useful in discriminating the high from the low of the observed precipitation, discharge, and temperature values. Temperature especially can be useful regardless of model and season.

Development of A Multi-sensor Fusion-based Traffic Information Acquisition System with Robust to Environmental Changes using Mono Camera, Radar and Infrared Range Finder (환경변화에 강인한 단안카메라 레이더 적외선거리계 센서 융합 기반 교통정보 수집 시스템 개발)

  • Byun, Ki-hoon;Kim, Se-jin;Kwon, Jang-woo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.16 no.2
    • /
    • pp.36-54
    • /
    • 2017
  • The purpose of this paper is to develop a multi-sensor fusion-based traffic information acquisition system with robust to environmental changes. it combines the characteristics of each sensor and is more robust to the environmental changes than the video detector. Moreover, it is not affected by the time of day and night, and has less maintenance cost than the inductive-loop traffic detector. This is accomplished by synthesizing object tracking informations based on a radar, vehicle classification informations based on a video detector and reliable object detections of a infrared range finder. To prove the effectiveness of the proposed system, I conducted experiments for 6 hours over 5 days of the daytime and early evening on the pedestrian - accessible road. According to the experimental results, it has 88.7% classification accuracy and 95.5% vehicle detection rate. If the parameters of this system is optimized to adapt to the experimental environment changes, it is expected that it will contribute to the advancement of ITS.

A Study on the Model for Determining the Deceptive Status of Attackers using Markov Chain (Markov Chain을 이용한 기만환경 칩입 공격자의 기만 여부 예측 모델에 대한 연구)

  • Sunmo Yoo;Sungmo Wi;Jonghwa Han;Yonghyoun Kim;Jungsik Cho
    • Convergence Security Journal
    • /
    • v.23 no.2
    • /
    • pp.37-45
    • /
    • 2023
  • Cyber deception technology plays a crucial role in monitoring attacker activities and detecting new types of attacks. However, along with the advancements in deception technology, the development of Anti-honeypot technology has allowed attackers who recognize the deceptive environment to either cease their activities or exploit the environment in reverse. Currently, deception technology is unable to identify or respond to such situations. In this study, we propose a predictive model using Markov chain analysis to determine the identification of attackers who infiltrate deceptive environments. The proposed model for deception status determination is the first attempt of its kind and is expected to overcome the limitations of existing deception-based attacker analysis, which does not consider attackers who identify the deceptive environment. The classification model proposed in this study demonstrated a high accuracy rate of 97.5% in identifying and categorizing attackers operating in deceptive environments. By predicting the identification of an attacker's deceptive environment, it is anticipated that this model can provide refined data for numerous studies analyzing deceptive environment intrusions.

An Evaluation Method of Water Supply Reliability for a Dam by Firm Yield Analysis (보장 공급량 분석에 의한 댐의 물 공급 안전도 평가기법 연구)

  • Lee, Sang-Ho;Kang, Tae-Uk
    • Journal of Korea Water Resources Association
    • /
    • v.39 no.5 s.166
    • /
    • pp.467-478
    • /
    • 2006
  • Water supply reliability for a dam is defined with a concept of probabilistic reliability. An evaluation procedure of the water supply reliability is shown with an analysis of long term firm yield reliability. The water supply reliabilities of Soyanggang Dam and Chungju Dam were evaluated. To evaluate the water supply reliability, forty one sets of monthly runoff series were generated by SAMS-2000. HEC-5 model was applied to the reservoir simulation to compute the firm yield from a monthly data of time series. The water supply reliability of the firm yield from the design runoff data of Soyanggang Dam is evaluated by 80.5 % for a planning period of 50 years. The water supply reliability of the firm yield from the historic runoff after the dam construction is evaluated by 53.7 %. The firm yield from the design runoff is 1.491 billion $m^3$/yr and the firm yield from the historic runoff is 1.585 billion $m^3$/yr. If the target draft Is 1.585 billion $m^3$/yr, additional water of 0.094 billion $m^3$ could be supplied every year with its risk. From the similar procedures, the firm yield from the design runoff of Chungju Dam is evaluated 3.377 billion $m^3$/yr and the firm yield from the historic runoff is 2.960 billion $m^3$/yr. If the target draft is 3.377 billion $m^3$/yr, water supply insufficiency occurs for all the sets of time series generated. It may result from overestimation of the spring runoff used for design. The procedure shown can be a more objective method to evaluate water supply reliability of a dam.

Understanding of Generative Artificial Intelligence Based on Textual Data and Discussion for Its Application in Science Education (텍스트 기반 생성형 인공지능의 이해와 과학교육에서의 활용에 대한 논의)

  • Hunkoog Jho
    • Journal of The Korean Association For Science Education
    • /
    • v.43 no.3
    • /
    • pp.307-319
    • /
    • 2023
  • This study aims to explain the key concepts and principles of text-based generative artificial intelligence (AI) that has been receiving increasing interest and utilization, focusing on its application in science education. It also highlights the potential and limitations of utilizing generative AI in science education, providing insights for its implementation and research aspects. Recent advancements in generative AI, predominantly based on transformer models consisting of encoders and decoders, have shown remarkable progress through optimization of reinforcement learning and reward models using human feedback, as well as understanding context. Particularly, it can perform various functions such as writing, summarizing, keyword extraction, evaluation, and feedback based on the ability to understand various user questions and intents. It also offers practical utility in diagnosing learners and structuring educational content based on provided examples by educators. However, it is necessary to examine the concerns regarding the limitations of generative AI, including the potential for conveying inaccurate facts or knowledge, bias resulting from overconfidence, and uncertainties regarding its impact on user attitudes or emotions. Moreover, the responses provided by generative AI are probabilistic based on response data from many individuals, which raises concerns about limiting insightful and innovative thinking that may offer different perspectives or ideas. In light of these considerations, this study provides practical suggestions for the positive utilization of AI in science education.