• 제목/요약/키워드: Statistical Model Validation

검색결과 261건 처리시간 0.027초

서울지역 PM10 농도 예측모형 개발 (Development of statistical forecast model for PM10 concentration over Seoul)

  • 손건태;김다홍
    • Journal of the Korean Data and Information Science Society
    • /
    • 제26권2호
    • /
    • pp.289-299
    • /
    • 2015
  • 본 연구는 PM10 농도에 대한 계량치 예측모형 개발을 목적으로 한다. 세 종류의 자료 (기상관측 자료, 세계기상통신망 중국 관측자료, 대기질 화학수치모델자료)를 예측인자로 사용하였으며, 일일 단기예보 시스템에 쉽게 적용할 수 있도록 시간자료를 일자료로 변환하였고 시차변환을 수행하였다. 상관분석과 다중공선성 진단을 통하여 예측인자를 선택하고 두 종류의 모형 (중회귀모형, 문턱치 회귀모형)을 각각 적합하였다. 모형 안정성 검사를 위하여 모형검증을 수행하였으며, 전체자료를 사용하여 모형을 재추정한 후 예측치와 관측치 사이의 산점도와 시계열그림, RMSE, 예측성 평가측도를 작성 및 산출하여 두 모형을 비교하였다. 문턱치 회귀모형의 예측력이 고농도 PM10예측에서 다소 우수한 결과를 보였다.

Regulatory Oversight of Nuclear Safety Culture and the Validation Study on the Oversight Model Components

  • Choi, Young Sung;Jung, Su Jin;Chung, Yun Hyung
    • 대한인간공학회지
    • /
    • 제35권4호
    • /
    • pp.263-275
    • /
    • 2016
  • Objective: This paper introduces the regulatory oversight approaches and issues to consider in the course of safety culture oversight model development in the nuclear field. Common understanding on regulatory oversight and present practices of international communities are briefly reviewed. The nuclear safety culture oversight model of Korea is explained focusing on the development of safety culture definition and components, and their basic meanings. Oversight components are identified to represent the multiple human and organizational elements which can affect and reinforce elements of defense in depth system for nuclear safety. Result of validation study on safety culture components is briefly introduced too. Finally, the results of the application of the model are presented to show its effectiveness and feasibility. Background: The oversight of nuclear licensee's safety culture has been an important regulatory issue in the international community of nuclear safety regulation. Concurrent with the significant events that started to occur in the early 2000s and that had implications about safety culture of the operating organizations, it has been natural for regulators to pay attention to appropriate methods and even philosophy for intervening the licensee's safety culture. Although safety culture has been emphasized for last 30 years as a prerequisite to ensure high level of nuclear safety, it has not been of regulatory scope and has a unique dilemma between external oversight and the voluntary nature of culture. Safety culture oversight is a new regulatory challenge that needs to be approached taking into consideration of the uncontrollable aspects of cultural changes and the impacts on licensee's safety culture. Although researchers and industrial practitioners still struggle with measuring, evaluating, managing and changing safety culture, it was recognized that efforts to observe and influence licensees' safety culture should not be delayed. Method: Safety culture components which regulatory oversight will have to focus on are developed by benchmarking the concept of physical barriers and introducing the defense in depth philosophy into organizational system. Therefore, this paper begins with review of international regulatory oversight approaches and issues associated with the regulatory oversight of safety culture, followed by the development of oversight model. The validity of the model was verified by statistical analysis with the survey result obtained from survey administration to NPP employees in Korea. The developed safety culture oversight model and components were used in the "safety culture inspection" activities of the Korean regulatory body. Results: The developed safety culture model was confirmed to be valid in terms of content, construct and criterion validity. And the actual applicability in the nuclear operating organization was verified after series of pilot "safety culture inspection" activities. Conclusion: The application of the nuclear safety culture oversight model to operating organization of NPPs showed promising results for regulatory tools required for the organizations to improve their safety culture. Application: The developed oversight model and components might be used in the inspection activities and regulatory oversight of NPP operating organization's safety culture.

다층신경망모형에 의한 일 유출량의 예측에 관한 연구 (A Study on the Forecasting of Daily Streamflow using the Multilayer Neural Networks Model)

  • 김성원
    • 한국수자원학회논문집
    • /
    • 제33권5호
    • /
    • pp.537-550
    • /
    • 2000
  • 본 연구에서는 낙동강 진동지점에서 일유출량을 예측하기 위하여 신경망모형이 제시되었다. 신경망모형의 구조는 CASE 1(5-5-1)과 CASE 2(5-5-5-1)로 구성하였으며, 은닉층의 수에 따라 두 가지의 모형으로 분류하였다. 각 신경망모형은 광역최소점과 훈련임계치에 수렴하는데 기존의 역전파훈련 알고리즘(BP) 보다 뛰어난 Fletcher-Reeves 공액구배 역전파훈련 알고리즘(FR-CGBP)과 축적된 공액구배 역전파훈련 알고리즘(SCGBP)을 이용하였다. 그리고 모형의 훈련과 검증을 위하여 이용된 자료는 풍수년, 평수년, 갈수년 풍수년+평수년, 풍수년+갈수년, 평수년+갈수년 및 풍수년+평수년+갈수년으로 구분하여 구성하였다. 모형의 훈련과정에서 각 자료를 이용하여 최적 연결강도와 편차가 결정되어 졌으며, 동시에 일유출량이 계산되어졌다. 예측오차의 통계분석을 통하여 풍수년+갈수년의 자료를 제외하고는 훈련결과가 양호한 것으로 나타났다. 모형의 검증에는 모형의 훈련을 통해 산정된 CASE 1 의 SCGBP 알고리즘의 연결강도와 편차를 이용하였으며, 검증의 결과는 훈련결과처럼 만족스러운 것으로 분석되었다. 또한 본 연구에서 선정한 신경망모형과 비교검토하기 위하여 다중회귀분석모형을 적용하여 일유출량을 예측하였으며, 그 결과 신경망모형이 다소 우수한 결과를 나타내는 것으로 분석되었다. 이와 같이 신경망모형은 조직적인 접근법, 매개변수의 감소 및 모델을 개발하는데 소모되는 시간을 줄일수 있는 장점이 있다.

  • PDF

SVM을 이용한 지구에 영향을 미치는 Halo CME 예보

  • 최성환;문용재;박영득
    • 천문학회보
    • /
    • 제38권1호
    • /
    • pp.61.1-61.1
    • /
    • 2013
  • In this study we apply Support Vector Machine (SVM) to the prediction of geo-effective halo coronal mass ejections (CMEs). The SVM, which is one of machine learning algorithms, is used for the purpose of classification and regression analysis. We use halo and partial halo CMEs from January 1996 to April 2010 in the SOHO/LASCO CME Catalog for training and prediction. And we also use their associated X-ray flare classes to identify front-side halo CMEs (stronger than B1 class), and the Dst index to determine geo-effective halo CMEs (stronger than -50 nT). The combinations of the speed and the angular width of CMEs, and their associated X-ray classes are used for input features of the SVM. We make an attempt to find the best model by using cross-validation which is processed by changing kernel functions of the SVM and their parameters. As a result we obtain statistical parameters for the best model by using the speed of CME and its associated X-ray flare class as input features of the SVM: Accuracy=0.66, PODy=0.76, PODn=0.49, FAR=0.72, Bias=1.06, CSI=0.59, TSS=0.25. The performance of the statistical parameters by applying the SVM is much better than those from the simple classifications based on constant classifiers.

  • PDF

Econometric Estimation of the Climate Change Policy Effect in the U.S. Transportation Sector

  • Choi, Jaesung
    • 한국기후변화학회지
    • /
    • 제8권1호
    • /
    • pp.1-10
    • /
    • 2017
  • Over the past centuries, industrialization in developed and developing countries has had a negative impact on global warming, releasing $CO_2$ emissions into the Earth's atmosphere. In recent years, the transportation sector, which emits one-third of total $CO_2$ emissions in the United States, has adapted by implementing a climate change action plan to reduce $CO_2$ emissions. Having an environmental policy might be an essential factor in mitigating the man-made global warming threats to protect public health and the coexistent needs of current and future generations; however, to my best knowledge, no research has been conducted in such a context with appropriate statistical validation process to evaluate the effects of climate change policy on $CO_2$ emission reduction in recent years in the U.S. transportation. The empirical findings using an entity fixed-effects model with valid statistical tests show the positive effects of climate change policy on $CO_2$ emission reduction in a state. With all the 49 states joining the climate change action plans, the U.S. transportation sector is expected to reduce its $CO_2$ emissions by 20.2 MMT per year, and for the next 10 years, the cumulated $CO_2$ emission reduction is projected to reach 202.3 MMT, which is almost equivalent to the $CO_2$ emissions from the transportation sector produced in 2012 by California, the largest $CO_2$ emission state in the nation.

딥러닝 기반 농경지 속성분류를 위한 TIF 이미지와 ECW 이미지 간 정확도 비교 연구 (A Study on the Attributes Classification of Agricultural Land Based on Deep Learning Comparison of Accuracy between TIF Image and ECW Image)

  • 김지영;위성승
    • 한국농공학회논문집
    • /
    • 제65권6호
    • /
    • pp.15-22
    • /
    • 2023
  • In this study, We conduct a comparative study of deep learning-based classification of agricultural field attributes using Tagged Image File (TIF) and Enhanced Compression Wavelet (ECW) images. The goal is to interpret and classify the attributes of agricultural fields by analyzing the differences between these two image formats. "FarmMap," initiated by the Ministry of Agriculture, Food and Rural Affairs in 2014, serves as the first digital map of agricultural land in South Korea. It comprises attributes such as paddy, field, orchard, agricultural facility and ginseng cultivation areas. For the purpose of comparing deep learning-based agricultural attribute classification, we consider the location and class information of objects, as well as the attribute information of FarmMap. We utilize the ResNet-50 instance segmentation model, which is suitable for this task, to conduct simulated experiments. The comparison of agricultural attribute classification between the two images is measured in terms of accuracy. The experimental results indicate that the accuracy of TIF images is 90.44%, while that of ECW images is 91.72%. The ECW image model demonstrates approximately 1.28% higher accuracy. However, statistical validation, specifically Wilcoxon rank-sum tests, did not reveal a significant difference in accuracy between the two images.

전자의무기록을 이용한 욕창발생 예측 베이지안 네트워크 모델 개발 (Predictive Bayesian Network Model Using Electronic Patient Records for Prevention of Hospital-Acquired Pressure Ulcers)

  • 조인숙;정은자
    • 대한간호학회지
    • /
    • 제41권3호
    • /
    • pp.423-431
    • /
    • 2011
  • Purpose: The study was designed to determine the discriminating ability of a Bayesian network (BN) for predicting risk for pressure ulcers. Methods: Analysis was done using a retrospective cohort, nursing records representing 21,114 hospital days, 3,348 patients at risk for ulcers, admitted to the intensive care unit of a tertiary teaching hospital between January 2004 and January 2007. A BN model and two logistic regression (LR) versions, model-I and .II, were compared, varying the nature, number and quality of input variables. Classification competence and case coverage of the models were tested and compared using a threefold cross validation method. Results: Average incidence of ulcers was 6.12%. Of the two LR models, model-I demonstrated better indexes of statistical model fits. The BN model had a sensitivity of 81.95%, specificity of 75.63%, positive and negative predictive values of 35.62% and 96.22% respectively. The area under the receiver operating characteristic (AUROC) was 85.01% implying moderate to good overall performance, which was similar to LR model-I. However, regarding case coverage, the BN model was 100% compared to 15.88% of LR. Conclusion: Discriminating ability of the BN model was found to be acceptable and case coverage proved to be excellent for clinical use.

기상청 현업 모형(UM)과 1차원 난류모형(PAFOG)의 접합시스템 개발 및 검증 (Development and Validation of the Coupled System of Unified Model (UM) and PArameterized FOG (PAFOG))

  • 김원흥;염성수
    • 대기
    • /
    • 제25권1호
    • /
    • pp.149-154
    • /
    • 2015
  • As an attempt to improve fog predictability at Incheon International Airport (IIA) we couple the 3D weather forecasting model currently operational in Korea Meteorological Administration (regional Unified Model, UM_RE) with a 1D turbulence model (PAFOG). The coupling is done by extracting the meteorological data from the 3D model and properly inserting them in the PAFOG model as initial conditions and external forcing. The initial conditions include surface temperature, 2 m temperature and dew point temperature, geostrophic wind at 850 hPa and vertical profiles of temperature and dew point temperature. Moisture and temperature advections are included as external forcing and updated every hr. To validate the performance of the coupled system, simulation results of the coupled system are compared to those of the 3D model alone for the 22 sea fog cases observed over the Yellow Sea. Three statistical indices, i.e., Root Mean Square Error (RMSE), linear correlation coefficient (R) and Critical Success Index (CSI), are examined, and they all indicate that the coupled system performs better than the 3D model alone. These are certainly promising results but more improvement is required before the coupled system can actually be used as an operational fog forecasting model. For the RMSE, R, and CSI values for the coupled system are still not good enough for operational fog forecast.

An Ensemble Approach to Detect Fake News Spreaders on Twitter

  • Sarwar, Muhammad Nabeel;UlAmin, Riaz;Jabeen, Sidra
    • International Journal of Computer Science & Network Security
    • /
    • 제22권5호
    • /
    • pp.294-302
    • /
    • 2022
  • Detection of fake news is a complex and a challenging task. Generation of fake news is very hard to stop, only steps to control its circulation may help in minimizing its impacts. Humans tend to believe in misleading false information. Researcher started with social media sites to categorize in terms of real or fake news. False information misleads any individual or an organization that may cause of big failure and any financial loss. Automatic system for detection of false information circulating on social media is an emerging area of research. It is gaining attention of both industry and academia since US presidential elections 2016. Fake news has negative and severe effects on individuals and organizations elongating its hostile effects on the society. Prediction of fake news in timely manner is important. This research focuses on detection of fake news spreaders. In this context, overall, 6 models are developed during this research, trained and tested with dataset of PAN 2020. Four approaches N-gram based; user statistics-based models are trained with different values of hyper parameters. Extensive grid search with cross validation is applied in each machine learning model. In N-gram based models, out of numerous machine learning models this research focused on better results yielding algorithms, assessed by deep reading of state-of-the-art related work in the field. For better accuracy, author aimed at developing models using Random Forest, Logistic Regression, SVM, and XGBoost. All four machine learning algorithms were trained with cross validated grid search hyper parameters. Advantages of this research over previous work is user statistics-based model and then ensemble learning model. Which were designed in a way to help classifying Twitter users as fake news spreader or not with highest reliability. User statistical model used 17 features, on the basis of which it categorized a Twitter user as malicious. New dataset based on predictions of machine learning models was constructed. And then Three techniques of simple mean, logistic regression and random forest in combination with ensemble model is applied. Logistic regression combined in ensemble model gave best training and testing results, achieving an accuracy of 72%.

데이터마이닝을 이용한 박스오피스 예측 (Prediction of box office using data mining)

  • 전성현;손영숙
    • 응용통계연구
    • /
    • 제29권7호
    • /
    • pp.1257-1270
    • /
    • 2016
  • 본 연구는 영화 흥행의 척도로서 총 관객수의 예측을 다루었다. 의사결정나무, MLP 신경망모형, 다항로짓모형, support vector machine과 같은 데이터마이닝 분류 기법들을 사용하여 개봉 전, 개봉 일, 개봉 1주 후, 그리고 개봉 2주 후 시점 별로 예측이 이루어진다. 국적, 등급, 개봉 월, 개봉 계절, 감독, 배우, 배급사, 관객수, 그리고 스크린 수와 같은 영화의 내재적인 속성을 나타내는 변수 뿐만 아니라 포털의 평점과 평가자 수, 블로그 수, 뉴스 수와 같은 온라인 구전 변수들이 예측변수로 사용되었다. 10-중 교차 검증에서 신경망모형의 정확도는 개봉 전 시점에서도 90% 이상의 높은 예측력을 보였다. 또한 최종 온라인 구전 변수의 추정치를 예측변수로 추가함으로서 예측의 정확도가 더 높아짐을 볼 수 있다.