• Title/Summary/Keyword: Cross Validation

Search Result 1,001, Processing Time 0.034 seconds

A Method for Extracting Equipment Specifications from Plant Documents and Cross-Validation Approach with Similar Equipment Specifications (플랜트 설비 문서로부터 설비사양 추출 및 유사설비 사양 교차 검증 접근법)

  • Jae Hyun Lee;Seungeon Choi;Hyo Won Suh
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.2
    • /
    • pp.55-68
    • /
    • 2024
  • Plant engineering companies create or refer to requirements documents for each related field, such as plant process/equipment/piping/instrumentation, in different engineering departments. The process-related requirements document includes not only a description of the process but also the requirements of the equipment or related facilities that will operate it. Since the authors and reviewers of the requirements documents are different, there is a possibility that inconsistencies may occur between equipment or parts design specifications described in different requirement documents. Ensuring consistency in these matters can increase the reliability of the overall plant design information. However, the amount of documents and the scattered nature of requirements for a same equipment and parts across different documents make it challenging for engineers to trace and manage requirements. This paper proposes a method to analyze requirement sentences and calculate the similarity of requirement sentences in order to identify semantically identical sentences. To calculate the similarity of requirement sentences, we propose a named entity recognition method to identify compound words for the parts and properties that are semantically central to the requirements. A method to calculate the similarity of the identified compound words for parts and properties is also proposed. The proposed method is explained using sentences in practical documents, and experimental results are described.

Ultrafast MRI and T1 and T2 Radiomics for Predicting Invasive Components in Ductal Carcinoma in Situ Diagnosed With Percutaneous Needle Biopsy

  • Min Young Kim;Heera Yoen;Hye Ji;Sang Joon Park;Sun Mi Kim;Wonshik Han;Nariya Cho
    • Korean Journal of Radiology
    • /
    • v.24 no.12
    • /
    • pp.1190-1199
    • /
    • 2023
  • Objective: This study aimed to investigate the feasibility of ultrafast magnetic resonance imaging (MRI) and radiomic features derived from breast MRI for predicting the upstaging of ductal carcinoma in situ (DCIS) diagnosed using percutaneous needle biopsy. Materials and Methods: Between August 2018 and June 2020, 95 patients with 98 DCIS lesions who underwent preoperative breast MRI, including an ultrafast sequence, and subsequent surgery were included. Four ultrafast MRI parameters were analyzed: time-to-enhancement, maximum slope (MS), area under the curve for 60 s after enhancement, and time-to-peak enhancement. One hundred and seven radiomic features were extracted for the whole tumor on the first post-contrast T1WI and T2WI using PyRadiomics. Clinicopathological characteristics, ultrafast MRI findings, and radiomic features were compared between the pure DCIS and DCIS with invasion groups. Prediction models, incorporating clinicopathological, ultrafast MRI, and radiomic features, were developed. Receiver operating characteristic curve analysis and area under the curve (AUC) were used to evaluate model performance in distinguishing between the two groups using leave-one-out cross-validation. Results: Thirty-six of the 98 lesions (36.7%) were confirmed to have invasive components after surgery. Compared to the pure DCIS group, the DCIS with invasion group had a higher nuclear grade (P < 0.001), larger mean lesion size (P = 0.038), larger mean MS (P = 0.002), and different radiomic-related characteristics, including a more extensive tumor volume; higher maximum gray-level intensity; coarser, more complex, and heterogeneous texture; and a greater concentration of high gray-level intensity. No significant differences in AUCs were found between the model incorporating nuclear grade and lesion size (0.687) and the models integrating additional ultrafast MRI and radiomic features (0.680-0.732). Conclusion: High nuclear grade, larger lesion size, larger MS, and multiple radiomic features were associated with DCIS upstaging. However, the addition of MS and radiomic features to the prediction model did not significantly improve the prediction performance.

Non Destructive Fast Determination of Fatty Acid Composition by Near Infrared Reflectance Spectroscopy in Sesame

  • Kang, Churl-Whan;Kim, Dong-Hwi;Lee, Sung-Woo;Kim, Ki-Jong;Cho, Kyu-Chae;Shim, Kang-Bo
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.51 no.spc1
    • /
    • pp.283-291
    • /
    • 2006
  • To investigate seed non destructive and fast determination technique utilizing near infrared reflectance spectroscopy (NIRs) for screening ultra high oleic (C18:1) and linoleic (C18:2) fatty acid content sesame varieties among genetic resources and lines of pedigree generations of cross and mutation breeding were carried out in National Institute of Crop Science (NICS). 150 among 378 landraces and introduced cultivars were released to analyse fatty acids by NIRs and gas chromatography (GC). Average content of each fatty acid was 9.64% in palmitic acid (C16:0), 4.73% in stearic acid (C18:0), 42.26% in oleic acid and 43.38% in linoleic acid by GC. The content range of each fatty acid was from 7.29 to 12.27% in palmitic, 6.49% from 2.39 to 8.88% in stearic, 12.59% of wider range compared to that of stearic and palmitic from 37.36 to 49.95% in oleic and of the widest from 30.60 to 47.40% in linoleic acid. Spectrums analyzed by NIRs were distributed from 400 to 2,500 nm wavelengths and varietal distribution of fatty acids were appeared as regular distribution. Varietal differences of oleic acid content good for food processing and human health by NIRs was 14.08% of which 1.49% wider range than that of GC from 38.31 to 52.39%. Varietal differences of linoleic acid content by NIRs was 16.41% of which 0.39% narrower range than that of GC from 30.60 to 47.01%. Varietal differences of oleic and linoleic acid content in NIRs analysis were appeared relatively similar inclination compared with those of GC. Partial least square regression (PLSR) among multiple variant regression (MVR) in NIRs calibration statistics was carried out in spectrum characteristics on the wavelength from 700 to 2,500 nm with oleic and linoleic acids. Correlation coefficient of root square (RSQ) in oleic acid content was 0.724 of which 72.4 percent of sample varieties among all distributed in the range of 0.570 percent of standard error when calibrated (SEC) which were considerably acceptable in statistic confidence significantly for analysis between NIRs and GC. Standard error of cross validation (SECV) of oleic acid was 0.725 of which distributed in the range of 0.725 percent standard error among the samples of mother population between analyzed value by NIRs analysis and analyzed value by GC. RSQ of linoleic acid content was 0.735 of which 73.5 percent of sample varieties among all distributed in the range of 0.643 percent of SEC. SECV of linoleic acid was 0.711 of which distributed in the range of 0.711 percent standard error among the samples of mother population between NIRs analysis and GC analysis. Consequently, adoption NIR analysis for fatty acids of oleic and linoleic instead that of GC was recognized statistically significant between NIRs and GC analysis through not only majority of samples distributed in the range of negligible SEC but also SECV. For enlarging and increasing statistic significance of NIRs analysis, wider range of fatty acids contented sesame germplasm should be kept on releasing additionally for increasing correlation coefficient of RSQ and reducing SEC and SECV in the future.

Exploration of the Multiple Structure of Relational Self and Construct Validation among Korean Adults (한국남녀의 관계적 자아의 특성: 다원적 구성요인 탐색 및 타당성 분석)

  • Ji Kyung Kim;Myoung So Kim
    • Korean Journal of Culture and Social Issue
    • /
    • v.9 no.2
    • /
    • pp.41-59
    • /
    • 2003
  • The present study was conducted to (1) explore the perceptions of Korean men and women about what is an important relationship for them and how do each gender group construe relational self, and (2) develop the scale to assess the factors of relational self and verify construct validity of the scale. 40 college students and 60 adults participated in survey and FGI (Focused Group Interview) respectively, and content analysis of their responses yielded 2 categories with 39 characteristics of relational self. The one category was named 'instrumentality' which was important to men and the other was named 'expressivity' which was important to women. The list of 39 items was administered to a nationwide sample of 1503 Korean adults to assess their construal of relational self through the 6-point Likert scale. Principal axis factor analysis showed that the two categories were unidimensional with high reliability. As a result of factor analysis on each category, a total of 9 factors were extracted. Specifically, the instrumentality consisted of factors such as utilitarianism, independence, initiativeness, self-assurance, and competence. And the factors of expressivity were empathy, passiveness, dependency, consideration. The tests of mean difference revealed that men had higher scores in most of the instrumental factors, while women had higher scores in most of the expressive factors. But there was no sex difference in the interdependent self-construal scale(Cross, 2000) which has been frequently used for measuring relational self. This is related to the Korean's collective cultural characteristics, and it was concluded that the relationship with others is very important to both Korean men and women, but the meaning and expectation of the relationship as well as the method for its preservation are different to each sex group. In addition, the correlation analyses indicated that the feminity score was positively correlated with the expressiveness while the masculinity score was positively correlated with instrumentality. This result implicated the differences of relational self among Korean people were related to the socialization process of each sex, i.e., sex role identity. Finally, limitations of this study and the directions for future research were discussed.

  • PDF

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Optimization of Multiclass Support Vector Machine using Genetic Algorithm: Application to the Prediction of Corporate Credit Rating (유전자 알고리즘을 이용한 다분류 SVM의 최적화: 기업신용등급 예측에의 응용)

  • Ahn, Hyunchul
    • Information Systems Review
    • /
    • v.16 no.3
    • /
    • pp.161-177
    • /
    • 2014
  • Corporate credit rating assessment consists of complicated processes in which various factors describing a company are taken into consideration. Such assessment is known to be very expensive since domain experts should be employed to assess the ratings. As a result, the data-driven corporate credit rating prediction using statistical and artificial intelligence (AI) techniques has received considerable attention from researchers and practitioners. In particular, statistical methods such as multiple discriminant analysis (MDA) and multinomial logistic regression analysis (MLOGIT), and AI methods including case-based reasoning (CBR), artificial neural network (ANN), and multiclass support vector machine (MSVM) have been applied to corporate credit rating.2) Among them, MSVM has recently become popular because of its robustness and high prediction accuracy. In this study, we propose a novel optimized MSVM model, and appy it to corporate credit rating prediction in order to enhance the accuracy. Our model, named 'GAMSVM (Genetic Algorithm-optimized Multiclass Support Vector Machine),' is designed to simultaneously optimize the kernel parameters and the feature subset selection. Prior studies like Lorena and de Carvalho (2008), and Chatterjee (2013) show that proper kernel parameters may improve the performance of MSVMs. Also, the results from the studies such as Shieh and Yang (2008) and Chatterjee (2013) imply that appropriate feature selection may lead to higher prediction accuracy. Based on these prior studies, we propose to apply GAMSVM to corporate credit rating prediction. As a tool for optimizing the kernel parameters and the feature subset selection, we suggest genetic algorithm (GA). GA is known as an efficient and effective search method that attempts to simulate the biological evolution phenomenon. By applying genetic operations such as selection, crossover, and mutation, it is designed to gradually improve the search results. Especially, mutation operator prevents GA from falling into the local optima, thus we can find the globally optimal or near-optimal solution using it. GA has popularly been applied to search optimal parameters or feature subset selections of AI techniques including MSVM. With these reasons, we also adopt GA as an optimization tool. To empirically validate the usefulness of GAMSVM, we applied it to a real-world case of credit rating in Korea. Our application is in bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. The experimental dataset was collected from a large credit rating company in South Korea. It contained 39 financial ratios of 1,295 companies in the manufacturing industry, and their credit ratings. Using various statistical methods including the one-way ANOVA and the stepwise MDA, we selected 14 financial ratios as the candidate independent variables. The dependent variable, i.e. credit rating, was labeled as four classes: 1(A1); 2(A2); 3(A3); 4(B and C). 80 percent of total data for each class was used for training, and remaining 20 percent was used for validation. And, to overcome small sample size, we applied five-fold cross validation to our dataset. In order to examine the competitiveness of the proposed model, we also experimented several comparative models including MDA, MLOGIT, CBR, ANN and MSVM. In case of MSVM, we adopted One-Against-One (OAO) and DAGSVM (Directed Acyclic Graph SVM) approaches because they are known to be the most accurate approaches among various MSVM approaches. GAMSVM was implemented using LIBSVM-an open-source software, and Evolver 5.5-a commercial software enables GA. Other comparative models were experimented using various statistical and AI packages such as SPSS for Windows, Neuroshell, and Microsoft Excel VBA (Visual Basic for Applications). Experimental results showed that the proposed model-GAMSVM-outperformed all the competitive models. In addition, the model was found to use less independent variables, but to show higher accuracy. In our experiments, five variables such as X7 (total debt), X9 (sales per employee), X13 (years after founded), X15 (accumulated earning to total asset), and X39 (the index related to the cash flows from operating activity) were found to be the most important factors in predicting the corporate credit ratings. However, the values of the finally selected kernel parameters were found to be almost same among the data subsets. To examine whether the predictive performance of GAMSVM was significantly greater than those of other models, we used the McNemar test. As a result, we found that GAMSVM was better than MDA, MLOGIT, CBR, and ANN at the 1% significance level, and better than OAO and DAGSVM at the 5% significance level.

Accuracy evaluation of microwave water surface current meter for measurement angles in middle flow condition (전자파표면유속계의 측정 각도에 따른 평수기 유속 측정 정확도 분석)

  • Son, Geunsoo;Kim, Dongsu;Kim, Kyungdong;Kim, Jongmin
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.1
    • /
    • pp.15-27
    • /
    • 2020
  • Streamflow discharge as a fundamental riverine quantity plays a crucial role in water resources management, thereby requiring accurate in-situ measurement. Recent advances in instrumentations for the streamflow discharge measurement has complemented or substituted classical devices and methods. Among various potential methods, surface current meter using microwave has increasingly begun to be applied not only for flood but also normal flow discharge measurement, remotely and safely enabling practitioners to measure flow velocity postulating indirect contact. With minimized field preparedness, this method facilitated and eased flood discharge measurement in the difficult in-situ conditions such as extreme flood in active ways emitting 24.125 GHz microwave without relying on natural lights. In South Korea, a rectangular shaped instrument named with Microwave Water Surface Current Meter (MWSCM) has been developed and commercially released around 2010, in which domestic agencies charging on streamflow observation shed lights on this approach regarding it as a potential substitute. Considering this brand-new device highlighted for efficient flow measurement, however, there has been few noticeable efforts in systematic and comprehensive evaluation of its performance in various measurement and riverine conditions that lead to lack in imminent and widely spreading usages in practices. This study attempted to evaluate the MWSCM in terms of instrumen's monitoring configuration particularly regarding tilt and yaw angle. In the middle of pointing the measurement spot in a given cross-section, the observation campaign inevitably poses accuracy issues related with different tilt and yaw angles of the instrument, which can be a conventionally major source of errors for this type of instrument. Focusing on the perspective of instrument configuration, the instrument was tested in a controlled outdoor river channel located in KICT River Experiment Center with a fixed flow condition of around 1 m/s flow speed with steady flow supply, 6 m of channel width, and less than 1 m of shallow flow depth, where the detailed velocity measurements with SonTek micro-ADV was used for validation. As results, less than 15 degree in tilting angle generated much higher deviation, and higher yawing angle proportionally increased coefficient of variance. Yaw angles affected accuracy in terms of measurement area.

A Multimodal Profile Ensemble Approach to Development of Recommender Systems Using Big Data (빅데이터 기반 추천시스템 구현을 위한 다중 프로파일 앙상블 기법)

  • Kim, Minjeong;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.93-110
    • /
    • 2015
  • The recommender system is a system which recommends products to the customers who are likely to be interested in. Based on automated information filtering technology, various recommender systems have been developed. Collaborative filtering (CF), one of the most successful recommendation algorithms, has been applied in a number of different domains such as recommending Web pages, books, movies, music and products. But, it has been known that CF has a critical shortcoming. CF finds neighbors whose preferences are like those of the target customer and recommends products those customers have most liked. Thus, CF works properly only when there's a sufficient number of ratings on common product from customers. When there's a shortage of customer ratings, CF makes the formation of a neighborhood inaccurate, thereby resulting in poor recommendations. To improve the performance of CF based recommender systems, most of the related studies have been focused on the development of novel algorithms under the assumption of using a single profile, which is created from user's rating information for items, purchase transactions, or Web access logs. With the advent of big data, companies got to collect more data and to use a variety of information with big size. So, many companies recognize it very importantly to utilize big data because it makes companies to improve their competitiveness and to create new value. In particular, on the rise is the issue of utilizing personal big data in the recommender system. It is why personal big data facilitate more accurate identification of the preferences or behaviors of users. The proposed recommendation methodology is as follows: First, multimodal user profiles are created from personal big data in order to grasp the preferences and behavior of users from various viewpoints. We derive five user profiles based on the personal information such as rating, site preference, demographic, Internet usage, and topic in text. Next, the similarity between users is calculated based on the profiles and then neighbors of users are found from the results. One of three ensemble approaches is applied to calculate the similarity. Each ensemble approach uses the similarity of combined profile, the average similarity of each profile, and the weighted average similarity of each profile, respectively. Finally, the products that people among the neighborhood prefer most to are recommended to the target users. For the experiments, we used the demographic data and a very large volume of Web log transaction for 5,000 panel users of a company that is specialized to analyzing ranks of Web sites. R and SAS E-miner was used to implement the proposed recommender system and to conduct the topic analysis using the keyword search, respectively. To evaluate the recommendation performance, we used 60% of data for training and 40% of data for test. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. A widely used combination metric called F1 metric that gives equal weight to both recall and precision was employed for our evaluation. As the results of evaluation, the proposed methodology achieved the significant improvement over the single profile based CF algorithm. In particular, the ensemble approach using weighted average similarity shows the highest performance. That is, the rate of improvement in F1 is 16.9 percent for the ensemble approach using weighted average similarity and 8.1 percent for the ensemble approach using average similarity of each profile. From these results, we conclude that the multimodal profile ensemble approach is a viable solution to the problems encountered when there's a shortage of customer ratings. This study has significance in suggesting what kind of information could we use to create profile in the environment of big data and how could we combine and utilize them effectively. However, our methodology should be further studied to consider for its real-world application. We need to compare the differences in recommendation accuracy by applying the proposed method to different recommendation algorithms and then to identify which combination of them would show the best performance.

A Management Plan According to the Estimation of Nutria (Myocastorcoypus) Distribution Density and Potential Suitable Habitat (뉴트리아(Myocastor coypus) 분포밀도 및 잠재적 서식가능지역 예측에 따른 관리방향)

  • Kim, Areum;Kim, Young-Chae;Lee, Do-Hun
    • Journal of Environmental Impact Assessment
    • /
    • v.27 no.2
    • /
    • pp.203-214
    • /
    • 2018
  • The purpose of this study is to estimate the concentrated distribution area of nutria (Myocastor coypus) and potential suitable habitat and to provide useful data for the effective management direction setting. Based on the nationwide distribution data of nutria, the cross-validation value was applied to analyze the distribution density. As a result, the concentrated distribution areas thatrequired preferential elimination is found in 14 administrative areas including Busan Metropolitan City, Daegu Metropolitan City, 11 cities and counties in Gyeongsangnam-do and 1 county in Gyeongsangbuk-do. In the potential suitable habitat estimation using a MaxEnt (Maximum Entropy) model, the possibility of emergency was found in the Nakdong River middle and lower stream area and the Seomjin riverlower stream area and Gahwacheon River area. As for the contribution by variables of a model, it showed DEM, precipitation of driest month, min temperature of coldest month and distance from river had contribution from the highest order. In terms of the relation with the probability of appearance, the probability of emergence was higher than the threshold value in areas with less than 34m of altitude, with $-5.7^{\circ}C{\sim}-0.6^{\circ}C$ of min temperature of the coldest month, with 15-30mm of precipitation of the driest month and with less than 1,373m away from the river. Variables that Altitude, existence of water and wintertemperature affected settlement and expansion of nutria, considering the research results and the physiological and ecological characteristics of nutria. Therefore, it is necessary to reflect them as important variables in the future habitable area detection and expansion estimation modeling. It must be essential to distinguish the concentrated distribution area and the management area of invasive alien species such as nutria and to establish and apply a suitable management strategy to the management site for the permanent control. The results in this study can be used as useful data for a strategic management such as rapid management on the preferential management area and preemptive and preventive management on the possible spreading area.

Impacts assessment of Climate changes in North Korea based on RCP climate change scenarios II. Impacts assessment of hydrologic cycle changes in Yalu River (RCP 기후변화시나리오를 이용한 미래 북한지역의 수문순환 변화 영향 평가 II. 압록강유역의 미래 수문순환 변화 영향 평가)

  • Jeung, Se Jin;Kang, Dong Ho;Kim, Byung Sik
    • Journal of Wetlands Research
    • /
    • v.21 no.spc
    • /
    • pp.39-50
    • /
    • 2019
  • This study aims to assess the influence of climate change on the hydrological cycle at a basin level in North Korea. The selected model for this study is MRI-CGCM 3, the one used for the Coupled Model Intercomparison Project Phase 5 (CMIP5). Moreover, this study adopted the Spatial Disaggregation-Quantile Delta Mapping (SDQDM), which is one of the stochastic downscaling techniques, to conduct the bias correction for climate change scenarios. The comparison between the preapplication and postapplication of the SDQDM supported the study's review on the technique's validity. In addition, as this study determined the influence of climate change on the hydrological cycle, it also observed the runoff in North Korea. In predicting such influence, parameters of a runoff model used for the analysis should be optimized. However, North Korea is classified as an ungauged region for its political characteristics, and it was difficult to collect the country's runoff observation data. Hence, the study selected 16 basins with secured high-quality runoff data, and the M-RAT model's optimized parameters were calculated. The study also analyzed the correlation among variables for basin characteristics to consider multicollinearity. Then, based on a phased regression analysis, the study developed an equation to calculate parameters for ungauged basin areas. To verify the equation, the study assumed the Osipcheon River, Namdaecheon Stream, Yongdang Reservoir, and Yonggang Stream as ungauged basin areas and conducted cross-validation. As a result, for all the four basin areas, high efficiency was confirmed with the efficiency coefficients of 0.8 or higher. The study used climate change scenarios and parameters of the estimated runoff model to assess the changes in hydrological cycle processes at a basin level from climate change in the Amnokgang River of North Korea. The results showed that climate change would lead to an increase in precipitation, and the corresponding rise in temperature is predicted to cause elevating evapotranspiration. However, it was found that the storage capacity in the basin decreased. The result of the analysis on flow duration indicated a decrease in flow on the 95th day; an increase in the drought flow during the periods of Future 1 and Future 2; and an increase in both flows for the period of Future 3.