• Title/Summary/Keyword: Prediction accuracy

Search Result 3,732, Processing Time 0.039 seconds

An Integrated Model based on Genetic Algorithms for Implementing Cost-Effective Intelligent Intrusion Detection Systems (비용효율적 지능형 침입탐지시스템 구현을 위한 유전자 알고리즘 기반 통합 모형)

  • Lee, Hyeon-Uk;Kim, Ji-Hun;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.125-141
    • /
    • 2012
  • These days, the malicious attacks and hacks on the networked systems are dramatically increasing, and the patterns of them are changing rapidly. Consequently, it becomes more important to appropriately handle these malicious attacks and hacks, and there exist sufficient interests and demand in effective network security systems just like intrusion detection systems. Intrusion detection systems are the network security systems for detecting, identifying and responding to unauthorized or abnormal activities appropriately. Conventional intrusion detection systems have generally been designed using the experts' implicit knowledge on the network intrusions or the hackers' abnormal behaviors. However, they cannot handle new or unknown patterns of the network attacks, although they perform very well under the normal situation. As a result, recent studies on intrusion detection systems use artificial intelligence techniques, which can proactively respond to the unknown threats. For a long time, researchers have adopted and tested various kinds of artificial intelligence techniques such as artificial neural networks, decision trees, and support vector machines to detect intrusions on the network. However, most of them have just applied these techniques singularly, even though combining the techniques may lead to better detection. With this reason, we propose a new integrated model for intrusion detection. Our model is designed to combine prediction results of four different binary classification models-logistic regression (LOGIT), decision trees (DT), artificial neural networks (ANN), and support vector machines (SVM), which may be complementary to each other. As a tool for finding optimal combining weights, genetic algorithms (GA) are used. Our proposed model is designed to be built in two steps. At the first step, the optimal integration model whose prediction error (i.e. erroneous classification rate) is the least is generated. After that, in the second step, it explores the optimal classification threshold for determining intrusions, which minimizes the total misclassification cost. To calculate the total misclassification cost of intrusion detection system, we need to understand its asymmetric error cost scheme. Generally, there are two common forms of errors in intrusion detection. The first error type is the False-Positive Error (FPE). In the case of FPE, the wrong judgment on it may result in the unnecessary fixation. The second error type is the False-Negative Error (FNE) that mainly misjudges the malware of the program as normal. Compared to FPE, FNE is more fatal. Thus, total misclassification cost is more affected by FNE rather than FPE. To validate the practical applicability of our model, we applied it to the real-world dataset for network intrusion detection. The experimental dataset was collected from the IDS sensor of an official institution in Korea from January to June 2010. We collected 15,000 log data in total, and selected 10,000 samples from them by using random sampling method. Also, we compared the results from our model with the results from single techniques to confirm the superiority of the proposed model. LOGIT and DT was experimented using PASW Statistics v18.0, and ANN was experimented using Neuroshell R4.0. For SVM, LIBSVM v2.90-a freeware for training SVM classifier-was used. Empirical results showed that our proposed model based on GA outperformed all the other comparative models in detecting network intrusions from the accuracy perspective. They also showed that the proposed model outperformed all the other comparative models in the total misclassification cost perspective. Consequently, it is expected that our study may contribute to build cost-effective intelligent intrusion detection systems.

Estimation of Ground-level PM10 and PM2.5 Concentrations Using Boosting-based Machine Learning from Satellite and Numerical Weather Prediction Data (부스팅 기반 기계학습기법을 이용한 지상 미세먼지 농도 산출)

  • Park, Seohui;Kim, Miae;Im, Jungho
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.2
    • /
    • pp.321-335
    • /
    • 2021
  • Particulate matter (PM10 and PM2.5 with a diameter less than 10 and 2.5 ㎛, respectively) can be absorbed by the human body and adversely affect human health. Although most of the PM monitoring are based on ground-based observations, they are limited to point-based measurement sites, which leads to uncertainty in PM estimation for regions without observation sites. It is possible to overcome their spatial limitation by using satellite data. In this study, we developed machine learning-based retrieval algorithm for ground-level PM10 and PM2.5 concentrations using aerosol parameters from Geostationary Ocean Color Imager (GOCI) satellite and various meteorological parameters from a numerical weather prediction model during January to December of 2019. Gradient Boosted Regression Trees (GBRT) and Light Gradient Boosting Machine (LightGBM) were used to estimate PM concentrations. The model performances were examined for two types of feature sets-all input parameters (Feature set 1) and a subset of input parameters without meteorological and land-cover parameters (Feature set 2). Both models showed higher accuracy (about 10 % higher in R2) by using the Feature set 1 than the Feature set 2. The GBRT model using Feature set 1 was chosen as the final model for further analysis(PM10: R2 = 0.82, nRMSE = 34.9 %, PM2.5: R2 = 0.75, nRMSE = 35.6 %). The spatial distribution of the seasonal and annual-averaged PM concentrations was similar with in-situ observations, except for the northeastern part of China with bright surface reflectance. Their spatial distribution and seasonal changes were well matched with in-situ measurements.

Development of a Retrieval Algorithm for Adjustment of Satellite-viewed Cloudiness (위성관측운량 보정을 위한 알고리즘의 개발)

  • Son, Jiyoung;Lee, Yoon-Kyoung;Choi, Yong-Sang;Ok, Jung;Kim, Hye-Sil
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.3
    • /
    • pp.415-431
    • /
    • 2019
  • The satellite-viewed cloudiness, a ratio of cloudy pixels to total pixels ($C_{sat,\;prev}$), inevitably differs from the "ground-viewed" cloudiness ($C_{grd}$) due to different viewpoints. Here we develop an algorithm to retrieve the satellite-viewed, but adjusted cloudiness to $C_{grd} (C_{sat,\;adj})$. The key process of the algorithm is to convert the cloudiness projected on the plane surface into the cloudiness on the celestial hemisphere from the observer. For this conversion, the supplementary satellite retrievals such as cloud detection and cloud top pressure are used as they provide locations of cloudy pixels and cloud base height information, respectively. The algorithm is tested for Himawari-8 level 1B data. The $C_{sat,\;adj}$ and $C_{sat,\;prev}$ are retrieved and validated with $C_{grd}$ of SYNOP station over Korea (22 stations) and China (724 stations) during only daytime for the first seven days of every month from July 2016 to June 2017. As results, the mean error of $C_{sat,\;adj}$ (0.61) is less that than that of $C_{sat,\;prev}$ (1.01). The percent of detection for 'Cloudy' scenario of $C_{sat,\;adj}$ (73%) is higher than that of $C_{sat,\;prev}$ (60%) The percent of correction, the accuracy, of $C_{sat,\;adj}$ is 61%, while that of $C_{sat,\;prev}$ is 55% for all seasons. For the December-January-February period when cloudy pixels are readily overestimated, the proportion of correction of $C_{sat,\;adj$ is 60%, while that of $C_{sat,\;prev}$ is 56%. Therefore, we conclude that the present algorithm can effectively get the satellite cloudiness near to the ground-viewed cloudiness.

Application of deep learning method for decision making support of dam release operation (댐 방류 의사결정지원을 위한 딥러닝 기법의 적용성 평가)

  • Jung, Sungho;Le, Xuan Hien;Kim, Yeonsu;Choi, Hyungu;Lee, Giha
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.spc1
    • /
    • pp.1095-1105
    • /
    • 2021
  • The advancement of dam operation is further required due to the upcoming rainy season, typhoons, or torrential rains. Besides, physical models based on specific rules may sometimes have limitations in controlling the release discharge of dam due to inherent uncertainty and complex factors. This study aims to forecast the water level of the nearest station to the dam multi-timestep-ahead and evaluate the availability when it makes a decision for a release discharge of dam based on LSTM (Long Short-Term Memory) of deep learning. The LSTM model was trained and tested on eight data sets with a 1-hour temporal resolution, including primary data used in the dam operation and downstream water level station data about 13 years (2009~2021). The trained model forecasted the water level time series divided by the six lead times: 1, 3, 6, 9, 12, 18-hours, and compared and analyzed with the observed data. As a result, the prediction results of the 1-hour ahead exhibited the best performance for all cases with an average accuracy of MAE of 0.01m, RMSE of 0.015 m, and NSE of 0.99, respectively. In addition, as the lead time increases, the predictive performance of the model tends to decrease slightly. The model may similarly estimate and reliably predicts the temporal pattern of the observed water level. Thus, it is judged that the LSTM model could produce predictive data by extracting the characteristics of complex hydrological non-linear data and can be used to determine the amount of release discharge from the dam when simulating the operation of the dam.

The Study on Gyeokguk and Sangshin (격국과 상신에 대한 소고)

  • Hwangbo, Kwan
    • Industry Promotion Research
    • /
    • v.7 no.3
    • /
    • pp.115-124
    • /
    • 2022
  • The most difficult things, when we study the future-telling science of human destiny, are in case of what one's individual's fate is bad which is shown by Saju-Palza(四柱八字), In that case, we have faced the problems on how we live ; to follow or to deny our fate under the brief of improving our lives by trying to make hard efforts, regardless of the bad Saju-Palza(四柱八字). However, we can hardly find the clear answer to those questions. 『Liao Fan 4 lessons(了凡四訓)』 shows that one's destiny can be improved by accumulating good deeds despite of the bad Saju-Palza(四柱八字). Someone says that future can be created, not be foreseen. As well, Dr. Steven Coby says that the best definite way to forecast future is in creating the future. Anyhow, the strong desire and curiosity to know one's individual's future is having been lasted until now since the Genesis. we guess these desires may be one of our basic instinct. If then, the function and role of the future-telling science will be to increase the accuracy of future prediction, whether our fate has been fixed or been able to be changeable. Therefore, this study summarizes the definition of confusing terms, focusing on Gyeokguk(格局) and Sangshin(相神), the core of Myeongrihak(命理學), which is considered to be one of the most popular future-telling science. Concering Gyeok(格), in this paper, Nae-Gyeok(內格) has been mainly considered and Oi-Gyeok(外格) or Special-Gyeok(別格) have not been addressed. Specifically, it summarized the views of the classical Myeongri(命理) books and modern scholars on Gyeokguk(格局) and Yongshin(用神). In particular, it also summarized the comparison of various concepts of Gyeokguk(格局), the advantages and disadvantages of each Nae-Gyeok(內格)'s characteristic, the determination order of Nae-Gyeok(內格) and the good case and bad case of it's Gyeok(格). In addition, it was necessary to summarize the concept of Sangshin(相神), which was talked about in 『Japyeongjinjeon』 and to briefly summarize Heeshin(喜神) with a broader concept than Sangshin(相神). The different usage of Sangshin(相神) was also analyzed, between the priority interpretation of Cheongan(天干) in Day-Column(日柱) and the interpretation based on Jijee(地支) in Month-Column(月柱). Finally, this paper was completed, leaving it later as a research task, the confusion that comes from the scholars' acceptance of the comprehensive diversity on the same term.

Prediction of Maximal Oxygen Uptake Ages 18~34 Years (18~34 남성의 최대산소 섭취량 추정)

  • Jeon, Yoo-Joung;Im, Jae-Hyeng;Lee, Byung-Kun;Kim, Chang-Hwan;Kim, Byeong-Wan
    • 한국체육학회지인문사회과학편
    • /
    • v.51 no.3
    • /
    • pp.373-382
    • /
    • 2012
  • The purpose of this study is to predict VO2max with body index and submaximal metabolic responses. The subjects are consisted of 250 male aging from 18 to 34 and we separated them into two groups randomly; 179 for a sample, 71 for a cross-validation group. They went through maximal exercise testing with Bruce protocol, and we measured the metabolic responses in the end of the first(3 minute) and second stage(6 minute). To predict VO2max, we applied multiple regression analysis to the sample with stepwise method. Model 1's variables are weight, 6 minute HR and 6 minute VO2(R=0.64, SEE=4.74, CV=11.7%, p<.01), and the equation is VO2max(ml/kg/min)= 72.256-0.340(Weight)-0.220(6minHR)+0.013(6minVO2). Model 2's variables are weight, 6 minute HR, 6 minute VO2, and 6 minute VCO2(R=0.67, SEE=4.59, CV=11.3%, p<.01), and the equation is VO2max(ml/kg/min)= 68.699-0.277(Weight) -0.206(6minHR)+0.020(6minVO2)-0.009(6minVCO2). And the result did not show multicolinearity for both models. Model 2 demonstrated more correlation compared to Model 1. However, when we conducted cross-validation of those models with 71 men, measured VO2max and estimated VO2 Max had statistical significance with correlation (R=0.53, 0.56, P<.01). Although both models are functional with validity considering their simplicity and utility, Model 2 has more accuracy.

The differences of STO between before and after presurgical orthodontics in skeletal Class III malocclusions (골격성 III급 부정교합자에서 술 전 교정치료 전과 후의 수술계획의 차이)

  • Lee, Eun-Ju;Son, Woo-Sung;Park, Soo-Byung;Kim, Seong-Sik
    • The korean journal of orthodontics
    • /
    • v.38 no.3
    • /
    • pp.175-186
    • /
    • 2008
  • Objective: To evaluate the discrepancies between initial STO and final STO in Class III malocclusions and to find which factors are related to the discrepancies. Methods: Twenty patients were selected for the extraction group and 20 patients for the non-extraction group. They were diagnosed as skeletal Class III and received presurgical orthodontic treatment and mandibular set-back surgery at Pusan National University Hospital. The lateral cephalograms were analyzed for initial STO (T1s) at pretreatment and final STO (T2s) after presurgical orthodontic treatment, and specified the landmarks 3s coordinates of the X and V axes. Results: Differences in hard tissue points (T1s-T2s) in the X coordinates of upper central incisor edge, upper first molar mesial end surface, lower central incisor apex, lower first molar mesial end surface and mesio-buccal cusp and Y coordinates of upper central incisor edge, upper central incisor apex, upper first molar mesio-buccal cusp were statistically significant in the extraction group. Differences in hard tissue points (T1s-T2s) in the X coordinates of upper central incisor edge, lower central incisor apex, lower first molar mesial end surface and Y coordinates of lower central incisor apex were statistically significant in the non-extraction group. In the extraction group, the upper arch length discrepancy (UALD) had a statistically significant effect on maxillary incisor and first molar estimation. Lower arch length discrepancy and IMPA had statistically significant effects on mandibular incisor estimation in both groups. Conclusions: Discrepancies between initial STO and final STO and factors contributing to the accuracy of initial STO must be considered in treatment planning of Class III surgical patients to increase the accuracy of prediction.

Prediction of Forest Fire Danger Rating over the Korean Peninsula with the Digital Forecast Data and Daily Weather Index (DWI) Model (디지털예보자료와 Daily Weather Index (DWI) 모델을 적용한 한반도의 산불발생위험 예측)

  • Won, Myoung-Soo;Lee, Myung-Bo;Lee, Woo-Kyun;Yoon, Suk-Hee
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.14 no.1
    • /
    • pp.1-10
    • /
    • 2012
  • Digital Forecast of the Korea Meteorological Administration (KMA) represents 5 km gridded weather forecast over the Korean Peninsula and the surrounding oceanic regions in Korean territory. Digital Forecast provides 12 weather forecast elements such as three-hour interval temperature, sky condition, wind direction, wind speed, relative humidity, wave height, probability of precipitation, 12 hour accumulated rain and snow, as well as daily minimum and maximum temperatures. These forecast elements are updated every three-hour for the next 48 hours regularly. The objective of this study was to construct Forest Fire Danger Rating Systems on the Korean Peninsula (FFDRS_KORP) based on the daily weather index (DWI) and to improve the accuracy using the digital forecast data. We produced the thematic maps of temperature, humidity, and wind speed over the Korean Peninsula to analyze DWI. To calculate DWI of the Korean Peninsula it was applied forest fire occurrence probability model by logistic regression analysis, i.e. $[1+{\exp}\{-(2.494+(0.004{\times}T_{max})-(0.008{\times}EF))\}]^{-1}$. The result of verification test among the real-time observatory data, digital forecast and RDAPS data showed that predicting values of the digital forecast advanced more than those of RDAPS data. The results of the comparison with the average forest fire danger rating index (sampled at 233 administrative districts) and those with the digital weather showed higher relative accuracy than those with the RDAPS data. The coefficient of determination of forest fire danger rating was shown as $R^2$=0.854. There was a difference of 0.5 between the national mean fire danger rating index (70) with the application of the real-time observatory data and that with the digital forecast (70.5).

Studies on the Changes of Sex Hormone Concentrations in Milk during the Reproductive Stages of Dairy Cows (유우의 번식과정에 따른 유즙중의 성호르몬 수준 변화에 관한 연구)

  • 김상근;이재근
    • Korean Journal of Animal Reproduction
    • /
    • v.9 no.1
    • /
    • pp.9-30
    • /
    • 1985
  • The study was carried out to find out the changes of the sex hormone levels in the milk of Holstein cows during the reproductive stages such as the estrous cycle, pregnancy and periparturient period. The FSH, LH, estradiol-17$\beta$ and progesterone from the milk samples were assayed by radioimmunoassay methods. The results of this study were summarized as follows: 1. The levels of progesterone and estradiol-17$\beta$ were similar among inter-quarters, but they were higher in after milking than before milking times, with no statistical significance. 2. The milk progesterone levels during the estrous cycles reached a peak mean level of 3.55$\pm$0.26ng/$m\ell$ at 15 days after estrus and they did not show any differences among the length of estrous cycles. The estradiol-17$\beta$ levels during the estrous cycles showed a peak level of 36.40$\pm$2.38pg/$m\ell$ at estrus, and decreased(17.20$\pm$0.46 pg/$m\ell$ to 18.65$\pm$1.26pg/$m\ell$) at luteal phase. 3. The FSH levels during the estrous cycles ranged from 2.25$\pm$0.23mIU/$m\ell$ to 4.35$\pm$0.24mIU/$m\ell$ showing significant changes. The LH levels during the estrous cycles gradually increased and remained a peak level of 10.90$\pm$0.36mIU/$m\ell$ from 20 to 25 days after estrus. 4. The progesterone levels during the pregnancy were decreased from 30 to 60 days after artificial insemination, and therafter continuously increased until 240 days. The estradiol-17$\beta$ levels during the pregnancy were 24.56$\pm$1.19pg/$m\ell$ at day 30 after artificial inseminaton, and increased rapidly until 180 days. The levles were agagin decreased by 26.17$\pm$3.03pg/$m\ell$ until 210 days and markedly increased by 68.00$\pm$8.70pg/$m\ell$ until 240 days. 5. The prolactin levels during the pregnancy were 31.27$\pm$2.31ng/$m\ell$ and 42.60$\pm$2.37ng/$m\ell$ at day 150 and 240 after artificial insemination respectively. The LH levels during the pregnancy reached a peak of 27.47$\pm$7.90mIU/$m\ell$ at day 30 after artificial insemination, and thereafter gradually decreased. 6. The progesterone levels during the periparturient period reached a peak of 4.61$\pm$0.34ng/$m\ell$ at day 3 prepartum, and thereafter gradually decreased, and showed 2.05$\pm$0.60ng/$m\ell$ at day 7 postpartum. The estradiol-17$\beta$ levels during the periparturient period showed high level from 207.23$\pm$6.04pg/$m\ell$ at day 1 prepartum to 239.90$\pm$13.90pg/$m\ell$ at day 2 prepartum, and thereafter began to decline and reached 51.87$\pm$1.72pg/$m\ell$ at by 7 postpartum. 7. The prolactin levels during the periparturient period showed relatively higher level at the time of parturition. The LH levels during the periparturient period rnage from 6.32$\pm$0.32mIU/$m\ell$ to 13.90$\pm$1.37mIU/$m\ell$ showing significant changes. 8. The progesterone levels(4.6$\pm$0.8ng/$m\ell$) of the pregnant cows were significantly higher than those (1.84$\pm$1.4ng/$m\ell$) of nonpregnant cows. The cows of artificial insemination from 61 to 90 days after parturition showed higher progesterone levels. 9. During 20 to 25 days after artificial insemination, the accuracy of pregnancy diagnosis from milk progesterone levels were 94.4% for nonpregnant cows(<2.3ng/$m\ell$), and 75.0% for pregnant cows( 3.2ng/$m\ell$). The average overall accuracy of pregnancy prediction for nonpregnant and pregnant cows 83.3% 10. The results obtained this study suggest that the understanding of the endocrinological mechanisms by means of milk hormone analysis during the estrous cycle, pregnancy and parturition would give the basic information needed for increasing efficiency of reproduction. This study would not only provide an accurate method of the early pregnancy diagnosis by milk progesterone levels but also contribute to the research of providing the method of detecting of FSH levels in milk, which was difficult in blood serum.

  • PDF

Comparison of Size Criteria in Mediastinal Lymph Node Involvement of Adenocarcinoma of Lungs (폐 선암의 종격동 림프절 전이에 있어서 림프절 크기 기준의 비교)

  • Gu, Ki-Seon;Kuk, Hiang;Koh, Hyeck-Jae;Yang, Sei-Hun;Jeong, Eun-Taik
    • Tuberculosis and Respiratory Diseases
    • /
    • v.46 no.4
    • /
    • pp.542-547
    • /
    • 1999
  • Background: Decision in mediastinal lymph node involvement of lung cancer by CT scan is very important and valuable for the treatment planning and prognosis prediction. In general, long diameter of mediastinal lymph node more than 15mm is used as criterion of lung cancer involvement. Adenocarci-noma has a tendency of early distant metastasis and micrometastasis, so adenocarcinoma may involve lymph node earlier and cannot be detected before lymph nodes are enlarged enough. The authors tried to determine the difference between two size criteria(15mm, 10mm) in adenocarcinoma for the detection of cancer involvement. Methods: Numbers of sample are 60 cases(male 46, female 14, median age: 61.5 years). According to pathology, squamous cancer 41, large cell cancer 2, adenocarcinoma 17. According to TNM stage, I 23, III 24, IIIA 13. Results : Mean long diameter of lymph node involvement is 16.0($\pm8.0$) mm in non-adenocarcinoma group, and that of adenocarcinoma group is 12.0($\pm3.2$) mm(p<0.05). If long diameter of lymph node larger than 15mm as involvement criterion is applied, sensitivity, specificity, positive predictive index, negative predictive index, accuracy of nonadenocarcinoma group are 54%, 100%, 100%, 83%, 86%, and those of adenocarcinoma group are 43%, 90%, 75%, 69%, 71%. If long diameter of lymph node larger than 10mm as involvement criterion is applied, sensitivity, specificity, positive predictive index. negative predictive index. accuracy of nonadenocarcinoma group are 65%, 77%, 61%, 92%, 79%, and those of adenocarcinoma group are 100%, 80%, 78%, 100%, 88%. Conclusion: Long diameter of lymph node larger than 10mm is more valuable criterion as lymph node involvement in adenocarcinoma of lungs.

  • PDF