• Title/Summary/Keyword: the Combination Data

Search Result 3,488, Processing Time 0.036 seconds

A Novel Network Anomaly Detection Method based on Data Balancing and Recursive Feature Addition

  • Liu, Xinqian;Ren, Jiadong;He, Haitao;Wang, Qian;Sun, Shengting
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.7
    • /
    • pp.3093-3115
    • /
    • 2020
  • Network anomaly detection system plays an essential role in detecting network anomaly and ensuring network security. Anomaly detection system based machine learning has become an increasingly popular solution. However, due to the unbalance and high-dimension characteristics of network traffic, the existing methods unable to achieve the excellent performance of high accuracy and low false alarm rate. To address this problem, a new network anomaly detection method based on data balancing and recursive feature addition is proposed. Firstly, data balancing algorithm based on improved KNN outlier detection is designed to select part respective data on each category. Combination optimization about parameters of improved KNN outlier detection is implemented by genetic algorithm. Next, recursive feature addition algorithm based on correlation analysis is proposed to select effective features, in which a cross contingency test is utilized to analyze correlation and obtain a features subset with a strong correlation. Then, random forests model is as the classification model to detection anomaly. Finally, the proposed algorithm is evaluated on benchmark datasets KDD Cup 1999 and UNSW_NB15. The result illustrates the proposed strategies enhance accuracy and recall, and decrease the false alarm rate. Compared with other algorithms, this algorithm still achieves significant effects, especially recall in the small category.

Modeling of Policy Making for Big Data (빅데이터를 위한 정책결정 설계)

  • Lee, Sangwon;Park, Sungbum;Kim, Sunghyun;Chae, Seong Wook
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2015.01a
    • /
    • pp.281-282
    • /
    • 2015
  • Data, by itself, will not reveal the optimal policy choice. Nor will data alone tell us what problems to focus on or how to direct resources. It should be recognized upfront that data-driven policy making cannot provide all the answers to the challenges of good governance. Policy decisions always depend on a combination of facts, analysis, judgment, and values. In this paper, we research on factors to design an organizational policy making for Big Data.

  • PDF

New techniques in Echoview for fisheries acoustic data analysis

  • Higginbottom, Ian
    • Proceedings of the Korean Society of Fisheries Technology Conference
    • /
    • 2003.10a
    • /
    • pp.1-8
    • /
    • 2003
  • Acoustics is widely used in marine and inland fisheries research and management. In June 2002 ICES (Council for the Exploration of the Sea) held a symposium titled “Acoustics in Fisheries and Aquatic Ecology” in Montpellier, France. There were several topics to be presented such as ecology marine waters, combination of methods, target strength (TS) method and results, TS modeling, survey design, behavior, avoidance, technology, and identification. (omitted)

  • PDF

PSF Deconvolution on the Integral Field Unit Spectroscopy Data

  • Chung, Haeun;Park, Changbom
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.1
    • /
    • pp.58.4-58.4
    • /
    • 2019
  • We present the application of the Point Spread Function (PSF) deconvolution method to the astronomical Integral Field Unit (IFU) Spectroscopy data focus on the restoration of the galaxy kinematics. We apply the Lucy-Richardson deconvolution algorithm to the 2D image at each wavelength slice. We make a set of mock IFU data which resemble the IFU observation to the model galaxies with a diverse combination of surface brightness profile, S/N, line-of-sight geometry and Line-Of-Sight Velocity Distribution (LOSVD). Using the mock IFU data, we demonstrate that the algorithm can effectively recover the stellar kinematics of the galaxy. We also show that lambda_R_e, the proxy of the spin parameter can be correctly measured from the deconvolved IFU data. Implementation of the algorithm to the actual SDSS-IV MaNGA IFU survey data exhibits the noticeable difference on the 2D LOSVD, geometry, lambda_R_e. The algorithm can be applied to any other regular-grid IFS data to extract the PSF-deconvolved spatial information.

  • PDF

Combining Regression Model and Time Series Model to a Set of Autocorrelated Data

  • Jee, Man-Won
    • Journal of the military operations research society of Korea
    • /
    • v.8 no.1
    • /
    • pp.71-76
    • /
    • 1982
  • A procedure is established for combining a regression model and a time series model to fit to a set of autocorrelated data. This procedure is based on an iterative method to compute regression parameter estimates and time series parameter estimates simultaneously. The time series model which is discussed is basically AR(p) model, since MA(q) model or ARMA(p,q) model can be inverted to AR({$\infty$) model which can be approximated by AR(p) model. The procedure discussed in this articled is applied in general to any combination of regression model and time series model.

  • PDF

Isolation of a New Carotenoid Pigment from an Undescribed Gorgonian of the Genus Muricella

  • 노정래;서영완;조기웅;송준임;신종헌
    • Bulletin of the Korean Chemical Society
    • /
    • v.17 no.6
    • /
    • pp.529-531
    • /
    • 1996
  • Muricellaxanthin, a novel carotenoid pigment has been isolated by activity-guided separation from an undescribed gorgonian of the genus Muricella collected from Jaeju Island. Structure of this compound has been determined by a combination of spectral methods. Stereochemistry has been defined by interpretation of nOe data and comparison of CD data with related compounds. Muricellaxanthin exhibited potent lethality against brine-shrimp larvae.

A STUDY ON MIDDLE AGED PEOPLE'S COMPLIANCE FOR PREVENTIVE HEALTH BEHAVIOR OF CANCER (우리나라 일부 중년층 남녀의 암에 대한 예방적 건강행위 이행에 관한 연구)

  • 김은주;문인옥
    • Korean Journal of Health Education and Promotion
    • /
    • v.4 no.2
    • /
    • pp.9-31
    • /
    • 1987
  • This study was conducted because of the investigator's concern for the high incidence and fatal nature of cancer in prime years of human life. The purpose of this study was to investigate risk factors on compilance for preventive health behavior of cancer. The data on which the analysis was based come from a survey of 828 married men & women, 40-59 years old. The instrument of the study were 'Health Belief Model' by Becker. The Data was analyzed using X--test, t-test, ANOVA, Pearson's Correlation Coefficient, Stepwise Multiple Regression. The followings were the result; 1. The examined group had a higher scores than the non-examined group in health belief variables. (p<0.001) 2. The higher level of health belief variables, the higher level of compliance for preventive health behavior is. (p<0.001) 3. The Stepwise Multiple Regression of compliance for preventive health behavior on the variables in the health belief model; Approximataly 65.5% of the variance of compliance for preventive health behavior was accounted for by health concern, susceptibility and barriers in combination. This meant that other factors seemed to influence preventive health behavior since the linear combination of variables failed to explain the remaining 34.5% of preventive health behavior of cancer. It tended to cost doubt on the usefulness of 5 variables in this model. Therefore further study to investigate the influential factors preventive health behavior of cancer is necessary.

  • PDF

Dissatisfaction with and design preferences for mountain gear as determined by specialization activity-pursued for recreational mountaineering (여가적 등산에서의 전문화 활동 추구에 따른 등산복 불만족과 선호 디자인)

  • Han, Heejung;Kim, Mi Sook
    • The Research Journal of the Costume Culture
    • /
    • v.22 no.4
    • /
    • pp.526-542
    • /
    • 2014
  • The purpose of this study was to investigate the differences in the dissatisfaction with and design preferences for mountain gear among the segments divided by specialization activity-pursued for recreational mountaineering. Data were collected by questionnaire survey from 900 subjects with the experiences of mountaineering and purchasing mountain gear in the past year, and 891 were used for the data analysis. The results of the study were as follows: Three factors were formulated based on mountaineering specialization activity-pursued: expertise-pursued mountaineering, mountaineering with psychological attachment and activity-oriented mountaineering. Four segments were identified based on the specialization activity-pursued: the emotionally-committed, the continuously-participated, the expertise-pursued, and the passively-participated. Significant differences were found in dissatisfaction with and design preference for mountain gear among the segments. The expertise-pursued tended to be more dissatisfied with color and fabric than the others, and preferred various mountain gear design of shape, color combination and construction line. On the other hand, the passively-participated tended to prefer simple and comfortable style with solid color and simple color combination.

A Scheme for Filtering SNPs Imputed in 8,842 Korean Individuals Based on the International HapMap Project Data

  • Lee, Ki-Chan;Kim, Sang-Soo
    • Genomics & Informatics
    • /
    • v.7 no.2
    • /
    • pp.136-140
    • /
    • 2009
  • Genome-wide association (GWA) studies may benefit from the inclusion of imputed SNPs into their dataset. Due to its predictive nature, the imputation process is typically not perfect. Thus, it would be desirable to develop a scheme for filtering out the imputed SNPs by maximizing the concordance with the observed genotypes. We report such a scheme, which is based on the combination of several parameters that are calculated by PLINK, a popular GWA analysis software program. We imputed the genotypes of 8,842 Korean individuals, based on approximately 2 million SNP genotypes of the CHB+JPT panel in the International HapMap Project Phase II data, complementing the 352k SNPs in the original Affymetrix 5.0 dataset. A total of 333,418 SNPs were found in both datasets, with a median concordance rate of 98.7%. The concordance rates were calculated at different ranges of parameters, such as the number of proxy SNPs (NPRX), the fraction of successfully imputed individuals (IMPUTED), and the information content (INFO). The poor concordance that was observed at the lower values of the parameters allowed us to develop an optimal combination of the cutoffs (IMPUTED${\geq}$0.9 and INFO${\geq}$0.9). A total of 1,026,596 SNPs passed the cutoff, of which 94,364 were found in both datasets and had 99.4% median concordance. This study illustrates a conservative scheme for filtering imputed SNPs that would be useful in GWA studies.

Limiting conditions prediction using machine learning for loss of condenser vacuum event

  • Dong-Hun Shin;Moon-Ghu Park;Hae-Yong Jeong;Jae-Yong Lee;Jung-Uk Sohn;Do-Yeon Kim
    • Nuclear Engineering and Technology
    • /
    • v.55 no.12
    • /
    • pp.4607-4616
    • /
    • 2023
  • We implement machine learning regression models to predict peak pressures of primary and secondary systems, a major safety concern in Loss Of Condenser Vacuum (LOCV) accident. We selected the Multi-dimensional Analysis of Reactor Safety-KINS standard (MARS-KS) code to analyze the LOCV accident, and the reference plant is the Korean Optimized Power Reactor 1000MWe (OPR1000). eXtreme Gradient Boosting (XGBoost) is selected as a machine learning tool. The MARS-KS code is used to generate LOCV accident data and the data is applied to train the machine learning model. Hyperparameter optimization is performed using a simulated annealing. The randomly generated combination of initial conditions within the operating range is put into the input of the XGBoost model to predict the peak pressure. These initial conditions that cause peak pressure with MARS-KS generate the results. After such a process, the error between the predicted value and the code output is calculated. Uncertainty about the machine learning model is also calculated to verify the model accuracy. The machine learning model presented in this paper successfully identifies a combination of initial conditions that produce a more conservative peak pressure than the values calculated with existing methodologies.