• Title/Summary/Keyword: Random samples

Search Result 576, Processing Time 0.029 seconds

An Integrated Model based on Genetic Algorithms for Implementing Cost-Effective Intelligent Intrusion Detection Systems (비용효율적 지능형 침입탐지시스템 구현을 위한 유전자 알고리즘 기반 통합 모형)

  • Lee, Hyeon-Uk;Kim, Ji-Hun;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.1
    • /
    • pp.125-141
    • /
    • 2012
  • These days, the malicious attacks and hacks on the networked systems are dramatically increasing, and the patterns of them are changing rapidly. Consequently, it becomes more important to appropriately handle these malicious attacks and hacks, and there exist sufficient interests and demand in effective network security systems just like intrusion detection systems. Intrusion detection systems are the network security systems for detecting, identifying and responding to unauthorized or abnormal activities appropriately. Conventional intrusion detection systems have generally been designed using the experts' implicit knowledge on the network intrusions or the hackers' abnormal behaviors. However, they cannot handle new or unknown patterns of the network attacks, although they perform very well under the normal situation. As a result, recent studies on intrusion detection systems use artificial intelligence techniques, which can proactively respond to the unknown threats. For a long time, researchers have adopted and tested various kinds of artificial intelligence techniques such as artificial neural networks, decision trees, and support vector machines to detect intrusions on the network. However, most of them have just applied these techniques singularly, even though combining the techniques may lead to better detection. With this reason, we propose a new integrated model for intrusion detection. Our model is designed to combine prediction results of four different binary classification models-logistic regression (LOGIT), decision trees (DT), artificial neural networks (ANN), and support vector machines (SVM), which may be complementary to each other. As a tool for finding optimal combining weights, genetic algorithms (GA) are used. Our proposed model is designed to be built in two steps. At the first step, the optimal integration model whose prediction error (i.e. erroneous classification rate) is the least is generated. After that, in the second step, it explores the optimal classification threshold for determining intrusions, which minimizes the total misclassification cost. To calculate the total misclassification cost of intrusion detection system, we need to understand its asymmetric error cost scheme. Generally, there are two common forms of errors in intrusion detection. The first error type is the False-Positive Error (FPE). In the case of FPE, the wrong judgment on it may result in the unnecessary fixation. The second error type is the False-Negative Error (FNE) that mainly misjudges the malware of the program as normal. Compared to FPE, FNE is more fatal. Thus, total misclassification cost is more affected by FNE rather than FPE. To validate the practical applicability of our model, we applied it to the real-world dataset for network intrusion detection. The experimental dataset was collected from the IDS sensor of an official institution in Korea from January to June 2010. We collected 15,000 log data in total, and selected 10,000 samples from them by using random sampling method. Also, we compared the results from our model with the results from single techniques to confirm the superiority of the proposed model. LOGIT and DT was experimented using PASW Statistics v18.0, and ANN was experimented using Neuroshell R4.0. For SVM, LIBSVM v2.90-a freeware for training SVM classifier-was used. Empirical results showed that our proposed model based on GA outperformed all the other comparative models in detecting network intrusions from the accuracy perspective. They also showed that the proposed model outperformed all the other comparative models in the total misclassification cost perspective. Consequently, it is expected that our study may contribute to build cost-effective intelligent intrusion detection systems.

High-Speed Implementation and Efficient Memory Usage of Min-Entropy Estimation Algorithms in NIST SP 800-90B (NIST SP 800-90B의 최소 엔트로피 추정 알고리즘에 대한 고속 구현 및 효율적인 메모리 사용 기법)

  • Kim, Wontae;Yeom, Yongjin;Kang, Ju-Sung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.1
    • /
    • pp.25-39
    • /
    • 2018
  • NIST(National Institute of Standards and Technology) has recently published SP 800-90B second draft which is the document for evaluating security of entropy source, a key element of a cryptographic random number generator(RNG), and provided a tool implemented on Python code. In SP 800-90B, the security evaluation of the entropy sources is a process of estimating min-entropy by several estimators. The process of estimating min-entropy is divided into IID track and non-IID track. In IID track, the entropy sources are estimated only from MCV estimator. In non-IID Track, the entropy sources are estimated from 10 estimators including MCV estimator. The running time of the NIST's tool in non-IID track is approximately 20 minutes and the memory usage is over 5.5 GB. For evaluation agencies that have to perform repeatedly evaluations on various samples, and developers or researchers who have to perform experiments in various environments, it may be inconvenient to estimate entropy using the tool and depending on the environment, it may be impossible to execute. In this paper, we propose high-speed implementations and an efficient memory usage technique for min-entropy estimation algorithm of SP 800-90B. Our major achievements are the three improved speed and efficient memory usage reduction methods which are the method applying advantages of C++ code for improving speed of MultiMCW estimator, the method effectively reducing the memory and improving speed of MultiMMC by rebuilding the data storage structure, and the method improving the speed of LZ78Y by rebuilding the data structure. The tool applied our proposed methods is 14 times faster and saves 13 times more memory usage than NIST's tool.

Electroencephalographic Changes Induced by a Neurofeedback Training : A Preliminary Study in Primary Insomniac Patients (뉴로피드백 훈련에 의한 뇌파 변화 연구 : 일차성 불면증 환자에 대한 예비 연구)

  • Lee, Jin Han;Shin, Hong-Beom;Kim, Jong Won;Suh, Ho-Suk;Lee, Young Jin
    • Sleep Medicine and Psychophysiology
    • /
    • v.26 no.1
    • /
    • pp.44-48
    • /
    • 2019
  • Objectives: Insomnia is one of the most prevalent sleep disorders. Recent studies suggest that cognitive and physical arousal play an important role in the generation of primary insomnia. Studies have also shown that information processing disorders due to cortical hyperactivity might interfere with normal sleep onset and sleep continuity. Therefore, focusing on central nervous system arousal and normalizing the information process have become current topics of interest. It has been well known that neurofeedback can reduce the brain hyperarousal by modulating patients' brain waves during a sequence of behavior therapy. The purpose of this study was to investigate effects of neurofeedback therapy on electroencephalography (EEG) characteristics in patients with primary insomnia. Methods: Thirteen subjects who met the criteria for an insomnia diagnosis and 14 control subjects who were matched on sex and age were included. Neurofeedback and sham treatments were performed in a random order for 30 minutes, respectively. EEG spectral power analyses were performed to quantify effects of the neurofeedback therapy on brain wave forms. Results: In patients with primary insomnia, relative spectral theta and sigma power during a therapeutic neurofeedback session were significantly lower than during a sham session ($13.9{\pm}2.6$ vs. $12.2{\pm}3.8$ and $3.6{\pm}0.9$ vs. $3.2{\pm}1.0$ in %, respectively; p < 0.05). There were no statistically significant changes in other EEG spectral bands. Conclusion: For the first time in Korea, EEG spectral power in the theta band was found to increase when a neurofeedback session was applied to patients with insomnia. This outcome might provide some insight into new interventions for improving sleep onset. However, the treatment response of insomniacs was not precisely evaluated due to limitations of the current pilot study, which requires follow-up studies with larger samples in the future.

The Validity Test of Statistical Matching Simulation Using the Data of Korea Venture Firms and Korea Innovation Survey (벤처기업정밀실태조사와 한국기업혁신조사 데이터를 활용한 통계적 매칭의 타당성 검증)

  • An, Kyungmin;Lee, Young-Chan
    • Knowledge Management Research
    • /
    • v.24 no.1
    • /
    • pp.245-271
    • /
    • 2023
  • The change to the data economy requires a new analysis beyond ordinary research in the management field. Data matching refers to a technique or processing method that combines data sets collected from different samples with the same population. In this study, statistical matching was performed using random hotdeck and Mahalanobis distance functions using 2020 Survey of Korea Venture Firms and 2020 Korea Innovation Survey datas. Among the variables used for statistical matching simulation, the industry and the number of workers were set to be completely consistent, and region, business power, listed market, and sales were set as common variables. Simulation verification was confirmed by mean test and kernel density. As a result of the analysis, it was confirmed that statistical matching was appropriate because there was a difference in the average test, but a similar pattern was shown in the kernel density. This result attempted to expand the spectrum of the research method by experimenting with a data matching research methodology that has not been sufficiently attempted in the management field, and suggests implications in terms of data utilization and diversity.

Vegetation classification based on remote sensing data for river management (하천 관리를 위한 원격탐사 자료 기반 식생 분류 기법)

  • Lee, Chanjoo;Rogers, Christine;Geerling, Gertjan;Pennin, Ellis
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.6-7
    • /
    • 2021
  • Vegetation development in rivers is one of the important issues not only in academic fields such as geomorphology, ecology, hydraulics, etc., but also in river management practices. The problem of river vegetation is directly connected to the harmony of conflicting values of flood management and ecosystem conservation. In Korea, since the 2000s, the issue of river vegetation and land formation has been continuously raised under various conditions, such as the regulating rivers downstream of the dams, the small eutrophicated tributary rivers, and the floodplain sites for the four major river projects. In this background, this study proposes a method for classifying the distribution of vegetation in rivers based on remote sensing data, and presents the results of applying this to the Naeseong Stream. The Naeseong Stream is a representative example of the river landscape that has changed due to vegetation development from 2014 to the latest. The remote sensing data used in the study are images of Sentinel 1 and 2 satellites, which is operated by the European Aerospace Administration (ESA), and provided by Google Earth Engine. For the ground truth, manually classified dataset on the surface of the Naeseong Stream in 2016 were used, where the area is divided into eight types including water, sand and herbaceous and woody vegetation. The classification method used a random forest classification technique, one of the machine learning algorithms. 1,000 samples were extracted from 10 pre-selected polygon regions, each half of them were used as training and verification data. The accuracy based on the verification data was found to be 82~85%. The model established through training was also applied to images from 2016 to 2020, and the process of changes in vegetation zones according to the year was presented. The technical limitations and improvement measures of this paper were considered. By providing quantitative information of the vegetation distribution, this technique is expected to be useful in practical management of vegetation such as thinning and rejuvenation of river vegetation as well as technical fields such as flood level calculation and flow-vegetation coupled modeling in rivers.

  • PDF

Effects of vowel types and sentence positions in standard passage on auditory and cepstral and spectral measures in patients with voice disorders (모음 유형과 표준문단의 문장 위치가 음성장애 환자의 청지각적 및 켑스트럼 및 스펙트럼 분석에 미치는 효과)

  • Mi-Hyeon Choi;Seong Hee Choi
    • Phonetics and Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.81-90
    • /
    • 2023
  • Auditory perceptual assessment and acoustic analysis are commonly used in clinical practice for voice evaluation. This study aims to explore the effects of speech task context on auditory perceptual assessment and acoustic measures in patients with voice disorders. Sustained vowel phonations (/a/, /e/, /i/, /o/, /u/, /ɯ/, /ʌ/) and connected speech (a standardized paragraph 'kaeul' and nine sub-sentences) were obtained from a total of 22 patients with voice disorders. GRBAS ('G', 'R', 'B', 'A', 'S') and CAPE-V ('OS', 'R', 'B', 'S', 'P', 'L') auditory-perceptual assessment were evaluated by two certified speech language pathologists specializing in voice disorders using blind and random voice samples. Additionally, spectral and cepstral measures were analyzed using the analysis of dysphonia in speech and voice model (ADSV).When assessing voice quality with the GRBAS scale, it was not significantly affected by the vowel type except for 'B', while the 'OS', 'R' and 'B' in CAPE-V were affected by the vowel type (p<.05). In addition, measurements of CPP and L/H ratio were influenced by vowel types and sentence positions. CPP values in the standard paragraph showed significant negative correlations with all vowels, with the highest correlation observed for /e/ vowel (r=-.739). The CPP of the second sentence had the strongest correlation with all vowels. Depending on the speech stimulus, CAPE-V may have a greater impact on auditory-perceptual assessment than GRBAS, vowel types and sentence position with consonants influenced the 'B' scale, CPP, and L/H ratio. When using vowels in the voice assessment of patients with voice disorders, it would be beneficial to use not only /a/, but also the vowel /i/, which is acoustically highly correlated with 'breathy'. In addition, the /e/ vowel was highly correlated acoustically with the standardized passage and sub-sentences. Furthermore, given that most dysphonic signals are aperiodic, 2nd sentence of the 'kaeul' passage, which is the most acoustically correlated with all vowels, can be used with CPP. These results provide clinical evidence of the impact of speech tasks on auditory perceptual and acoustic measures, which may help to provide guidelines for voice evaluation in patients with voice disorders.