• Title/Summary/Keyword: Probability Score

Search Result 295, Processing Time 0.024 seconds

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.

New index for the gifted students(G-Index) with EEG analysis (뇌파검사 자료를 기반으로 한 과학영재 판별 지수(G-Index) 개발과 적용)

  • Kim, Kyung-Hwa;Kim, Kyu-Han;Lee, Sun-Kil;Hur, Myung;Kim, Yong-Jin
    • Journal of Gifted/Talented Education
    • /
    • v.15 no.1
    • /
    • pp.67-84
    • /
    • 2005
  • In this study we investigated the adequacy of tools for distinction gifted students through the comparison these mutual relation on the basis of data, like paper test, the depths interview score, and the rest data((TTCT: Torrance Tests of Creative Thinking, IQ test, FASP: Find A Shape Puzzle, V.T: Visualization Tests and Exp: experimental ability test), and analysis data of EEG test for examining the adequacy of tools for identification gifted students. So, we developed Brain Wave gifted Index(G-Index) for finding another distinction ability as using brain waves data. The standard of index development use gifted brain characteristic in closed-eyes rest state which is judged like that characteristic of distinction between gifted and normal students is the most clear and consistence. That is, the degree of unified pattern between each object and gifted PCA pattern was defined by Pearson method which added spatial mutual index to weight concept. This refer to mean number of spatial PCA pattern. Searching for the possibility of distinction gifted gave distinction effect in 76%. The result of regression analysis on the basis of mutual relation between the rest data is . The probability formula for distinct gifted group is as follow. $$P=\frac 1{1+e^{-[-0.018(TTCT)+0.057(IQ)+1.916(FASP)+0.682(V.T)+0.088(Exp.)+0.034(G-Index)-57.510]}}$$ The result of this calculation showed that probability for distinct in gifted group was very good(95.0%). On the basis of upper result, tools for identification gifted students should be estimated as using many-sided estimation data whatever possible. And following study about development, and operation of tools for distinction suitable to gifted student in science should be progressed.

Development of a Gifted Behavior Checklist Based on the Observation Probability and Importance of the Behavior in Class (관찰가능성과 중요도를 고려한 관찰·추천용 초등 영재 행동 특성 체크리스트 개발)

  • Lee, In-Ho;Han, Ki-Soon
    • Journal of Gifted/Talented Education
    • /
    • v.25 no.6
    • /
    • pp.817-836
    • /
    • 2015
  • This research focuses on the development of gifted child behavior checklist which feasibly has application on the nation-wide gifted children observation-recommendation method. Corresponding measure has significance as it reflects actual observations of teachers teaching gifted children first-hand and involves measure of importance regarding each characteristic. An open survey on gifted children behavior characteristics lists and specific behavior patterns has been acquired from teachers in gifted education, and the checklist was developed through expert group review, pre-test, and confirmatory factor analysis process. The former checklists have shown several difficulties on application of observation-recommendation on the field due to behaviors that can't be observed in school, less important behaviors, and collide and duplicate behaviors etc. With regard to such problems, problematic clauses were removed based on the observation probability and importance of the behaviors. Ultimately, total of 32 behavior characteristic checklist consisting of ten sub factors(logical thinking, high achievement, originality, perfectionism, creative problem solving, curiosity, task commitment, conversation ability, creativity, passion) and two to three questions on each factor had been drawn. Through internal consistency test and item-total score correlation, each item of the measure has been analyzed to be consistently evaluating corresponding variables. In addition, the result of confirmatory factor analysis showed every item to be weighed appropriately on its sub-factor, strongly suggesting its feasibility on observation-recommendation of elementary gifted children as an appropriate checklist.

The MeSH-Term Query Expansion Models using LDA Topic Models in Health Information Retrieval (MeSH 기반의 LDA 토픽 모델을 이용한 검색어 확장)

  • You, Sukjin
    • Journal of Korean Library and Information Science Society
    • /
    • v.52 no.1
    • /
    • pp.79-108
    • /
    • 2021
  • Information retrieval in the health field has several challenges. Health information terminology is difficult for consumers (laypeople) to understand. Formulating a query with professional terms is not easy for consumers because health-related terms are more familiar to health professionals. If health terms related to a query are automatically added, it would help consumers to find relevant information. The proposed query expansion (QE) models show how to expand a query using MeSH terms. The documents were represented by MeSH terms (i.e. Bag-of-MeSH), found in the full-text articles. And then the MeSH terms were used to generate LDA (Latent Dirichlet Analysis) topic models. A query and the top k retrieved documents were used to find MeSH terms as topic words related to the query. LDA topic words were filtered by threshold values of topic probability (TP) and word probability (WP). Threshold values were effective in an LDA model with a specific number of topics to increase IR performance in terms of infAP (inferred Average Precision) and infNDCG (inferred Normalized Discounted Cumulative Gain), which are common IR metrics for large data collections with incomplete judgments. The top k words were chosen by the word score based on (TP *WP) and retrieved document ranking in an LDA model with specific thresholds. The QE model with specific thresholds for TP and WP showed improved mean infAP and infNDCG scores in an LDA model, comparing with the baseline result.

Application of peak based-Bayesian statistical method for isotope identification and categorization of depleted, natural and low enriched uranium measured by LaBr3:Ce scintillation detector

  • Haluk Yucel;Selin Saatci Tuzuner;Charles Massey
    • Nuclear Engineering and Technology
    • /
    • v.55 no.10
    • /
    • pp.3913-3923
    • /
    • 2023
  • Todays, medium energy resolution detectors are preferably used in radioisotope identification devices(RID) in nuclear and radioactive material categorization. However, there is still a need to develop or enhance « automated identifiers » for the useful RID algorithms. To decide whether any material is SNM or NORM, a key parameter is the better energy resolution of the detector. Although masking, shielding and gain shift/stabilization and other affecting parameters on site are also important for successful operations, the suitability of the RID algorithm is also a critical point to enhance the identification reliability while extracting the features from the spectral analysis. In this study, a RID algorithm based on Bayesian statistical method has been modified for medium energy resolution detectors and applied to the uranium gamma-ray spectra taken by a LaBr3:Ce detector. The present Bayesian RID algorithm covers up to 2000 keV energy range. It uses the peak centroids, the peak areas from the measured gamma-ray spectra. The extraction features are derived from the peak-based Bayesian classifiers to estimate a posterior probability for each isotope in the ANSI library. The program operations were tested under a MATLAB platform. The present peak based Bayesian RID algorithm was validated by using single isotopes(241Am, 57Co, 137Cs, 54Mn, 60Co), and then applied to five standard nuclear materials(0.32-4.51% at.235U), as well as natural U- and Th-ores. The ID performance of the RID algorithm was quantified in terms of F-score for each isotope. The posterior probability is calculated to be 54.5-74.4% for 238U and 4.7-10.5% for 235U in EC-NRM171 uranium materials. For the case of the more complex gamma-ray spectra from CRMs, the total scoring (ST) method was preferred for its ID performance evaluation. It was shown that the present peak based Bayesian RID algorithm can be applied to identify 235U and 238U isotopes in LEU or natural U-Th samples if a medium energy resolution detector is was in the measurements.

The Elderly Families' Daily Food Cultivation, Preservation in Rural, Korea -Comparison with middle aged families- (농촌거주 노년가족의 일상 식품 생산과 가공 및 저장 -중년가족과의 비교-)

  • Rhie Seung Gyo;Chung Kum Ju;Won Hyang Ryu
    • The Korean Journal of Community Living Science
    • /
    • v.16 no.2
    • /
    • pp.111-120
    • /
    • 2005
  • Recently, the number of elderly people in the rural area of Korea has increased remarkably and their food security has become deteriorated mainly due to the low economic status. To investigate the food security for the elderly people, relevant data were obtained by offering questionnaire to the rural elderly people who were engaging in traditional agricultural production for daily foods. The subjects of 1870 were collected in 9 provinces according to PPS(Probability Proportional to Size). Questionnaire contained the items of dietary habit, food cultivation, Production and Preservation, and the suey was conducted by trained interviewers. SAS (ver 8.1) was used for statistical analyses in which Chi-square tests and General Linear Models were made. Family of the elderly people was $45.4\%$ of the total and the characteristics of elderly families were that age of male head was 82.1 years and that of female was 67.7 years, and that $68.8\%$ of elderly women were working for family income or pocket money. The elderly families' food cultivation state was surveyed and they were pepper$(59.1\%)$, chinese cabbage$(61.91\%)$, and sesame$(48.6\%)$ for their own consumption. But, bean sprout$(6.5\%)$, tofu$(7.7\%)$ and egg$(5.1\%)$ showed low rate of cultivation for the family. The rate of cultivating chinese cabbage$(61.9\%)$, and sesame$(48.6\%)$ was significantly higher than that of middle aged family. At the status of fermented food production for the elderly family, Doenjang$(87.4\%)$ and Gochujang$(86.3\%)$ Kanjang$(84.0\%)$ Kimchi$(92.9\%)$ Jangachi$(27.6\%)$ and Meju$(91.61\%)$maintained higher rate than that of middle aged families' Food preservation of elderly families was low and there are just jam$(5.3\%)$ and bottled products$(1.4\%)$. A little higher rate was observed lot the preserved food such as alcohol$(9.9\%)$ and powder$(9.8\%)$. For the elderly family the score of food cultivation was 4.08/12 points and that of food preservation was 0.62/12 points. The score of fermented food production for elderly family was 10.24/12 points which was significantly different from that of middle aged family (9.58/12 points, p<0.001). This result suggests that for the elderly people food with more protein is needed for production.

  • PDF

A Comparative Study on HSI and MaxEnt Habitat Prediction Models: About Prionailurus bengalensis (HSI와 MaxEnt를 통한 삵의 서식지 예측 모델 비교 연구)

  • Yoo, Da-Young;Lim, Tai-Yang;Kim, Whee-Moon;Song, Won-Kyong
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.24 no.5
    • /
    • pp.1-14
    • /
    • 2021
  • Excessive development and urbanization have destroyed animal, plant, habitats and reduced biodiversity. In order to preserve species diversity, habitat prediction studies are have been conducted at home and overseas using various modeling techniques. This study was conducted to suggest optimal habitat modeling research by comparing HSI and MaxEnt, which are widely used among habitat modeling techniques. The study was targeted on the endangered species of Prionailurus bengalensis in nearby areas (5460.35km2) including Cheonan City, and the same data were used for analysis to compare those models. According to the HSI analysis, Prionailurus bengalensis's habitat probability was 74.65% for less than 0.5 and 25.34% for more than 0.5 and the top 30% were forest (99.07%). MaxEnt's analysis showed that 56.22% of those below 0.5 and 43.79% of those above 0.5 were found to have a high explanatory power of 78.3% of AUC. The Paired Wilcoxn test, which evaluated the significance of thoes models, confirmed that the mean difference between the two models was statistically significant (p<0.05). Analysis of the differences in the results of those models using the matrix table shows that score 24.43% HSI and MaxEnt was accordance,12.44% of the 0.0 to 0.2 section, 7.22% of the 0.2 to 0.4 section, 2.73% of the 0.4 to 0.6 section, 1.96% of the 0.6 to 0.8, and 0.08% of the 0.9 to 1.0. To verify where the score difference appears, the result values of those models were reset to values from 1 to 5 and overlaid. Overlapping analysis resulted in 30.26% of the Strongly agree values, 56.77% of the agree values, and 11.92% of the Disagree values. The places where the difference in scores occurs were analyzed in the order of forest (45.23%), agricultural land (34.57%), and urbanization area (7.65%). This confirmed that the analysis of the same target species within the same target site also has differences in forecasts depending on the modelling method. Therefore, a novel analysis method combining the advantages of each modeling in habitat prediction studies should be developed, and future study may be used to select Prionailurus bengalensis and species-protected areas and species protection areas in the future. Further research is judged to require higher accuracy studies through the use of various modeling techniques and on-site verification.

C7 Fracture as a Complication of C7 Dome-Like Laminectomy : Impact on Clinical and Radiological Outcomes and Evaluation of the Risk Factors

  • Yang, Seung Heon;Kim, Chi Heon;Lee, Chang Hyun;Ko, Young San;Won, Youngil;Chung, Chun Kee
    • Journal of Korean Neurosurgical Society
    • /
    • v.64 no.4
    • /
    • pp.575-584
    • /
    • 2021
  • Objective : Cervical expansive laminoplasty is an effective surgical method to address multilevel cervical spinal stenosis. During surgery, the spinous processes of C2 and C7 are usually preserved to keep the insertion points of the cervical musculature and nuchal ligament intact. In this regard, dome-like laminectomy (undercutting of C7 lamina) instead of laminoplasty is performed on C7 in selected cases. However, resection of the lamina can weaken the C7 lamina, and stress fractures may occur, but this complication has not been characterized in the literature. The objective of the present study was to investigate the incidence and risk factors for C7 laminar fracture after C7 dome-like laminectomy and its impact on clinical and radiological outcomes. Methods : Patients who underwent cervical open-door laminoplasty combined with C7 dome-like laminectomy (n=123) were classified according to the presence of C7 laminar fracture. Clinical parameters (neck/arm pain score and neck disability index) and radiologic parameters (C2-7 angle, C2-7 sagittal vertical axis, and C7-T1 angle) were compared between the groups preoperatively and at postoperatively at 3, 6, 12, and 24 months. Risk factors for complications were evaluated, and a formula estimating C7 fracture risk was suggested. Results : C7 lamina fracture occurred in 32/123 (26%) patients and occurred at the bilateral isthmus in 29 patients and at the spinolaminar junction in three patients. All fractures appeared on X-ray within 3 months postoperatively, but patients did not present any neurological deterioration. The fracture spontaneously healed in 27/32 (84%) patients at 1 year and in 29/32 (91%) at 2 years. During follow-up, clinical outcomes were not significantly different between the groups. However, patients with C7 fractures showed a more lordotic C2-7 angle and kyphotic C7-T1 angle than patients without C7 fractures. C7 fracture was significantly associated with the extent of bone removal. By incorporating significant factors, the probability of C7 laminar fracture could be assessed with the formula 'Risk score = 1.08 × depth (%) + 1.03 × length (%, of the posterior height of C7 vertebral body)', and a cut-off value of 167.9% demonstrated a sensitivity of 90.3% and a specificity of 65.1% (area under the curve, 0.81). Conclusion : C7 laminar fracture can occur after C7 dome-like laminectomy when a substantial amount of lamina is resected. Although C7 fractures may not cause deleterious clinical outcomes, they can lead to an unharmonized cervical curvature. The chance of C7 fracture should be discussed in the shared decision-making process.

A Prognostic Factor for Prolonged Mechanical Ventilator-Dependent Respiratory Failure after Cervical Spinal Cord Injury : Maximal Canal Compromise on Magnetic Resonance Imaging

  • Lee, Subum;Roh, Sung Woo;Jeon, Sang Ryong;Park, Jin Hoon;Kim, Kyoung-Tae;Lee, Young-Seok;Cho, Dae-Chul
    • Journal of Korean Neurosurgical Society
    • /
    • v.64 no.5
    • /
    • pp.791-798
    • /
    • 2021
  • Objective : The period of mechanical ventilator (MV)-dependent respiratory failure after cervical spinal cord injury (CSCI) varies from patient to patient. This study aimed to identify predictors of MV at hospital discharge (MVDC) due to prolonged respiratory failure among patients with MV after CSCI. Methods : Two hundred forty-three patients with CSCI were admitted to our institution between May 2006 and April 2018. Their medical records and radiographic data were retrospectively reviewed. Level and completeness of injury were defined according to the American Spinal Injury Association (ASIA) standards. Respiratory failure was defined as the requirement for definitive airway and assistance of MV. We also evaluated magnetic resonance imaging characteristics of the cervical spine. These characteristics included : maximum canal compromise (MCC); intramedullary hematoma or cord transection; and integrity of the disco-ligamentous complex for assessment of the Subaxial Cervical Spine Injury Classification (SLIC) scoring. The inclusion criteria were patients with CSCI who underwent decompression surgery within 48 hours after trauma with respiratory failure during hospital stay. Patients with Glasgow coma scale 12 or lower, major fatal trauma of vital organs, or stroke caused by vertebral artery injury were excluded from the study. Results : Out of 243 patients with CSCI, 30 required MV during their hospital stay, and 27 met the inclusion criteria. Among them, 48.1% (13/27) of patients had MVDC with greater than 30 days MV or death caused by aspiration pneumonia. In total, 51.9% (14/27) of patients could be weaned from MV during 30 days or less of hospital stay (MV days : MVDC 38.23±20.79 vs. MV weaning, 13.57±8.40; p<0.001). Vital signs at hospital arrival, smoking, the American Society of Anesthesiologists classification, Associated injury with Injury Severity Score, SLIC score, and length of cord edema did not differ between the MVDC and MV weaning groups. The ASIA impairment scale, level of injury within C3 to C6, and MCC significantly affected MVDC. The MCC significantly correlated with MVDC, and the optimal cutoff value was 51.40%, with 76.9% sensitivity and 78.6% specificity. In multivariate logistic regression analysis, MCC >51.4% was a significant risk factor for MVDC (odds ratio, 7.574; p=0.039). Conclusion : As a method of predicting which patients would be able to undergo weaning from MV early, the MCC is a valid factor. If the MCC exceeds 51.4%, prognosis of respiratory function becomes poor and the probability of MVDC is increased.

Cyber attack group classification based on MITRE ATT&CK model (MITRE ATT&CK 모델을 이용한 사이버 공격 그룹 분류)

  • Choi, Chang-hee;Shin, Chan-ho;Shin, Sung-uk
    • Journal of Internet Computing and Services
    • /
    • v.23 no.6
    • /
    • pp.1-13
    • /
    • 2022
  • As the information and communication environment develops, the environment of military facilities is also development remarkably. In proportion to this, cyber threats are also increasing, and in particular, APT attacks, which are difficult to prevent with existing signature-based cyber defense systems, are frequently targeting military and national infrastructure. It is important to identify attack groups for appropriate response, but it is very difficult to identify them due to the nature of cyber attacks conducted in secret using methods such as anti-forensics. In the past, after an attack was detected, a security expert had to perform high-level analysis for a long time based on the large amount of evidence collected to get a clue about the attack group. To solve this problem, in this paper, we proposed an automation technique that can classify an attack group within a short time after detection. In case of APT attacks, compared to general cyber attacks, the number of attacks is small, there is not much known data, and it is designed to bypass signature-based cyber defense techniques. As an attack model, we used MITRE ATT&CK® which modeled many parts of cyber attacks. We design an impact score considering the versatility of the attack techniques and proposed a group similarity score based on this. Experimental results show that the proposed method classified the attack group with a 72.62% probability based on Top-5 accuracy.