• Title/Summary/Keyword: Score distribution

Search Result 726, Processing Time 0.027 seconds

Pattern Selection Using the Bias and Variance of Ensemble (앙상블의 편기와 분산을 이용한 패턴 선택)

  • Shin, Hyunjung;Cho, Sungzoon
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.28 no.1
    • /
    • pp.112-127
    • /
    • 2002
  • A useful pattern is a pattern that contributes much to learning. For a classification problem those patterns near the class boundary surfaces carry more information to the classifier. For a regression problem the ones near the estimated surface carry more information. In both cases, the usefulness is defined only for those patterns either without error or with negligible error. Using only the useful patterns gives several benefits. First, computational complexity in memory and time for learning is decreased. Second, overfitting is avoided even when the learner is over-sized. Third, learning results in more stable learners. In this paper, we propose a pattern 'utility index' that measures the utility of an individual pattern. The utility index is based on the bias and variance of a pattern trained by a network ensemble. In classification, the pattern with a low bias and a high variance gets a high score. In regression, on the other hand, the one with a low bias and a low variance gets a high score. Based on the distribution of the utility index, the original training set is divided into a high-score group and a low-score group. Only the high-score group is then used for training. The proposed method is tested on synthetic and real-world benchmark datasets. The proposed approach gives a better or at least similar performance.

Machine learning-based Predictive Model of Suicidal Thoughts among Korean Adolescents. (머신러닝 기반 한국 청소년의 자살 생각 예측 모델)

  • YeaJu JIN;HyunKi KIM
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.1
    • /
    • pp.1-6
    • /
    • 2023
  • This study developed models using decision forest, support vector machine, and logistic regression methods to predict and prevent suicidal ideation among Korean adolescents. The study sample consisted of 51,407 individuals after removing missing data from the raw data of the 18th (2022) Youth Health Behavior Survey conducted by the Korea Centers for Disease Control and Prevention. Analysis was performed using the MS Azure program with Two-Class Decision Forest, Two-Class Support Vector Machine, and Two-Class Logistic Regression. The results of the study showed that the decision forest model achieved an accuracy of 84.8% and an F1-score of 36.7%. The support vector machine model achieved an accuracy of 86.3% and an F1-score of 24.5%. The logistic regression model achieved an accuracy of 87.2% and an F1-score of 40.1%. Applying the logistic regression model with SMOTE to address data imbalance resulted in an accuracy of 81.7% and an F1-score of 57.7%. Although the accuracy slightly decreased, the recall, precision, and F1-score improved, demonstrating excellent performance. These findings have significant implications for the development of prediction models for suicidal ideation among Korean adolescents and can contribute to the prevention and improvement of youth suicide.

Color Image Segmentation by statistical approach (확률적 방법을 통한 컬러 영상 분할)

  • Gang Seon-Do;Yu Heon-U;Jang Dong-Sik
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2006.05a
    • /
    • pp.1677-1683
    • /
    • 2006
  • Color image segmentation is useful for fast retrieval in large image database. For that purpose, new image segmentation technique based on the probability of pixel distribution in the image is proposed. Color image is first divided into R, G, and B channel images. Then, pixel distribution from each of channel image is extracted to select to which it is similar among the well known probabilistic distribution function-Weibull, Exponential, Beta, Gamma, Normal, and Uniform. We use sum of least square error to measure of the quality how well an image is fitted to distribution. That P.d.f has minimum score in relation to sum of square error is chosen. Next, each image is quantized into 4 gray levels by applying thresholds to the c.d.f of the selected distribution of each channel. Finally, three quantized images are combined into one color image to obtain final segmentation result. To show the validity of the proposed method, experiments on some images are performed.

  • PDF

Impact of Climate Change on Business Process in the Distribution Industry

  • Kim, Young-Ei
    • Journal of Distribution Science
    • /
    • v.12 no.12
    • /
    • pp.5-17
    • /
    • 2014
  • Purpose - The purpose of this study is to examine the possible ways to minimize damage by analyzing the influence that may be exerted upon the business process of the distribution industry by unexpected climate change. Research design, data, and methodology - The optimum business process is to be implemented after dividing the diversified business process of the distribution industry into the four stages of the Business Continuity Plan (BCP). Results - First, the upper-level risks that would be impacted most sensitively by climate change have been selected. Second, the impact and characteristics of the environment have been discovered. Third, weighted values by criteria item of upper-level business risks have been analyzed. Fourth, it was possible to define the business priority order based on the individual and then to adjust the Recovery Time Objective (RTO). Conclusion - In this study, the priority order has been defined quantitatively by calculating the priority order score. Further, the priority order has been determined depending on whether any targeted business unit is applicable to the items of the business nature criteria.

Evaluation of Oil Spill Detection Models by Oil Spill Distribution Characteristics and CNN Architectures Using Sentinel-1 SAR data (Sentienl-1 SAR 영상을 활용한 유류 분포특성과 CNN 구조에 따른 유류오염 탐지모델 성능 평가)

  • Park, Soyeon;Ahn, Myoung-Hwan;Li, Chenglei;Kim, Junwoo;Jeon, Hyungyun;Kim, Duk-jin
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.5_3
    • /
    • pp.1475-1490
    • /
    • 2021
  • Detecting oil spill area using statistical characteristics of SAR images has limitations in that classification algorithm is complicated and is greatly affected by outliers. To overcome these limitations, studies using neural networks to classify oil spills are recently investigated. However, the studies to evaluate whether the performance of model shows a consistent detection performance for various oil spill cases were insufficient. Therefore, in this study, two CNNs (Convolutional Neural Networks) with basic structures(Simple CNN and U-net) were used to discover whether there is a difference in detection performance according to the structure of CNN and distribution characteristics of oil spill. As a result, through the method proposed in this study, the Simple CNN with contracting path only detected oil spill with an F1 score of 86.24% and U-net, which has both contracting and expansive path showed an F1 score of 91.44%. Both models successfully detected oil spills, but detection performance of the U-net was higher than Simple CNN. Additionally, in order to compare the accuracy of models according to various oil spill cases, the cases were classified into four different categories according to the spatial distribution characteristics of the oil spill (presence of land near the oil spill area) and the clarity of border between oil and seawater. The Simple CNN had F1 score values of 85.71%, 87.43%, 86.50%, and 85.86% for each category, showing the maximum difference of 1.71%. In the case of U-net, the values for each category were 89.77%, 92.27%, 92.59%, and 92.66%, with the maximum difference of 2.90%. Such results indicate that neither model showed significant differences in detection performance by the characteristics of oil spill distribution. However, the difference in detection tendency was caused by the difference in the model structure and the oil spill distribution characteristics. In all four oil spill categories, the Simple CNN showed a tendency to overestimate the oil spill area and the U-net showed a tendency to underestimate it. These tendencies were emphasized when the border between oil and seawater was unclear.

Coping methods related with post-traumatic stress types for the firefighters who experienced the Dae-gu subway fire disaster (대구지하철 참사를 경험한 소방관의 외상 후 스트레스유형에 따른 대처방식)

  • Baek, Mi-Lye
    • The Korean Journal of Emergency Medical Services
    • /
    • v.11 no.3
    • /
    • pp.5-15
    • /
    • 2007
  • The purpose of this study was to identity the distribution of post-traumatic stress types and coping methods and to find the relationship between the post-traumatic stress types and the coping methods, for firefighters who experienced in Dea-Gu Subway Fire Disaster. The Subjects of this study were 126 firefighters who experienced Deagu Subway Disaster. Q questionnaire developed by Q-study and coping methods instrument based on that of Folkman & Lazaruswas revised and complemented by Kim Jung Hee was used. Data were analyzed by t-test, ANOVA using SPSS. The results of this study were as follows : 1. The distribution of post-traumatic stress types were 52.4% of Emotional arousal trauma, 34.1% of Trauma experience persistence and 13.5% of Physiological symptom experience. 2. The difference of post-traumatic stress types according to the general characteristics were significantly related to the physical injury(p = .010). 3. The minimum score of coping with post-traumatic stress types was 0.07, the maximum was 2.96 and the mean score was 1.27. 4. The coping methods according to the general characteristics were significantly different at active coping method according to educational level(p = .001), passive coping method according to educational level(p = .003) and passive coping method according to diagnosis(p = 0.20). 5. The mean score of active coping method according types were Emotional arousal trauma(1.505), trauma experience persistance(1.322) and Physiological symptom experience(1.276). The mean score of passive coping method related with types were Emotional arousal trauma(1.328), trauma experience persistance(1.254) and Physiological symptom experience(1.219).

  • PDF

The Characteristics and Implementations of Quality Metrics for Analyzing Innovation Effects in Six Sigma Projects (식스시그마 프로젝트 사례에서 혁신효과 분석을 위한 품질척도의 특성 및 적용)

  • Choi, Sungwoon
    • Journal of the Korea Safety Management & Science
    • /
    • v.16 no.1
    • /
    • pp.169-176
    • /
    • 2014
  • This research discusses the characteristics and the implementation strategies for two types of quality metrics to analyze innovation effects in six sigma projects: fixed specification type and moving specification type. $Z_{st}$, $P_{pk}$ are quality metrics of fixed specification type that are influenced by predetermined specification. In contrast, the quality metrics of moving specification type such as Strictly Standardized Mean Difference(SSMD), Z-Score, F-Statistic and t-Statistic are independent from predetermined specification. $Z_{st}$ sigma level obtains defective rates of Parts Per Million(PPM) and Defects Per Million Opportunities(DPMO). However, the defective rates between different industrial sectors are incomparable due to their own technological inherence. In order to explore relative method to compare defective rates between different industrial sectors, the ratio of specification and natural tolerance called, $P_{pk}$, is used. The drawback of this $P_{pk}$ metric is that it is highly dependent on the specification. The metrics of F-Statistic and t-Statistic identify innovation effect by comparing before-and-after of accuracy and precision. These statistics are not affected by specification, but affected by type of statistical distribution models and sample size. Hence, statistical significance determined by above two statistics cannot give a same conclusion as practical significance. In conclusion, SSMD and Z-Score are the best quality metrics that are uninfluenced by fixed specification, theoretical distribution model and arbitrary sample size. Those metrics also identify the innovation effects for before-and-after of accuracy and precision. It is beneficial to use SSMD and Z-Score methods along with popular methods of $Z_{st}$ sigma level and $P_{pk}$ that are commonly employed in six sigma projects. The case studies from national six sigma contest from 2011 to 2012 are proposed and analyzed to provide the guidelines for the usage of quality metrics for quality practitioners.

Relationships among Health behavior, Social Support, Behavior Pattern, and Self-Efficacy of Hospital Nurses (임상간호사의 건강행위, 사회적지지, 행동유형과 자기효능감과의 관련성)

  • Lee, Young-Mee
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.11
    • /
    • pp.4861-4868
    • /
    • 2011
  • This study intends to investigate the health behavior, self-efficacy, behavior pattern and social support among hospital of nurses, to provide the basic data for self-efficacy promoting program and social support promoting program. The self-administered questionnaires were given to 202 nurses employed in hospital during the period from June 10 to June 26, 2011. As a results, the score of level of self-efficacy was statistically significant difference according to educational status, working period, marital state, place of duty, religion, night-duty, distribution of household labor, exercise and stress management. The score of level of social support was statistically significant difference according to place of duty, religion and distribution of household labor, alcohol drinking, smoking, and stress management. But the score of level of behavior pattern was not statistically significant difference according to self-efficacy and social support. In correlation, the score of self-efficacy level correlated positively with social support and not with behavior pattern.

Characteristics and Socio-Demographic Distribution of Precarious Employment Among Korean Wage Workers: A Proposition of Multidimensional Approach Using a Summative Score

  • Seong-Uk Baek;Min-Seok Kim;Myeong-Hun Lim;Taeyeon Kim;Jin-Ha Yoon;Jong-Uk Won
    • Safety and Health at Work
    • /
    • v.14 no.4
    • /
    • pp.476-482
    • /
    • 2023
  • Introduction: There is a growing global interest in the issue of precarious employment. We aimed to analyze the characteristics and socio-demographic distribution of precarious employment using a summative score approach. Methods: To operationalize precarious employment, we utilized data from the Korean Working Conditions Survey and focused on three distinct dimensions: employment insecurity, income inadequacy, and a lack of rights and protections. By constructing a summative scale ranging from -16 to 2, with lower scores indicating higher precariousness, we measured employment precariousness among Korean wage workers. To compare employment precariousness according to survey participant characteristics, we employed the Wilcoxon Rank Sum Test. Results: We analyzed a weighted number of 38,432 workers. The overall sample showed a median (Q1, Q3) summative scale score of -3 (-6, -1). The median summative score was lower for women compared to men (men: -2; women: -5; p < 0.001), as well as for young or older workers compared to middle-aged workers (young: -4; middle-aged: -2; older: -5; p < 0.001). Similarly, workers with lower educational levels (middle school or below: -8; high school: -5; college or above: -2; p < 0.001) and non-white collar workers (blue collar: -5; service/sales worker: -6; white collar: -2; p < 0.001) experienced higher levels of employment precariousness. Conclusion: Our findings indicate that certain vulnerable groups, such as women, young or older adults, workers with low educational attainment, and caregiving or low-skilled elementary workers, are disproportionately exposed to high employment precariousness. Active policy interventions are needed to improve the employment quality of vulnerable groups.

The Study on the Distribution of Sasang Constitution and UPDRS(Unified Parkinson's Disease Rating Scale) among Parkinson's Disease Patients (파킨슨 환자의 사상채질 및 UPDRS 분포 연구)

  • Jung, Ji-Chul;Kim, Kun-Hyung;Park, Sang-Min;Lee, Sang-Hoon;Chang, Dae-Il;Lee, Yun-Ho
    • Journal of Acupuncture Research
    • /
    • v.22 no.4
    • /
    • pp.47-54
    • /
    • 2005
  • Objectives : In order to find Sasang constitutional therapies on Parkinson's disease and to make a fundamental basis for clinical application, this study was performed. Methods : We recruit thirty five person as the disease group and we test them by QSCCII. also, We estimate them by UPDRS scale. Results : In the distribution of sasang constitution among parkinson's disease patients and controls, we can know that in the distribution of sasang constitution among parkinson's disease patients, Taeumin has a large division. but we don't find out significantly difference statistically. In the distribution of UPDRS Std. score of sasang constitution, we find out significantly difference statistically. Conclusion : Unfortunately, we fail to lead significantly difference statistically in the sasang constitution among parkinson's disease patients and controls. But, in the distribution of UPDRS Std. score of sasang constitution, we fine out significantly difference statistically. Consequencely, it is necessary that further study on the theme in the more sample.

  • PDF