• Title/Summary/Keyword: binomial classification

Search Result 8, Processing Time 0.025 seconds

Selecting the optimal threshold based on impurity index in imbalanced classification (불균형 자료에서 불순도 지수를 활용한 분류 임계값 선택)

  • Jang, Shuin;Yeo, In-Kwon
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.5
    • /
    • pp.711-721
    • /
    • 2021
  • In this paper, we propose the method of adjusting thresholds using impurity indices in classification analysis on imbalanced data. Suppose the minority category is Positive and the majority category is Negative for the imbalanced binomial data. When categories are determined based on the commonly used 0.5 basis, the specificity tends to be high in unbalanced data while the sensitivity is relatively low. Increasing sensitivity is important when proper classification of objects in minority categories is relatively important. We explore how to increase sensitivity through adjusting thresholds. Existing studies have adjusted thresholds based on measures such as G-Mean and F1-score, but in this paper, we propose a method to select optimal thresholds using the chi-square statistic of CHAID, the Gini index of CART, and the entropy of C4.5. We also introduce how to get a possible unique value when multiple optimal thresholds are obtained. Empirical analysis shows what improvements have been made compared to the results based on 0.5 through classification performance metrics.

Texture Descriptor for Texture-Based Image Retrieval and Its Application in Computer-Aided Diagnosis System (질감 기반 이미지 검색을 위한 질감 서술자 및 컴퓨터 조력 진단 시스템의 적용)

  • Saipullah, Khairul Muzzammil;Peng, Shao-Hu;Kim, Deok-Hwan
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.47 no.4
    • /
    • pp.34-43
    • /
    • 2010
  • Texture information plays an important role in object recognition and classification. To perform an accurate classification, the texture feature used in the classification must be highly discriminative. This paper presents a novel texture descriptor for texture-based image retrieval and its application in Computer-Aided Diagnosis (CAD) system for Emphysema classification. The texture descriptor is based on the combination of local surrounding neighborhood difference and centralized neighborhood difference and is named as Combined Neighborhood Difference (CND). The local differences of surrounding neighborhood difference and centralized neighborhood difference between pixels are compared and converted into binary codewords. Then binomial factor is assigned to the codewords in order to convert them into high discriminative unique values. The distribution of these unique values is computed and used as the texture feature vectors. The texture classification accuracies using Outex and Brodatz dataset show that CND achieves an average of 92.5%, whereas LBP, LND and Gabor filter achieve 89.3%, 90.7% and 83.6%, respectively. The implementations of CND in the computer-aided diagnosis of Emphysema is also presented in this paper.

Unsupervised Multispectral Image Segmentation Based on 1D Combined Neighborhood Differences (1D 통합된 근접차이에 기반한 자율적인 다중분광 영상 분할)

  • Saipullah, Khairul Muzzammil;Yun, Byung-Choon;Kim, Deok-Hwan
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.11a
    • /
    • pp.625-628
    • /
    • 2010
  • This paper proposes a novel feature extraction method for unsupervised multispectral image segmentation based in one dimensional combined neighborhood differences (1D CND). In contrast with the original CND, which is applied with traditional image, 1D CND is computed on a single pixel with various bands. The proposed algorithm utilizes the sign of differences between bands of the pixel. The difference values are thresholded to form a binary codeword. A binomial factor is assigned to these codeword to form another unique value. These values are then grouped to construct the 1D CND feature image where is used in the unsupervised image segmentation. Various experiments using two LANDSAT multispectral images have been performed to evaluate the segmentation and classification accuracy of the proposed method. The result shows that 1D CND feature outperforms the spectral feature, with average classification accuracy of 87.55% whereas that of spectral feature is 55.81%.

Smoking Rate of Workers according to Employment Status and Industry: 1992-2006 (산업군별 고용형태에 따른 근로자 흡연율 변화 추이: 1992-2006)

  • Kim, Il-Ho;Park, Ki-Soo;Chun, Hee-Ran;Noh, Samuel
    • Korean Journal of Health Education and Promotion
    • /
    • v.28 no.4
    • /
    • pp.15-25
    • /
    • 2011
  • Objectives: The present study examined whether smoking rate has declined in 1992-2006 and who the high risk groups were on industry classification and employment type. Methods: Data from 91,263 persons aged 25-64 years were analyzed from three rounds of the Social Statistical Surveys of Korea between 1992 and 2006. Industry indicators were divided by the 9th Korean Standard Industrial Classification. Age-adjusted prevalence of smoking was calculated. Prevalence ratios(PR) and differences(PD) were estimated using log-binomial regression analysis. Results: Age-adjusted prevalence of smoking decreased between 1992 and 2006, specially the smoking prevalence of regular employees decreased most. PD in age-adjusted prevalence of smoking were the biggest between regular and daily employees. PR of the temporary employees', daily employees', self-employed persons' in order was wider than that of regular employees. PR increased significantly increased between 1999 and 2006 for those in manufacturing, construction, wholesale & retail trade, service industries. Increases in PR(regular/irregular) for women in service industry were statistically significant. Conclusions: Despite reducing overall cigarette smoking rates in males, the smoking rate was not reduced equally by industry classification and employment type in both genders. More adjustable antismoking policies and consideration of employment type are requested to reduce inequalities in smoking.

Freeway Crash Frequency Model Development Based on the Classification of Geometric Alignment Type (선형유형 구분을 통한 고속도로 사고빈도모형 개발 연구)

  • Kim, Sang-Youp;Choi, Jai-Sung;Lee, Soo-Beom;Kim, Seong-Min;Cho, Won-Bum;Kim, Yong-Seok
    • International Journal of Highway Engineering
    • /
    • v.13 no.1
    • /
    • pp.97-105
    • /
    • 2011
  • This paper presents how one can investigate the effects on crash occurrence of freeway geometric design elements including the horizontal, vertical alignment and road environment. At present, the available research results for the most part involve geometric data analysis that are obtained along a relatively long section of freeway, and, because of the long section's diverse geometric conditions, the results tend to miss the specific local geometric impacts on vehicle crashes. In this regard, this research attempts to establish vehicle crash models based on a set of freeway geometric patterns whose crash generating characteristics are identical because they are homogeneous in terms of producing the same vehicle operating speeds, and subsequently their actual relationships are described by providing statistical analysis made in this research. Also each standard is comprised of part of straight, curve and continuous curve. This research has revealed that each type of model has different relation between accident and geometry structure. This research results should be useful for doing more reasonable highway designs and safety audit analysis.

Analysis of Influencing Factors on the Outpatient Prescription of Antipsychotic Drugs in the Elderly Patients (노인환자의 항정신병 약물 원외처방 내역에 미친 영향 요인 분석)

  • Dong, Jae Yong;Lee, Hyun Ji;Lee, Tae Hoon;Kim, Yujeong
    • Korean Journal of Clinical Pharmacy
    • /
    • v.31 no.4
    • /
    • pp.268-277
    • /
    • 2021
  • Background: Most antipsychotic drugs studies have been mainly conducted on side effects, randomized clinical trials, utilization rates, and trends. But there have been few studies on the influencing factors in elderly patients. The purpose of this study was to analyze the influencing factors on the outpatient prescription of antipsychotic drugs in the elderly patients. Methods: Active ingredients of antipsychotic drugs in Korea were selected according to the Korean Pharmaceutical Information Center (KPIC)'s classification. Data source was Korean Health Insurance Review and Assessment Service (HIRA) claims data in 2020 and target patient group was the elderly patient group. We extracted patients who have been prescribed one or more antipsychotic drugs and visited only one medical institution. Data were analyzed using descriptive statistics, chi-square, t-test, negative binomial regression. Results: A number of outpatients were 245,197 and prescriptions were 1,379,092. Most characteristics of patients were 75-85 year's old, female, health insurance type, no disease (dementia, schizophrenia), atypical drugs, cci score (>2) and characteristics of medical institution were neurology in specialty, rural region, general hospitals. Results of regression showed that patient's characteristics and medical center characteristics had significant effect on the outpatient prescription of antipsychotic drugs in the elderly patients. Conclusion: This study suggests that national policy of antipsychotic drugs in the elderly patients, with the consideration of the patients' and medical institutions' characteristics, is needed.

Empirical Examination of Determinants Affecting Safety Incidents in Building Construction (건축공사 안전사고에 대한 현장 요인별 영향력 분석)

  • Hur, Youn-Kyoung;Lee, Seung-Woo;Yoo, Wi-Sung;Song, Tae-Geun
    • Journal of the Korea Institute of Building Construction
    • /
    • v.23 no.5
    • /
    • pp.583-593
    • /
    • 2023
  • For a holistic and precise assessment of safety benchmarks within a construction venture, it's paramount to delineate between the intrinsic features of the construction and its real-time, on-site performance metrics. In this study, we delved into genuine accident instances to discern the interplay between these construction attributes and on-ground performance determinants in relation to safety mishaps, employing the binomial logit analytical framework. Our scrutiny underscored that construction expenditure profoundly modulates the likelihood of fatal occurrences. Notably, variables pertinent to on-site safety protocols wielded considerable influence over both fatal mishaps and accidents implicating multiple personnel. These revelations intimate that while ascertaining the safety quotient of a construction initiative, a mere classification and recalibration based on fiscal dimensions can elucidate much. Yet, a comprehensive safety appraisal necessitates transcending quantitative indices, such as frequency of mishaps or casualty rates, to encapsulate the multifaceted interventions and strategies adopted at the construction locale.

Development of a Failure Probability Model based on Operation Data of Thermal Piping Network in District Heating System (지역난방 열배관망 운영데이터 기반의 파손확률 모델 개발)

  • Kim, Hyoung Seok;Kim, Gye Beom;Kim, Lae Hyun
    • Korean Chemical Engineering Research
    • /
    • v.55 no.3
    • /
    • pp.322-331
    • /
    • 2017
  • District heating was first introduced in Korea in 1985. As the service life of the underground thermal piping network has increased for more than 30 years, the maintenance of the underground thermal pipe has become an important issue. A variety of complex technologies are required for periodic inspection and operation management for the maintenance of the aged thermal piping network. Especially, it is required to develop a model that can be used for decision making in order to derive optimal maintenance and replacement point from the economic viewpoint in the field. In this study, the analysis was carried out based on the repair history and accident data at the operation of the thermal pipe network of five districts in the Korea District Heating Corporation. A failure probability model was developed by introducing statistical techniques of qualitative analysis and binomial logistic regression analysis. As a result of qualitative analysis of maintenance history and accident data, the most important cause of pipeline damage was construction erosion, corrosion of pipe and bad material accounted for about 82%. In the statistical model analysis, by setting the separation point of the classification to 0.25, the accuracy of the thermal pipe breakage and non-breakage classification improved to 73.5%. In order to establish the failure probability model, the fitness of the model was verified through the Hosmer and Lemeshow test, the independent test of the independent variables, and the Chi-Square test of the model. According to the results of analysis of the risk of thermal pipe network damage, the highest probability of failure was analyzed as the thermal pipeline constructed by the F construction company in the reducer pipe of less than 250mm, which is more than 10 years on the Seoul area motorway in winter. The results of this study can be used to prioritize maintenance, preventive inspection, and replacement of thermal piping systems. In addition, it will be possible to reduce the frequency of thermal pipeline damage and to use it more aggressively to manage thermal piping network by establishing and coping with accident prevention plan in advance such as inspection and maintenance.