• Title/Summary/Keyword: classification error

Search Result 828, Processing Time 0.023 seconds

Validity of the diagnosis of diabetic microvascular complications in Korean national health insurance claim data

  • Kim, Hyung Jun;Park, Moo-Seok;Kim, Jee-Eun;Song, Tae-Jin
    • Annals of Clinical Neurophysiology
    • /
    • v.24 no.1
    • /
    • pp.7-16
    • /
    • 2022
  • Background: There is inadequate information on the validation of diabetic microvascular complications in the Korean National Health Insurance Service data set. We aimed to validate the diagnostic algorithms regarding the nephropathy, neuropathy, and retinopathy of diabetes. Methods: From various secondary and tertiary medical centers, we selected 6,493 patients aged ≥ 40 years who were diagnosed with diabetic microvascular complications more than once based on codes in the 10th version of the International Classification of Diseases (ICD-10). During 2019 and 2020, we randomly selected the diagnoses of 200 patients, 100 from each of two hospitals. The positive predictive value (PPV), negative predictive value, error rate, sensitivity, and specificity were determined for each diabetic microvascular complication according to the ICD-10 codes, laboratory findings, diagnostic studies, and treatment procedure codes. Results: Among the 200 patients who visited the hospital more than once and had the diagnostic codes of diabetic microvascular complications, 142, 110, and 154 patients were confirmed to have the gold standard of diabetic nephropathy (PPV, 71.0%), diabetic neuropathy (PPV, 55.0%), and diabetic retinopathy (PPV, 77.0%), respectively. The PPV and specificity of diabetic nephropathy (PPV, 71.0-81.4%; specificity, 10.3-53.4%), diabetic neuropathy (PPV, 55.0-81.3%; specificity, 66.7-76.7%) and diabetic retinopathy (PPV, 77.0-96.6%; specificity, 2.2-89.1%) increased after combining them with the laboratory findings, diagnostic studies, and treatment procedures codes. These change trends were observed similarly for both hospitals. Conclusions: Defining diabetic microvascular complications using ICD-10 codes and their related examination codes may be a feasible method for studying diabetic complications.

A Novel Grasshopper Optimization-based Particle Swarm Algorithm for Effective Spectrum Sensing in Cognitive Radio Networks

  • Ashok, J;Sowmia, KR;Jayashree, K;Priya, Vijay
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.2
    • /
    • pp.520-541
    • /
    • 2023
  • In CRNs, SS is of utmost significance. Every CR user generates a sensing report during the training phase beneath various circumstances, and depending on a collective process, either communicates or remains silent. In the training stage, the fusion centre combines the local judgments made by CR users by a majority vote, and then returns a final conclusion to every CR user. Enough data regarding the environment, including the activity of PU and every CR's response to that activity, is acquired and sensing classes are created during the training stage. Every CR user compares their most recent sensing report to the previous sensing classes during the classification stage, and distance vectors are generated. The posterior probability of every sensing class is derived on the basis of quantitative data, and the sensing report is then classified as either signifying the presence or absence of PU. The ISVM technique is utilized to compute the quantitative variables necessary to compute the posterior probability. Here, the iterations of SVM are tuned by novel GO-PSA by combining GOA and PSO. Novel GO-PSA is developed since it overcomes the problem of computational complexity, returns minimum error, and also saves time when compared with various state-of-the-art algorithms. The dependability of every CR user is taken into consideration as these local choices are then integrated at the fusion centre utilizing an innovative decision combination technique. Depending on the collective choice, the CR users will then communicate or remain silent.

Development of Functional Scenarios for Automated Vehicle Assessment : Focused on Tollgate and Ramp Sections (자율주행차 평가용 상황 시나리오 개발 : 톨게이트, 램프 구간을 중심으로)

  • Jongmin Noh;Woori Ko;Joong Hyo Kim;Seok Jin Oh;Ilsoo Yun
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.21 no.6
    • /
    • pp.250-265
    • /
    • 2022
  • Positive effects such as significantly reducing traffic accidents caused by human error can be expected by the introduction of Automated vehicles (AV). However, as new traffic safety issues are expected to occur in the future due to errors in H/W or S/W of autonomous vehicles and lack of its function, it is necessary to establish a scenario to evaluate the driving safety of AV. Therefore, in this study, functional scenario was developed to evaluate the driving safety of AV based on traffic accident data of the National Police Agency. Using the GIS program, QGIS, traffic accident data that occurred in the toll gate and ramp sections of expressway were extracted and accident summary items were checked to classify the types of accident. In addition, based on the results of accident type classification, functional scenario were developed that contains various dangerous situations in the tollgate and ramp sections.

Comparison of Feature Selection Methods Applied on Risk Prediction for Hypertension (고혈압 위험 예측에 적용된 특징 선택 방법의 비교)

  • Khongorzul, Dashdondov;Kim, Mi-Hye
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.3
    • /
    • pp.107-114
    • /
    • 2022
  • In this paper, we have enhanced the risk prediction of hypertension using the feature selection method in the Korean National Health and Nutrition Examination Survey (KNHANES) database of the Korea Centers for Disease Control and Prevention. The study identified various risk factors correlated with chronic hypertension. The paper is divided into three parts. Initially, the data preprocessing step of removes missing values, and performed z-transformation. The following is the feature selection (FS) step that used a factor analysis (FA) based on the feature selection method in the dataset, and feature importance (FI) and multicollinearity analysis (MC) were compared based on FS. Finally, in the predictive analysis stage, it was applied to detect and predict the risk of hypertension. In this study, we compare the accuracy, f-score, area under the ROC curve (AUC), and mean standard error (MSE) for each model of classification. As a result of the test, the proposed MC-FA-RF model achieved the highest accuracy of 80.12%, MSE of 0.106, f-score of 83.49%, and AUC of 85.96%, respectively. These results demonstrate that the proposed MC-FA-RF method for hypertension risk predictions is outperformed other methods.

Development of Methodology for Measuring Water Level in Agricultural Water Reservoir through Deep Learning anlaysis of CCTV Images (딥러닝 기법을 이용한 농업용저수지 CCTV 영상 기반의 수위계측 방법 개발)

  • Joo, Donghyuk;Lee, Sang-Hyun;Choi, Gyu-Hoon;Yoo, Seung-Hwan;Na, Ra;Kim, Hayoung;Oh, Chang-Jo;Yoon, Kwang-Sik
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.65 no.1
    • /
    • pp.15-26
    • /
    • 2023
  • This study aimed to evaluate the performance of water level classification from CCTV images in agricultural facilities such as reservoirs. Recently, the CCTV system, widely used for facility monitor or disaster detection, can automatically detect and identify people and objects from the images by developing new technologies such as a deep learning system. Accordingly, we applied the ResNet-50 deep learning system based on Convolutional Neural Network and analyzed the water level of the agricultural reservoir from CCTV images obtained from TOMS (Total Operation Management System) of the Korea Rural Community Corporation. As a result, the accuracy of water level detection was improved by excluding night and rainfall CCTV images and applying measures. For example, the error rate significantly decreased from 24.39 % to 1.43 % in the Bakseok reservoir. We believe that the utilization of CCTVs should be further improved when calculating the amount of water supply and establishing a supply plan according to the integrated water management policy.

Joint Reasoning of Real-time Visual Risk Zone Identification and Numeric Checking for Construction Safety Management

  • Ali, Ahmed Khairadeen;Khan, Numan;Lee, Do Yeop;Park, Chansik
    • International conference on construction engineering and project management
    • /
    • 2020.12a
    • /
    • pp.313-322
    • /
    • 2020
  • The recognition of the risk hazards is a vital step to effectively prevent accidents on a construction site. The advanced development in computer vision systems and the availability of the large visual database related to construction site made it possible to take quick action in the event of human error and disaster situations that may occur during management supervision. Therefore, it is necessary to analyze the risk factors that need to be managed at the construction site and review appropriate and effective technical methods for each risk factor. This research focuses on analyzing Occupational Safety and Health Agency (OSHA) related to risk zone identification rules that can be adopted by the image recognition technology and classify their risk factors depending on the effective technical method. Therefore, this research developed a pattern-oriented classification of OSHA rules that can employ a large scale of safety hazard recognition. This research uses joint reasoning of risk zone Identification and numeric input by utilizing a stereo camera integrated with an image detection algorithm such as (YOLOv3) and Pyramid Stereo Matching Network (PSMNet). The research result identifies risk zones and raises alarm if a target object enters this zone. It also determines numerical information of a target, which recognizes the length, spacing, and angle of the target. Applying image detection joint logic algorithms might leverage the speed and accuracy of hazard detection due to merging more than one factor to prevent accidents in the job site.

  • PDF

Voice Activity Detection Based on SVM Classifier Using Likelihood Ratio Feature Vector (우도비 특징 벡터를 이용한 SVM 기반의 음성 검출기)

  • Jo, Q-Haing;Kang, Sang-Ki;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.8
    • /
    • pp.397-402
    • /
    • 2007
  • In this paper, we apply a support vector machine(SVM) that incorporates an optimized nonlinear decision rule over different sets of feature vectors to improve the performance of statistical model-based voice activity detection(VAD). Conventional method performs VAD through setting up statistical models for each case of speech absence and presence assumption and comparing the geometric mean of the likelihood ratio (LR) for the individual frequency band extracted from input signal with the given threshold. We propose a novel VAD technique based on SVM by treating the LRs computed in each frequency bin as the elements of feature vector to minimize classification error probability instead of the conventional decision rule using geometric mean. As a result of experiments, the performance of SVM-based VAD using the proposed feature has shown better results compared with those of reported VADs in various noise environments.

A Study on the Visiting Areas Classification of Cargo Vehicles Using Dynamic Clustering Method (화물차량의 방문시설 공간설정 방법론 연구)

  • Bum Chul Cho;Eun A Cho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.6
    • /
    • pp.141-156
    • /
    • 2023
  • This study aims to improve understanding of freight movement, crucial for logistics facility investment and policy making. It addresses the limitations of traditional freight truck traffic data, aggregated only at city and county levels, by developing a new methodology. This method uses trip chain data for more detailed, facility-level analysis of freight truck movements. It employs DTG (Digital Tachograph) data to identify individual truck visit locations and creates H3 system-based polygons to represent these visits spatially. The study also involves an algorithm to dynamically determine the optimal spatial resolution of these polygons. Tested nationally, the approach resulted in polygons with 81.26% spatial fit and 14.8% error rate, offering insights into freight characteristics and enabling clustering based on traffic chain characteristics of freight trucks and visited facility types.

Corporate Bond Rating Using Various Multiclass Support Vector Machines (다양한 다분류 SVM을 적용한 기업채권평가)

  • Ahn, Hyun-Chul;Kim, Kyoung-Jae
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.157-178
    • /
    • 2009
  • Corporate credit rating is a very important factor in the market for corporate debt. Information concerning corporate operations is often disseminated to market participants through the changes in credit ratings that are published by professional rating agencies, such as Standard and Poor's (S&P) and Moody's Investor Service. Since these agencies generally require a large fee for the service, and the periodically provided ratings sometimes do not reflect the default risk of the company at the time, it may be advantageous for bond-market participants to be able to classify credit ratings before the agencies actually publish them. As a result, it is very important for companies (especially, financial companies) to develop a proper model of credit rating. From a technical perspective, the credit rating constitutes a typical, multiclass, classification problem because rating agencies generally have ten or more categories of ratings. For example, S&P's ratings range from AAA for the highest-quality bonds to D for the lowest-quality bonds. The professional rating agencies emphasize the importance of analysts' subjective judgments in the determination of credit ratings. However, in practice, a mathematical model that uses the financial variables of companies plays an important role in determining credit ratings, since it is convenient to apply and cost efficient. These financial variables include the ratios that represent a company's leverage status, liquidity status, and profitability status. Several statistical and artificial intelligence (AI) techniques have been applied as tools for predicting credit ratings. Among them, artificial neural networks are most prevalent in the area of finance because of their broad applicability to many business problems and their preeminent ability to adapt. However, artificial neural networks also have many defects, including the difficulty in determining the values of the control parameters and the number of processing elements in the layer as well as the risk of over-fitting. Of late, because of their robustness and high accuracy, support vector machines (SVMs) have become popular as a solution for problems with generating accurate prediction. An SVM's solution may be globally optimal because SVMs seek to minimize structural risk. On the other hand, artificial neural network models may tend to find locally optimal solutions because they seek to minimize empirical risk. In addition, no parameters need to be tuned in SVMs, barring the upper bound for non-separable cases in linear SVMs. Since SVMs were originally devised for binary classification, however they are not intrinsically geared for multiclass classifications as in credit ratings. Thus, researchers have tried to extend the original SVM to multiclass classification. Hitherto, a variety of techniques to extend standard SVMs to multiclass SVMs (MSVMs) has been proposed in the literature Only a few types of MSVM are, however, tested using prior studies that apply MSVMs to credit ratings studies. In this study, we examined six different techniques of MSVMs: (1) One-Against-One, (2) One-Against-AIL (3) DAGSVM, (4) ECOC, (5) Method of Weston and Watkins, and (6) Method of Crammer and Singer. In addition, we examined the prediction accuracy of some modified version of conventional MSVM techniques. To find the most appropriate technique of MSVMs for corporate bond rating, we applied all the techniques of MSVMs to a real-world case of credit rating in Korea. The best application is in corporate bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. For our study the research data were collected from National Information and Credit Evaluation, Inc., a major bond-rating company in Korea. The data set is comprised of the bond-ratings for the year 2002 and various financial variables for 1,295 companies from the manufacturing industry in Korea. We compared the results of these techniques with one another, and with those of traditional methods for credit ratings, such as multiple discriminant analysis (MDA), multinomial logistic regression (MLOGIT), and artificial neural networks (ANNs). As a result, we found that DAGSVM with an ordered list was the best approach for the prediction of bond rating. In addition, we found that the modified version of ECOC approach can yield higher prediction accuracy for the cases showing clear patterns.

A Study on Status of Landscape Architecture Industry with National Statistics (국가통계자료를 활용한 조경산업 현황 연구)

  • Choi, Ja-Ho;Yoon, Young-Kwan;Koo, Bon-Hak
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.50 no.5
    • /
    • pp.40-53
    • /
    • 2022
  • This study carried out to provide the methodology and basic status material of using Korean national statistics needed to find the actual state of the landscape architecture industry. The landscape architecture industry was classified into 'Design', 'Construction Management', 'construction', 'Maintenance & Management', 'Materials', 'Research', 'Education', and 'Administration' areas. In each field, business types were systemized and associated in accordance with Korean standard industrial classification and legislations pertinent to construction. Among them, the business types directly defined in the construction related legislations under the Ministry of Land, Infrastructure and Transport were focused on, and the establishment, association, integration, distribution, duplication, and omission of national statistics were analyzed. As a result, the business types of statistical analysis were selected. In order for commonality of statistical items and minimized error of interpretation, semantic analysis was conducted. Finally, the number of registered business types, the number of workers, and sales were selected. Based on them, the analysis framework applicable to fundamental analysis and evaluation of the actual state of the industry was proposed. Actual national statical data were applied for analysis and evaluation. In 2019, the number of registered business types related to the landscape architecture industry was 12,160, the number of workers by business type was 106,296, and the sales by business type were 8,308.5 billion KRW. The number of registered business types and the number of workers had been on the rise from 2017, whereas the sales had been on the decrease. It is required to come up with a plan for industrial development. This study was conducted with the national statistics established by multiple public institutions, so that there are limitations in securing consistency and reliability. Therefore, it is necessary to establish systematic and consistent national statistics in accordance with 「Landscaping Promotion Act」. In the future, it will planned to research application and development plans of national statistics according to subjects including park and green.