• Title/Summary/Keyword: Multiple missing values

Search Result 49, Processing Time 0.026 seconds

Predicting Personal Credit Rating with Incomplete Data Sets Using Frequency Matrix technique (Frequency Matrix 기법을 이용한 결측치 자료로부터의 개인신용예측)

  • Bae, Jae-Kwon;Kim, Jin-Hwa;Hwang, Kook-Jae
    • Journal of Information Technology Applications and Management
    • /
    • v.13 no.4
    • /
    • pp.273-290
    • /
    • 2006
  • This study suggests a frequency matrix technique to predict personal credit rate more efficiently using incomplete data sets. At first this study test on multiple discriminant analysis and logistic regression analysis for predicting personal credit rate with incomplete data sets. Missing values are predicted with mean imputation method and regression imputation method here. An artificial neural network and frequency matrix technique are also tested on their performance in predicting personal credit rating. A data set of 8,234 customers in 2004 on personal credit information of Bank A are collected for the test. The performance of frequency matrix technique is compared with that of other methods. The results from the experiments show that the performance of frequency matrix technique is superior to that of all other models such as MDA-mean, Logit-mean, MDA-regression, Logit-regression, and artificial neural networks.

  • PDF

Anomaly detection in particulate matter sensor using hypothesis pruning generative adversarial network

  • Park, YeongHyeon;Park, Won Seok;Kim, Yeong Beom
    • ETRI Journal
    • /
    • v.43 no.3
    • /
    • pp.511-523
    • /
    • 2021
  • The World Health Organization provides guidelines for managing the particulate matter (PM) level because a higher PM level represents a threat to human health. To manage the PM level, a procedure for measuring the PM value is first needed. We use a PM sensor that collects the PM level by laser-based light scattering (LLS) method because it is more cost effective than a beta attenuation monitor-based sensor or tapered element oscillating microbalance-based sensor. However, an LLS-based sensor has a higher probability of malfunctioning than the higher cost sensors. In this paper, we regard the overall malfunctioning, including strange value collection or missing collection data as anomalies, and we aim to detect anomalies for the maintenance of PM measuring sensors. We propose a novel architecture for solving the above aim that we call the hypothesis pruning generative adversarial network (HP-GAN). Through comparative experiments, we achieve AUROC and AUPRC values of 0.948 and 0.967, respectively, in the detection of anomalies in LLS-based PM measuring sensors. We conclude that our HP-GAN is a cutting-edge model for anomaly detection.

A Multiple Imputation for Reducing Outlier Effect (이상점 영향력 축소를 통한 무응답 대체법)

  • Kim, Man-Gyeom;Shin, Key-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.7
    • /
    • pp.1229-1241
    • /
    • 2014
  • Most of sampling surveys have outliers and non-response missing values simultaneously. In that case, due to the effect of outliers, the result of imputation is not good enough to meet a given precision. To overcome this situation, outlier treatment should be conducted before imputation. In this paper in order for reducing the effect of outlier, we study outlier imputation methods and outlier weight adjustment methods. For the outlier detection, the method suggested by She and Owen (2011) is used. A small simulation study is conducted and for real data analysis, Monthly Labor Statistic and Briquette Consumption Survey Data are used.

The Determinants of Customer Loyalty: The Case Study of Saigon Co.op Supermarkets in Vietnam

  • NGUYEN, Cuong Quoc;PHAM, Ngan
    • Journal of Distribution Science
    • /
    • v.19 no.5
    • /
    • pp.61-68
    • /
    • 2021
  • Purpose: Retailing is one of the fastest-growing sectors in Vietnam. In the sight of foreign investors, the Vietnamese retailing market is very prospective. However, the current competition is very intensive, and retailers are keen on gaining new customers. Hence, Vietnamese retailers have paid more attention to customer loyalty. Saigon Co.op is one of the largest retailers in Vietnam, but its consumers have more choices over a retailer. As a result, Saigon Co.op has realized the significance of customer loyalty. This study aims to determine the factors impacting customer loyalty of Saigon Co.op supermarket in Vietnam. Research design, data and methodology: this study applied the multiple regression analysis with 250 samples collected from Saigon Co.op customers. The questionnaire is provided to respondents via Google Form, and the link is sent to the fan page of Co.op mart on Facebook. Two hundred eighty-seven samples were collected, but 37 samples were removed due to missing values. Exploratory Factor Analysis (EFA) and regression analysis are used for data analysis on SPSS software version 20. Results: The findings show all four determinants of Saigon Co.op's customer loyalty, including Product Quality, Brand Image, Price Strategy and Service Quality. Conclusions: managerial recommendations are provided for supermarkets to improve their customer loyalty in Vietnam and other emerging markets. Limitations and suggestions for further research are also discussed.

Prediction of genomic breeding values of carcass traits using whole genome SNP data in Hanwoo (Korean cattle) (한우에 있어서 유전체 육종가 추정)

  • Lee, Seung Hwan;Kim, Heong Cheul;Lim, Dajeong;Dang, Chang Gwan;Cho, Yong Min;Kim, Si Dong;Lee, Hak Kyo;Lee, Jun Heon;Yang, Boh Suk;Oh, Sung Jong;Hong, Seong Koo;Chang, Won Kyung
    • Korean Journal of Agricultural Science
    • /
    • v.39 no.3
    • /
    • pp.357-364
    • /
    • 2012
  • Genomic breeding value (GEBV) has recently become available in the beef cattle industry. Genomic selection methods are exceptionally valuable for selecting traits, such as marbling, that are difficult to measure until later in life. One method to utilize information from sparse marker panels is the Bayesian model selection method with RJMCMC. The accuracy of prediction varies between a multiple SNP model with RJMCMC (0.47 to 0.73) and a least squares method (0.11 to 0.41) when using SNP information, while the accuracy of prediction increases in the multiple SNP (0.56 to 0.90) and least square methods (0.21 to 0.63) when including a polygenic effect. In the multiple SNP model with RJMCMC model selection method, the accuracy ($r^2$) of GEBV for marbling predicted based only on SNP effects was 0.47, while the $r^2$ of GEBV predicted by SNP plus polygenic effect was 0.56. The accuracies of GEBV predicted using only SNP information were 0.62, 0.68 and 0.73 for CWT, EMA and BF, respectively. However, when polygenic effects were included, the accuracies of GEBV were increased to 0.89, 0.90 and 0.89 for CWT, EMA and BF, respectively. Our data demonstrate that SNP information alone is missing genetic variation information that contributes to phenotypes for carcass traits, and that polygenic effects compensate genetic variation that whole genome SNP data do not explain. Overall, the multiple SNP model with the RJMCMC model selection method provides a better prediction of GEBV than does the least squares method (single marker regression).

A Study on Hotel Chef Subtropical Vegetable Purchase Intention and Word of Mouth (호텔 조리사들의 아열대 채소 구매의도 및 구전에 관한 연구)

  • Kim, Hayun
    • Culinary science and hospitality research
    • /
    • v.21 no.3
    • /
    • pp.181-197
    • /
    • 2015
  • This study examined the influence of perceived value, perceived quality, reasonable price of subtropical vegetables on trust, purchase intention and word of mouth among hotel chefs. For this investigation, a survey was carried out targeting hotel chefs in Korea with experience with subtropical vegetables. A total of 380 questionnaires were distributed to selected chefs over 20 days from October 1st to October 20th, 2014, of which 353 valid questionnaires were used after the exclusion of responses missing values or too much weighted tendency. A frequency analysis, factor analysis, correlation analysis, and multiple regression analysis were conducted with the use of the SPSS 18.0 package. The analysis results are as follows. First, perceived value, perceived quality and reasonable price had a positive influence on trust. Second, trust had a positive effect on purchase intention and word of mouth. Third, purchase intention positively influenced word of mouth.

Effect of Local Child Care Centers' Social Workers Perceptions Professionalism on Organizational Commitment (지역아동센터 사회복지사의 전문성 인식이 조직헌신에 미치는 영향)

  • Im, Dong-Ho;Kim, Dae-Seok
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.11
    • /
    • pp.196-204
    • /
    • 2014
  • The objective of this study was to empirically analyze the effect that the perceptions professionalism perceived by social workers at local child care centers wold have on organizational commitment. For this study, the social workers at local child care centers in Jeollanam-do were surveyed. 286 sets of questionnaires were used for analysis among the collected ones, excluding the missing values. The results of analysis showed that the professionalism score was above the medium level while the service conviction score was found to be the highest. A positive(+) correlation was observed between perceptions professionalism and organizational commitment. Particularly, the correlation was the highest between occupational consciousness of mission and organizational commitment. Meanwhile, the results of multiple regression analysis suggested that the organizational commitment was affected by occupational consciousness of mission and utilization of professional organization, the sub-variables of perceptions professionalism. Moreover, it was found that the occupational consciousness of mission had the greatest influence on organizational commitment. Based on aforesaid results of analysis, this study presented the direction for the improvement of perceptions professionalism and organizational commitment among social workers of local child care centers, along with the challenges for the succeeding studies.

The Influence of Customer Satisfaction on Customer Loyalty and the Moderating Effect of Gender (항공서비스에 대한 고객만족이 거래지속의도에 미치는 영향에 있어서 성별의 역할)

  • Kim, Moon-Seop
    • Journal of Distribution Science
    • /
    • v.14 no.10
    • /
    • pp.73-79
    • /
    • 2016
  • Purpose - Customer satisfaction has been considered important as a way to retain current customers. Specifically, the retention of current customers through customer satisfaction has been considered important in an industry where competition between companies is fierce. Major Korean airlines have confronted fierce competition with the growth of low cost carriers (LCCs). In order to survive, these airlines need to retain their customers. This research aims to investigate the relationships between customer satisfaction and the customer intention to remain loyal. Moreover, this study examines how the influence of customer satisfaction on customer loyalty is moderated by gender. Research design, data, and methodology - A regression model is developed in which customer satisfaction, gender, and an interaction of satisfaction and gender are predictors and the customer's intention to remain loyal is a dependent variable. To analyze this research model, data were collected from 402 university students taking a marketing class in universities in Seoul, Chung-Cheong province, and Kangwon province. After eliminating data from students who had never flown and data with missing values, a final sample of 201 was analyzed. The hypotheses were tested using SPSS 21.0. Internal reliability was supported by the results of Cronbach's α. Multiple regression was performed. Results - Empirical results showed that customer satisfaction toward the airline's service had a positive influence on the customer intention to remain loyal to the airlines. Moreover, this influence was moderated by gender. More specifically, a male customer's intention to remain loyal was more determined by his satisfaction toward airline service than a female customer's. Conclusions - This research contributes to the aviation service marketing literature by showing how customer satisfaction influences customer intention to remain loyal and how this influence is moderated by gender. More specifically, male customer loyalty is more determined by airline service satisfaction than female customers. These results have manager implications for major Korean airlines in terms of customer satisfaction and gender as ways to enhance customer retention.

Data Processing of AutoML-based Classification Models for Improving Performance in Unbalanced Classes (불균형 클래스에서 AutoML 기반 분류 모델의 성능 향상을 위한 데이터 처리)

  • Lee, Dong-Joon;Kang, Ji-Soo;Chung, Kyungyong
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.6
    • /
    • pp.49-54
    • /
    • 2021
  • With the recent development of smart healthcare technology, interest in daily diseases is increasing. However, healthcare data has an imbalance between positive and negative data. This is caused by the difficulty of collecting data because there are relatively many people who are not patients compared to patients with certain diseases. Data imbalances need to be adjusted because they affect performance in ongoing learning during disease prediction and analysis. Therefore, in this paper, We replace missing values through multiple imputation in detection models to determine whether they are prevalent or not, and resolve data imbalances through over-sampling. Based on AutoML using preprocessed data, We generate several models and select top 3 models to generate ensemble models.

Numerical Model for Cerebrovascular Hemodynamics with Indocyanine Green Fluorescence Videoangiography

  • Hwayeong Cheon;Young-Je Son;Sung Bae Park;Pyoung-Seop Shim;Joo-Hiuk Son;Hee-Jin Yang
    • Journal of Korean Neurosurgical Society
    • /
    • v.66 no.4
    • /
    • pp.382-392
    • /
    • 2023
  • Objective : The use of indocyanine green videoangiography (ICG-VA) to assess blood flow in the brain during cerebrovascular surgery has been increasing. Clinical studies on ICG-VA have predominantly focused on qualitative analysis. However, quantitative analysis numerical modelling for time profiling enables a more accurate evaluation of blood flow kinetics. In this study, we established a multiple exponential modified Gaussian (multi-EMG) model for quantitative ICG-VA to understand accurately the status of cerebral hemodynamics. Methods : We obtained clinical data of cerebral blood flow acquired the quantitative analysis ICG-VA during cerebrovascular surgery. Varied asymmetric peak functions were compared to find the most matching function form with clinical data by using a nonlinear regression algorithm. To verify the result of the nonlinear regression, the mode function was applied to various types of data. Results : The proposed multi-EMG model is well fitted to the clinical data. Because the primary parameters-growth and decay rates, and peak center and heights-of the model are characteristics of model function, they provide accurate reference values for assessing cerebral hemodynamics in various conditions. In addition, the primary parameters can be estimated on the curves with partially missed data. The accuracy of the model estimation was verified by a repeated curve fitting method using manipulation of missing data. Conclusion : The multi-EMG model can possibly serve as a universal model for cerebral hemodynamics in a comparison with other asymmetric peak functions. According to the results, the model can be helpful for clinical research assessment of cerebrovascular hemodynamics in a clinical setting.