• 제목/요약/키워드: Multiple missing values

검색결과 52건 처리시간 0.02초

Design of the Integrated Incomplete Information Processing System based on Rough Set

  • Jeong, Gu-Beom;Chung, Hwan-Mook;Kim, Guk-Boh;Park, Kyung-Ok
    • 한국지능시스템학회논문지
    • /
    • 제11권5호
    • /
    • pp.441-447
    • /
    • 2001
  • In general, Rough Set theory is used for classification, inference, and decision analysis of incomplete data by using approximation space concepts in information system. Information system can include quantitative attribute values which have interval characteristics, or incomplete data such as multiple or unknown(missing) data. These incomplete data cause tole inconsistency in information system and decrease the classification ability in system using Rough Sets. In this paper, we present various types of incomplete data which may occur in information system and propose INcomplete information Processing System(INiPS) which converts incomplete information system into complete information system in using Rough Sets.

  • PDF

Frequency Matrix 기법을 이용한 결측치 자료로부터의 개인신용예측 (Predicting Personal Credit Rating with Incomplete Data Sets Using Frequency Matrix technique)

  • 배재권;김진화;황국재
    • Journal of Information Technology Applications and Management
    • /
    • 제13권4호
    • /
    • pp.273-290
    • /
    • 2006
  • This study suggests a frequency matrix technique to predict personal credit rate more efficiently using incomplete data sets. At first this study test on multiple discriminant analysis and logistic regression analysis for predicting personal credit rate with incomplete data sets. Missing values are predicted with mean imputation method and regression imputation method here. An artificial neural network and frequency matrix technique are also tested on their performance in predicting personal credit rating. A data set of 8,234 customers in 2004 on personal credit information of Bank A are collected for the test. The performance of frequency matrix technique is compared with that of other methods. The results from the experiments show that the performance of frequency matrix technique is superior to that of all other models such as MDA-mean, Logit-mean, MDA-regression, Logit-regression, and artificial neural networks.

  • PDF

Anomaly detection in particulate matter sensor using hypothesis pruning generative adversarial network

  • Park, YeongHyeon;Park, Won Seok;Kim, Yeong Beom
    • ETRI Journal
    • /
    • 제43권3호
    • /
    • pp.511-523
    • /
    • 2021
  • The World Health Organization provides guidelines for managing the particulate matter (PM) level because a higher PM level represents a threat to human health. To manage the PM level, a procedure for measuring the PM value is first needed. We use a PM sensor that collects the PM level by laser-based light scattering (LLS) method because it is more cost effective than a beta attenuation monitor-based sensor or tapered element oscillating microbalance-based sensor. However, an LLS-based sensor has a higher probability of malfunctioning than the higher cost sensors. In this paper, we regard the overall malfunctioning, including strange value collection or missing collection data as anomalies, and we aim to detect anomalies for the maintenance of PM measuring sensors. We propose a novel architecture for solving the above aim that we call the hypothesis pruning generative adversarial network (HP-GAN). Through comparative experiments, we achieve AUROC and AUPRC values of 0.948 and 0.967, respectively, in the detection of anomalies in LLS-based PM measuring sensors. We conclude that our HP-GAN is a cutting-edge model for anomaly detection.

이상점 영향력 축소를 통한 무응답 대체법 (A Multiple Imputation for Reducing Outlier Effect)

  • 김만겸;신기일
    • 응용통계연구
    • /
    • 제27권7호
    • /
    • pp.1229-1241
    • /
    • 2014
  • 이상점과 무응답이 동시에 존재하는 경우에는 무응답만 있는 경우에 비해 무응답 대체의 성능이 떨어지게 된다. 이러한 경우에는 먼저 이상점을 탐지하고, 탐지된 이상점의 영향력을 축소한 후 무응답 대체를 실시하여야 한다. 본 논문에서는 이상점의 영향력을 축소하여 무응답 대체법의 성능을 향상시키는 방법을 연구하였다. 이를 위해 She and Owen (2011)이 제안한 이상점 탐지법을 살펴보았고, 탐지된 이상점의 영향력을 줄이기 위한 방법으로 흔히 사용되는 가중치 조정법과 이상점 대체법을 살펴보았다. 또한 이상점 처리 방법을 적용한 무응답 대체법을 살펴보았으며 모의실험과 사례분석을 통하여 이상점 영향력 축소 효과를 살펴보았다.

The Determinants of Customer Loyalty: The Case Study of Saigon Co.op Supermarkets in Vietnam

  • NGUYEN, Cuong Quoc;PHAM, Ngan
    • 유통과학연구
    • /
    • 제19권5호
    • /
    • pp.61-68
    • /
    • 2021
  • Purpose: Retailing is one of the fastest-growing sectors in Vietnam. In the sight of foreign investors, the Vietnamese retailing market is very prospective. However, the current competition is very intensive, and retailers are keen on gaining new customers. Hence, Vietnamese retailers have paid more attention to customer loyalty. Saigon Co.op is one of the largest retailers in Vietnam, but its consumers have more choices over a retailer. As a result, Saigon Co.op has realized the significance of customer loyalty. This study aims to determine the factors impacting customer loyalty of Saigon Co.op supermarket in Vietnam. Research design, data and methodology: this study applied the multiple regression analysis with 250 samples collected from Saigon Co.op customers. The questionnaire is provided to respondents via Google Form, and the link is sent to the fan page of Co.op mart on Facebook. Two hundred eighty-seven samples were collected, but 37 samples were removed due to missing values. Exploratory Factor Analysis (EFA) and regression analysis are used for data analysis on SPSS software version 20. Results: The findings show all four determinants of Saigon Co.op's customer loyalty, including Product Quality, Brand Image, Price Strategy and Service Quality. Conclusions: managerial recommendations are provided for supermarkets to improve their customer loyalty in Vietnam and other emerging markets. Limitations and suggestions for further research are also discussed.

한우에 있어서 유전체 육종가 추정 (Prediction of genomic breeding values of carcass traits using whole genome SNP data in Hanwoo (Korean cattle))

  • 이승환;김형철;임다정;당창권;조용민;김시동;이학교;이준헌;양보석;오성종;홍성구;장원경
    • 농업과학연구
    • /
    • 제39권3호
    • /
    • pp.357-364
    • /
    • 2012
  • Genomic breeding value (GEBV) has recently become available in the beef cattle industry. Genomic selection methods are exceptionally valuable for selecting traits, such as marbling, that are difficult to measure until later in life. One method to utilize information from sparse marker panels is the Bayesian model selection method with RJMCMC. The accuracy of prediction varies between a multiple SNP model with RJMCMC (0.47 to 0.73) and a least squares method (0.11 to 0.41) when using SNP information, while the accuracy of prediction increases in the multiple SNP (0.56 to 0.90) and least square methods (0.21 to 0.63) when including a polygenic effect. In the multiple SNP model with RJMCMC model selection method, the accuracy ($r^2$) of GEBV for marbling predicted based only on SNP effects was 0.47, while the $r^2$ of GEBV predicted by SNP plus polygenic effect was 0.56. The accuracies of GEBV predicted using only SNP information were 0.62, 0.68 and 0.73 for CWT, EMA and BF, respectively. However, when polygenic effects were included, the accuracies of GEBV were increased to 0.89, 0.90 and 0.89 for CWT, EMA and BF, respectively. Our data demonstrate that SNP information alone is missing genetic variation information that contributes to phenotypes for carcass traits, and that polygenic effects compensate genetic variation that whole genome SNP data do not explain. Overall, the multiple SNP model with the RJMCMC model selection method provides a better prediction of GEBV than does the least squares method (single marker regression).

호텔 조리사들의 아열대 채소 구매의도 및 구전에 관한 연구 (A Study on Hotel Chef Subtropical Vegetable Purchase Intention and Word of Mouth)

  • 김하윤
    • 한국조리학회지
    • /
    • 제21권3호
    • /
    • pp.181-197
    • /
    • 2015
  • 본 연구를 수행하기 위하여 호텔 조리사들의 아열대 채소에 대한 인지된 가치, 품질, 가격에 대한 인식을 조사하여 조리사들의 아열대 채소에 대한 신뢰와 이용의도, 구전의도 등을 조사하였다. 이 연구를 수행하기 위하여 서울 시내에 있는 호텔 조리사 중 아열대 채소를 경험해본 조리사들을 대상으로 연구를 진행하였다. 2014년 10월 1일부터 20까지 총 20일 동안 380부의 설문을 배포하여 불성실한 응답을 제외하고 총 353개의 설문을 분석에 이용하였다. 353개의 설문을 가지고, 빈도분석, 요인분석, 상관관계분석, 회귀분석을 실시하였다. 분석결과, 아열대 채소의 지각된 가치, 지각된 품질, 합리적 가격요인은 신뢰에 긍정적인 영향을 주었고, 신뢰는 구매의도와 구전의도에 긍정적인 영향을 주었다. 마지막으로 구매 의도는 구전에 긍정적인 영향을 주었다. 그러므로 아열대 채소의 품질을 높이고, 가격을 합리적인 수준으로 생산량을 늘리고, 제품에 대한 영양적, 기능적 연구에 노력하여 좋은 식재료가 될 수 있도록 노력해야 한다.

지역아동센터 사회복지사의 전문성 인식이 조직헌신에 미치는 영향 (Effect of Local Child Care Centers' Social Workers Perceptions Professionalism on Organizational Commitment)

  • 임동호;김대석
    • 한국콘텐츠학회논문지
    • /
    • 제14권11호
    • /
    • pp.196-204
    • /
    • 2014
  • 본 연구는 지역아동센터에서 근무하는 사회복지사의 전문성 인식이 조직헌신에 미치는 영향을 실증적으로 분석하고자 하였다. 연구대상은 전라남도 소재의 지역아동센터에서 근무하는 사회복지사를 대상으로 설문조사를 실시하였으며, 회수된 설문지 중 결측값을 제외한 286부를 분석자료로 활용하였다. 분석결과, 전문성 인식은 중간 점수 이상의 수준을 보였으며, 서비스에 대한 신념이 가장 높게 나타났다. 전문성 인식과 조직헌신 간에는 정(+)의 상관관계가 있는 것으로 나타났으며, 특히 직업에 대한 소명의식과 조직헌신 간의 상관관계가 가장 높은 것으로 나타났다. 다중회귀분석 결과, 전문성 인식의 각 하위변인인 직업에 대한 소명의식, 전문조직의 활용이 조직헌신에 영향을 미치는 것으로 나타났으며, 직업에 대한 소명의식이 가장 큰 영향을 미치는 것으로 나타났다. 이와 같은 분석결과를 바탕으로 지역아동센터 사회복지사의 전문성 인식 및 조직헌신의 향상 방안과 후속연구의 과제를 제언하였다.

항공서비스에 대한 고객만족이 거래지속의도에 미치는 영향에 있어서 성별의 역할 (The Influence of Customer Satisfaction on Customer Loyalty and the Moderating Effect of Gender)

  • 김문섭
    • 유통과학연구
    • /
    • 제14권10호
    • /
    • pp.73-79
    • /
    • 2016
  • Purpose - Customer satisfaction has been considered important as a way to retain current customers. Specifically, the retention of current customers through customer satisfaction has been considered important in an industry where competition between companies is fierce. Major Korean airlines have confronted fierce competition with the growth of low cost carriers (LCCs). In order to survive, these airlines need to retain their customers. This research aims to investigate the relationships between customer satisfaction and the customer intention to remain loyal. Moreover, this study examines how the influence of customer satisfaction on customer loyalty is moderated by gender. Research design, data, and methodology - A regression model is developed in which customer satisfaction, gender, and an interaction of satisfaction and gender are predictors and the customer's intention to remain loyal is a dependent variable. To analyze this research model, data were collected from 402 university students taking a marketing class in universities in Seoul, Chung-Cheong province, and Kangwon province. After eliminating data from students who had never flown and data with missing values, a final sample of 201 was analyzed. The hypotheses were tested using SPSS 21.0. Internal reliability was supported by the results of Cronbach's α. Multiple regression was performed. Results - Empirical results showed that customer satisfaction toward the airline's service had a positive influence on the customer intention to remain loyal to the airlines. Moreover, this influence was moderated by gender. More specifically, a male customer's intention to remain loyal was more determined by his satisfaction toward airline service than a female customer's. Conclusions - This research contributes to the aviation service marketing literature by showing how customer satisfaction influences customer intention to remain loyal and how this influence is moderated by gender. More specifically, male customer loyalty is more determined by airline service satisfaction than female customers. These results have manager implications for major Korean airlines in terms of customer satisfaction and gender as ways to enhance customer retention.

불균형 클래스에서 AutoML 기반 분류 모델의 성능 향상을 위한 데이터 처리 (Data Processing of AutoML-based Classification Models for Improving Performance in Unbalanced Classes)

  • 이동준;강지수;정경용
    • 융합정보논문지
    • /
    • 제11권6호
    • /
    • pp.49-54
    • /
    • 2021
  • 최근 스마트 헬스케어 기술의 발전에 따라 일상적인 질환에 대한 관심이 증가하고 있다. 이에 따라 헬스케어 데이터를 통해 예측 모델로 질병을 분석하거나 예측하는 연구들이 증가하고 있다. 그러나 헬스케어 데이터에는 양성 데이터와 음성 데이터의 불균형이 존재한다. 이는 특정 질환을 가진 환자에 비하여 상대적으로 환자가 아닌 사람이 많아 데이터 수집에 어려움이 있어 발생하는 현상이다. 데이터 불균형은 질병 예측 및 탐지 시 진행하는 모델의 성능에 영향을 끼치기 때문에 이를 제거할 필요가 있다. 따라서 본 연구에서는 오버샘플링과 결측값 대치를 통해서 데이터 불균형을 해소한다. AutoML을 기반으로 여러 모델의 성능을 파악하고 모델 중 상위 3개의 모델을 앙상블한다.