• Title/Summary/Keyword: Outlier test

Search Result 109, Processing Time 0.033 seconds

Deep Learning-based Abnormal Behavior Detection System for Dementia Patients (치매 환자를 위한 딥러닝 기반 이상 행동 탐지 시스템)

  • Kim, Kookjin;Lee, Seungjin;Kim, Sungjoong;Kim, Jaegeun;Shin, Dongil;shin, Dong-kyoo
    • Journal of Internet Computing and Services
    • /
    • v.21 no.3
    • /
    • pp.133-144
    • /
    • 2020
  • The number of elderly people with dementia is increasing as fast as the proportion of older people due to aging, which creates a social and economic burden. In particular, dementia care costs, including indirect costs such as increased care costs due to lost caregiver hours and caregivers, have grown exponentially over the years. In order to reduce these costs, it is urgent to introduce a management system to care for dementia patients. Therefore, this study proposes a sensor-based abnormal behavior detection system to manage dementia patients who live alone or in an environment where they cannot always take care of dementia patients. Existing studies were merely evaluating behavior or evaluating normal behavior, and there were studies that perceived behavior by processing images, not data from sensors. In this study, we recognized the limitation of real data collection and used both the auto-encoder, the unsupervised learning model, and the LSTM, the supervised learning model. Autoencoder, an unsupervised learning model, trained normal behavioral data to learn patterns for normal behavior, and LSTM further refined classification by learning behaviors that could be perceived by sensors. The test results show that each model has about 96% and 98% accuracy and is designed to pass the LSTM model when the autoencoder outlier has more than 3%. The system is expected to effectively manage the elderly and dementia patients who live alone and reduce the cost of caring.

Color Contrast Evaluation Algorithm Considering Color Temperature Feeling (색 온도 느낌을 고려한 색 대비 평가 알고리즘)

  • Jang Young-Gun
    • The KIPS Transactions:PartB
    • /
    • v.13B no.4 s.107
    • /
    • pp.471-478
    • /
    • 2006
  • In this paper, two color contrast evaluation algorithms, W3C and NSSC algorithms are compared and investigated to select proper criteria of the color contrast of text-background color combinations in web documents. The relationship between the color contrast defined by existing formula and the readability rating is not perfect and there is quite a bit of variance, in particular, there is some substantial outlier. I modify the NSSC algorithm to apply all colors and compare the two algorithms to apply same color combinations of web safe colors. A new algorithm considering color temperature feeling as a component of the color contrast is proposed and implemented. As the results of this study, the existing two algorithms are not contradictory to each other, 82% of all color combinations of web safe colors are not proper combinations according to W3C guide which provide severe restriction to select colors in web documents compared to NSSC algorithm. Experimental test shows proposed algorithm is superior to the W3C algorithm with respect to the linearity of relationship between color contrast and readability rating. It means a color temperature feeling is an effective component of a color contrast. But to determine best contribution ratio of the color temperature feeling, further study is required and it is related to Hangul font style and size. The more popular a mobile color display is used, the more important accessibility factor a color contrast will be.

The Classification of Forest Cover Types by Consecutive Application of Multivariate Statistical Analysis in the Natural Forest of Western Mt. Jiri (다변량 통계 분석법의 연속 적용에 의한 서부 지리산 천연림의 산림 피복형 분류)

  • Chung, Sang Hoon;Kim, Ji Hong
    • Journal of Korean Society of Forest Science
    • /
    • v.102 no.3
    • /
    • pp.407-414
    • /
    • 2013
  • This study was conducted to classify forest cover types using the multivariate statistical analysis in the natural forest of western Mt. Jiri. On the basis of the vegetation data by point quarter sampling, the adopted analytical methods were species-area curve (SAC), hierarchical cluster analysis (HCA), indicator species analysis (ISA), and multiple discriminant analysis (MDA). SAC selected the outlier tree species which was likely to have no influence on the classification of forest cover types, excluded from all analytical process. Based on forest vegetative information, HCA classified the study area into 2 to 10 clusters and ISA indicated that the optimal number of clusters were seven. MDA was taken to test the clusters that classified with HCA and ISA. The seven clusters were classified appropriately as overall classification success were 91.3%. The classified forest cover types were named by the ratio of the dominant species in the upper layer of each cluster. They were (1) Quercus mongolica Pure forest, (2) Mixed mesophytic forest, (3) Q. mongolica - Q. serrata forest, (4) Abies koreana - Q. mongolica forest, (5) Fraxinus mandshurica forest, (6) Q. serrata forest, and (7) Carpinus laxiflora forest.

Molecular Classification and Characterization of Human Gastric Adenocarcinoma through DNA Microarray

  • Xie, Hongjian;Eun, Jung-Woo;Noh, Ji-Heon;Jeong, Kwang-Wha;Kim, Jung-Kyu;Kim, Su-Young;Lee, Sug-Hyung;Park, Won-Sang;Yoo, Nam-Jin;Lee, Jung-Young;Nam, Suk-Woo
    • Molecular & Cellular Toxicology
    • /
    • v.3 no.3
    • /
    • pp.190-194
    • /
    • 2007
  • Gastric adenocarcinoma (GA) is a major tumor type of gastric cancers and subdivides into several different tumors such as papillary, tubular mucinous, signet-ring cell and adenosquamous carcinoma according to histopatholigical determination. In other hand, GA is also subdivided into intestinal and diffuse type of adenocarcinoma by the Lauren?fs classification. In this study, we have examined differential gene expression pattern analysis of three histologically different GAs of 24 samples by using DNA microarray containing approximately 19000 genetic elements. The hierarchical clustering analysis of 24 gastric adenocarcinomas (12 of intestinal type, 7 of diffuse type and 5 of mixed type) resulted in two major subgroup on dendrogram, and two subgroups included most of intestinal and diffused type of GAs respectively. Supervised analysis of 19 intestinal and diffuse type GAs by using Wilcoxon rank T-test (P<0.01) resulted in 100 outlier genes which exactly separated intestinal and diffuse type of GA by differential gene expression. In conclusion, genome-wide analysis of gene expression of GAs suggested that GAs may subclassify as intestinal and diffused type of GA by their characteristic molecular expression. Our results also provide large-scale genetic elements which reflect molecular differences of intestinal and diffuse type of GAs, and this may facilitate to understand different molecular carcinogenesis of gastric cancer.

Establishment of National Quality Control System for Analytical Laboratory of Pesticide Products by Proficiency Testing (농약 이화학시험 분석기관의 숙련도시험을 통한 정도관리체계 확립 연구)

  • Chang, Hee-Ra;Park, Hyo-Kyung;Lim, Youngjoo;Kim, Kwang-Ho;Kim, Chan Sub;Kim, Kyun
    • The Korean Journal of Pesticide Science
    • /
    • v.16 no.4
    • /
    • pp.350-356
    • /
    • 2012
  • Performance of proficiency testing and the validation of analytical method was included a scheme of quality assurance in analytical chemistry laboratory to monitor a laboratory's performance abilities and produce consistently reliable data. This study was assessed the applicability of proficiency testing scheme proposed for analytical laboratories of pesticide product in domestic. The validation of analytical methods, stability and homogeneity for formulated pesticide products (emulsifiable concentrate) of emamectin benzoate and lufenuron was confirmed for the proficiency testing. The z-score of 33 participation laboratories for emamectin benzoate were that the numbers of outlier were 2 laboratories (6.0%), z-score outside the range from -3 to 3 designated "unaccptable" were 2 laboratories and z-score in the ranges -2 to -3 and 2 to 3 designated "questionable" were 3 laboratories (9.0%). Three laboratories (9.0%) showed the z-score designated "questionable" for lufenuron. The additional proficiency testing for various product types will be needed to establish the scheme of quality control.

A Comparative Study on Methods for Outlier Test of Rainfall in Korea (국내 강우의 이상치검정 방법의 비교 연구)

  • Lee, Jung Sik;Shin, Chang Dong
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.359-359
    • /
    • 2018
  • 이상치는 표본자료에서 크게 어긋나 다른 자료들로부터 떨어져 표시되는 자료로써, 실제로 발생할 확률이 매우 낮은 자료로 정의되고 있다. 설계홍수량을 산정하기 위하여 적용하고 있는 극치계열의 연최대치 강우자료에는 기계오작동 및 엔지니어의 표독오류가 발생하고 있으며, 기후변화에 따른 거대태풍 및 국지적인 집중호우 발생 등으로 인한 극치값 등에서 이상치가 관측되고 있다. 통상 이상치들은 통계분석시 자료 본연의 특성을 왜곡시켜 편향된 결과를 산정할 수 있으므로 빈도해석시 이상치해석 절차를 수행하여 자료의 적정성을 확인하여야 한다. 현재 실무에서는 설계홍수량 산정요령과 하천설계기준 해설 등에서 관련 내용을 기술하고 있지만, 국내 강우자료의 기록연수의 부족으로 인하여 빈도해석시 이상치 해석이 미수행되고 있어 이상치에 따른 자료편의가 발생하면 결과물인 확률강우량이 왜곡되게 산정될 수 있다. 따라서, 본 연구에서는 국내 주요 도시의 강우자료를 이용하여 이상치검정을 수행하였다. 대상지점으로는 서울, 부산, 대전, 대구, 인천, 광주, 울산 등의 비교적 긴 관측년수를 보유하고 있는 광역시를 선정하였으며, 지속기간은 10분, 1~24시간의 25개 강우자료를 적용하였다. 이상치검정 방법으로는 타 방법에 비하여 이상치 검정력이 뛰어난 것으로 알려진 2가지 방법을 채택하였으며, 표본자료의 평균과 표준편차로 표준화된 z값을 이용하여 상 하 한계선를 초과하는 값을 확인하는 z-Score 방법중 향상된 중위수 절대편차(MAD)에 의한 수정 z-Score 방법(Hoaglin, 1993)과 Box-Plot 방법(Tukey, 1969)을 적용하였다. Box-Plot 방법(Tukey, 1969)은 전체 자료를 25%씩 사분위로 구분하는 방법으로 정렬된 자료계열을 중앙값, 박스, 수염(whiskers), 이상치로 구분한다. 정렬된 25~75% 값들을 박스로 포함하여 외곽의 수염값들을 이상치로 분류하며, 특히 사분위수의 도식화로 데이터의 분포를 파악하기 좋으며, 이상치들의 위치와 자료의 비대칭 여부를 쉽게 파악할 수 있다. 본 연구의 수행으로 수정 z-Score 방법의 경우에는 서울과 대구지점에는 이상치가 없으며, 부산지점에는 13개, 대전지점 7개, 인천지점 5개, 광주지점 32개, 울산지점 26개가 나타났다. Box-Plot 방법으로는 서울지점 35개, 부산지점 39개, 대전지점 32개, 대구지점 38개, 인천지점 51개, 광주지점 61개, 울산지점 65개의 이상치가 분석되었다. 연구를 수행한 결과, 수정 z-Score 방법에 비하여 Box-Plot 방법에 의한 이상치가 더 많이 발생하였으며, 각각의 방법으로 지속기간 및 연도별 이상치 발생자료를 확인하였다. 방법별 이상치 발생현황 등을 분석하여 지점별 발생횟수를 분석하였으며, 추후 지점 및 자료의 보완이 수행되면 활용성을 증대시킬 수 있을 것으로 판단된다.

  • PDF

Design of Fetal Health Classification Model for Hospital Operation Management (효율적인 병원보건관리를 위한 태아건강분류 모델)

  • Chun, Je-Ran
    • Journal of Digital Convergence
    • /
    • v.19 no.5
    • /
    • pp.263-268
    • /
    • 2021
  • The purpose of this study was to propose a model which is suitable for the actual delivery system by designing a fetal delivery hospital operation management and fetal health classification model. The number of deaths during childbirth is similar to the number of maternal mortality rate of 295,000 as of 2017. Among those numbers, 94% of deaths are preventable in most cases. Therefore, in this paper, we proposed a model that predicts the health condition of the fetus using data like heart rate of fetuses, fetal movements, uterine contractions, etc. that are extracted from the Cardiotocograms(CTG) test using a random forest. If the redundancy of the data is unbalanced, This proposed model guarantees a stable management of the fetal delivery health management system. To secure the accuracy of the fetal delivery health management system, we remove the outlier which embedded in the system, by setting thresholds for the upper and lower standard deviations. In addition, as the proportion of the sequence class uses the health status of fetus, a small number of classes were replicated by data-resampling to balance the classes. We had the 4~5% improvement and as the result we reached the accuracy of 97.75%. It is expected that the developed model will contribute to prevent death and effective fetal health management, also disease prevention by predicting and managing the fetus'deaths and diseases accurately in advance.

Enhanced Block Matching Scheme for Denoising Images Based on Bit-Plane Decomposition of Images (영상의 이진화평면 분해에 기반한 확장된 블록매칭 잡음제거)

  • Pok, Gouchol
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.12 no.3
    • /
    • pp.321-326
    • /
    • 2019
  • Image denoising methods based on block matching are founded on the experimental observations that neighboring patches or blocks in images retain similar features with each other, and have been proved to show superior performance in denoising different kinds of noise. The methods, however, take into account only neighboring blocks in searching for similar blocks, and ignore the characteristic features of the reference block itself. Consequently, denoising performance is negatively affected when outliers of the Gaussian distribution are included in the reference block which is to be denoised. In this paper, we propose an expanded block matching method in which noisy images are first decomposed into a number of bit-planes, then the range of true signals are estimated based on the distribution of pixels on the bit-planes, and finally outliers are replaced by the neighboring pixels belonging to the estimated range. In this way, the advantages of the conventional Gaussian filter can be added to the blocking matching method. We tested the proposed method through extensive experiments with well known test-bed images, and observed that performance gain can be achieved by the proposed method.

Product Recommender Systems using Multi-Model Ensemble Techniques (다중모형조합기법을 이용한 상품추천시스템)

  • Lee, Yeonjeong;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.39-54
    • /
    • 2013
  • Recent explosive increase of electronic commerce provides many advantageous purchase opportunities to customers. In this situation, customers who do not have enough knowledge about their purchases, may accept product recommendations. Product recommender systems automatically reflect user's preference and provide recommendation list to the users. Thus, product recommender system in online shopping store has been known as one of the most popular tools for one-to-one marketing. However, recommender systems which do not properly reflect user's preference cause user's disappointment and waste of time. In this study, we propose a novel recommender system which uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user's preference. The research data is collected from the real-world online shopping store, which deals products from famous art galleries and museums in Korea. The data initially contain 5759 transaction data, but finally remain 3167 transaction data after deletion of null data. In this study, we transform the categorical variables into dummy variables and exclude outlier data. The proposed model consists of two steps. The first step predicts customers who have high likelihood to purchase products in the online shopping store. In this step, we first use logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. We perform above data mining techniques using SAS E-Miner software. In this study, we partition datasets into two sets as modeling and validation sets for the logistic regression and decision trees. We also partition datasets into three sets as training, test, and validation sets for the artificial neural network model. The validation dataset is equal for the all experiments. Then we composite the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. Bagging is the abbreviation of "Bootstrap Aggregation" and it composite outputs from several machine learning techniques for raising the performance and stability of prediction or classification. This technique is special form of the averaging method. Bumping is the abbreviation of "Bootstrap Umbrella of Model Parameter," and it only considers the model which has the lowest error value. The results show that bumping outperforms bagging and the other predictors except for "Poster" product group. For the "Poster" product group, artificial neural network model performs better than the other models. In the second step, we use the market basket analysis to extract association rules for co-purchased products. We can extract thirty one association rules according to values of Lift, Support, and Confidence measure. We set the minimum transaction frequency to support associations as 5%, maximum number of items in an association as 4, and minimum confidence for rule generation as 10%. This study also excludes the extracted association rules below 1 of lift value. We finally get fifteen association rules by excluding duplicate rules. Among the fifteen association rules, eleven rules contain association between products in "Office Supplies" product group, one rules include the association between "Office Supplies" and "Fashion" product groups, and other three rules contain association between "Office Supplies" and "Home Decoration" product groups. Finally, the proposed product recommender systems provides list of recommendations to the proper customers. We test the usability of the proposed system by using prototype and real-world transaction and profile data. For this end, we construct the prototype system by using the ASP, Java Script and Microsoft Access. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The participants for the survey are 173 persons who use MSN Messenger, Daum Caf$\acute{e}$, and P2P services. We evaluate the user satisfaction using five-scale Likert measure. This study also performs "Paired Sample T-test" for the results of the survey. The results show that the proposed model outperforms the random selection model with 1% statistical significance level. It means that the users satisfied the recommended product list significantly. The results also show that the proposed system may be useful in real-world online shopping store.