• Title/Summary/Keyword: Clustering sampling

Search Result 86, Processing Time 0.028 seconds

Evaluation of Multivariate Stream Data Reduction Techniques (다변량 스트림 데이터 축소 기법 평가)

  • Jung, Hung-Jo;Seo, Sung-Bo;Cheol, Kyung-Joo;Park, Jeong-Seok;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.889-900
    • /
    • 2006
  • Even though sensor networks are different in user requests and data characteristics depending on each application area, the existing researches on stream data transmission problem focus on the performance improvement of their methods rather than considering the original characteristic of stream data. In this paper, we introduce a hierarchical or distributed sensor network architecture and data model, and then evaluate the multivariate data reduction methods suitable for user requirements and data features so as to apply reduction methods alternatively. To assess the relative performance of the proposed multivariate data reduction methods, we used the conventional techniques, such as Wavelet, HCL(Hierarchical Clustering), Sampling and SVD (Singular Value Decomposition) as well as the experimental data sets, such as multivariate time series, synthetic data and robot execution failure data. The experimental results shows that SVD and Sampling method are superior to Wavelet and HCL ia respect to the relative error ratio and execution time. Especially, since relative error ratio of each data reduction method is different according to data characteristic, it shows a good performance using the selective data reduction method for the experimental data set. The findings reported in this paper can serve as a useful guideline for sensor network application design and construction including multivariate stream data.

Reinforced Generator GAN Model for Tabular Data Learning (Tabular Data 학습을 위한 강화형 생성자 GAN Mode)

  • Chan-sik Sung;Joon-sik Lim
    • Journal of Internet Computing and Services
    • /
    • v.25 no.5
    • /
    • pp.121-130
    • /
    • 2024
  • Tabular Data is a mixture of numerical and categorical data, and machine learning models have been evaluated to be more suitable than generative models in performing learning using such tabular data. This evaluation is because the generative model had a problem of excessively increasing parameters or not finding the direction of learning due to the numerical multimodal distribution and categorical frequency imbalance, which are characteristics of Tabular Data. However, as data gradually becomes big data and becomes real-time, existing machine learning models have shown limitations in their application. In this paper, as a methodology for applying generative models to tabular data, we propose RGGAN (Reinforced Generator GAN), a reinforced generator adversarial neural network that Clustering sampling that leverages conjugate prior distributions and the loss function improved with Gower coefficients and mutual information. As a result of measuring the AUC by detecting fraudulent transactions in the IEEE-CIS Fraud Detection Dataset by constructing an anomaly detector with the discriminators learned from the RGGAN proposed in this paper, it showed a performance improvement effect of 1-7% over the existing generative models, proving that the proposed model is effective for learning tabular data and also effective in detecting fraudulent transactions.

Multivariate Stratification Method for the Multipurpose Sample Survey : A Case Study of the Sample Design for Fisher Production Survey (다목적 표본조사를 위한 다변량 층화 : 어업비계통생산량조사를 위한 표본설계 사례)

  • Park, Jin-Woo;Kim, Young-Won;Lee, Seok-Hoon;Shin, Ji-Eun
    • Survey Research
    • /
    • v.9 no.1
    • /
    • pp.69-85
    • /
    • 2008
  • Stratification is a feature of the majority of field sample design. This paper considers the multivariate stratification strategy for multipurpose sample survey with several auxiliary variables. In a multipurpose survey, stratification procedure is very complicated because we have to simultaneously consider the efficiencies of stratification for several variables of interest. We propose stratification strategy based on factor analysis and cluster analysis using several stratification variables. To improve the efficiency of stratification, we first select the stratification variables by factor analysis, and then apply the K-means clustering algorithm to the formation of strata. An application of the stratification strategy in the sampling design for the Fisher Production Survey is discussed, and it turns out that the variances of estimators are significantly less than those obtained by simple random sampling.

  • PDF

Comparison of the oral microbial composition between healthy individuals and periodontitis patients in different oral sampling sites using 16S metagenome profiling

  • Kim, Yeon-Tae;Jeong, Jinuk;Mun, Seyoung;Yun, Kyeongeui;Han, Kyudong;Jeong, Seong-Nyum
    • Journal of Periodontal and Implant Science
    • /
    • v.52 no.5
    • /
    • pp.394-410
    • /
    • 2022
  • Purpose: The purpose of this study was to compare the microbial composition of 3 types of oral samples through 16S metagenomic sequencing to determine how to resolve some sampling issues that occur during the collection of sub-gingival plaque samples. Methods: In total, 20 subjects were recruited. In both the healthy and periodontitis groups, samples of saliva and supra-gingival plaque were collected. Additionally, in the periodontitis group, sub-gingival plaque samples were collected from the deepest periodontal pocket. After DNA extraction from each sample, polymerase chain reaction amplification was performed on the V3-V4 hypervariable region on the 16S rRNA gene, followed by metagenomic sequencing and a bioinformatics analysis. Results: When comparing the healthy and periodontitis groups in terms of alpha-diversity, the saliva samples demonstrated much more substantial differences in bacterial diversity than the supra-gingival plaque samples. Moreover, in a comparison between the samples in the case group, the diversity score of the saliva samples was higher than that of the supra-gingival plaque samples, and it was similar to that of the sub-gingival plaque samples. In the beta-diversity analysis, the sub-gingival plaque samples exhibited a clustering pattern similar to that of the periodontitis group. Bacterial relative abundance analysis at the species level indicated lower relative frequencies of bacteria in the healthy group than in the periodontitis group. A statistically significant difference in frequency was observed in the saliva samples for specific pathogenic species (Porphyromonas gingivalis, Treponema denticola, and Prevotella intermedia). The saliva samples exhibited a similar relative richness of bacterial communities to that of sub-gingival plaque samples. Conclusions: In this 16S oral microbiome study, we confirmed that saliva samples had a microbial composition that was more similar to that of sub-gingival plaque samples than to that of supra-gingival plaque samples within the periodontitis group.

Visual Model of Pattern Design Based on Deep Convolutional Neural Network

  • Jingjing Ye;Jun Wang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.311-326
    • /
    • 2024
  • The rapid development of neural network technology promotes the neural network model driven by big data to overcome the texture effect of complex objects. Due to the limitations in complex scenes, it is necessary to establish custom template matching and apply it to the research of many fields of computational vision technology. The dependence on high-quality small label sample database data is not very strong, and the machine learning system of deep feature connection to complete the task of texture effect inference and speculation is relatively poor. The style transfer algorithm based on neural network collects and preserves the data of patterns, extracts and modernizes their features. Through the algorithm model, it is easier to present the texture color of patterns and display them digitally. In this paper, according to the texture effect reasoning of custom template matching, the 3D visualization of the target is transformed into a 3D model. The high similarity between the scene to be inferred and the user-defined template is calculated by the user-defined template of the multi-dimensional external feature label. The convolutional neural network is adopted to optimize the external area of the object to improve the sampling quality and computational performance of the sample pyramid structure. The results indicate that the proposed algorithm can accurately capture the significant target, achieve more ablation noise, and improve the visualization results. The proposed deep convolutional neural network optimization algorithm has good rapidity, data accuracy and robustness. The proposed algorithm can adapt to the calculation of more task scenes, display the redundant vision-related information of image conversion, enhance the powerful computing power, and further improve the computational efficiency and accuracy of convolutional networks, which has a high research significance for the study of image information conversion.

An Empirical Study of the Usage Performance of Mobile Emoticons : Applying to the Five Construct Model by Huang et al.

  • Lim, Se-Hun;Kim, Dae-Kil;Watts, Sean
    • Journal of Information Technology Applications and Management
    • /
    • v.18 no.4
    • /
    • pp.21-40
    • /
    • 2011
  • Emoticons perform an important role as an enhancement to written communication, in areas such as Windows Live Messenger instant messaging, e-mails, mobile Short Message Services (SMS), and others. Emoticons are graphic images used in communications to indicate the feelings of people exchanging messages via mobile technology. In this research, the perceived usefulness of the emoticon in mobile phone text messages is verified with consumers using the five construct model of Huang. A K-means clustering technique for separating three groups based on levels of perceived usefulness of mobile emoticons is used with a structural equation model test using Smart PLS 2.0, and the bootstrap re-sampling procedure. We analyzed relationships among use of emoticons, enjoyment, interaction, information richness, and perceived usefulness. The results show there are relationships among use of emoticons, enjoyment, interaction, perceived usefulness, and information richness, however enjoyment of emoticons did not significantly affect the perceived usefulness of messages with emoticons alone. The results suggest emoticons have different affects on emotion in both mobile, and Messenger contexts. Our study did not consider more detailed media properties, and thus more studies are needed. Our research results contribute to mobile communication activation, provides companies with an understanding of key characteristics of consumers who use emoticons, and provides useful implications for improving management and marketing strategies.

Menstrual Attitudes and Maternal Child Rearing Attitudes in Middle School Female Students (여중생의 월경태도와 어머니 양육태도)

  • Hong, Kyoung-Ja;Kim, Hae-Won;Ahn, Hye-Young
    • Journal of Korean Academy of Nursing
    • /
    • v.38 no.5
    • /
    • pp.748-757
    • /
    • 2008
  • Purpose: This correlational study was performed to identify the impacts of maternal child rearing attitudes on the menstrual attitudes and the determinants of positive menstrual attitudes in female middle school students. Methods: With convenience sampling, 198 middle school female students were recruited living in one major city and its surrounding areas in Korea. Data was collected using a self administered questionnaire including menstrual attitudes and maternal child rearing attitudes from April 1 to July 15, 2008. Results: Among the Maternal child rearing attitudes, affectionate, achievement oriented and rational attitudes had positive correlations to a positive menstrual attitude, and an autonomous attitude had a negative correlation to a negative menstrual attitude. As determinants of positive menstrual attitudes, feeling of menarche, mother's response at first menstruation, and rational maternal child rearing attitudes were delineated and their explained variance for a positive menstrual attitude was 18.5%. There was no difference on menstrual attitudes by K clustering in terms of maternal child rearing attitudes. Conclusion: These results support the critical role of the mother. Especially desirable maternal child rearing attitudes in relation to a positive menstrual attitude would be affectionate, achievement oriented and rational for early adolescent girls. In further studies, considerations are needed for menstruation related education and research for early adolescents and active involvement of the mother & daughter together.

Complex sample design effects and inference for Korea National Health and Nutrition Examination Survey data (국민건강영양조사 자료의 복합표본설계효과와 통계적 추론)

  • Chung, Chin-Eun
    • Journal of Nutrition and Health
    • /
    • v.45 no.6
    • /
    • pp.600-612
    • /
    • 2012
  • Nutritional researchers world-wide are using large-scale sample survey methods to study nutritional health epidemiology and services utilization in general, non-clinical populations. This article provides a review of important statistical methods and software that apply to descriptive and multivariate analysis of data collected in sample surveys, such as national health and nutrition examination survey. A comparative data analysis of the Korea National Health and Nutrition Examination Survey (KNHANES) was used to illustrate analytical procedures and design effects for survey estimates of population statistics, model parameters, and test statistics. This article focused on the following points, method of approach to analyze of the sample survey data, right software tools available to perform these analyses, and correct survey analysis methods important to interpretation of survey data. It addresses the question of approaches to analysis of complex sample survey data. The latest developments in software tools for analysis of complex sample survey data are covered, and empirical examples are presented that illustrate the impact of survey sample design effects on the parameter estimates, test statistics, and significance probabilities (p values) for univariate and multivariate analyses.

Light Contribution Based Importance Sampling for the Many-Light Problem (다광원 문제를 위한 광원 기여도 기반의 중요도 샘플링)

  • Kim, Hyo-Won;Ki, Hyun-Woo;Oh, Kyoung-Su
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06b
    • /
    • pp.240-245
    • /
    • 2008
  • 컴퓨터 그래픽스에서 많은 광원들을 포함하는 장면을 사실적으로 렌더링하기 위해서는, 많은 양의 조명 계산을 수행해야 한다. 다수의 광원들로부터 빠르게 조명 계산을 하기 위해 많이 사용되는 기법 중에 몬테 카를로(Monte Carlo) 기법이 있다. 본 논문은 이러한 몬테 카를로(Monte Carlo) 기법을 기반으로, 다수의 광원들을 효과적으로 샘플링 할 수 있는 새로운 중요도 샘플링 기법을 제안한다. 제안된 기법의 두 가지 핵심 아이디어는 첫째, 장면 내에 다수의 광원이 존재하여도 어떤 특정 지역에 많은 영향을 주는 광원은 일부인 경우가 많다는 점이고 두 번째는 공간 일관성(spatial coherence)이 낮거나 그림자 경계 지역에 위치한 픽셀들은 영향을 받는 주요 광원이 서로 다르다는 점이다. 제안된 기법은 이러한 관찰에 착안하여 특정 지역에 광원이 기여하는 정도를 평가하고 이에 비례하게 확률 밀도 함수(PDF: Probability Density Function)를 결정하는 방법을 제안한다. 이를 위하여 이미지 공간상에서 픽셀들을 클러스터링(clustering)하고 클러스터 구조를 기반으로 대표 샘플을 선정한다. 선정된 대표 샘플들로부터 광원들의 기여도를 평가하고 이를 바탕으로 클러스터 단위의 확률 밀도 함수를 결정하여 최종 렌더링을 수행한다. 본 논문이 제안하는 샘플링 기법을 적용했을 때 전통적인 샘플링 방식과 비교하여 같은 샘플링 개수에서 노이즈(noise)가 적게 발생하는 좋은 화질을 얻을 수 있었다. 제안된 기법은 다수의 조명과 다양한 재질, 복잡한 가려짐이 존재하는 장면을 효과적으로 표현할 수 있다.

  • PDF

Fauna and Relative Abundance of the Insects Collected by Black Light Traps in Gotjawal Terrains of Jeju Island, Korea (Exclusion of Lepidoptera)

  • Yang, Kyoung-Sik;Kim, Sang-Bum;Kim, Seong-Yoon;Jeong, Sang-Bae;Kim, Won-Taek
    • Journal of Ecology and Environment
    • /
    • v.29 no.2
    • /
    • pp.85-103
    • /
    • 2006
  • An investigation of fauna and community of insects in Gotjawal Terrain, Jeju-do, had been conducted with a way of black light trap from July to September, 2005. The investigated insects were classified into 217 species, 75 families, and 11 orders respectively. Coleoptera that occupied 55.3 percent among them and was consisted of 120 species was the richest group and Hemiptera followed it. The density of Physopelta gutta was highest but Physopelta cincticollis was overall the dominant species in all sampling areas. The species diversity index was highest at Jocheon-Hamdeog Gotjawal in Jeju-do, while it was lowest in Gujwa-Sungsan Gotjawal. Clustering analysis revealed that the insect communities of four localities were grouped in only one cluster. Included in the species unreported in Jeju Island were Menida musiva and Pentatoma japonica in Hemiptera, Philonthus wuesthoffi in Coleoptera, and Phanerotoma flava in Hymenoptera.