• Title/Summary/Keyword: Data Group

Search Result 21,372, Processing Time 0.045 seconds

Group Search Optimization Data Clustering Using Silhouette (실루엣을 적용한 그룹탐색 최적화 데이터클러스터링)

  • Kim, Sung-Soo;Baek, Jun-Young;Kang, Bum-Soo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.42 no.3
    • /
    • pp.25-34
    • /
    • 2017
  • K-means is a popular and efficient data clustering method that only uses intra-cluster distance to establish a valid index with a previously fixed number of clusters. K-means is useless without a suitable number of clusters for unsupervised data. This paper aimsto propose the Group Search Optimization (GSO) using Silhouette to find the optimal data clustering solution with a number of clusters for unsupervised data. Silhouette can be used as valid index to decide the number of clusters and optimal solution by simultaneously considering intra- and inter-cluster distances. The performance of GSO using Silhouette is validated through several experiment and analysis of data sets.

Data Mining Approach for Real-Time Processing of Large Data Using Case-Based Reasoning : High-Risk Group Detection Data Warehouse for Patients with High Blood Pressure (사례기반추론을 이용한 대용량 데이터의 실시간 처리 방법론 : 고혈압 고위험군 관리를 위한 자기학습 시스템 프레임워크)

  • Park, Sung-Hyuk;Yang, Kun-Woo
    • Journal of Information Technology Services
    • /
    • v.10 no.1
    • /
    • pp.135-149
    • /
    • 2011
  • In this paper, we propose the high-risk group detection model for patients with high blood pressure using case-based reasoning. The proposed model can be applied for public health maintenance organizations to effectively manage knowledge related to high blood pressure and efficiently allocate limited health care resources. Especially, the focus is on the development of the model that can handle constraints such as managing large volume of data, enabling the automatic learning to adapt to external environmental changes and operating the system on a real-time basis. Using real data collected from local public health centers, the optimal high-risk group detection model was derived incorporating optimal parameter sets. The results of the performance test for the model using test data show that the prediction accuracy of the proposed model is two times better than the natural risk of high blood pressure.

Pre-Adjustment of Incomplete Group Variable via K-Means Clustering

  • Hwang, S.Y.;Hahn, H.E.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.3
    • /
    • pp.555-563
    • /
    • 2004
  • In classification and discrimination, we often face with incomplete group variable arising typically from many missing values and/or incredible cases. This paper suggests the use of K-means clustering for pre-adjusting incompleteness and in turn classification based on generalized statistical distance is performed. For illustrating the proposed procedure, simulation study is conducted comparatively with CART in data mining and traditional techniques which are ignoring incompleteness of group variable. Simulation study manifests that our methodology out-performs.

  • PDF

A Comparative Analysis on Korean Medical and Western Medical Service Usage Tendency of Rotator Cuff Surgery Patients - Using HIRA's Patients Sample Data

  • Khang, Hyun-jin;Lee, Hye-Yoon;Lee, Se-Yeon;Kim, NamKwen;Song, YunKyung
    • The Journal of Korean Medicine
    • /
    • v.42 no.4
    • /
    • pp.133-149
    • /
    • 2021
  • Objectives: To lay the foundation for future research into Korean Medicine treatment for Rotator Cuff repair surgery patients by analyzing Korean Medical and Western Medical service utilization and treatment duration. Methods: Data sampling was performed on 2015's HIRA patient data (confidence level of 97%) to analyze patients' Korean Medical and Western Medical service usage tendency. Sampled groups were divided into two groups: i) Patients who completed their treatment within five months after the rotator cuff surgery (termination group), ii) Patients who were treated for more than five months after the surgery (continuation group). Then the patients' Korean Medical and Western Medical service usage tendency was investigated and information of these patients are arranged. Results: Out of 1,453,486 patients who were gathered for sampling, 2,461 patients in total had gone through rotator cuff repair surgery. The termination group had 517 patients and the continuation group had 541 patients. The proportion of patients who visited a Korean Medicine clinic was lower in the termination group than the continuation group. Conclusion: The continuation group received more treatments (both in Western Medicine and Korean Medicine) and spent more on medical expenses compared to the termination group. Further research is highly recommended for more efficient Western Medicine and Korean Medicine treatments and reduced medical expenditure.

A Study on Region-based Secure Multicast in Mobile Ad-hoc Network (Mobile Ad-hoc Network에서 영역기반 보안 멀티캐스트 기법 연구)

  • Yang, Hwanseok
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.12 no.3
    • /
    • pp.75-85
    • /
    • 2016
  • MANET is a network composed only mobile network having limited resources and has dynamic topology characteristics. Therefore, every mobile node acts as a route and delivers data by using multi-hop method. In particular, group communication such as multicast is desperately needed because of characteristics such as battery life of limited wireless bandwidth and mobile nodes. However, the multicast technique can have different efficient of data transmission according to configuring method of a virtual topology by the movement of the nodes and the performance of a multicast can be significantly degraded. In this paper, the region based security multicast technique is proposed in order to increase the efficiency of data transmission by maintaining an optimal path and enhance the security features in data transmission. The group management node that manages the state information of the member nodes after the whole network is separated to area for efficient management of multicast member nodes is used. Member node encrypts using member key for secure data transmission and the security features are strengthened by sending the data after encrypted using group key in group management node. The superiority of the proposed technique in this paper was confirmed through experiments.

Finding associations between genes by time-series microarray sequential patterns analysis

  • Nam, Ho-Jung;Lee, Do-Heon
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.161-164
    • /
    • 2005
  • Data mining techniques can be applied to identify patterns of interest in the gene expression data. One goal in mining gene expression data is to determine how the expression of any particular gene might affect the expression of other genes. To find relationships between different genes, association rules have been applied to gene expression data set [1]. A notable limitation of association rule mining method is that only the association in a single profile experiment can be detected. It cannot be used to find rules across different condition profiles or different time point profile experiments. However, with the appearance of time-series microarray data, it became possible to analyze the temporal relationship between genes. In this paper, we analyze the time-series microarray gene expression data to extract the sequential patterns which are similar to the association rules between genes among different time points in the yeast cell cycle. The sequential patterns found in our work can catch the associations between different genes which express or repress at diverse time points. We have applied sequential pattern mining method to time-series microarray gene expression data and discovered a number of sequential patterns from two groups of genes (test, control) and more sequential patterns have been discovered from test group (same CO term group) than from the control group (different GO term group). This result can be a support for the potential of sequential patterns which is capable of catching the biologically meaningful association between genes.

  • PDF

Shopping Propensity for Clothes Consumers in the Internet Shopping Mall - Focused on university students in Busan -

  • Jhun, Mi-Ran;Beum, Su-Gyun;Choi, Seung-Bae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.3
    • /
    • pp.775-788
    • /
    • 2008
  • In this study, the consumers were classified into Risk Avoidance Group, Enjoyment and Economy Group, Low Involving Group and High Involving Group by shopping propensity of the consumers and investigated the difference in their evaluation criteria and purchasing intention. The samples used in this study are 214 university students in Busan area who visited the internet shopping mall or had an experience to buy clothes through an internet. The purpose of this study is to identify how the internet university students' clothes shopping propensity and demographical factors affect the evaluation criteria and purchasing intention for clothes. And based on the results, the implications for internet clothes shopping mall are suggested and post research subject is provided as well.

  • PDF

A Study on Testing Tools for Hierarchical Cooperative Analysis in Cloud of Things Environment (CoT(Cloud of Things)환경에서 계층적 협업 분석 SW의 시험 검증 사례 연구)

  • Lee, Yong-Ju;Park, Hwin Dol;Choi, Chang-Ho;Park, JunYong;Min, Ok-Gee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.1183-1184
    • /
    • 2017
  • 라즈베리파이와 같은 경량 시스템이 보급되면서, 기존의 서버 중심의 분석 기법에서 경량 시스템에서 경량/순간 분석을 하고, 누적/중량 분석은 서버에서 수행할수 있는 계층적 협업 분석에 대한 연구를 진행하고 있으며, 이를 통해 IoT Thing을 논리적으로 묶어서 Cloud of Things과 같이 센서 데이터 수집/처리를 용이하게 하기 위해 구현된 계층적 협업 분석 SW에 대한 시험 검증 사례에 대한 연구 내용을 담고 있다.

A Resetting Scheme for Process Parameters using the Mahalanobis-Taguchi System

  • Park, Chang-Soon
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.4
    • /
    • pp.589-603
    • /
    • 2012
  • Mahalanobis-Taguchi system(MTS) is a statistical tool for classifying the normal group and abnormal group in multivariate data structures. In addition to the classification itself, the MTS uses a method for selecting variables useful for the classification. This method can be used efficiently especially when the abnormal group data are scattered without a specific directionality. When the feedback adjustment procedure through the measurements of the process output for controlling process input variables is not practically possible, the reset procedure can be an alternative one. This article proposes a reset procedure using the MTS. Moreover, a method for identifying input variables to reset is also proposed by the use of the contribution. The identification of the root-cause parameters using the existing dimension-reduced contribution tends to be difficult due to the variety of correlation relationships of multivariate data structures. However, it became possible to provide an improved decision when used together with the location-centered contribution and the individual-parameter contribution.

MP-Lasso chart: a multi-level polar chart for visualizing group Lasso analysis of genomic data

  • Min Song;Minhyuk Lee;Taesung Park;Mira Park
    • Genomics & Informatics
    • /
    • v.20 no.4
    • /
    • pp.48.1-48.7
    • /
    • 2022
  • Penalized regression has been widely used in genome-wide association studies for joint analyses to find genetic associations. Among penalized regression models, the least absolute shrinkage and selection operator (Lasso) method effectively removes some coefficients from the model by shrinking them to zero. To handle group structures, such as genes and pathways, several modified Lasso penalties have been proposed, including group Lasso and sparse group Lasso. Group Lasso ensures sparsity at the level of pre-defined groups, eliminating unimportant groups. Sparse group Lasso performs group selection as in group Lasso, but also performs individual selection as in Lasso. While these sparse methods are useful in high-dimensional genetic studies, interpreting the results with many groups and coefficients is not straightforward. Lasso's results are often expressed as trace plots of regression coefficients. However, few studies have explored the systematic visualization of group information. In this study, we propose a multi-level polar Lasso (MP-Lasso) chart, which can effectively represent the results from group Lasso and sparse group Lasso analyses. An R package to draw MP-Lasso charts was developed. Through a real-world genetic data application, we demonstrated that our MP-Lasso chart package effectively visualizes the results of Lasso, group Lasso, and sparse group Lasso.