• Title/Summary/Keyword: Grouped data

Search Result 846, Processing Time 0.027 seconds

A Study on Eating-out Behavior by Cluster Analysis according to The Lifestyle of Female Consumers in Seoul (서울시 여성 소비자의 라이프스타일에 따른 군집분석과 외식행동에 대한 연구)

  • Van, Ju-Won
    • Journal of the Korean Society of Food Culture
    • /
    • v.23 no.3
    • /
    • pp.377-387
    • /
    • 2008
  • The objective of this study was to use cluster analysis to determine differences in eating-out behavior among grouped clusters of female consumers after each cluster was divided based on lifestyle patterns. The data were collected by interview survey from a biased sample of 1,300 females, ranging from ages 20 to 59, and living in residential districts of Seoul. Reliability analysis, factor analysis, cluster analysis, cross-tabulation analysis, and analysis of variance (ANOVA) were applied to the data. Four lifestyle factors were extracted by lower-division and classified as follows: health condition, consuming, food, and housing lifestyles. Based on these four factors, the female consumers were grouped as three clusters: the consuming-individuality type, rational-pursuit type, and conservative-stability type. The eating-out behavior of each cluster was significantly different in terms of frequency of eating-out, eating-out expenditures, restaurant selection criteria, food preferences, and the purpose for eating-out. Since this study surveyed females from ages 20 to 59, age and demographics were the differential factors in determining the various lifestyle types. Thus, to target the consumers who form a target market, the food industry should consider market segmentation that combines demographic factors such as age, income, and marital status.

Microarray data analysis using relative hierarchical clustering (상대적 계층적 군집 방법을 이용한 마이크로어레이 자료의 군집분석)

  • Woo, Sook Young;Lee, Jae Won;Jhun, Myoungshic
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.999-1009
    • /
    • 2014
  • Hierarchical clustering analysis helps easily exploring massive microarray data and understanding biological phenomena with dendrogram. But, because hierarchical clustering algorithms only consider the absolute similarity, it is difficult to illustrate a relative dissimilarity, which consider not only the distance between a pair of clusters, but also how distant are they from the rest of the clusters. In this study, we introduced the relative hierarchical clustering method proposed by Mollineda and Vidal (2000) and compared hierarchical clustering method and relative hierarchical method using the simulated data and the real data in the various situations. The evaluation of the quality of two hierarchical methods was performed using percentage of incorrectly grouped points (PIGP), homogeneity and separation.

Estimation of the Parameters of the New Generalized Weibull Distribution

  • Zaindin, M.
    • International Journal of Reliability and Applications
    • /
    • v.11 no.1
    • /
    • pp.23-40
    • /
    • 2010
  • Recently, Zaindin and Sarhan (2009) introduced a new distribution named new generalized Weibull distribution. This paper deals with the problem of estimating the parameters of this distribution in the case where the data is grouped and censored. We use both the maximum likelihood and Bayes techniques. The results obtained are illustrated on a set of real data.

  • PDF

A Typology of Urban Married Women's Leisure Activities (도시기혼여성의 여가 활동유형)

  • 김외숙;이기춘
    • Journal of Families and Better Life
    • /
    • v.10 no.2
    • /
    • pp.61-74
    • /
    • 1992
  • The purpose of this study is to identify a typology of urban marred women's leisure activities based on participation data. The survey of this research was conducted by means of interview with 606 married women in Seoul. The instruments of the survey sere questionnaire including a leisure participation scale. Data were analysed by means of the statistic of frequency. percentage, arithmetic mean, standard deviation and factor analysis ,using the SPSS-X and SPSS/PC+ programs. The result was that the leisure activities of urban married women could be grouped into 5 factors; self-developing , family-oriented. religious-social, sociable, and time-spending activities For further researches, we suggested several proposals.

  • PDF

Design an Indexing Structure System Based on Apache Hadoop in Wireless Sensor Network

  • Keo, Kongkea;Chung, Yeongjee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.05a
    • /
    • pp.45-48
    • /
    • 2013
  • In this paper, we proposed an Indexing Structure System (ISS) based on Apache Hadoop in Wireless Sensor Network (WSN). Nowadays sensors data continuously keep growing that need to control. Data constantly update in order to provide the newest information to users. While data keep growing, data retrieving and storing are face some challenges. So by using the ISS, we can maximize processing quality and minimize data retrieving time. In order to design ISS, Indexing Types have to be defined depend on each sensor type. After identifying, each sensor goes through the Indexing Structure Processing (ISP) in order to be indexed. After ISP, indexed data are streaming and storing in Hadoop Distributed File System (HDFS) across a number of separate machines. Indexed data are split and run by MapReduce tasks. Data are sorted and grouped depend on sensor data object categories. Thus, while users send the requests, all the queries will be filter from sensor data object and managing the task by MapReduce processing framework.

Hierarchically penalized support vector machine for the classication of imbalanced data with grouped variables (그룹변수를 포함하는 불균형 자료의 분류분석을 위한 서포트 벡터 머신)

  • Kim, Eunkyung;Jhun, Myoungshic;Bang, Sungwan
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.5
    • /
    • pp.961-975
    • /
    • 2016
  • The hierarchically penalized support vector machine (H-SVM) has been developed to perform simultaneous classification and input variable selection when input variables are naturally grouped or generated by factors. However, the H-SVM may suffer from estimation inefficiency because it applies the same amount of shrinkage to each variable without assessing its relative importance. In addition, when analyzing imbalanced data with uneven class sizes, the classification accuracy of the H-SVM may drop significantly in predicting minority class because its classifiers are undesirably biased toward the majority class. To remedy such problems, we propose the weighted adaptive H-SVM (WAH-SVM) method, which uses a adaptive tuning parameters to improve the performance of variable selection and the weights to differentiate the misclassification of data points between classes. Numerical results are presented to demonstrate the competitive performance of the proposed WAH-SVM over existing SVM methods.

Characteristics of Source Acupoints: Data Mining of Clinical Trials Database (데이터 마이닝을 이용한 임상연구 데이터베이스 기반 원혈의 주치 특성)

  • Choi, Dha-Hyun;Lee, Seoyoung;Lee, In-Seon;Ryu, Yeonhee;Chae, Younbyoung
    • Korean Journal of Acupuncture
    • /
    • v.38 no.2
    • /
    • pp.100-109
    • /
    • 2021
  • Objectives : Source acupoint is one of the representative acupoints to treat various diseases in each meridian. We aimed to identify the patterns of selection of Source acupoints and their associations with diseases using clinical trials data. Methods : We extracted the frequency of Source acupoints across 30 diseases from clinical trials database. Acupuncture treatment regimens were retrieved from the Cochrane Database of Systematic Reviews. The frequency of Source acupoint use was calculated as the number of studies using a certain acupoint divided by the total number of included studies. Using hierarchical clustering and multidimensional scaling, the characteristics of Source acupoints were analyzed based on the similarity of the relationships between the Source acupoints and the diseases. Results : A total of 421 clinical trials were included for this analysis. LR3, HT7, KI3, and LI4 acupoints were most frequently used for the treatment of 30 diseases. Cluster analysis showed that LR3 and LI4 acupoints were grouped together and HT7 and KI3 acupoints were grouped together. Multidimensional scaling revealed that LR3, LI4, HT7, and KI3 acupoints have intrinsic properties in the two-dimensional space. Conclusions : The present study identified the selection patterns of the Source acupoints using clinical trials data. Our finding will provide the understanding of the characteristics of Source acupoints.

Diversity of the genus Sheathia (Batrachospermales, Rhodophyta) in northeast India and east Nepal

  • Necchi, Orlando Jr.;West, John A.;Ganesan, E.K.;Yasmin, Farishta;Rai, Shiva Kumar;Rossignolo, Natalia L.
    • ALGAE
    • /
    • v.34 no.4
    • /
    • pp.277-288
    • /
    • 2019
  • Freshwater red algae of the order Batrachospermales are poorly studied in India and Nepal, especially on a molecular basis. During a survey in northeast India and east Nepal, six populations of the genus Sheathia were found and analyzed using molecular and morphological evidence. Phylogenetic analyses based on the rbcL gene sequences grouped all populations in a large clade including our S. arcuata specimens and others from several regions. Sheathia arcuata represents a species complex with a high sequence divergence and several smaller clades. Samples from India and Nepal were grouped in three distinct clades with high support and representing new cryptic species: a clade formed by two samples from India, which was named Sheathia assamica sp. nov.; one sample from India and one from Nepal formed another clade, named Sheathia indonepalensis sp. nov.; two samples from Nepal grouped with sequences from Hawaii and Indonesia (only 'Chantransia' stages) and gametophytes from Taiwan, named Sheathia dispersa sp. nov. Morphological characters of the specimens from these three species overlap one another and with the general circumscription of S. arcuata, which lacks the heterocortication (presence of bulbous cells in the cortical filaments) present in other species of the genus Sheathia. Although the region sampled is relatively restricted, the genetic diversity among specimens of these three groups was high and not closely related in the phylogenetic relationship with the other clades of S. arcuata. These data corroborate information from other groups of organisms (e.g., land and aquatic plants) that indicates this region (Eastern Himalaya) as a hotspot of biodiversity.

Investigation of ground behaviour between plane-strain grouped pile and 2-arch tunnel station excavation (2-arch 터널 정거장 굴착 시 평면변형률 조건에서 군말뚝의 이격거리에 따른 지반거동 분석)

  • Kong, Suk-Min;Oh, Dong-Wook;Ahn, Ho-Yeon;Lee, Hyun-Gu;Lee, Yong-Joo
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.18 no.6
    • /
    • pp.535-544
    • /
    • 2016
  • Special tunnel design and construction methods have been suggested due to developments of subway and tunnel. Collapse accidents of tunnel bring enormous damage. So, observation and analysis for the safety of tunnelling and behaviour of surrounding ground are important. But, it is not economical to implement the field test in every time. Therefore, this study has measured ground behaviour due to excavation of 2-arch tunnel station according to offset between grouped pile and tunnel by laboratory model test. For the model test, trapdoor device was adopted. Tunnelling is simulated by volume loss of 2-arch tunnel. Ground displacements are observed by close range photogrammetric method and image processing. In addition, these data are compared with numerical analysis.

A small review and further studies on the LASSO

  • Kwon, Sunghoon;Han, Sangmi;Lee, Sangin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.5
    • /
    • pp.1077-1088
    • /
    • 2013
  • High-dimensional data analysis arises from almost all scientific areas, evolving with development of computing skills, and has encouraged penalized estimations that play important roles in statistical learning. For the past years, various penalized estimations have been developed, and the least absolute shrinkage and selection operator (LASSO) proposed by Tibshirani (1996) has shown outstanding ability, earning the first place on the development of penalized estimation. In this paper, we first introduce a number of recent advances in high-dimensional data analysis using the LASSO. The topics include various statistical problems such as variable selection and grouped or structured variable selection under sparse high-dimensional linear regression models. Several unsupervised learning methods including inverse covariance matrix estimation are presented. In addition, we address further studies on new applications which may establish a guideline on how to use the LASSO for statistical challenges of high-dimensional data analysis.