• Title/Summary/Keyword: categorical data analysis

Search Result 195, Processing Time 0.019 seconds

RESEARCH ON THE DEVELOPMENT OF COLLEGE STUDENT EDUCATION BASED ON MACHINE LEARNING - TAKE THE PHYSICAL EDUCATION OF YANBIAN UNIVERSITY AS AN EXAMPLE

  • Quan, Yu;Guo, Wei-Jie;He, Lin;Jin, Zhe-Zhi
    • East Asian mathematical journal
    • /
    • v.38 no.1
    • /
    • pp.65-84
    • /
    • 2022
  • This paper is based on Yanbian University's physical test data, and uses statistical analysis methods to study the relationship between college students' physical test scores to promote college physical education. Firstly, using gender as categorical variables, we conduct a general analysis of students in different majors and different grades, and obtain the advantages and disadvantages of male and female college students; then we use Decision Trees and Random Forest algorithms to conduct modeling analysis to provide valuable suggestions for relevant departments of the university. the aiming of this research analyzing about the undergraduates physical test is that giving universities the targeted suggestions to improve the college graduate rate and promote the overall development of higher education, lay the foundation for achieving universal health.

The Marginal Model for Categorical Data Analysis of $3\times3$ Cross-Trials ($3\times3$ 교차실험을 범주형 자료 분석을 위한 주변확률모형)

  • 안주선
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.1
    • /
    • pp.25-37
    • /
    • 2001
  • The marginal model is proposed for the analysis of data which have c(2: 3) categories in the 3 x 3 cross-over trials with three periods and three treatments. This model could be used for the counterpart of the Kenward-Jones' joint probability one and should be the generalization of Balagtas et ai's univariate marginal logits one, which analyze the treatment effects in the 3 x 3 cross-over trials with binary response variables[Kenward and Jones(1991), Balagtas et al(1995)]. The model equations for the marginal probability are constructed by the three types of link functions. The methods would be given for making of the link function matrices and model ones, and the estimation of parameters shall be discussed. The proposed model is applied to the analysis of Kenward and Jones' data.

  • PDF

Information Theory and Data Visualization Approach to Poll Analysis (정보이론과 시각화 방법에 의한 여론조사 분석의 새로운 접근방법)

  • Huh, Moon-Yul;Cha, Woon-Ock
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.1
    • /
    • pp.61-78
    • /
    • 2007
  • A method for poll analysis using information theory and data visualization is proposed in this paper. Questions of opinion poll consist of a target variable and many explanation variables. The type of explanation variables is either numerical or categorical. In this study, explanation variables of mixed types have been ranked according to the magnitude of their effect on target variable by using mutual information. Likewise, the order of explanation variables has been evaluated using data visualization. This is the first study to quantify the impact of specific explanation variable on the related target variable.

A Study on the Analysis for Life-cycle of Quasi-Market Oriented SOC Public Enterprise and Effective Management (준시장형 SOC 공기업의 수명주기 분석과 효율적 관리방안에 관한 연구)

  • Park, Dong-Sun;Kang, Myung-Soo;Kim, Nam-Jung
    • Land and Housing Review
    • /
    • v.6 no.4
    • /
    • pp.165-175
    • /
    • 2015
  • This study is focusing on the needs to introduce policy decision making based on identification of the definition for 'business life cycle' and 'public enterprises' for proper public enterprises management. For this purpose, the study is planning to define categorical variables for enterprise life cycle and provide basic data for public enterprises management policy. This study explored 'Korea Expressway Corporation', 'K-water', 'Korea Railroad', 'Korea Land and Housing Corporations', because of they are the public institutions recently underwent 'management normalization policy' due to rapidly increasing debt. First, there follows the analysis on priority and standard of categorical variables for quasi-market oriented SOC public enterprise life cycle by using AHP and frequency study on expert survey. Next, this study investigated and analyzed the enterprises management plan for expected 'declining period' through forecasting 'declining period' by conducting 2nd expert survey.

Critical Evaluation of Fine Needle Aspiration Cytology as a Diagnostic Technique in Bone Tumors and Tumor-like Lesions

  • Chakrabarti, Sudipta;Datta, Alok Sobhan;Hira, Michael
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.7
    • /
    • pp.3031-3035
    • /
    • 2012
  • Background: Though open surgical biopsy is the procedure of choice for the diagnosis of bone tumors, many disadvantages are associated with this approach. The present study was undertaken to evaluate the role of fine needle aspiration cytology (FNAC) as a diagnostic tool in cases of bony tumors and tumor-like lesions which may be conducted in centers where facilities for surgical biopsies are inadequate. Methods: The study population consisted of 51 cases presenting with a skeletal mass. After clinical evaluation, radiological correlation was done to assess the nature and extent of each lesion. Fine needle aspiration was performed aseptically and smears were prepared. Patients subsequently underwent open surgical biopsy and tissue samples were obtained for histopathological examination. Standard statistical methods were applied for analysis of data. Results: Adequate material was not obtained even after repeated aspiration in seven cases, six of which were benign. Among the remaining 44 cases, diagnosis of malignancy was correctly provided in 28 (93.3%) out of 30 cases and categorical diagnosis in 20 (66.67%). Interpretation of cytology was more difficult in cases of benign and tumor-like lesions, with a categorical opinion only possible in seven (50%) cases. Statistical analysis showed FNAC with malignant tumors to have high sensitivity (93.3%), specificity (92.9%) and positive predictive value of 96.6%, whereas the negative predictive value was 86.7%. Conclusion: FNAC should be included in the diagnostic workup of a skeletal tumor because of its simplicity and reliability. However, a definitive pathologic diagnosis heavily depends on compatible clinical and radiologic features which can only be accomplished by teamwork. The cytological technique applied in this study could detect many bone tumors and tumor-like conditions and appears particularly suitable as a diagnostic technique for rural regions of India as other developing countries.

Multiple Testing in Genomic Sequences Using Hamming Distance

  • Kang, Moonsu
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.6
    • /
    • pp.899-904
    • /
    • 2012
  • High-dimensional categorical data models with small sample sizes have not been used extensively in genomic sequences that involve count (or discrete) or purely qualitative responses. A basic task is to identify differentially expressed genes (or positions) among a number of genes. It requires an appropriate test statistics and a corresponding multiple testing procedure so that a multivariate analysis of variance should not be feasible. A family wise error rate(FWER) is not appropriate to test thousands of genes simultaneously in a multiple testing procedure. False discovery rate(FDR) is better than FWER in multiple testing problems. The data from the 2002-2003 SARS epidemic shows that a conventional FDR procedure and a proposed test statistic based on a pseudo-marginal approach with Hamming distance performs better.

Bayesian Modeling of Random Effects Covariance Matrix for Generalized Linear Mixed Models

  • Lee, Keunbaik
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.3
    • /
    • pp.235-240
    • /
    • 2013
  • Generalized linear mixed models(GLMMs) are frequently used for the analysis of longitudinal categorical data when the subject-specific effects is of interest. In GLMMs, the structure of the random effects covariance matrix is important for the estimation of fixed effects and to explain subject and time variations. The estimation of the matrix is not simple because of the high dimension and the positive definiteness; subsequently, we practically use the simple structure of the covariance matrix such as AR(1). However, this strong assumption can result in biased estimates of the fixed effects. In this paper, we introduce Bayesian modeling approaches for the random effects covariance matrix using a modified Cholesky decomposition. The modified Cholesky decomposition approach has been used to explain a heterogenous random effects covariance matrix and the subsequent estimated covariance matrix will be positive definite. We analyze metabolic syndrome data from a Korean Genomic Epidemiology Study using these methods.

Analytical Study for the Data Formation of Chair Structure (의자구조 데이터화를 위한 분석 연구)

  • 인석일
    • Archives of design research
    • /
    • v.17 no.1
    • /
    • pp.347-360
    • /
    • 2004
  • This study concerns the structural analysis of chairs, that are significant in the history of furniture, for objective data by taking 347 samples. Categorical standards have been made that are all applicable to these samples, and they are organized in 43 different structural cases. From this, we can discover 6 structural complicity connected in its form which provides a penetrating perspective for the structure of chairs.

  • PDF

Annotation Technique Development based on Apparel Attributes for Visual Apparel Search Technology (비주얼 의류 검색기술을 위한 의류 속성 기반 Annotation 기법 개발)

  • Lee, Eun-Kyung;Kim, Yang-Weon;Kim, Seon-Sook
    • Fashion & Textile Research Journal
    • /
    • v.17 no.5
    • /
    • pp.731-740
    • /
    • 2015
  • Mobile (smartphone) search engine marketing is increasingly important. Accordingly, the development of visual apparel search technology to obtain easier and faster access to visual information in the apparel field is urgently needed. This study helps establish a proper classifying system for an apparel search after an analysis of search techniques for apparel search applications and existing domestic and overseas apparel sites. An annotation technique is developed in accordance with visual attributes and apparel categories based on collected data obtained by web crawling and apparel images collecting. The categorical composition of apparel is divided into wearing, image and style. The web evaluation site traces the correlations of the apparel category and apparel factors as dependent upon visual attributes. An appraisal team of 10 individuals evaluated 2860 pieces of merchandise images. Data analysis consisted of correlations between apparel, sleeve length and apparel category (based on an average analysis), and correlation between fastener and apparel category (based on an average analysis). The study results can be considered as an epoch-making mobile apparel search system that can contribute to enhancing consumer convenience since it enables an effective search of type, price, distributor, and apparel image by a mobile photographing of the wearing state.

Steal Success Model for 2007 Korean Professional Baseball Games (2007년 한국프로야구에서 도루성공모형)

  • Hong, Chong-Sun;Choi, Jeong-Min
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.455-468
    • /
    • 2008
  • Based on the huge baseball game records, the steal plays an important role to affect the result of games. For the research about success or failure of the steal in baseball games, logistic regression models are developed based on 2007 Korean professional baseball games. The analyses of logistic regression models are compared of those of the discriminant models. It is found that the performance of the logistic regression analysis is more efficient than that of the discriminant analysis. Also, we consider an alternative logistic regression model based on categorical data which are transformed from uneasy obtainable continuous data.