• Title/Summary/Keyword: hierarchical data

Search Result 3,012, Processing Time 0.028 seconds

Determinants of student course evaluation using hierarchical linear model (위계적 선형모형을 이용한 강의평가 결정요인 분석)

  • Cho, Jang Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.6
    • /
    • pp.1285-1296
    • /
    • 2013
  • The fundamental concerns of this paper are to analyze the effects of student course evaluation using subject characteristic and student characteristic variables. We use a 2-level hierarchical linear model since the data structure of subject characteristic and student characteristic variables is multilevel. Four models we consider are as follows; (1) null model, (2) random coefficient model, (3) mean as outcomes model, (4) intercepts and slopes as outcomes model. The results of the analysis were given as follows. First, the result of null model was that subject characteristics effects on course evaluation had much larger than student characteristics. Second, the result of conditional model specifying subject and student level predictors revealed that class size, grade, tenure, mean GPA of the class, native class for level-1, and sex, department category, admission method, mean GPA of the student for level-2 had statistically significant effects on course evaluation. The explained variance was 13% in subject level, 13% in student level.

Bayesian methods in clinical trials with applications to medical devices

  • Campbell, Gregory
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.6
    • /
    • pp.561-581
    • /
    • 2017
  • Bayesian statistics can play a key role in the design and analysis of clinical trials and this has been demonstrated for medical device trials. By 1995 Bayesian statistics had been well developed and the revolution in computing powers and Markov chain Monte Carlo development made calculation of posterior distributions within computational reach. The Food and Drug Administration (FDA) initiative of Bayesian statistics in medical device clinical trials, which began almost 20 years ago, is reviewed in detail along with some of the key decisions that were made along the way. Both Bayesian hierarchical modeling using data from previous studies and Bayesian adaptive designs, usually with a non-informative prior, are discussed. The leveraging of prior study data has been accomplished through Bayesian hierarchical modeling. An enormous advantage of Bayesian adaptive designs is achieved when it is accompanied by modeling of the primary endpoint to produce the predictive posterior distribution. Simulations are crucial to providing the operating characteristics of the Bayesian design, especially for a complex adaptive design. The 2010 FDA Bayesian guidance for medical device trials addressed both approaches as well as exchangeability, Type I error, and sample size. Treatment response adaptive randomization using the famous extracorporeal membrane oxygenation example is discussed. An interesting real example of a Bayesian analysis using a failed trial with an interesting subgroup as prior information is presented. The implications of the likelihood principle are considered. A recent exciting area using Bayesian hierarchical modeling has been the pediatric extrapolation using adult data in clinical trials. Historical control information from previous trials is an underused area that lends itself easily to Bayesian methods. The future including recent trends, decision theoretic trials, Bayesian benefit-risk, virtual patients, and the appalling lack of penetration of Bayesian clinical trials in the medical literature are discussed.

Data Pattern Estimation with Movement of the Center of Gravity

  • Ahn Tae-Chon;Jang Kyung-Won;Shin Dong-Du;Kang Hak-Soo;Yoon Yang-Woong
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.6 no.3
    • /
    • pp.210-216
    • /
    • 2006
  • In the rule based modeling, data partitioning plays crucial role be cause partitioned sub data set implies particular information of the given data set or system. In this paper, we present an empirical study result of the data pattern estimation to find underlying data patterns of the given data. Presented method performs crisp type clustering with given n number of data samples by means of the sequential agglomerative hierarchical nested model (SAHN). In each sequence, the average value of the sum of all inter-distance between centroid and data point. In the sequel, compute the derivation of the weighted average distance to observe a pattern distribution. For the final step, after overall clustering process is completed, weighted average distance value is applied to estimate range of the number of clusters in given dataset. The proposed estimation method and its result are considered with the use of FCM demo data set in MATLAB fuzzy logic toolbox and Box and Jenkins's gas furnace data.

The effect of missing levels of nesting in multilevel analysis

  • Park, Seho;Chung, Yujin
    • Genomics & Informatics
    • /
    • v.20 no.3
    • /
    • pp.34.1-34.11
    • /
    • 2022
  • Multilevel analysis is an appropriate and powerful tool for analyzing hierarchical structure data widely applied from public health to genomic data. In practice, however, we may lose the information on multiple nesting levels in the multilevel analysis since data may fail to capture all levels of hierarchy, or the top or intermediate levels of hierarchy are ignored in the analysis. In this study, we consider a multilevel linear mixed effect model (LMM) with single imputation that can involve all data hierarchy levels in the presence of missing top or intermediate-level clusters. We evaluate and compare the performance of a multilevel LMM with single imputation with other models ignoring the data hierarchy or missing intermediate-level clusters. To this end, we applied a multilevel LMM with single imputation and other models to hierarchically structured cohort data with some intermediate levels missing and to simulated data with various cluster sizes and missing rates of intermediate-level clusters. A thorough simulation study demonstrated that an LMM with single imputation estimates fixed coefficients and variance components of a multilevel model more accurately than other models ignoring data hierarchy or missing clusters in terms of mean squared error and coverage probability. In particular, when models ignoring data hierarchy or missing clusters were applied, the variance components of random effects were overestimated. We observed similar results from the analysis of hierarchically structured cohort data.

Hierarchical Bayesian analysis for a forest stand volume (산림재적 추정을 위한 계층적 베이지안 분석)

  • Song, Se Ri;Park, Joowon;Kim, Yongku
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.1
    • /
    • pp.29-37
    • /
    • 2017
  • It has gradually become important to estimate a forest stand volume utilizing LiDAR data. Recently, various statistical models including a linear regression model has been introduced to estimate a forest stand volume using LiDAR data. One of limitations of the current approaches is in that the accuracy of observed forest stand volume data, which is used as a response variable, is questionable unstable. To overcome this limitation, we consider a spatial structure for a forest stand volume. In this research, we propose a hierarchical model for applying a spatial structure to a forest stand volume. The proposed model is applied to the LiDAR data and the forest stand volume for Bonghwa, Gyeongsangbuk-do.

On the Hierarchical Modeling of Spatial Measurements from Different Station Networks (다양한 관측네트워크에서 얻은 공간자료들을 활용한 계층모형 구축)

  • Choi, Jieun;Park, Man Sik
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.1
    • /
    • pp.93-109
    • /
    • 2013
  • Geostatistical data or point-referenced data have the information on the monitoring stations of interest where the observations are measured. Practical geostatistical data are obtained from a wide variety of observational monitoring networks that are mainly operated by the Korean government. When we analyze geostatistical data and predict the expectations at unobservable locations, we can improve the reliability of the prediction by utilizing some relevant spatial data obtained from different observational monitoring networks and blend them with the measurements of our main interest. In this paper, we consider the hierarchical spatial linear model that enables us to link spatial variables from different resources but with similar patterns and guarantee the precision of the prediction. We compare the proposed model to a classical linear regression model and simple kriging in terms of some information criteria and one-leave-out cross-validation. Real application deals with Sulfur Dioxide($SO_2$) measurements from the urban air pollution monitoring network and wind speed data from the surface observation network.

Performance evaluation of principal component analysis for clustering problems

  • Kim, Jae-Hwan;Yang, Tae-Min;Kim, Jung-Tae
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.40 no.8
    • /
    • pp.726-732
    • /
    • 2016
  • Clustering analysis is widely used in data mining to classify data into categories on the basis of their similarity. Through the decades, many clustering techniques have been developed, including hierarchical and non-hierarchical algorithms. In gene profiling problems, because of the large number of genes and the complexity of biological networks, dimensionality reduction techniques are critical exploratory tools for clustering analysis of gene expression data. Recently, clustering analysis of applying dimensionality reduction techniques was also proposed. PCA (principal component analysis) is a popular methd of dimensionality reduction techniques for clustering problems. However, previous studies analyzed the performance of PCA for only full data sets. In this paper, to specifically and robustly evaluate the performance of PCA for clustering analysis, we exploit an improved FCBF (fast correlation-based filter) of feature selection methods for supervised clustering data sets, and employ two well-known clustering algorithms: k-means and k-medoids. Computational results from supervised data sets show that the performance of PCA is very poor for large-scale features.

Galaxy Clusters at High Redshift

  • Im, Myungshin
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.40 no.1
    • /
    • pp.41.1-41.1
    • /
    • 2015
  • Hierarchical galaxy formation models under LCDM cosmology predict that the most massive structures such as galaxy clusters (M > $10^{14}M_{\odot}$) appear late (z < 1) in the history of the universe through hierarchical clustering of small objects. Galaxy formation is also expected to be accelerated in overdense environments, with the star formation rate-density relation to be established at z ~ 2. In this talk, we present our search of massive structures of galaxies at 0.7 < z < 4, using the data from GOODS survey and our own imaging survey, Infrared Medium-deep Survey (IMS). From these studies, we find that there are excess of massive structures of galaxies at z > 2 in comparison to the Millennium simulation data. At 1 < z < 2, the number density of massive structures is consistent with the simulation data, but the star formation history is more or less identical between field and cluster. The star formation quenching process is dominated by internal process (stellar mass). The environmental effect becomes important only at z < 1, which contributes to create the well known star formation-density relation in the local universe. Our results suggest that galaxy formation models under LCDM cosmology may require further refinements to match the observation.

  • PDF

Hierarchical Associative Frame with Learning and Episode memory for the intelligent Knowledge Retrieval

  • Shim, Jeon-Yon
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.694-698
    • /
    • 2004
  • In this paper, as one of these efforts for making the intelligent data mining system we propose the Associative frame of the memory according to the following three steps. First,the structured frame for performing the main brain function should be made. In this frame, the concepts of learning memory and episode memory are considered. Second,the learning mechanism for data acquisition and storing mechanism in the memory frame are provided. The obtained data are arranged and stored in the memory following the rules of the structured memory frame. Third, it is the last step of processing the inference and knowledge retrieval function using the stored knowledge in the associative memory frame. This system is applied to the area for estimating the purchasing degree from the type of customer's tastes, the pattern of commodities and the evaluation of a company.

  • PDF