• Title/Summary/Keyword: a mixed data set

Search Result 139, Processing Time 0.026 seconds

Derivation of a benchmark dose lower bound of lead for attention deficit hyperactivity disorder using a longitudinal data set (경시적 자료의 주의력 결핍 과잉행동 장애를 종점으로 한 납의 벤치마크 용량 하한 도출)

  • Lee, Juhyung;Kim, Si Yeon;Ha, Mina;Kwon, Hojang;Kim, Byung Soo
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1295-1309
    • /
    • 2016
  • This paper is to reproduce the result of Kim et al. (2014) by deriving a benchmark dose lower bound (BMDL) of lead based on the 2005 cohort data set of Children's Health and Environmental Research (CHEER) data set. The ADHD rating scales in the 2005 cohort were not consistent along the three follow-ups since two different ADHD rating scales were used in the cohort. We first unified the ADHD rating scales in the 2005 cohort by deriving a conversion formula using a penalized linear spline. We then constructed two linear mixed models for the 2005 cohort which reflected the longitudinal characteristics of the data set. The first model introduced the random intercept and the random slope terms and the second model assumed the first order autoregressive structure of the error term. Using these two models, we derived the BMDLs of lead and reconfirmed the "regression to the mean" nature of the ADHD score discovered by Kim et al. (2014). We also noticed that there was a definite difference between the sampling distributions of the two cohorts. As a result, taking this difference into account, we were able to obtain the consistent result with Kim et al. (2014).

Pagoda Data Management and Metadata Requirements for Libraries in Myanmar

  • Tin Tin Pipe;Kulthida Tuamsuk
    • Journal of Information Science Theory and Practice
    • /
    • v.11 no.3
    • /
    • pp.79-91
    • /
    • 2023
  • The storage of data documentation for Myanmar pagodas has various issues, and its retrieval method causes problems for users and libraries. This study utilized a mixed-methods approach, combining qualitative and quantitative methods to investigate pagoda data management in Myanmar libraries. The study aims to achieve the following objectives: to study the library collection management of pagodas in Myanmar, to investigate the management of pagoda data in Myanmar libraries, and to identify the pagoda data requirements for metadata development from the library professional perspective. The study findings revealed several challenges facing librarians and library users in accessing and managing Myanmar pagoda data, including limited stocks and retrieval tools, difficulty in accessing all available data online, and a lack of a centralized database or repository for storing and retrieving pagoda data. The study recommends the establishment of metadata criteria for managing a set of pagoda data and improving access to technology to address these challenges.

A Kurtosis-based Algorithm for Blind Sources Separation Using the Cayley Transformation And Its Application to Multi-channel Electrogastrograms

  • Ohata, Masashi;Matsumoto, Takahiro;Shigematsu, Akio;Matsuoka, Kiyotoshi
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2000.10a
    • /
    • pp.471-471
    • /
    • 2000
  • This paper presents a new kurtosis-based algorithm for blind separation of convolutively mixed source signals. The algorithm whitens the signals not only spatially but also temporally beforehand. A separator is built for the whitened signals and it exists in the set of para-unitary matrices. Since the set forms a curved manifold, it is hard to treat its elements. In order to avoid the difficulty, this paper introduces the Cayley transformation for the para-unitary matrices. The transformed matrix is referred to as para-skew-Hermitian matrix and the set of such matrices forms a linear space. In the set of all para-skew-Hermitian matrices, the kurtosis-based algorithm obtains a desired separator. This paper also shows the algorithm's application to electrogastrogram datum which are observed by 4 electrodes on subjects' abdomen around their stomachs. An electrogastrogram contains signals from a stomach and other organs. This paper obtains independent components by the algorithm and then extracts the signal corresponding to the stomach from the data.

  • PDF

An Application of ISODATA Method for Regional Lithological Mapping (광역지질도 작성을 위한 ISODATA 응용)

  • 朴鍾南;徐延熙
    • Korean Journal of Remote Sensing
    • /
    • v.5 no.2
    • /
    • pp.109-122
    • /
    • 1989
  • The ISODATA method, which is one of the most famous of the square-error clustering methos, has been applied to two Chungju multivariate data sets in order to evaluate the effectiveness of the regional lithological mapping. One is an airborne radiometric data set and the other is a mixed data set of the airborne radiometric and Landsat TM data. In both cases, the classification of the Bulguksa granite and the Kyemyongsan biotite-quartz gneiss are the most successful. Hyangsanni dolomitic limestone and neighboring Daehyangsan quartzite are also classified by their typical lowness of the radioactive intensities, though it is still confused with some others such as water-covered areas and nearby alluvials, and unaltered limestone areas. Topographically rugged valleys are also classified as the same cluster as above. This could be due to unavoidable variations of flight height and the attitude of the airborne system in such rugged terrains. The regional geological mapping of sedimentary rock units of the Ockchun System is in general confused. This might be due to similarities between different sediments. Considarable discrepancies occurred in mapping some lithological boundaries might also be due to secondary effects such as contamination or smoothing in digitizing process. Further study should be continued in the variable selection scheme as no absolutely superior method claims to exist yet since it seems somewhat to be rather data dependent. Study could also be made on the data preprocessing in order to reduce the erratic effects as mentioned above, and thus hoprfully draw much better result in regional geological mapping.

MINIMIZATION OF PARENT ROLL TRIM LOSS FOR THE PAPER INDUSTRY

  • Bae, Hee-Man
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.3 no.2
    • /
    • pp.95-108
    • /
    • 1978
  • This paper discusses an application of mathematical programming techniques in the paper industry in determining optimal parent roll widths. Parent rolls are made from the reels produced at wide paper machines by slitting them to more manageable widths. The problem is finding a set of the slitting patterns that will minimize the trim loss involved in the sheeting operation. Two programming models, one linear and one mixed integer linear, are presented in this paper. Also presented are the computational experience, the model sensitivity, and the comparison of the optimal solutions with the simulated operational data.

  • PDF

A Decision Tree Algorithm using Genetic Programming

  • Park, Chongsun;Ko, Young Kyong
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.3
    • /
    • pp.845-857
    • /
    • 2003
  • We explore the use of genetic programming to evolve decision trees directly for classification problems with both discrete and continuous predictors. We demonstrate that the derived hypotheses of standard algorithms can substantially deviated from the optimum. This deviation is partly due to their top-down style procedures. The performance of the system is measured on a set of real and simulated data sets and compared with the performance of well-known algorithms like CHAID, CART, C5.0, and QUEST. Proposed algorithm seems to be effective in handling problems caused by top-down style procedures of existing algorithms.

Supremacy of Realized Variance MIDAS Regression in Volatility Forecasting of Mutual Funds: Empirical Evidence From Malaysia

  • WAN, Cheong Kin;CHOO, Wei Chong;HO, Jen Sim;ZHANG, Yuruixian
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.9 no.7
    • /
    • pp.1-15
    • /
    • 2022
  • Combining the strength of both Mixed Data Sampling (MIDAS) Regression and realized variance measures, this paper seeks to investigate two objectives: (1) evaluate the post-sample performance of the proposed weekly Realized Variance-MIDAS (RVar-MIDAS) in one-week ahead volatility forecasting against the established Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model and the less explored but robust STES (Smooth Transition Exponential Smoothing) methods. (2) comparing forecast error performance between realized variance and squared residuals measures as a proxy for actual volatility. Data of seven private equity mutual fund indices (generated from 57 individual funds) from two different time periods (with and without financial crisis) are applied to 21 models. Robustness of the post-sample volatility forecasting of all models is validated by the Model Confidence Set (MCS) Procedures and revealed: (1) The weekly RVar-MIDAS model emerged as the best model, outperformed the robust DAILY-STES methods, and the weekly DAILY-GARCH models, particularly during a volatile period. (2) models with realized variance measured in estimation and as a proxy for actual volatility outperformed those using squared residual. This study contributes an empirical approach to one-week ahead volatility forecasting of mutual funds return, which is less explored in past literature on financial volatility forecasting compared to stocks volatility.

Consensus Clustering for Time Course Gene Expression Microarray Data

  • Kim, Seo-Young;Bae, Jong-Sung
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.2
    • /
    • pp.335-348
    • /
    • 2005
  • The rapid development of microarray technologies enabled the monitoring of expression levels of thousands of genes simultaneously. Recently, the time course gene expression data are often measured to study dynamic biological systems and gene regulatory networks. For the data, biologists are attempting to group genes based on the temporal pattern of their expression levels. We apply the consensus clustering algorithm to a time course gene expression data in order to infer statistically meaningful information from the measurements. We evaluate each of consensus clustering and existing clustering methods with various validation measures. In this paper, we consider hierarchical clustering and Diana of existing methods, and consensus clustering with hierarchical clustering, Diana and mixed hierachical and Diana methods and evaluate their performances on a real micro array data set and two simulated data sets.

Optimization-Based Pattern Generation for LAD (최적화에 근거한 LAD의 패턴생성 기법)

  • Jang, In-Yong;Ryoo, Hong-Seo
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2005.10a
    • /
    • pp.409-413
    • /
    • 2005
  • The logical analysis of data(LAD) is an effective Boolean-logic based data mining tool. A critical step in analyzing data by LAD is the pattern generation stage where useful knowledge and hidden structural information in data is discovered in the form of patterns. A conventional method for pattern generation in LAD is based on term enumeration that renders the generation of higher degree patterns practically impossible. In this paper, we present a new optimization-based pattern generation methodology and propose two mathematical programming medels, a mixed 0-1 integer and linear programming(MILP) formulation and a well-studied set covering problem(SCP) formulation for the generation of optimal and heuristic patterns, respectively. With benchmark datasets, we demonstrate the effectiveness of our models by automatically generating with much ease patterns of high complexity that cannot be generated with the conventional approach.

  • PDF

Individual Tree Growth Models for Natural Mixed Forests in Changbai Mountains, Northeast China

  • Lu, Jun;Li, Fengri
    • Journal of Korean Society of Forest Science
    • /
    • v.96 no.2
    • /
    • pp.160-169
    • /
    • 2007
  • The data used to develop distance-independent individual models for natural mixed forests were collected from 712 remeasured permanent sample plots (25,526 trees) of 10-year periodic from 1990 to 2000 in Baihe Forest Bureau of Changbai Mountains, northeast China. Based on analyzing relationship between diameter increment of individual trees with tree size, competitive status, and site condition, the diameter growth models for individual trees of 15 species growing in mixed-species uneven-aged forest stands, that have simple form, good predicting precision, and easily applicable, were developed using stepwise regression method. The main variables influencing on diameter increment of individual trees were tree size and competition, however, the site conditions were not significantly related with diameter increment. The tree size variables (lnDBH and $DBH^2$) were the most significant and important predictors of diameter growth existing in all 15 growth models. The diameter increment was directly proportional to tree diameter for each species. For the competitive factors in growth model, the relative diameter (RD), canopy closure (P), and the ratio of diameter of subject tree with maximum diameter (DDM) were contributed to the diameter increment at a certain extent. Other measures of stand density, such as basal area of stand (G) and stand density index (SDI), were not significantly influenced on diameter increment. Site factors, such as site index, slope and aspect were not important to diameter increment and excluded in the final models. The total variance explained by the final models of squared diameter increment ($R^2$) for all 15 species ranged from 35% to 72% and these results compared quit closely with those of Wykoff (1990) for mixed conifer stands. Using independent data set, validation measures were evaluated for predicting models of diameter increment developed in this study. The result indicated that the estimated precision was all greater than 94% and the models were suitable to describe diameter increment.