• Title/Summary/Keyword: Cumulative data

Search Result 1,060, Processing Time 0.027 seconds

A Cumulative Logit Mixed Model for Ordered Response Data

  • Choi, Jae-Sung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.1
    • /
    • pp.123-130
    • /
    • 2006
  • This paper discusses about how to build up a mixed-effects model using cumulative logits when some factors are fixed and others are random. Location effects are considered as random effects by choosing them randomly from a population of locations. Estimation procedure for the unknown parameters in a suggested model is also discussed by an illustrated example.

  • PDF

Regression analysis of interval censored competing risk data using a pseudo-value approach

  • Kim, Sooyeon;Kim, Yang-Jin
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.6
    • /
    • pp.555-562
    • /
    • 2016
  • Interval censored data often occur in an observational study where the subject is followed periodically. Instead of observing an exact failure time, two inspection times that include it are available. There are several methods to analyze interval censored failure time data (Sun, 2006). However, in the presence of competing risks, few methods have been suggested to estimate covariate effect on interval censored competing risk data. A sub-distribution hazard model is a commonly used regression model because it has one-to-one correspondence with a cumulative incidence function. Alternatively, Klein and Andersen (2005) proposed a pseudo-value approach that directly uses the cumulative incidence function. In this paper, we consider an extension of the pseudo-value approach into the interval censored data to estimate regression coefficients. The pseudo-values generated from the estimated cumulative incidence function then become response variables in a generalized estimating equation. Simulation studies show that the suggested method performs well in several situations and an HIV-AIDS cohort study is analyzed as a real data example.

A cumulative logit mixed model for ordered response data

  • Choi, Jae-Sung
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2004.04a
    • /
    • pp.121-126
    • /
    • 2004
  • This paper discusses about how to build up a mixed-effects model using cumulative logits when there are some factors are fixed and others are random. Random factors are assumed to be coming from a two-way nested design for choosing individuals or experimental units to apply treatments. Estimation procedure for the unknown parameters in a suggested model is also discussed by an illustrated example.

  • PDF

Competing Risks Regression Analysis (경쟁적 위험하에서의 회귀분석)

  • Baik, Jaiwook
    • Journal of Applied Reliability
    • /
    • v.18 no.2
    • /
    • pp.130-142
    • /
    • 2018
  • Purpose: The purpose of this study is to introduce regression method in the presence of competing risks and to show how you can use the method with hypothetical data. Methods: Survival analysis has been widely used in biostatistics division. But the same method has not been utilized in reliability division. Especially competing risks, where more than a couple of causes of failure occur and the occurrence of one event precludes the occurrence of the other events, are scattered in reliability field. But they are not utilized in the area of reliability or they are analysed in the wrong way. Specifically Kaplan-Meier method is used to calculate the probability of failure in the presence of competing risks, thereby overestimating the real probability of failure. Hence, cumulative incidence function is introduced. In addition, sample competing risks data are analysed using cumulative incidence function along with some graphs. Lastly we compare cumulative incidence functions with regression type analysis briefly. Results: We used cumulative incidence function to calculate the survival probability or failure probability in the presence of competing risks. We also drew some useful graphs depicting the failure trend over the lifetime. Conclusion: This research shows that Kaplan-Meier method is not appropriate for the evaluation of survival or failure over the course of lifetime in the presence of competing risks. Cumulative incidence function is shown to be useful in stead. Some graphs using the cumulative incidence functions are also shown to be informative.

Development of a Semi-quantitative Food Frequency Questionnaire Based on Dietary Data from the Korea National Health and Nutrition Examination Survey

  • Younjhin Ahn;Lee, Ji-Eun;Paik, Hee-Young;Lee, Hong-Kyu;Inho Jo;Kim, Kuchan m
    • Nutritional Sciences
    • /
    • v.6 no.3
    • /
    • pp.173-184
    • /
    • 2003
  • Objective : This study was carried out to develop a semi-quantitative food frequency Questionnaire (SQFFQ) for estimating average dietary intake to determine the risk factor for lifestyle-related diseases in a conjoint cohort study. Design : We developed an SQFFQ for genomic epidemiological studies based on the data in the'98 Korea Health and Nutrition Examination Survey. A subset of data on informative food items was collected using the 24-hr recall method with 2,714 adults aged 40 or older living in middle-sized cities or in rural areas in Korea. The cumulative percent contribution and cumulative multiple regression coefficients of 17 nutrients (energy, fat, carbohydrate, protein, fiber, iron, potassium, sodium, calcium, phosphorus, vitamin A, retinol, $\beta$-carotene, vitamin $B_1$, vitamin $B_2$, niacin and vitamin C) of each food were computed. Results : Two hundred and forty-nine foods, which were selected based on their 0.9 cumulative percent contribution, and 254 foods, which were selected based on their 0.9 cumulative multiple regression coefficients, respectively, were grouped into 97 food groups according to their nutrient contents. Several popular Korean foods, which were missing from the list due to the seasonality of the survey, were included. The portion sizes were derived from the same data set. The SQFFQ covered 84.8 percent of the intake of 17 nutrients in the one day diet record data of our 326 cohort study subjects. Conclusions . The final list included 103 food items. The foods list in the SQFFQ described herein accounted for 84.8 percent of the average intake of 17 nutrients. Therefore, the list could be used for the assessment of the baseline dietary intakes of the conjoint cohort studies.

The difference between two distribution functions

  • Hong, Chong Sun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.6
    • /
    • pp.1449-1454
    • /
    • 2013
  • There are many methods for measuring the difference between two location parameters. In this paper, statistics are proposed in order to estimate the difference of two location parameters. The statistics are designed not using the means, variances, signs and ranks, but with the cumulative distribution functions. Hence these are measured as the differences in the area between two univariate cumulative distribution functions. It is found that the difference in the area between two empirical cumulative distribution functions is the difference of two sample means, and its integral is also the difference of two population means.

Nonparametric Inference for the Recurrent Event Data with Incomplete Observation Gaps

  • Kim, Jin-Heum;Nam, Chung-Mo;Kim, Yang-Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.4
    • /
    • pp.621-632
    • /
    • 2012
  • Recurrent event data can be easily found in longitudinal studies such as clinical trials, reliability fields, and the social sciences; however, there are a few observations that disappear temporarily in sight during the follow-up and then suddenly reappear without notice like the Young Traffic Offenders Program(YTOP) data collected by Farmer et al. (2000). In this article we focused on inference for a cumulative mean function of the recurrent event data with these incomplete observation gaps. Defining a corresponding risk set would be easily accomplished if we know the exact intervals where the observation gaps occur. However, when they are incomplete (if their starting times are known but their terminating times are unknown) we need to estimate a distribution function for the terminating times of the observation gaps. To accomplish this, we treated them as interval-censored and then estimated their distribution using the EM algorithm proposed by Turnbull (1976). We proposed a nonparametric estimator for the cumulative mean function and also a nonparametric test to compare the cumulative mean functions of two groups. Through simulation we investigated the finite-sample performance of the proposed estimator and proposed test. Finally, we applied the proposed methods to YTOP data.

Long-term Driving Data Analysis of Hybrid Electric Vehicle

  • Woo, Ji-Young;Yang, In-Beom
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.3
    • /
    • pp.63-70
    • /
    • 2018
  • In this work, we analyze the relationship between the accumulated mileage of hybrid electric vehicle(HEV) and the data provided from vehicle parts. Data were collected while traveling over 70,000 Km in various paths. The data collected in seconds are aggregated for 10 minutes and characterized in terms of centrality, variability, normality, and so on. We examined whether the statistical properties of vehicle parts are different for each cumulative mileage interval of a hybrid car. When the cumulative mileage interval is categorized into =< 30,000, <= 50,000, and >50,000, the statistical properties are classified by the mileage interval as 82.3% accuracy. This indicates that if the data of the vehicle parts is collected by operating the hybrid vehicle for 10 minutes, the cumulative mileage interval of the vehicle can be estimated. This makes it possible to detect the abnormality of the vehicle part relative to the accumulated mileage. It can be used to detect abnormal aging of vehicle parts and to inform maintenance necessity.

Mode identifiability of a multi-span cable-stayed bridge utilizing stabilization diagram and singular values

  • Goi, Y.;Kim, C.W.
    • Smart Structures and Systems
    • /
    • v.17 no.3
    • /
    • pp.391-411
    • /
    • 2016
  • This study investigates the mode identifiability of a multi-span cable-stayed bridge in terms of a benchmark study using stabilization diagrams of a system model identified using stochastic subspace identification (SSI). Cumulative contribution ratios (CCRs) estimated from singular values of system models under different wind conditions were also considered. Observations revealed that wind speed might influence the mode identifiability of a specific mode of a cable-stayed bridge. Moreover the cumulative contribution ratio showed that the time histories monitored during strong winds, such as those of a typhoon, can be modeled with less system order than under weak winds. The blind data Acc 1 and Acc 2 were categorized as data obtained under a typhoon. Blind data Acc 3 and Acc 4 were categorized as data obtained under wind conditions of critical wind speeds around 7.5 m/s. Finally, blind data Acc 5 and Acc 6 were categorized as data measured under weak wind conditions.

Classify and Quantify Cumulative Impact of Change Orders On Productivity Using ANN Models

  • Lee, Min-Jae
    • Korean Journal of Construction Engineering and Management
    • /
    • v.6 no.5 s.27
    • /
    • pp.69-77
    • /
    • 2005
  • Change is inevitable and is a reality of construction projects. Most construction contracts include change clauses and allowing contractors an equitable adjustment to the contract price and duration caused by change. However, the actions of a contractor can cause a loss of productivity and furthermore can result in disruption of the whole project because of a cumulative or ripple effect. Because of its complicated nature, it becomes a complex issue to determine the cumulative impact (ripple effect) caused by single or multiple change orders. Furthermore, owners and contractors do not always agree on the adjusted contract price for the cumulative Impact of the changes. A number of studies have attempted to quantify the impact of change orders on project costs and schedule. Many of these attempted to develop regression models to quantify the loss. However, regression analysis has shortcomings in dealing with many qualitative or noisy input data. This study develops ANN models to classify and quantify the labor productivity losses that are caused by the cumulative impact of change orders. The results skew that ANN models give significantly improved performance compared to traditional statistical models.