• Title/Summary/Keyword: Pooled data

Search Result 365, Processing Time 0.028 seconds

BaSDAS: a web-based pooled CRISPR-Cas9 knockout screening data analysis system

  • Park, Young-Kyu;Yoon, Byoung-Ha;Park, Seung-Jin;Kim, Byung Kwon;Kim, Seon-Young
    • Genomics & Informatics
    • /
    • v.18 no.4
    • /
    • pp.46.1-46.4
    • /
    • 2020
  • We developed the BaSDAS (Barcode-Seq Data Analysis System), a GUI-based pooled knockout screening data analysis system, to facilitate the analysis of pooled knockout screen data easily and effectively by researchers with limited bioinformatics skills. The BaSDAS supports the analysis of various pooled screening libraries, including yeast, human, and mouse libraries, and provides many useful statistical and visualization functions with a user-friendly web interface for convenience. We expect that BaSDAS will be a useful tool for the analysis of genome-wide screening data and will support the development of novel drugs based on functional genomics information.

A pooled Bayes test of independence using restricted pooling model for contingency tables from small areas

  • Jo, Aejeong;Kim, Dal Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.5
    • /
    • pp.547-559
    • /
    • 2022
  • For a chi-squared test, which is a statistical method used to test the independence of a contingency table of two factors, the expected frequency of each cell must be greater than 5. The percentage of cells with an expected frequency below 5 must be less than 20% of all cells. However, there are many cases in which the regional expected frequency is below 5 in general small area studies. Even in large-scale surveys, it is difficult to forecast the expected frequency to be greater than 5 when there is small area estimation with subgroup analysis. Another statistical method to test independence is to use the Bayes factor, but since there is a high ratio of data dependency due to the nature of the Bayesian approach, the low expected frequency tends to decrease the precision of the test results. To overcome these limitations, we will borrow information from areas with similar characteristics and pool the data statistically to propose a pooled Bayes test of independence in target areas. Jo et al. (2021) suggested hierarchical Bayesian pooling models for small area estimation of categorical data, and we will introduce the pooled Bayes factors calculated by expanding their restricted pooling model. We applied the pooled Bayes factors using bone mineral density and body mass index data from the Third National Health and Nutrition Examination Survey conducted in the United States and compared them with chi-squared tests often used in tests of independence.

Empirical Analysis on Agent Costs against Ownership Structure in Accordance with Verification of Suitability of the Model (모형의 적합성 검증에 따른 소유구조대비 대리인 비용의 실증분석)

  • Kim, Dae-Lyong;Lim, Kee-Soo;Sung, Sang-Hyeon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.8
    • /
    • pp.3417-3426
    • /
    • 2012
  • This study aims to determine how ownership structure (share-holding ratio of insiders, foreigners) affects agent costs (the portion of asset efficiency or non-operating expenses) through empirical analysis. However, as existing studies on correlations between ownership structure and agent costs adopted Pooled OLS Model, this study focused on additionally formulating Fixed Effect Model and Random Effect Model aimed to reflect the time of data formation and corporate effects as study models based on verification results on the suitability of Pooled-OLS Model before comparative analysis for the purpose of improvement of credibility and statistical validity of the results of empirical analysis based on the premise that the Pooled OLS Model is not reliable enough to verify massive panel data. The data has been accumulated over 10 years from 1998 to 2007 after the IMF crisis hit the nation, from a subject 331 companies except for financial institutions. As a result of the empirical analysis, verification of the suitability of model has determined that the Random Effect Model is appropriate in terms of asset efficiency among agent costs items. On the other hand, the Fixed Effect Model is appropriate in terms of non-operating costs. As a result of the empirical analysis according to the appropriate model, no hypothesis adopted in the Pooled OLS Model has been accepted. This suggests that developing an appropriate model is more important than other factors for the purpose of generating statistically significant empirical results by showing that different empirical results are produced according to the type of empirical analysis.

Calculating Sample Variance for the Combined Data (두 자료들의 평균과 분산을 이용한 혼합자료의 분산 계산)

  • Shin, Mi-Young;Cho, Tae-Kyoung
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.1
    • /
    • pp.177-182
    • /
    • 2008
  • There are times when we need more sample to achieve a more accurate estimator. Since these two sets of sample have the information about the same population, it is necessary to treat both as a single combined data. In this paper we present the unpooled sample variance for the combined data when we just know a sample mean and variance for the each data set without the raw data. It is shown that the pooled variance $s^2_p$ is always greater than the exact variance $s^2_t$ when ${\bar{x}}_n\;=\;{\bar{y}}_m$. And the difference of means for two data, ${\bar{x}}_n-{\bar{y}}_m}$, is larger, the difference of $s^2_p$ and $s^2_t$ is larger.

Interval Estimation of the Difference of two Population Proportions using Pooled Estimator

  • Hong, Chong-Sun
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.2
    • /
    • pp.389-399
    • /
    • 2002
  • In order to examine whether the difference between two point estimates of population proportions is statistically significant, data analysts use two techniques. The first is to explore the overlap between two associated confidence intervals. Second method is to test the significance which is introduced at most statistical textbooks under the common assumptions of consistency, asymptotic normality, and asymptotic independence of the estimates. Under the null hypothesis which is two population proportions are equal, the pooled estimator of population proportion is preferred as a point estimator since two independent random samples are considered to be collected from one population. Hence as an alternative method, we could obtain another confidence interval of the difference of the population proportions with using the pooled estimate. We conclude that, among three methods, the overlapped method is under-estimated, and the difference of the population proportions method is over-estimated on the basis of the proposed method.

Forecast and Demand Analysis of Oyster as Kimchi's Ingredients (김장굴의 수요 분석 및 예측)

  • Nam, Jong-Oh;Nho, Seung-Guk
    • The Journal of Fisheries Business Administration
    • /
    • v.42 no.2
    • /
    • pp.69-83
    • /
    • 2011
  • This paper estimates demand functions of oyster as Kimchi's ingredients of capital area, other areas excluding a capital area, and a whole area in Korea to forecast its demand quantities in 2011~2015. To estimate oyster demand function, this paper uses pooled data produced from Korean housewives over 30 years old in 2009 and 2010. Also, this paper adopts several econometrics methods such as Ordinary Least Squares and Feasible Generalized Least Squares. First of all, to choose appropriate variables of oyster demand functions by area, this paper carries out model's specification with joint significance test. Secondly, to remedy heteroscedasticity with pooled data, this paper attempts residual plotting between estimated squared residuals and estimated dependent variable and then, if it happens, undertakes White test to care the problem. Thirdly, to test multicollinearity between variables with pooled data, this paper checks correlations between variables by area. In this analysis, oyster demand functions of a capital area and a whole area need price of the oyster, price of the cabbage for Gimjang, and income as independent variables. The function on other areas excluding a capital area only needs price of the oyster and income as ones. In addition, the oyster demand function of a whole area needed White test to care a heteroscedasticity problem and demand functions of the other two regions did not have the problem. Thus, first model was estimated by FGLS and second two models were carried out by OLS. The results suggest that oyster demand quantities per a household as Kimchi's ingredients are going to slightly increase in a capital area and a whole area, but slightly decrease in other areas excluding a capital area in 2011~2015. Also, the results show that oyster demand quantities as kimchi's ingredients for total household targeting housewives over 30 years old are going to slightly increase in three areas in 2011~2015.

Constructing Simultaneous Confidence Intervals for the Difference of Proportions from Multivariate Binomial Distributions

  • Jeong, Hyeong-Chul;Kim, Dae-Hak
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.1
    • /
    • pp.129-140
    • /
    • 2009
  • In this paper, we consider simultaneous confidence intervals for the difference of proportions between two groups taken from multivariate binomial distributions in a nonparametric way. We briefly discuss the construction of simultaneous confidence intervals using the method of adjusting the p-values in multiple tests. The features of bootstrap simultaneous confidence intervals using non-pooled samples are presented. We also compute confidence intervals from the adjusted p-values of multiple tests in the Westfall (1985) style based on a pooled sample. The average coverage probabilities of the bootstrap simultaneous confidence intervals are compared with those of the Bonferroni simultaneous confidence intervals and the Sidak simultaneous confidence intervals. Finally, we give an example that shows how the proposed bootstrap simultaneous confidence intervals can be utilized through data analysis.

Pooled Analysis of the Cow's Milk-related-Symptom-Score (CoMiSSTM) as a Predictor for Cow's Milk Related Symptoms

  • Vandenplas, Yvan;Steenhout, Philippe;Jarvi, Anette;Garreau, Anne-Sophie;Mukherjee, Rajat
    • Pediatric Gastroenterology, Hepatology & Nutrition
    • /
    • v.20 no.1
    • /
    • pp.22-26
    • /
    • 2017
  • Purpose: The diagnosis of cow's milk (CM) allergy is a challenge. The Cow's Milk-related-Symptom-Score ($CoMiSS^{TM}$) was developed to offer primary health care providers a reliable diagnostic tool for CM related symptoms. The predictive prospective value of the $CoMiSS^{TM}$ was evaluated in three clinical trials. Methods: Pooled analyses of the three studies were conducted based on regressing the results of the month-1 challenge test on the month-1 $CoMiSS^{TM}$, adjusting for baseline $CoMiSS^{TM}$ using a logistic regression model. In addition a logistic regression model was also fitted to the month-1 challenge test result with the change in $CoMiSS^{TM}$ from baseline as a predictor. Results: Results suggest that infants having a low $CoMiSS^{TM}$ (median, 5) after 1 month dietary treatment free from intact CM protein have a significant risk of having a positive challenge test (odds ratio, 0.83; 95% confidence interval, 0.75-0.93; p=0.002). Pooled data suggest that the change in $CoMiSS^{TM}$ from baseline to month-1 can predict CM related symptoms as a confirmed diagnosis according to the challenge test at month-1. However, in order to validate such a tool, infants without CM related symptoms would also need to be enrolled in a validation trial. A concern is that it may not be ethical to expose healthy infants to a therapeutic formula and a challenge test. Conclusion: Pooled data analysis emphasizes that the $CoMiSS^{TM}$ has the potential to be of interest in infants suspected to have CM-related-symptoms. A prospective validation trial is needed.

Is Stent-Assisted Coil Embolization for the Treatment of Ruptured Blood Blister-Like Aneurysms of the Supraclinoid Internal Carotid Artery Effective? : An Analysis of Single Institutional Experience with Pooled Data

  • Roh, Haewon;Kim, Junwon;Suh, Sang-il;Kwon, Taek-Hyun;Yoon, Wonki
    • Journal of Korean Neurosurgical Society
    • /
    • v.64 no.2
    • /
    • pp.217-228
    • /
    • 2021
  • Objective : Given the high risk of rebleeding and recurrence of blood blister-like aneurysms (BBAs), we treated ruptured BBAs of the internal carotid artery (ICA) with stent-assisted coil embolization (SAC). This study aimed to evaluate the efficacy and safety of SACs. Methods : We retrospectively reviewed clinical and radiological data from eight patients with ruptured BBAs of the supraclinoid ICA. The modified Rankin Scale (mRS) was used to assess clinical outcomes, while radiological outcomes were evaluated on angiographs. For a pooled analysis, data from literature reporting the outcomes of ruptured BBAs treated with SAC were collected and analyzed in conjunction with our data. Results : In our cohort, the mean Raymond classification score was 1.57±0.53 immediately after initial endovascular treatment. There were no perioperative complications or rebleeding events during the follow-up period. The mean mRS score at patient discharge was 1.00±0.81 and improved to 0.28±0.48 by the last follow-up day. The recurrence rate was 25% with an asymptomatic presentation and successful treatment with multiple stent insertion. Pooled analysis of 76 cases of SAC revealed a complete occlusion rate immediately after treatment of 54.8%, rebleeding rate 7.94%, and recurrence rate 24.2%. Good clinical outcomes with mRS score 0-2 were observed in 89.9% by the last clinical follow-up. Total mortality rate was 7.7%. Conclusion : This treatment appears to not only minimize the hemodynamic burden on the fragile dome specific to this type of aneurysm, but also provides an opportunity for safe and effective treatment in recurrent cases.

An analysis of the effect of the inequality of income to the inequality of health: Using Panel Analysis of the OECD Health data from 1980 to 2013

  • Lee, Hun-Hee;Lee, Jung-Seo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.10
    • /
    • pp.145-150
    • /
    • 2017
  • This study aims to analyze panel data using OECD Health data of 34 years to examine how significant the inequality of income is to the inequality of health. The data was from OECD's pooled Health data of 32 countries from 1980 to 2013. The process of determining analysis model was as follows; First, through the descriptive statistics, we examined averages and standard deviation of variables. Second, Lagrange multiplier test has done. Third, through the F-test, we compared Least squares method and Fixed effect model. Lastly, by Hausman test, we determined proper model and examined effective factor using the model. As a result, rather than Pooled OLS Model, Fixed Effect Model was shown as effective in order to consider the characteristics of individual in the panel. The results are as follows: First, as relative poverty rate(${\beta}=-19.264$, p<.01) grows, people's life expectancy decreases. Second, as the rate of smoking(${\beta}=-.125$, p<.05) and the rate of unemployment (${\beta}=-.081$, p<.01) grows, people's life expectancy decreases. Third, as health expenditure(${\beta}=.414$, p<.01) shares more amount of GDP and as the number of hospital beds(${\beta}=-.190$, p<.05) grows, people's life expectancy increases.