• Title/Summary/Keyword: Pseudo data

Search Result 795, Processing Time 0.021 seconds

A Flexible Modeling Approach for Current Status Survival Data via Pseudo-Observations

  • Han, Seungbong;Andrei, Adin-Cristian;Tsui, Kam-Wah
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.6
    • /
    • pp.947-958
    • /
    • 2012
  • When modeling event times in biomedical studies, the outcome might be incompletely observed. In this paper, we assume that the outcome is recorded as current status failure time data. Despite well-developed literature the routine practical use of many current status data modeling methods remains infrequent due to the lack of specialized statistical software, the difficulty to assess model goodness-of-fit, as well as the possible loss of information caused by covariate grouping or discretization. We propose a model based on pseudo-observations that is convenient to implement and that allows for flexibility in the choice of the outcome. Parameter estimates are obtained based on generalized estimating equations. Examples from studies in bile duct hyperplasia and breast cancer in conjunction with simulated data illustrate the practical advantages of this model.

The Integration of GIS with LANDSAT TM Data for Ground Water Potential Area Mapping (I) - Extraction of the Ground Water Potential Area using LANDSAT TM Data - (지하수 부존 가능지역 추출을 위한 LANDSAT TM 자료와 GIS의 통합(I) - LANDSAT TM 자료에 의한 지하수 부존 가능지역 추출 -)

  • 지종훈
    • Korean Journal of Remote Sensing
    • /
    • v.7 no.1
    • /
    • pp.29-43
    • /
    • 1991
  • The study was performed to extraction the ground water potential area using LANDSAT TM data. The image processing techniques developed for the study are contrast transformation, differential filtering and pseudo stereoscopic image methods. These were examined for lineament extraction, lineament interpretation and the integration of vertor data with LANDSAT data. The differential filtering method is much usefull for lineament extraction, and all direction lineaments are clearly shown on the band 5 image of LANDSAT TM. The pseudo stereoscopic image are made in which color differential method is adopted, the pair images are usefull for the lineament interpretation. The results of the analysis are as follows. 1) there is a close correlation between lineament and cased well in the study area, because 33 wells of the developed 45 cased wells coincide with the lineaments. 2) 21 sites in the study area were selected for pumping test, and as a result 11 sites of them produces over than 200 ton/day.

Design of Thin-Client Framework for Application Sharing & Optimization of Data Access (애플리케이션 공유 및 데이터 접근 최적화를 위한 씬-클라이언트 프레임워크 설계)

  • Song, Min-Gyu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.14 no.5
    • /
    • pp.19-32
    • /
    • 2009
  • In this paper, we design thin-client framework capable of application sharing & data access on the Internet, and apply related skills, such as X windows system, pseudo server, CODA file system, MPI(Message Passing Interface). We suggest a framework for the thin client to access data produced by working on a server optimally as well as to run server side application, even in the case of network down. Additionally, it needed to reflect all local computing changes to remote server when network is restored. To design thin client framework with these characteristics, in this paper, we apply distributed pseudo server and CODA file system to our framework, also utilize MPI for the purpose of more efficient computing & management. It allows for implementation of network independent computing environment of thin client, also provide scalable application service to numerous user through the elimination of bottleneck on caused by server overload. In this paper, we discuss the implementing method of thin client framework in detail.

Rhipe Platform for Big Data Processing and Analysis (빅데이터 처리 및 분석을 위한 Rhipe 플랫폼)

  • Jung, Byung Ho;Shin, Ji Eun;Lim, Dong Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.7
    • /
    • pp.1171-1185
    • /
    • 2014
  • Rhipe that integrates R and Hadoop environment, made it possible to process and analyze massive amounts of data using a distributed processing environment. In this paper, we implemented multiple regression analysis using Rhipe with various data sizes of actual data and simulated data. Experimental results for comparing the computing speeds of pseudo-distributed and fully-distributed modes for configuring Hadoop cluster, showed fully-distributed mode was more fast than pseudo-distributed mode and computing speeds of fully-distributed mode were faster as the number of data nodes increases. We also compared the performance of our Rhipe with stats and biglm packages available on bigmemory. The results showed that our Rhipe was more fast than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

The Optimal Seeding Quantity of Lespedeza cyrtobotrya Miquel and Indigofera pseudo-tinctoria MATSUMURA as Leguminous Woody Plants for the Cut-slope Revegetation (훼손비탈면 복원을 위한 콩과목본류로서 참싸리 및 낭아초의 적정파종량에 관한 연구)

  • Yu, Byeong-Deuk;Shim, Sang-Ryul
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.19 no.1
    • /
    • pp.61-71
    • /
    • 2016
  • The purpose of the research is to identify the optimal seeding quantity of Lespedeza cyrtobotrya and Indigofera pseudo-tinctoria as leguminous woody plants for the cut-slope revegetation. To investigate the coverage ratio and appearance frequency, we divided Lespedeza cyrtobotrya and Indigofera pseudo-tinctoria into five treatment groups with various quantities of $0.0g/m^2$, $1.0g/m^2$, $2.4g/m^2$, $3.8g/m^2$, and $5.2g/m^2$. For each treatment group, we mixed the identical quantities of herbaceous flowers (Lotus corniculatus var. japonicus, Dianthus sinensis, Aster yomena and Pennisetum alopecuroides) seeds and cool-season turfgrasses (Festuca arundinacea and Poa pratensis) seeds. In this result, as the seeding quantity of Lespedeza cyrtobotrya and Indigofera pseudo-tinctoria increased in the spray, the coverage ratio for leguminous woody plants appeared to increase whereas the coverage ratio decreased in herbaceous flowers and cool-season turfgrasses. However, when the seeding quantity of Lespedeza cyrtobotrya and Indigofera pseudo-tinctoria exceeded $3.8g/m^2$ in the spray, the coverage ratio of leguminous woody plants rather decreased compared to four treatment groups less than $3.8g/m^2$ seeding ratio. Based on the longitudinal data of coverage ratio in five treatment groups, we observed the gradual process of a short-term succession in which the dominant species shifted in the following order: First, cool-season turfgrasses; Second, herbaceous flowers; Third, leguminous woody plants. Comparing the appearance frequency of Lespedeza cyrtobotrya and Indigofera pseudo-tinctoria, Lespedeza cyrtobotrya appeared more frequent in 2014 whereas Indigofera pseudo-tinctoria appeared to be relatively more frequent in 2015. As a result, Indigofera pseudo-tinctoria was discovered to be a dominant species among woody plants. In this study, we observed that the optimal seeding quantity of Lespedeza cyrtobotrya and Indigofera pseudo-tinctoria was $2.4g/m^2{\sim}3.8g/m^2$. Moreover, the coverage ratio of 29.1%~35.4% and appearance frequency of 4.6plants~5.8plants were found in the condition of optimal seeding quantity.

Revisiting the Bradley-Terry model and its application to information retrieval

  • Jeon, Jong-June;Kim, Yongdai
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.5
    • /
    • pp.1089-1099
    • /
    • 2013
  • The Bradley-Terry model is widely used for analysis of pairwise preference data. We explain that the popularity of Bradley-Terry model is gained due to not only easy computation but also some nice asymptotic properties when the model is misspecified. For information retrieval required to analyze big ranking data, we propose to use a pseudo likelihood based on the Bradley-Terry model even when the true model is different from the Bradley-Terry model. We justify using the Bradley-Terry model by proving that the estimated ranking based on the proposed pseudo likelihood is consistent when the true model belongs to the class of Thurstone models, which is much bigger than the Bradley-Terry model.

Censored varying coefficient regression model using Buckley-James method

  • Shim, Jooyong;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1167-1177
    • /
    • 2017
  • The censored regression using the pseudo-response variable proposed by Buckley and James has been one of the most well-known models. Recently, the varying coefficient regression model has received a great deal of attention as an important tool for modeling. In this paper we propose a censored varying coefficient regression model using Buckley-James method to consider situations where the regression coefficients of the model are not constant but change as the smoothing variables change. By using the formulation of least squares support vector machine (LS-SVM), the coefficient estimators of the proposed model can be easily obtained from simple linear equations. Furthermore, a generalized cross validation function can be easily derived. In this paper, we evaluated the proposed method and demonstrated the adequacy through simulate data sets and real data sets.

Bootstrap Simulation for Performance Evaluation of Optical Multifiber Connectors (붓스크랩 기법을 이용한 다심 광커넥터 손실특성 예측)

  • 전오곤;강기훈
    • Journal of Korean Society for Quality Management
    • /
    • v.26 no.4
    • /
    • pp.250-264
    • /
    • 1998
  • The purpose of the thesis is to develop simulation program for forecasting of optical connector. So we can achieve the time and the money saving for making the optical connector. Optical performance (insertion loss) of optical connector mainly relies on 3 misalignment factors-ferrule factor due to mis-manufacture from design, auto-centering effect that is fiber behavior phenomena between hole and fiber, fiber misalignment factor. Simulation use experimental data with auto-centering effect and fiber factor and use pseudo data with ferrule through random number generation because it is developing stage. In this study we a, pp.y kernel density estimation method with experimental data in order to know whether it belong to or not specific parametric distribution family. And we simulate to forecast insertion loss of optical multifiber connector under specific design model using nonparametric bootstrap resampling data and parametric pseudo samples from uniform distribution. We obtain the tolerance specifications of misalignment factors satisfying not exceed in maximum 1.0dB and choose optimal hole diameter.

  • PDF

Application of Data Dictionary to BIM for Small and Medium Project (중소규모 사업용 BIM을 위한 데이터 사전의 활용)

  • Lee, Hwan Woo;Lee, Kyung Sub;Kim, Kwang Yang
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.26 no.6
    • /
    • pp.431-438
    • /
    • 2013
  • The systemization of construction information is required over whole life cycle of facilities to improve productivity of construction industry. BIM(Building Information Modeling) is a technology to manage information based on 3D information model. It has been actively suggested as one of the alternatives. However, it may be currently concentrated on the large project while the small and medium project based on BIM are slightly treated in indifference. In the case of small and medium project, the loss of information has been occurred more seriously than large project. However, it is hard to introduce BIM to the small and medium companies due to the lack of investment resources. This study has been performed to set up information management system based on BIM considering characteristics of small and medium project without excessive investment. In this study, pseudo BIM is defined as BIM for small and medium project. The concept of pseudo BIM has been suggested. The PLIB of ISO and construction information classification system of MOLIT in Korea are used to construct data dictionary for pseudo BIM. A pilot test is performed to verify the effectiveness of pseudo BIM.

On a Multiple Data Handling Method under Online Parameter Estimation

  • Takeyasu, Kazuhiro;Amemiya, Takashi;Iino, Katsuhiro;Masuda, Shiro
    • Industrial Engineering and Management Systems
    • /
    • v.1 no.1
    • /
    • pp.64-72
    • /
    • 2002
  • In the field of plant maintenance, data that are gathered by sensors on multiple machines are handled and analyzed. Online or pseudo online data handling is required on such fields. When the data occurrence speed exceeds the data handling speed, multiple data should be handled at a time (batch data handling or pseudo online data handling). If l amount of data are received at one time following N amount of data, how to estimate the new parameters effectively is a great concern. A new simplified calculation method, which calculates the N data's weights, is introduced. Numerical examples show that this new method has a fairly god estimation accuracy and the calculation time is less than 1/10 compared with the case when the whole data are re-calculated. Even under the restriction calculation ability in the apparatus is limited, this proposed method makes the failure detection of equipments possible in early stages with a few new coming data. This method would be applicable in many data handling fields.