• 제목/요약/키워드: Pseudo data

검색결과 793건 처리시간 0.024초

A Flexible Modeling Approach for Current Status Survival Data via Pseudo-Observations

  • Han, Seungbong;Andrei, Adin-Cristian;Tsui, Kam-Wah
    • 응용통계연구
    • /
    • 제25권6호
    • /
    • pp.947-958
    • /
    • 2012
  • When modeling event times in biomedical studies, the outcome might be incompletely observed. In this paper, we assume that the outcome is recorded as current status failure time data. Despite well-developed literature the routine practical use of many current status data modeling methods remains infrequent due to the lack of specialized statistical software, the difficulty to assess model goodness-of-fit, as well as the possible loss of information caused by covariate grouping or discretization. We propose a model based on pseudo-observations that is convenient to implement and that allows for flexibility in the choice of the outcome. Parameter estimates are obtained based on generalized estimating equations. Examples from studies in bile duct hyperplasia and breast cancer in conjunction with simulated data illustrate the practical advantages of this model.

지하수 부존 가능지역 추출을 위한 LANDSAT TM 자료와 GIS의 통합(I) - LANDSAT TM 자료에 의한 지하수 부존 가능지역 추출 - (The Integration of GIS with LANDSAT TM Data for Ground Water Potential Area Mapping (I) - Extraction of the Ground Water Potential Area using LANDSAT TM Data -)

  • 지종훈
    • 대한원격탐사학회지
    • /
    • 제7권1호
    • /
    • pp.29-43
    • /
    • 1991
  • The study was performed to extraction the ground water potential area using LANDSAT TM data. The image processing techniques developed for the study are contrast transformation, differential filtering and pseudo stereoscopic image methods. These were examined for lineament extraction, lineament interpretation and the integration of vertor data with LANDSAT data. The differential filtering method is much usefull for lineament extraction, and all direction lineaments are clearly shown on the band 5 image of LANDSAT TM. The pseudo stereoscopic image are made in which color differential method is adopted, the pair images are usefull for the lineament interpretation. The results of the analysis are as follows. 1) there is a close correlation between lineament and cased well in the study area, because 33 wells of the developed 45 cased wells coincide with the lineaments. 2) 21 sites in the study area were selected for pumping test, and as a result 11 sites of them produces over than 200 ton/day.

애플리케이션 공유 및 데이터 접근 최적화를 위한 씬-클라이언트 프레임워크 설계 (Design of Thin-Client Framework for Application Sharing & Optimization of Data Access)

  • 송민규
    • 한국산업정보학회논문지
    • /
    • 제14권5호
    • /
    • pp.19-32
    • /
    • 2009
  • 본 논문에서는 인터넷 상에서 애플리케이션 공유과 데이터 접근을 수행할 수 있는 씬-클라이언트 프레임워크를 설계할 것이며, 관련 기술로 X 윈도우 시스템, 가상 서버, CODA 파일 시스템, MPI(Message Passing Interface)를 활용하고자 한다. 우리는 네트워크 연결이 중단되더라도 서버 상에서 실행되던 애플리케이션을 로컬 상에서 실행할 수 있음은 물론 서버 상의 작업 수행으로 생성된 데이터에 클라이언트가 최적으로 접근할 수 있는 씬-클라이언트 프레임워크를 제안하고자 한다. 또한 네트워크가 복원되었을 때 로컬 상의 작업 내역이 서버에 효과적으로 반영될 수 있어야 할 것이다. 이러한 씬-클라이언트 프레임워크를 설계하기 위하여 본 논문에서는 기존의 시스템에 분산 Pseudo 서버, CODA 파일 시스템 기술을 접목시킬 것이며, 보다 효율적인 작업 수행, 관리를 위해 MPI를 활용할 것이다. 이를 통하여 네트워크 독립적인 씬-클라이언트 작업 환경을 구축할 수 있고 서버의 병목현상을 지양함으로써 다수의 사용자에게 확장성 있는 애플리케이션 서비스를 제공할 수 있다. 본 논문에서는 이를 구현함에 있어 기반이 되는 씬-클라이언트 프레임워크의 설계 방안에 대해 논의하고자 한다.

빅데이터 처리 및 분석을 위한 Rhipe 플랫폼 (Rhipe Platform for Big Data Processing and Analysis)

  • 정병호;신지은;임동훈
    • 응용통계연구
    • /
    • 제27권7호
    • /
    • pp.1171-1185
    • /
    • 2014
  • R과 Hadoop의 통합환경인 Rhipe 개발로 인해 분산처리 환경 하에서 대용량 데이터 분석이 가능해졌다. 본 논문에서는 Rhipe을 이용하여 실제 데이터와 모의실험 데이터에서 다양한 데이터 크기에 따라 다중 회귀분석을 구현하였다. Hadoop의 가상분산 모드(pseudo-dstributed mode)와 완전분산 모드(fully-distributed mode) 구축 시스템 비교에서 완전분산 모드 시스템이 가상분산 모드 시스템보다 처리 속도가 빠르고 데이터 노드의 수가 많을수록 계산 시간이 점점 줄어드는 것을 알 수 있었다. 또한, 제안된 Rhipe 플랫폼의 성능을 평가하기 위해 기본 R 패키지인 stats와 bigmemory 상에서 유용한 biglm 패키지와 처리 속도를 비교하였다. 실험결과 Rhipe은 데이터의 크기가 클수록 map task 개수가 증가되고 동시에 병렬 처리로 인해 다른 패키지들보다 빠른 처리속도를 보였다.

훼손비탈면 복원을 위한 콩과목본류로서 참싸리 및 낭아초의 적정파종량에 관한 연구 (The Optimal Seeding Quantity of Lespedeza cyrtobotrya Miquel and Indigofera pseudo-tinctoria MATSUMURA as Leguminous Woody Plants for the Cut-slope Revegetation)

  • 유병득;심상렬
    • 한국환경복원기술학회지
    • /
    • 제19권1호
    • /
    • pp.61-71
    • /
    • 2016
  • The purpose of the research is to identify the optimal seeding quantity of Lespedeza cyrtobotrya and Indigofera pseudo-tinctoria as leguminous woody plants for the cut-slope revegetation. To investigate the coverage ratio and appearance frequency, we divided Lespedeza cyrtobotrya and Indigofera pseudo-tinctoria into five treatment groups with various quantities of $0.0g/m^2$, $1.0g/m^2$, $2.4g/m^2$, $3.8g/m^2$, and $5.2g/m^2$. For each treatment group, we mixed the identical quantities of herbaceous flowers (Lotus corniculatus var. japonicus, Dianthus sinensis, Aster yomena and Pennisetum alopecuroides) seeds and cool-season turfgrasses (Festuca arundinacea and Poa pratensis) seeds. In this result, as the seeding quantity of Lespedeza cyrtobotrya and Indigofera pseudo-tinctoria increased in the spray, the coverage ratio for leguminous woody plants appeared to increase whereas the coverage ratio decreased in herbaceous flowers and cool-season turfgrasses. However, when the seeding quantity of Lespedeza cyrtobotrya and Indigofera pseudo-tinctoria exceeded $3.8g/m^2$ in the spray, the coverage ratio of leguminous woody plants rather decreased compared to four treatment groups less than $3.8g/m^2$ seeding ratio. Based on the longitudinal data of coverage ratio in five treatment groups, we observed the gradual process of a short-term succession in which the dominant species shifted in the following order: First, cool-season turfgrasses; Second, herbaceous flowers; Third, leguminous woody plants. Comparing the appearance frequency of Lespedeza cyrtobotrya and Indigofera pseudo-tinctoria, Lespedeza cyrtobotrya appeared more frequent in 2014 whereas Indigofera pseudo-tinctoria appeared to be relatively more frequent in 2015. As a result, Indigofera pseudo-tinctoria was discovered to be a dominant species among woody plants. In this study, we observed that the optimal seeding quantity of Lespedeza cyrtobotrya and Indigofera pseudo-tinctoria was $2.4g/m^2{\sim}3.8g/m^2$. Moreover, the coverage ratio of 29.1%~35.4% and appearance frequency of 4.6plants~5.8plants were found in the condition of optimal seeding quantity.

Revisiting the Bradley-Terry model and its application to information retrieval

  • Jeon, Jong-June;Kim, Yongdai
    • Journal of the Korean Data and Information Science Society
    • /
    • 제24권5호
    • /
    • pp.1089-1099
    • /
    • 2013
  • The Bradley-Terry model is widely used for analysis of pairwise preference data. We explain that the popularity of Bradley-Terry model is gained due to not only easy computation but also some nice asymptotic properties when the model is misspecified. For information retrieval required to analyze big ranking data, we propose to use a pseudo likelihood based on the Bradley-Terry model even when the true model is different from the Bradley-Terry model. We justify using the Bradley-Terry model by proving that the estimated ranking based on the proposed pseudo likelihood is consistent when the true model belongs to the class of Thurstone models, which is much bigger than the Bradley-Terry model.

Censored varying coefficient regression model using Buckley-James method

  • Shim, Jooyong;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제28권5호
    • /
    • pp.1167-1177
    • /
    • 2017
  • The censored regression using the pseudo-response variable proposed by Buckley and James has been one of the most well-known models. Recently, the varying coefficient regression model has received a great deal of attention as an important tool for modeling. In this paper we propose a censored varying coefficient regression model using Buckley-James method to consider situations where the regression coefficients of the model are not constant but change as the smoothing variables change. By using the formulation of least squares support vector machine (LS-SVM), the coefficient estimators of the proposed model can be easily obtained from simple linear equations. Furthermore, a generalized cross validation function can be easily derived. In this paper, we evaluated the proposed method and demonstrated the adequacy through simulate data sets and real data sets.

붓스크랩 기법을 이용한 다심 광커넥터 손실특성 예측 (Bootstrap Simulation for Performance Evaluation of Optical Multifiber Connectors)

  • 전오곤;강기훈
    • 품질경영학회지
    • /
    • 제26권4호
    • /
    • pp.250-264
    • /
    • 1998
  • The purpose of the thesis is to develop simulation program for forecasting of optical connector. So we can achieve the time and the money saving for making the optical connector. Optical performance (insertion loss) of optical connector mainly relies on 3 misalignment factors-ferrule factor due to mis-manufacture from design, auto-centering effect that is fiber behavior phenomena between hole and fiber, fiber misalignment factor. Simulation use experimental data with auto-centering effect and fiber factor and use pseudo data with ferrule through random number generation because it is developing stage. In this study we a, pp.y kernel density estimation method with experimental data in order to know whether it belong to or not specific parametric distribution family. And we simulate to forecast insertion loss of optical multifiber connector under specific design model using nonparametric bootstrap resampling data and parametric pseudo samples from uniform distribution. We obtain the tolerance specifications of misalignment factors satisfying not exceed in maximum 1.0dB and choose optimal hole diameter.

  • PDF

중소규모 사업용 BIM을 위한 데이터 사전의 활용 (Application of Data Dictionary to BIM for Small and Medium Project)

  • 이환우;이경섭;김광양
    • 한국전산구조공학회논문집
    • /
    • 제26권6호
    • /
    • pp.431-438
    • /
    • 2013
  • 건설 산업의 전반적인 생산성 향상을 위하여 시설물의 전 생애주기에 걸쳐 정보의 시스템화가 요구되고 있다. 정보를 시스템화하는 방법의 하나로 3차원 정보모델을 기반으로 정보 관리하는 기술인 BIM(Building Information Modeling)이 활발하게 연구되고 있다. 하지만 BIM 연구의 초점은 대형 사업장에 맞추어져 있으며 중소규모 사업장을 위한 BIM 연구는 미비한 실정이다. 중소규모 사업장의 경우 대형 사업장보다 정보 손실이 더욱 심각하지만, 투자 자원의 부족으로 인해 BIM을 도입하기에는 힘든 실정이다. 따라서 이 논문에는 과도한 투자 없이 BIM의 효과를 얻을 수 있는 중소규모 사업장 대상 맞춤형 BIM 시스템 개발을 위한 연구를 수행하였다. 이를 Pseudo BIM(이하, 의사BIM)이라 정의하였다. 그리고 의사 BIM의 개념과 구축방법에 따라 PLIB Part 42, 건설정보분류체계 등을 활용하여 의사 BIM의 엔진 구조를 담당하는 데이터 사전 구축 방법을 제시하고 Pilot test를 실시하여 의사 BIM의 유효성을 검증하였다.

On a Multiple Data Handling Method under Online Parameter Estimation

  • Takeyasu, Kazuhiro;Amemiya, Takashi;Iino, Katsuhiro;Masuda, Shiro
    • Industrial Engineering and Management Systems
    • /
    • 제1권1호
    • /
    • pp.64-72
    • /
    • 2002
  • In the field of plant maintenance, data that are gathered by sensors on multiple machines are handled and analyzed. Online or pseudo online data handling is required on such fields. When the data occurrence speed exceeds the data handling speed, multiple data should be handled at a time (batch data handling or pseudo online data handling). If l amount of data are received at one time following N amount of data, how to estimate the new parameters effectively is a great concern. A new simplified calculation method, which calculates the N data's weights, is introduced. Numerical examples show that this new method has a fairly god estimation accuracy and the calculation time is less than 1/10 compared with the case when the whole data are re-calculated. Even under the restriction calculation ability in the apparatus is limited, this proposed method makes the failure detection of equipments possible in early stages with a few new coming data. This method would be applicable in many data handling fields.