• Title/Summary/Keyword: Data Heterogeneity

Search Result 604, Processing Time 0.03 seconds

Bayesian modeling of random effects precision/covariance matrix in cumulative logit random effects models

  • Kim, Jiyeong;Sohn, Insuk;Lee, Keunbaik
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.1
    • /
    • pp.81-96
    • /
    • 2017
  • Cumulative logit random effects models are typically used to analyze longitudinal ordinal data. The random effects covariance matrix is used in the models to demonstrate both subject-specific and time variations. The covariance matrix may also be homogeneous; however, the structure of the covariance matrix is assumed to be homoscedastic and restricted because the matrix is high-dimensional and should be positive definite. To satisfy these restrictions two Cholesky decomposition methods were proposed in linear (mixed) models for the random effects precision matrix and the random effects covariance matrix, respectively: modified Cholesky and moving average Cholesky decompositions. In this paper, we use these two methods to model the random effects precision matrix and the random effects covariance matrix in cumulative logit random effects models for longitudinal ordinal data. The methods are illustrated by a lung cancer data set.

Regression Analysis of Longitudinal Data Based on M-estimates

  • Jung, Sin-Ho;Terry M. Therneau
    • Journal of the Korean Statistical Society
    • /
    • v.29 no.2
    • /
    • pp.201-217
    • /
    • 2000
  • The method of generalized estimating equations (GEE) has become very popular for the analysis of longitudinal data. We extend this work to the use of M-estimators; the resultant regression estimates are robust to heavy tailed errors and to outliers. The proposed method does not require correct specification of the dependence structure between observation, and allows for heterogeneity of the error. However, an estimate of the dependence structure may be incorporated, and if it is correct this guarantees a higher efficiency for the regression estimators. A goodness-of-fit test for checking the adequacy of the assumed M-estimation regression model is also provided. Simulation studies are conducted to show the finite-sample performance of the new methods. The proposed methods are applied to a real-life data set.

  • PDF

Comparison of synthetic seismograms referred to inhomogeneous medium (불균질 매질에 따른 인공 합성 탄성파 자료 비교)

  • Kim, Young-Wan;Jang, Seung-Hyung;Yoon, Wang-Joong;Suh, Sang-Yong
    • 한국지구물리탐사학회:학술대회논문집
    • /
    • 2007.06a
    • /
    • pp.197-202
    • /
    • 2007
  • Most of seismic reflection prospecting assumes subsurface formation to be homogeneous media. These models are not capable of estimating small scale heterogeneity which is verified by well log data or drilling core. And those synthetic seismograms by homogeneous media are limited to explain various changes at field data. So we developed a inhomogeneous velocity model which can estimate inhomogeneity of background medium to implement numerical modeling from homogeneous medium and inhomogeneous medium on the model. Background medium using three autocorrelation functions in order to generate inhomogeneous velocity media was according to dominant wavelength of background medium and correlation length of random medium. And then we compared shot gathers. The results show that numerical modeling implemented at inhomogeneous medium depicts complex wave propagation of field data.

  • PDF

A visualizing method for investigating individual frailties using frailtyHL R-package

  • Ha, Il Do;Noh, Maengseok
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.4
    • /
    • pp.931-940
    • /
    • 2013
  • For analysis of clustered survival data, the inferences of parameters in semi-parametric frailty models have been widely studied. It is also important to investigate the potential heterogeneity in event times among clusters (e.g. centers, patients). For purpose of this analysis, the interval estimation of frailty is useful. In this paper we propose a visualizing method to present confidence intervals of individual frailties across clusters using the frailtyHL R-package, which is implemented from h-likelihood methods for frailty models. The proposed method is demonstrated using two practical examples.

A Data Value Heterogeneity Solving Method In A GSN Based DataBase Integration Model (GSN 기반 DB통합 모델에서의 data value 이질성 해결 기법)

  • 홍종하;박성공;이종옥;백두권
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.10a
    • /
    • pp.331-333
    • /
    • 2001
  • 분산되고 이질적인 환경에서의 정보 소스들을 통합하려는 노력은 끊임 없이 계속되어 왔다. 이질적인 다중 정보소스로부터 추출된 정보를 통합하는 도구를 개발하는 것은 인터넷 기반에서 다양한 정보들을 실시간으로 사용할 수 있다는 측면에서 아주 흥미로운 일이다. 이러한 도구를 개발하는데 있어서의 주된 문제점은 서로 다른 정보소스에 존재하지만 실제적으로는 같은 실세계의 개념을 가지고 있는 정보를 어떻게 효과적으로 표현할 것인가 하는 것이다. 이러한 의미적 이질성을 해결하기 위해서 WordNet이나 Common Thesaurus 등을 이용한 개념 기반의 접근방법이 많이 제안되었다. 하지만 이들은 스키마 이질성을 해결하는 방법을 제시 할 뿐, 데이터의 이질성을 해결 하는 방법은 보여주지 않는다. 본 논문에서는 GSN(Global Semantic Network)을 이용해서 스키마 이질성을 해결해야 데이터베이스 시스템에서 발생하는 데이터 이질성의 예를 제시하고 이러한 데이터 이질성을 해결할 수 있는 기법을 제안한다.

  • PDF

A Study on the Calculation and Provision of Accruals-Quality by Big Data Real-Time Predictive Analysis Program

  • Shin, YeounOuk
    • International journal of advanced smart convergence
    • /
    • v.8 no.3
    • /
    • pp.193-200
    • /
    • 2019
  • Accruals-Quality(AQ) is an important proxy for evaluating the quality of accounting information disclosures. High-quality accounting information will provide high predictability and precision in the disclosure of earnings and will increase the response to stock prices. And high Accruals-Quality, such as mitigating heterogeneity in accounting information interpretation, provides information usefulness in capital markets. The purpose of this study is to suggest how AQ, which represents the quality of accounting information disclosure, is transformed into digitized data in real-time in combination with IT information technology and provided to financial analyst's information environment in real-time. And AQ is a framework for predictive analysis through big data log analysis system. This real-time information from AQ will help financial analysts to increase their activity and reduce information asymmetry. In addition, AQ, which is provided in real time through IT information technology, can be used as an important basis for decision-making by users of capital market information, and is expected to contribute in providing companies with incentives to voluntarily improve the quality of accounting information disclosure.

Resource Management Strategies in Fog Computing Environment -A Comprehensive Review

  • Alsadie, Deafallah
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.4
    • /
    • pp.310-328
    • /
    • 2022
  • Internet of things (IoT) has emerged as the most popular technique that facilitates enhancing humans' quality of life. However, most time sensitive IoT applications require quick response time. So, processing these IoT applications in cloud servers may not be effective. Therefore, fog computing has emerged as a promising solution that addresses the problem of managing large data bandwidth requirements of devices and quick response time. This technology has resulted in processing a large amount of data near the data source compared to the cloud. However, efficient management of computing resources involving balancing workload, allocating resources, provisioning resources, and scheduling tasks is one primary consideration for effective computing-based solutions, specifically for time-sensitive applications. This paper provides a comprehensive review of the source management strategies considering resource limitations, heterogeneity, unpredicted traffic in the fog computing environment. It presents recent developments in the resource management field of the fog computing environment. It also presents significant management issues such as resource allocation, resource provisioning, resource scheduling, task offloading, etc. Related studies are compared indifferent mentions to provide promising directions of future research by fellow researchers in the field.

User Identification Using Real Environmental Human Computer Interaction Behavior

  • Wu, Tong;Zheng, Kangfeng;Wu, Chunhua;Wang, Xiujuan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.6
    • /
    • pp.3055-3073
    • /
    • 2019
  • In this paper, a new user identification method is presented using real environmental human-computer-interaction (HCI) behavior data to improve method usability. User behavior data in this paper are collected continuously without setting experimental scenes such as text length, action number, etc. To illustrate the characteristics of real environmental HCI data, probability density distribution and performance of keyboard and mouse data are analyzed through the random sampling method and Support Vector Machine(SVM) algorithm. Based on the analysis of HCI behavior data in a real environment, the Multiple Kernel Learning (MKL) method is first used for user HCI behavior identification due to the heterogeneity of keyboard and mouse data. All possible kernel methods are compared to determine the MKL algorithm's parameters to ensure the robustness of the algorithm. Data analysis results show that keyboard data have a narrower range of probability density distribution than mouse data. Keyboard data have better performance with a 1-min time window, while that of mouse data is achieved with a 10-min time window. Finally, experiments using the MKL algorithm with three global polynomial kernels and ten local Gaussian kernels achieve a user identification accuracy of 83.03% in a real environmental HCI dataset, which demonstrates that the proposed method achieves an encouraging performance.

A Design of DBaaS-Based Collaboration System for Big Data Processing

  • Jung, Yean-Woo;Lee, Jong-Yong;Jung, Kye-Dong
    • International journal of advanced smart convergence
    • /
    • v.5 no.2
    • /
    • pp.59-65
    • /
    • 2016
  • With the recent growth in cloud computing, big data processing and collaboration between businesses are emerging as new paradigms in the IT industry. In an environment where a large amount of data is generated in real time, such as SNS, big data processing techniques are useful in extracting the valid data. MapReduce is a good example of such a programming model used in big data extraction. With the growing collaboration between companies, problems of duplication and heterogeneity among data due to the integration of old and new information storage systems have arisen. These problems arise because of the differences in existing databases across the various companies. However, these problems can be negated by implementing the MapReduce technique. This paper proposes a collaboration system based on Database as a Service, or DBaaS, to solve problems in data integration for collaboration between companies. The proposed system can reduce the overhead in data integration, while being applied to structured and unstructured data.

Mobile Cloud System based on EMRA for Inbody Data

  • Lee, Jong-Sub;Moon, Seok-Jae
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.3
    • /
    • pp.327-333
    • /
    • 2021
  • Inbody is a tool for measuring health information with high reliability and accuracy to analyze body composition. Unlike the existing method of storing/processing and outputting data on the server side, the health information generated by InBody requires accurate support for health sharing and data analysis services using mobile devices. However, in the process of transmitting body composition measurement information to a mobile service, a problem may occur in data transmission/reception processing. The reason for this is that, since the network network in the cloud environment is used, if the connection is cut off or the connection is changed, it is necessary to provide a global service, not a temporary area, focusing on the mobility of InBody information. In addition, since InBody information is transmitted to mobile devices, a standard schema should be defined in the mobile cloud environment to enable information transfer between standardized InBody data and mobile devices. We propose a mobile cloud system using EMRA(Extended Metadata Registry Access) in which a mobile device processes and transmits body data generated in the inbody and manages the data of each local organization with a standard schema. The proposed system processes the data generated in InBody and converts it into a standard schema using EMRA so that standardized data can be transmitted. In addition, even when the mobile device moves through the area, the coordinator subsystem is in charge of providing access services. In addition, EMRA is applied to the collision problem due to schema heterogeneity occurring in the process of accessing data generated in InBody.