• Title/Summary/Keyword: data heterogeneity

Search Result 599, Processing Time 0.029 seconds

Bayesian Analysis for Multiple Capture-Recapture Models using Reference Priors

  • Younshik;Pongsu
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.1
    • /
    • pp.165-178
    • /
    • 2000
  • Bayesian methods are considered for the multiple caputure-recapture data. Reference priors are developed for such model and sampling-based approach through Gibbs sampler is used for inference from posterior distributions. Furthermore approximate Bayes factors are obtained for model selection between trap and nontrap response models. Finally one methodology is implemented for a capture-recapture model in generated data and real data.

  • PDF

Effect of Heterogeneous Variance by Sex and Genotypes by Sex Interaction on EBVs of Postweaning Daily Gain of Angus Calves

  • Oikawa, T.;Hammond, K.;Tier, B.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.12 no.6
    • /
    • pp.850-853
    • /
    • 1999
  • Angus postweaning daily gain (PWDG) was analyzed to investigate effects of the heterogeneous variance and the genotypes by sex interaction on prediction of EBVs with data sets of various environmental levels. A whole data (16,239 records) was divided into six data sets according to averages of the best linear unbiased estimator (BLUE) of herd environment. The results comparing prediction models showed that single-trait model is adequate for most of the data sets except for the data set of poor environment for both of the bulls and the heifers where the heterogeneity of variance and the genotypes by sex interaction exists. In the prediction with the data set of the low environment level, the bull's EBVs by single-trait models had high product moment correlations with male EBVs of the bulls by the multitrait model. Whereas the heifer's EBVs had moderate correlations with female EBVs by the multitrait model. This moderate correlation seems to be resulted by the heterogeneity of variance and low heritability of the heifer's PWDG. The prediction models with heterogeneity of variance had little effect on the prediction of EBVs for the data sets with moderate to high genetic correlations.

FedGCD: Federated Learning Algorithm with GNN based Community Detection for Heterogeneous Data

  • Wooseok Shin;Jitae Shin
    • Journal of Internet Computing and Services
    • /
    • v.24 no.6
    • /
    • pp.1-11
    • /
    • 2023
  • Federated learning (FL) is a ground breaking machine learning paradigm that allow smultiple participants to collaboratively train models in a cloud environment, all while maintaining the privacy of their raw data. This approach is in valuable in applications involving sensitive or geographically distributed data. However, one of the challenges in FL is dealing with heterogeneous and non-independent and identically distributed (non-IID) data across participants, which can result in suboptimal model performance compared to traditionalmachine learning methods. To tackle this, we introduce FedGCD, a novel FL algorithm that employs Graph Neural Network (GNN)-based community detection to enhance model convergence in federated settings. In our experiments, FedGCD consistently outperformed existing FL algorithms in various scenarios: for instance, in a non-IID environment, it achieved an accuracy of 0.9113, a precision of 0.8798,and an F1-Score of 0.8972. In a semi-IID setting, it demonstrated the highest accuracy at 0.9315 and an impressive F1-Score of 0.9312. We also introduce a new metric, nonIIDness, to quantitatively measure the degree of data heterogeneity. Our results indicate that FedGCD not only addresses the challenges of data heterogeneity and non-IIDness but also sets new benchmarks for FL algorithms. The community detection approach adopted in FedGCD has broader implications, suggesting that it could be adapted for other distributed machine learning scenarios, thereby improving model performance and convergence across a range of applications.

Channel Heterogeneity Aware Channel Assignment for IEEE 802.11 Multi-Radio Multi-Rate Wireless Networks (IEEE 802.11 다중 라디오 다중 전송률 무선 네트워크를 위한 채널 이질성 인지 채널 할당)

  • Kim, Sok-Hyong;Kim, Dong-Wook;Suh, Young-Joo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.11A
    • /
    • pp.870-877
    • /
    • 2011
  • IEEE 802.11 devices are widely used, and terminals can be equipped with multiple IEEE 802.11 interfaces as low-cost IEEE 802.11 devices are deployed. The off-the-shelf IEEE 802.11 devices provide multiple channels and multiple data rates. In practical multi-channel networks, since there is channel heterogeneity which indicates that channels have different signal characteristics for the same node, channels should be efficiently assigned to improve network capacity. In addition, in multi-rate networks, low-rate links severely degrade the performance of high-rate links on the same channel, which is known as performance anomaly. Therefore, in this paper, we propose a heterogeneity aware channel assignment (HACA) algorithm that improves network performance by reflecting channel heterogeneity and performance anomaly. Through NS-2 simulations, we validate that the HACA algorithm shows improved performance compared with existing channel assignment algorithms that do not reflect channel heterogeneity.

Inherent Random Heterogeneity Logit Model for Stated Preference Freight Mode Choice (SP 화물수단선택을 위한 Inherent Random Heterogeneity 로짓 모형 연구)

  • KIM, Kang-Soo
    • Journal of Korean Society of Transportation
    • /
    • v.20 no.3
    • /
    • pp.83-92
    • /
    • 2002
  • Freight mode choice models are essential to the analysis of many areas of transport research. However, observations of actual market choices have only been made in a limited number of situations. Therefore, stated preference(SP) techniques have emerged as an alternative source of actual market choices to be used for estimating freight mode choice models. Considerable confidence exists about SP data, but little consideration has been given to the potential for estimation bias. This paper has been motivated by the theoretical side of estimating SP discrete choice models, focusing on a case study of freight mode choice. Recently developed simulation methods are used to construct inherent random heterogeneity legit models, which consider individual heterogeneity, its inheritance to the next choices and overcome the independence from irrelevant alternatives (IIA) property. This Paper contributes to the development of models dealing with heterogeneity and its inheritance, and sheds light on the heterogeneity of freight transport.

Using Mixed Logit Model and Latent Class Model to Analyze Preference Heterogeneity in Choice Experiment Data (선택실험법 자료에서의 선호이질성 분석을 위한 혼합로짓모형 및 잠재계층모형의 활용)

  • Yoo, Byong Kook
    • Environmental and Resource Economics Review
    • /
    • v.21 no.4
    • /
    • pp.921-945
    • /
    • 2012
  • Conditional Logit (CL) model is widely used since its model estimation and interpretation of results of the model is relatively easy, on the other hand, it has the limit of preference heterogeneity of respondents being not fully considered. In this study we used the two models, Mixed Logit (ML) Model and Latent Class Model (LCM) to explain preference heterogeneity of respondents for protection for Boryeong Dam wetland. As a result of the examination for heterogeneity in Boryeong city and six metropolitan areas, we found there was significant difference between two regions. While there was explicit preference heterogeneity within respondents in Boryeong city, we found little heterogeneity within respondents in six metropolitan areas. Thus in the case of six metropolitan areas, CL model can be used for parameter estimation while in the case of Boryeong city, WTP estimates are based on parameter estimates from ML model to reflect the heterogeneity within respondents. Additionally, ML model with interaction and 2-class LCM for respondents in Boryeong city were used to explain the sources of the heterogeneity. The ML model with interaction has advantage of explaining individual unobserved heterogeneity. However The comarison between these two models reflects the fact that LCM provided added information that was not conveyed in the ML model with interaction. Thus, Preference heterogeneity within respondents in this study may be better explained by class level through LCM rather than indiviual level through ML model.

  • PDF

The Production and Spatial Heterogeneity of Litterfall in the Mixed Broadleaved-Korean Pine Forest of Xiaoxing'an Mountains, China

  • Jin, Guangze;Zhao, Fengxia;Liu, Liang;Kim, Ji Hong
    • Journal of Korean Society of Forest Science
    • /
    • v.97 no.2
    • /
    • pp.165-170
    • /
    • 2008
  • Litterfall has been recognized an important part of the forest ecosystem production, playing a major pathway in energy flow and nutrient cycling through the ecosystem. This study was carried out to examine the quantity and components, temporal variation, and spatial heterogeneity of the litterfall in the mixed broadleaved-Korean pine forest. The data were collected from the 9ha permanent experimental plot, of which on the center area, i.e. $150m{\times}150m$, the total number of 319 circular litterfall traps with the size of $0.5m^2$ were established to collect falling litterfall. The results showed that the annual amount of litterfall was totalized 3,033.7 kg/ha, occupying broad-leaves of 39.3%, conifer-leaves of 29.5%, others of 18.5%, branches of 10.4%, and seeds of 2.3%. The peak point of the litterfall production was made at the end of September, proportionating 32.2% of total amount. The analysis of semivariogram revealed the existence of high spatial heterogeneity, calculated the scale of spatial heterogeneity ranged from 11.6 m to 29.1 m. The result of proportion (C/[Co+C]) showed that spatial heterogeneity of autocorrelation in total spatial heterogeneity were from 97.0% to 100%. The relatively heavy branches and others had significant differences in litterfall production between the areas of canopy gap and closed canopy in the 95% probability level, but the other components did not show statistical differences.

The Design of XMDR Data Hub for Efficient Business Process Operation (효율적인 비즈니스 프로세스 운용을 위한 XMDR 데이터 허브 설계)

  • Hwang, Chi-Gon;Jung, Gye-Dong;Choi, Young-Keun
    • The KIPS Transactions:PartD
    • /
    • v.18D no.3
    • /
    • pp.149-156
    • /
    • 2011
  • Recently, enterprise systems require the necessity of integration for data sharing and cooperation. As a methodology for integration, Service-Oriented Architecture for service integration and Master Data for integration of data, which is used for service, were appeared. This paper suggests a method that operates BP(Business Process) efficiently. We make XMDR(eXtended Meta Data Registry) as knowledge-repository to support the BP and construct data hubs to operate it. XMDR manages MDM(Master Data Management) to integrate the data, resolves heterogeneity between the data and provides relationship to the business efficiently. This is composed of MDR(Meta Data Registry), ontology and BR(Business Relations). MDR describes relationship between meta data to solve structured heterogeneity. Ontology describes semantic heterogeneity and relationship between data. BR describes relationship between tasks. XMDR data hub supports the management of master data and interaction of different process effectively.

A Study of Data Interoperability System using DBaaS for Mobility Handicapped

  • Kwon, TaeWoo;Lee, Jong-Yong;Jung, Kye-Dong
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.1
    • /
    • pp.97-102
    • /
    • 2019
  • As the number of "Mobility Handicapped" increases, the incidence of "Mobility Handicapped" traffic accidents is also increasing. In order to reduce the incidence of traffic accidents in the "Mobility Handicapped", a service providing system for "Mobility Handicapped" is required. Since these services have different data formats, data heterogeneity occurs. Therefore, the system should resolve the data heterogeneity by mapping the format of the data. In this paper, we design DBaaS as a mobility handicapped system for data interoperability. This system provides a service to extend the flashing time of the traffic lights according to the condition of "Mobility Handicapped" on the occurrence of a fall or a crosswalk in a crosswalk where there is a risk of a traffic accident. These services can reduce the incidence of traffic accidents in "Mobility Handicapped".

A Development of Traffic Accident Models at 4-legged Signalized Intersections using Random Parameter : A Case of Busan Metropolitan City (Random Parameter를 이용한 4지 신호교차로에서의 교통사고 예측모형 개발 : 부산광역시를 대상으로)

  • Park, Minho;Lee, Dongmin;Yoon, Chunjoo;Kim, Young Rok
    • International Journal of Highway Engineering
    • /
    • v.17 no.6
    • /
    • pp.65-73
    • /
    • 2015
  • PURPOSES : This study tries to develop the accident models of 4-legged signalized intersections in Busan Metropolitan city with random parameter in count model to understanding the factors mainly influencing on accident frequencies. METHODS : To develop the traffic accidents modeling, this study uses RP(random parameter) negative binomial model which enables to take account of heterogeneity in data. By using RP model, each intersection's specific geometry characteristics were considered. RESULTS : By comparing the both FP(fixed parameter) and RP modeling, it was confirmed the RP model has a little higher explanation power than the FP model. Out of 17 statistically significant variables, 4 variables including traffic volumes on minor roads, pedestrian crossing on major roads, and distance of pedestrian crossing on major/minor roads are derived as having random parameters. In addition, the marginal effect and elasticity of variables are analyzed to understand the variables'impact on the likelihood of accident occurrences. CONCLUSIONS : This study shows that the uses of RP is better fitted to the accident data since each observations'specific characteristics could be considered. Thus, the methods which could consider the heterogeneity of data is recommended to analyze the relationship between accidents and affecting factors(for example, traffic safety facilities or geometrics in signalized 4-legged intersections).