• Title/Summary/Keyword: Distribution data

Search Result 17,561, Processing Time 0.05 seconds

Diagnosis of Observations after Fit of Multivariate Skew t-Distribution: Identification of Outliers and Edge Observations from Asymmetric Data

  • Kim, Seung-Gu
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.6
    • /
    • pp.1019-1026
    • /
    • 2012
  • This paper presents a method for the identification of "edge observations" located on a boundary area constructed by a truncation variable as well as for the identification of outliers and the after fit of multivariate skew $t$-distribution(MST) to asymmetric data. The detection of edge observation is important in data analysis because it provides information on a certain critical area in observation space. The proposed method is applied to an Australian Institute of Sport(AIS) dataset that is well known for asymmetry in data space.

A Study on Air-distribution method for the Thermal Environmental Control in the Data Center (데이터센터의 합리적인 환경제어를 위한 공기분배 시스템에 대한 연구)

  • Cho, Jin-Kyun;Cha, Ji-Hyoung;Hong, Min-Ho;Yeon, Chang-Kun
    • Proceedings of the SAREK Conference
    • /
    • 2008.11a
    • /
    • pp.487-492
    • /
    • 2008
  • The cooling of data centers has emerged as a significant challenge as the density of IT server increases. Server installations, along with the shrinking physical size of servers and storage systems, has resulted in high power density and high heat density. The introduction of high density enclosures into a data center creates the potential for "hot spots" within the room that the cooling system may not be able to address, since traditional designs assume relatively uniform cooling patterns within a data center. The cooling system for data center consists of a CRAC or CRAH unit and the associated air distribution system. It is the configuration of the distribution system that primarily distinguishes the different types of data center cooling systems, this is the main subject of this paper.

  • PDF

A Federated Multi-Task Learning Model Based on Adaptive Distributed Data Latent Correlation Analysis

  • Wu, Shengbin;Wang, Yibai
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.441-452
    • /
    • 2021
  • Federated learning provides an efficient integrated model for distributed data, allowing the local training of different data. Meanwhile, the goal of multi-task learning is to simultaneously establish models for multiple related tasks, and to obtain the underlying main structure. However, traditional federated multi-task learning models not only have strict requirements for the data distribution, but also demand large amounts of calculation and have slow convergence, which hindered their promotion in many fields. In our work, we apply the rank constraint on weight vectors of the multi-task learning model to adaptively adjust the task's similarity learning, according to the distribution of federal node data. The proposed model has a general framework for solving optimal solutions, which can be used to deal with various data types. Experiments show that our model has achieved the best results in different dataset. Notably, our model can still obtain stable results in datasets with large distribution differences. In addition, compared with traditional federated multi-task learning models, our algorithm is able to converge on a local optimal solution within limited training iterations.

Methodology for determining optimal data sampling frequencies in water distribution systems (상수관망 데이터 수집의 최적 빈도 결정을 위한 방법론적 접근)

  • Hyunjun Kim;Eunhye Jeong;Kyungyup Hwang
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.37 no.6
    • /
    • pp.383-394
    • /
    • 2023
  • Currently, there is no definitive regulation for the appropriate frequency of data sampling in water distribution networks, yet it plays a crucial role in the efficient operation of these systems. This study proposes a new methodology for determining the optimal frequency of data acquisition in water distribution networks. Based on the decomposition of signals using harmonic series, this methodology has been validated using actual data from water distribution networks. By analyzing 12 types of data collected from two points, it was demonstrated that utilizing the factors and cumulative periodograms of harmonic series enables similar accuracy at lower data acquisition frequencies compared to the original signals. Type your abstract here.

The Marshall-Olkin generalized gamma distribution

  • Barriga, Gladys D.C.;Cordeiro, Gauss M.;Dey, Dipak K.;Cancho, Vicente G.;Louzada, Francisco;Suzuki, Adriano K.
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.3
    • /
    • pp.245-261
    • /
    • 2018
  • Attempts have been made to define new classes of distributions that provide more flexibility for modelling skewed data in practice. In this work we define a new extension of the generalized gamma distribution (Stacy, The Annals of Mathematical Statistics, 33, 1187-1192, 1962) for Marshall-Olkin generalized gamma (MOGG) distribution, based on the generator pioneered by Marshall and Olkin (Biometrika, 84, 641-652, 1997). This new lifetime model is very flexible including twenty one special models. The main advantage of the new family relies on the fact that practitioners will have a quite flexible distribution to fit real data from several fields, such as engineering, hydrology and survival analysis. Further, we also define a MOGG mixture model, a modification of the MOGG distribution for analyzing lifetime data in presence of cure fraction. This proposed model can be seen as a model of competing causes, where the parameter associated with the Marshall-Olkin distribution controls the activation mechanism of the latent risks (Cooner et al., Statistical Methods in Medical Research, 15, 307-324, 2006). The asymptotic properties of the maximum likelihood estimation approach of the parameters of the model are evaluated by means of simulation studies. The proposed distribution is fitted to two real data sets, one arising from measuring the strength of fibers and the other on melanoma data.

Design and Implementation of Cyber Warfare Training Data Set Generation Method based on Traffic Distribution Plan (트래픽 유통계획 기반 사이버전 훈련데이터셋 생성방법 설계 및 구현)

  • Kim, Yong Hyun;Ahn, Myung Kil
    • Convergence Security Journal
    • /
    • v.20 no.4
    • /
    • pp.71-80
    • /
    • 2020
  • In order to provide realistic traffic to the cyber warfare training system, it is necessary to prepare a traffic distribution plan in advance and to create a training data set using normal/threat data sets. This paper presents the design and implementation results of a method for creating a traffic distribution plan and a training data set to provide background traffic like a real environment to a cyber warfare training system. We propose a method of a traffic distribution plan by using the network topology of the training environment to distribute traffic and the traffic attribute information collected in real and simulated environments. We propose a method of generating a training data set according to a traffic distribution plan using a unit traffic and a mixed traffic method using the ratio of the protocol. Using the implemented tool, a traffic distribution plan was created, and the training data set creation result according to the distribution plan was confirmed.

Reliability In a Half-Triangle Distribution and a Skew-Symmetric Distribution

  • Woo, Jung-Soo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.543-552
    • /
    • 2007
  • We consider estimation of the right-tail probability in a half-triangle distribution, and also consider inference on reliability, and derive the k-th moment of ratio of two independent half-triangle distributions with different supports. As we define a skew-symmetric random variable from a symmetric triangle distribution about origin, we derive its k-th moment.

  • PDF

A Test Based on Euler Angles of a Rotationally Symmetric Spherical Distribution

  • Shin, Yang-Kyu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.10 no.1
    • /
    • pp.67-77
    • /
    • 1999
  • For a orientation-shift model supported on the unit sphere, Euler angles are the conventional measure to parametrize orientation-shifts. The essential role which is played by rotationally symmetry of an underlying distribution is reviewed. In this paper we propose the inference procedure based on Euler angles for the rotationally symmetric spherical distribution. The likelihood ratio test(LRT) based on the Euler angles is worked out. The asymptotic distribution of the test under the null hypotheses and certain contiguous alternatives is obtained.

  • PDF

Design and Analysis of the Data Distribution Service System (데이타 분배 서비스 시스템 설계 및 분석)

  • Park, Choong-Bum;Kwon, Ki-Jeong;Cha, Da-Ham;Choi, Hoon;Kim, Chum-Su
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.2
    • /
    • pp.211-215
    • /
    • 2008
  • the data-centric publish/subscribe middle-ware is suitable for a communication environment in which various devices dynamically forms a network domain and same type of data are frequently exchanged. For this purpose, OMG has standardized DDS (Data Distribution Service) specification. In this study, we designed the RiTiCoM, data distribution service system that observes the OMG DDS (Data Distribution Service) standard specification and supports the automation of system management, and analyzed the performance and compared with the JMS.

An Analysis on the Data Distribution of Construction Equipment Operations - A Case on Muck Hauling System - (건설 장비 운영 데이터 분포 특성에 관한 연구 - 버력 처리 시스템을 중심으로 -)

  • Seo, Hyeong Beom;Jung, Won Ji;Kim, Kyoungmin;Kim, Kyong Ju
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.26 no.4D
    • /
    • pp.661-670
    • /
    • 2006
  • The utilization of simulation has been limited in planning construction process because it is difficult to collect data and build a model using simulation method. This study collects construction operation data and analyzes the characteristics of its distribution. Through the statistical analysis on the empirical data, this study identifies Beta distribution functions is one of the most proper in duplicating the characteristics of construction equipment operation data into a computer simulation. The information obtained in this study can support preparing input data for another simulation.