• 제목/요약/키워드: unbalanced data

검색결과 324건 처리시간 0.024초

매트릭스 팬슬 방법의 데이터 불균형 제거 기법 (Data De-weighting in Matrix Pencil Method)

  • 고진환;쉬샤오웬;류병주;이제훈;이정섭
    • 한국통신학회논문지
    • /
    • 제36권8A호
    • /
    • pp.741-747
    • /
    • 2011
  • 잘 알려진 매트릭스 팬슬 방법은 정적이지 않은 환경 및 주파수가 같은 다중경로 신호가 존재 할때도 동작하는 입사각 추정방식이다. 매트릭스 팬슬방식은 기존의 공분산행렬을 사용한 방식보다 더 좋은 분해능을 보여 줄뿐 아니라 계산량의 측면에서도 매우 효과적이다. 본 논문에서는 매트릭스 팬슬 방식의 계산 과정에서 발생되는 데이터의 가중치가 균형이 맞지 않음으로써 생기는 영향에 관해 기술한다. 데이터의 균형이 맞는 새로운 방식의 매트릭스 팬슬 방식을 제안하고 데이터의 불균형을 해소할 수 있음을 보여주었다.

BAYESIAN INFERENCE FOR FIELLER-CREASY PROBLEM USING UNBALANCED DATA

  • Lee, Woo-Dong;Kim, Dal-Ho;Kang, Sang-Gil
    • Journal of the Korean Statistical Society
    • /
    • 제36권4호
    • /
    • pp.489-500
    • /
    • 2007
  • In this paper, we consider Bayesian approach to the Fieller-Creasy problem using noninformative priors. Specifically we extend the results of Yin and Ghosh (2000) to the unbalanced case. We develop some noninformative priors such as the first and second order matching priors and reference priors. Also we prove the posterior propriety under the derived noninformative priors. We compare these priors in light of how accurately the coverage probabilities of Bayesian credible intervals match the corresponding frequentist coverage probabilities.

Tests for Panel Regression Model with Unbalanced Data

  • Song, Suck-Heun;Jung, Byoung-Cheol
    • Journal of the Korean Statistical Society
    • /
    • 제30권3호
    • /
    • pp.511-527
    • /
    • 2001
  • This paper consider the testing problem of variance component for the unbalanced tow=-way error component model. We provide a conditional LM test statistic for testing zero individual(time) effects assuming that the other time-specific(individual)efefcts are present. This test is extension of Baltagi, Chang and Li(1998, 1992). Monte Carlo experiments are conducted to study the performance of this LM test.

  • PDF

Confidence Interval for the Variance Component in a Unbalanced One-way Random Effects Model

  • 송규문
    • Journal of the Korean Data and Information Science Society
    • /
    • 제13권2호
    • /
    • pp.329-340
    • /
    • 2002
  • Two methods are proposed for constructing a confidence interval on the among group variance component in a unbalanced one-way random effects model. Computer simulation is used to compare these methods with alternative procedures. The results indicate that the method1 and methods2 perform well over small group size and large sample size respectively.

  • PDF

On the Fitting ANOVA Models to Unbalanced Data

  • Jong-Tae Park;Jae-Heon Lee;Byung-Chun Kim
    • Communications for Statistical Applications and Methods
    • /
    • 제2권1호
    • /
    • pp.48-54
    • /
    • 1995
  • A direct method for fitting analysis-of-variance models to unbalanced data is presented. This method exploits sparsity and rank deficiency of the matrix and is based on Gram-Schmidt orthogonalization of a set of sparse columns of the model matrix. The computational algorithm of the sum of squares for testing estmable hyphotheses is given.

  • PDF

Detecting Malicious Social Robots with Generative Adversarial Networks

  • Wu, Bin;Liu, Le;Dai, Zhengge;Wang, Xiujuan;Zheng, Kangfeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권11호
    • /
    • pp.5594-5615
    • /
    • 2019
  • Malicious social robots, which are disseminators of malicious information on social networks, seriously affect information security and network environments. The detection of malicious social robots is a hot topic and a significant concern for researchers. A method based on classification has been widely used for social robot detection. However, this method of classification is limited by an unbalanced data set in which legitimate, negative samples outnumber malicious robots (positive samples), which leads to unsatisfactory detection results. This paper proposes the use of generative adversarial networks (GANs) to extend the unbalanced data sets before training classifiers to improve the detection of social robots. Five popular oversampling algorithms were compared in the experiments, and the effects of imbalance degree and the expansion ratio of the original data on oversampling were studied. The experimental results showed that the proposed method achieved better detection performance compared with other algorithms in terms of the F1 measure. The GAN method also performed well when the imbalance degree was smaller than 15%.

불균형 Haar 웨이블릿 변환을 이용한 군집화를 위한 시계열 표현 (Time series representation for clustering using unbalanced Haar wavelet transformation)

  • 이세훈;백창룡
    • 응용통계연구
    • /
    • 제31권6호
    • /
    • pp.707-719
    • /
    • 2018
  • 시계열 데이터의 분류와 군집화를 효율적으로 수행하기 위해 다양한 시계열 표현 방법들이 제안되었다. 본 연구는 Lin 등 (2007)이 제안한 국소 평균 근사를 이용하여 시계열의 차원을 축소한 후 심볼릭 자료로 이산화하는 symbolic aggregate approximation (SAX) 방법의 개선에 대해서 연구하였다. SAX는 국소 평균 근사를 할 때 등간격으로 임의의 개수의 세그먼트로 나누어 평균을 계산하여 세그먼트의 개수에 그 성능이 크게 좌우된다. 따라서 본 논문은 불균형 Haar 웨이블릿 변환을 통해 국소 평균 수준을 등간격이 아니라 자료의 특성을 반영하여 자료 의존적으로 선택하게 함으로써 시계열의 차원을 효과적으로 축소함과 동시에 정보의 손실을 줄이는 방법에 대해서 제안한다. 제안한 방법은 실증 자료 분석을 통해 SAX 방법을 개선시킴을 확인하였다.

Methods and Techniques for Variance Component Estimation in Animal Breeding - Review -

  • Lee, C.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제13권3호
    • /
    • pp.413-422
    • /
    • 2000
  • In the class of models which include random effects, the variance component estimates are important to obtain accurate predictors and estimators. Variance component estimation is straightforward for balanced data but not for unbalanced data. Since orthogonality among factors is absent in unbalanced data, various methods for variance component estimation are available. REML estimation is the most widely used method in animal breeding because of its attractive statistical properties. Recently, Bayesian approach became feasible through Markov Chain Monte Carlo methods with increasingly powerful computers. Furthermore, advances in variance component estimation with complicated models such as generalized linear mixed models enabled animal breeders to analyze non-normal data.

Machine Learning-based landslide susceptibility mapping - Inje area, South Korea

  • Chanul Choi;Le Xuan Hien;Seongcheon Kwon;Giha Lee
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2023년도 학술발표회
    • /
    • pp.248-248
    • /
    • 2023
  • In recent years, the number of landslides in Korea has been increasing due to extreme weather events such as localized heavy rainfall and typhoons. Landslides often occur with debris flows, land subsidence, and earthquakes. They cause significant damage to life and property. 64% of Korea's land area is made up of mountains, the government wanted to predict landslides to reduce damage. In response, the Korea Forest Service has established a 'Landslide Information System' to predict the likelihood of landslides. This system selects a total of 13 landslide factors based on past landslide events. Using the LR technique (Logistic Regression) to predict the possibility of a landslide occurrence and the accuracy is known to be 0.75. However, most of the data used for learning in the current system is on landslides that occurred from 2005 to 2011, and it does not reflect recent typhoons or heavy rain. Therefore, in this study, we will apply a total of six machine learning techniques (KNN, LR, SVM, XGB, RF, GNB) to predict the occurrence of landslides based on the data of Inje, Gangwon-do, which was recently produced by the National Institute of Forest. To predict the occurrence of landslides, it is necessary to process converting landslide events and factors data into a suitable form for machine learning techniques through ArcGIS and Python. In addition, there is a large difference in the number of data between areas where landslides occurred or not. Therefore, the prediction was performed after correcting the unbalanced data using Tomek Links and Near Miss techniques. Moreover, to control unbalanced data, a model that reflects soil properties will use to remove absolute safe areas.

  • PDF

지중송전계통에서 배전선 불평형전류 유입에 따른 영향 검토 (Analysis of induced voltage of CCPU with unbalanced current from Distribution Line on Underground Transmission Cable System)

  • 강지원;장태인;홍동석;정채균;윤동수;윤종건;김형호
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2005년도 제36회 하계학술대회 논문집 A
    • /
    • pp.459-461
    • /
    • 2005
  • This paper analyses the induced voltage characteristic of CCPU with unbalanced current from distribution line on underground transmission power cable systems. In switching surge strokes, in order to obtain the data of induced voltage/current on CCPU, the actual proof test carried out. This paper is expected to contribute the establishment of proper protection methods of CCPU against the unbalanced current from distribution line on underground transmission power cable systems.

  • PDF