• Title/Summary/Keyword: 다변량분석

Search Result 1,283, Processing Time 0.029 seconds

Detection of the Change in Blogger Sentiment using Multivariate Control Charts (다변량 관리도를 활용한 블로거 정서 변화 탐지)

  • Moon, Jeounghoon;Lee, Sungim
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.6
    • /
    • pp.903-913
    • /
    • 2013
  • Social network services generate a considerable amount of social data every day on personal feelings or thoughts. This social data provides changing patterns of information production and consumption but are also a tool that reflects social phenomenon. We analyze negative emotional words from daily blogs to detect the change in blooger sentiment using multivariate control charts. We used the all the blogs produced between 1 January 2008 and 31 December 2009. Hotelling's T-square control chart control chart is commonly used to monitor multivariate quality characteristics; however, it assumes that quality characteristics follow multivariate normal distribution. The performance of a multivariate control chart is affected by this assumption; consequently, we introduce the support vector data description and its extension (K-control chart) suggested by Sun and Tsung (2003) and they are applied to detect the chage in blogger sentiment.

Evaluation and Comparison of seasonal multivariate time series model construction with rainfall and site characteristics (강우 및 지점특성치를 이용한 계절형 다변량 시계열 모형 구축 평가 및 비교)

  • Kim, Taereem;Choi, Wonyoung;Shin, Hongjoon;Heo, Jun-Haeng
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2015.05a
    • /
    • pp.29-29
    • /
    • 2015
  • 수자원의 지속적인 관리 및 효율적인 활용을 위하여 수문량의 예측과 분석은 필수적인 과정이라 할 수 있으며 이에 따라 다양한 수문 모형이 구축되고 강우, 유량 등 대표적인 수문량의 예측이 수행되어져 왔다. 그 중에서도 수문 시계열 모형은 시간의 흐름에 따라 일정하게 기록되어온 수문 자료를 확률적인 과정을 통하여 모형을 구축하고 이를 바탕으로 미래 수문량을 예측하는 데활용되는 모형으로, 과거에 기록된 수문 패턴이 미래에도 지속된다는 가정 하에 구축된다. 일반적으로 시계열 모형은 하나의 자료계열로 모형을 구축하는 단변량 모형과 원 자료계열 외에 다른 자료계열을 고려하여 모형을 구축하는 다변량 모형이 있으며, 다변량 모형은 원 자료계열에 영향을 미치는 외부변수를 고려함으로써 두 자료계열간의 상관성을 모형에 반영할 수 있는 장점을 가지고 있다. 또한 자료계열의 계절성을 고려하여 시계열 모형을 구축할 경우, 수문 시계열이 가지고 있는 계절적 영향을 잘 반영할 수 있다. 따라서 본 연구에서는 계절성을 고려한 다변량 시계열 모형인 SARIMAX(Seasonal AutoRegressive Integrated Moving Average with eXogenous) 모형을 이용하여 대표적인 수공구조물인 댐의 유입량 예측을 수행하였다. 일반적으로 댐 유입량 예측에는 댐의 유입량과 상관성이 높은 강우가 외부변수로 사용되어져 왔으나, 이 외에도 영향을 미칠 수 있는 지점특성치를 고려하여 모형을 구축한 후 비교하였다.

  • PDF

Bivariate regional frequency analysis of extreme rainfalls in Korea (이변량 지역빈도해석을 이용한 우리나라 극한 강우 분석)

  • Shin, Ju-Young;Jeong, Changsam;Ahn, Hyunjun;Heo, Jun-Haeng
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.9
    • /
    • pp.747-759
    • /
    • 2018
  • Multivariate regional frequency analysis has advantages of regional and multivariate framework as adopting a large number of regional dataset and modeling phenomena that cannot be considered in the univariate frequency analysis. To the best of our knowledge, the multivariate regional frequency analysis has not been employed for hydrological variables in South Korea. Applicability of the multivariate regional frequency analysis should be investigated for the hydrological variable in South Korea in order to improve our capacity to model the hydrological variables. The current study focused on estimating parameters of regional copula and regional marginal models, selecting the most appropriate distribution models, and estimating regional multivariate growth curve in the multivariate regional frequency analysis. Annual maximum rainfall and duration data observed at 71 stations were used for the analysis. The results of the current study indicate that Frank and Gumbel copula models were selected as the most appropriate regional copula models for the employed regions. Several distributions, e.g. Gumbel and log-normal, were the representative regional marginal models. Based on relative root mean square error of the quantile growth curves, the multivariate regional frequency analysis provided more stable and accurate quantiles than the multivariate at-site frequency analysis, especially for long return periods. Application of regional frequency analysis in bivariate rainfall-duration analysis can provide more stable quantile estimation for hydraulic infrastructure design criteria and accurate modelling of rainfall-duration relationship.

Detecting Influential Observations in Multivariate Statistical Analysis of Incomplete Data by PCA (주성분분석에 의한 결손 자료의 영향값 검출에 대한 연구)

  • 김현정;문승호;신재경
    • The Korean Journal of Applied Statistics
    • /
    • v.13 no.2
    • /
    • pp.383-392
    • /
    • 2000
  • Since late 1970, methods of influence or sensitivity analysis for detecting influential observations have been studied not only in regression and related methods but also in various multivariate methods. If results of multivariate analyses sometimes depend heavily on a small number of observations, we should be very careful to draw a conclusion. Similar phenomena may also occur in the case of incomplete data. In this research we try to study such influential observations in multivariate statistical analysis of incomplete data. Case of principal component analysis is studied with a numerical example.

  • PDF

Multivariate Region Growing Method with Image Segments (영상분할단위 기반의 다변량 영역확장기법)

  • 이종열
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 2004.03a
    • /
    • pp.273-278
    • /
    • 2004
  • Feature identification is one of the largest issue in high spatial resolution satellite imagery. A popular method associated with this feature identification is image segmentation to produce image segments that are more likely to features interested. Here, it is, proposed that combination of edge extraction and region growing methods for image segments were used to improve the result of image segmentation. At the intial step, an image was segmented by edge detection method. The segments were assigned IDs, and polygon topology of segments were built. Based on the topology, the segments were tested their similarities with adjacent segments using multivariate analysis. The segments that have similar spectral characteristics were merged into a region. The test application shows that the segments composed of individual large, spectrally homogeneous structures, such as buildings and roads, were merged into more similar shape of structures.

  • PDF

Extreme Rainfall Reproduction Ability Assessment of Multivariate Downscaling Model (다변량 Downscaling 모델의 극치 강수량 재현 능력 평가)

  • Moon, Young-Il;Moon, Jang-Won;Kwon, Hyun-Han
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2011.05a
    • /
    • pp.393-393
    • /
    • 2011
  • 최근 기후변화로 인한 기상이변 및 이상기후로 예상하지 못한 극치사상이 빈번하게 발생하고 있다. 극치사상을 예측하기 위해 다양한 모형들이 개발되고 있으나 주로 유출의 변화 특성을 모의하는데 대부분의 연구가 초점을 맞추고 있다. 그러나 기본적으로 사용되는 강수량 자료의 정확한 추정이 기후변화 연구에서 가장 중요하다고 해도 과언이 아니다. 또한, 과거 연구들은 강수지점간의 공간상관성을 고려하지 않고 일강수량을 모의 발생시킨 후 이를 입력 자료로 강우-유출 모형에 사용하여 유역전체의 내리는 강수의 특성을 반영하지 못하였다. 이런 점들을 해결하기 위해 유역에 존재하는 실제 강우패턴을 모의 할 수 있는 다변량 Downscaling Model을 제안하였고, 기존 연구에서 극치사상을 재현해 내지 못하는 문제를 해결하기 위하여 입력 자료를 극치 값으로 변환하여 분석을 수행하였다. 즉, 본 논문에서는 실제 유역에 적용하여 모형의 타당성을 평가하고 기존 연구와 비교하여 극치 수문량의 변동 특성 등을 분석, 평가하였다.

  • PDF

Fault Detection Method for Multivariate Process using ICA (독립성분분석을 이용한 다변량 공정에서의 고장탐지 방법)

  • Jung, Seunghwan;Kim, Minseok;Lee, Hansoo;Kim, Jonggeun;Kim, Sungshin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.2
    • /
    • pp.192-197
    • /
    • 2020
  • Multivariate processes, such as large scale power plants or chemical processes are operated in very hazardous environment, which can lead to significant human and material losses if a fault occurs. On-line monitoring technology, therefore, is essential to detect system faults. In this paper, the ICA-based fault detection method is conducted using three different multivariate process data. Fault detection procedure based on ICA is divided into off-line and on-line processes. The off-line process determines a threshold for fault detection by using the obtained dataset when the system is normal. And the on-line process computes statistics of query vectors measured in real-time. The fault is detected by comparing computed statistics and previously defined threshold. For comparison, the PCA-based fault detection method is also implemented in this paper. Experimental results show that the ICA-based fault detection method detects the system faults earlier and better than the PCA-based method.

Evaluation of applicability of pan coefficient estimation method by multiple linear regression analysis (다변량 선형회귀분석을 이용한 증발접시계수 산정방법 적용성 검토)

  • Rim, Chang-Soo
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.3
    • /
    • pp.229-243
    • /
    • 2022
  • The effects of monthly meteorological data measured at 11 stations in South Korea on pan coefficient were analyzed to develop the four types of multiple linear regression models for estimating pan coefficients. To evaluate the applicability of developed models, the models were compared with six previous models. Pan coefficients were most affected by air temperature for January, February, March, July, November and December, and by solar radiation for other months. On the whole, for 12 months of the year, the effects of wind speed and relative humidity on pan coefficient were less significant, compared with those of air temperature and solar radiation. For all meteorological stations and months, the model developed by applying 5 independent variables (wind speed, relative humidity, air temperature, ratio of sunshine duration and daylight duration, and solar radiation) for each station was the most effective for evaporation estimation. The model validation results indicate that the multiple linear regression models can be applied to some particular stations and months.

A Comparative Study on the Multivariate Thomas-Fiering and Matalas Model (다변량 Thomas-Fiering 모형과 Matalas 모형의 비교연구)

  • 이주헌;이은태
    • Water for future
    • /
    • v.24 no.4
    • /
    • pp.59-66
    • /
    • 1991
  • Abstract The purpose of the synthetic of monthly river flows based on the short-term observed data by means of multivariate stochastic models is to provide abundunt input data to the water resources systems of which the system performance and operation policy are to be determined beforehand. In this study, multivariate Thomas-Fiering and Matalas models for synthetic generation based on stream flows in neihboring basin were employed to check if it can be applide in the modeling of monthly flows. Statistical parameters estimated by Method of Moment and Fourier Series Analysis respectively were reproduced for statistical features. For comparisons the statistical parameters of the generated monthly flow by each model were compared with those of the observed monthly flows. Results of this study suggest that the application of Matalas model for synthetic generation of monthly river flows can be adapted.

  • PDF