• 제목/요약/키워드: method: data analysis

검색결과 22,243건 처리시간 0.051초

2010 미국프로골프협회 자료를 활용한 경로분석을 통한 경기력의 평균타수에 미치는 영향력 비교 (The study for effectiveness of golf skills to adjust average score using path analysis in 2010 PGA)

  • 민대기
    • Journal of the Korean Data and Information Science Society
    • /
    • 제22권1호
    • /
    • pp.65-71
    • /
    • 2011
  • 경로분석은 변수 사이의 관계를 규명하는 방법으로 회귀분석에서 파악하기 어려운 직접효과와 간접효과, 의사효과를 파악할 수 있는 장점이 있다. 본 연구에서는 골프경기에서 가장 중요한 경기력 요소를 이용하여 평균타수에 어떻게 직접, 간접적으로 영향을 미치는가를 경로분석을 통하여 연구하였으며 자료는 2010년 남자프로골프선수 상위권 186명을 이용하였다.

Predicting Unknown Composition of a Mixture Using Independent Component Analysis

  • Lee, Hye-Seon;Park, Hae-Sang;Jun, Chi-Hyuck
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 한국데이터정보과학회 2005년도 춘계학술대회
    • /
    • pp.127-134
    • /
    • 2005
  • A suitable representation for the conceptual simplicity of the data in statistics and signal processing is essential for a subsequent analysis such as prediction, pattern recognition, and spatial analysis. Independent component analysis (ICA) is a statistical method for transforming an observed high-dimensional multivariate data into statistically independent components. ICA has been applied increasingly in wide fields of spectrum application since ICA is able to extract unknown components of a mixture from spectra. We focus on application of ICA for separating independent sources and predicting each composition using extracted components. The theory of ICA is introduced and an application to a metal surface spectra data will be described, where subsequent analysis using non-negative least square method is performed to predict composition ratio of each sample. Furthermore, some simulation experiments are performed to demonstrate the performance of the proposed approach.

  • PDF

성향점수매칭 방법을 사용한 로지스틱 회귀분석에 관한 연구 (On Logistic Regression Analysis Using Propensity Score Matching)

  • 김소연;백종일
    • 한국신뢰성학회지:신뢰성응용연구
    • /
    • 제16권4호
    • /
    • pp.323-330
    • /
    • 2016
  • Purpose: Recently, propensity score matching method is used in a large number of research paper, nonetheless, there is no research using fitness test of before and after propensity score matching. Therefore, comparing fitness of before and after propensity score matching by logistic regression analysis using data from 'online survey of adolescent health' is the main significance of this research. Method: Data that has similar propensity in two groups is extracted by using propensity score matching then implement logistic regression analysis on before and after matching separately. Results: To test fitness of logistic regression analysis model, we use Model summary, -2Log Likelihood and Hosmer-Lomeshow methods. As a result, it is confirmed that the data after matching is more suitable for logistic regression analysis than data before matching. Conclusion: Therefore, better result which has appropriate fitness will be shown by using propensity score matching shows better result which has better fitness.

Detecting Anomalies in Time-Series Data using Unsupervised Learning and Analysis on Infrequent Signatures

  • Bian, Xingchao
    • 전기전자학회논문지
    • /
    • 제24권4호
    • /
    • pp.1011-1016
    • /
    • 2020
  • We propose a framework called Stacked Gated Recurrent Unit - Infrequent Residual Analysis (SG-IRA) that detects anomalies in time-series data that can be trained on streams of raw sensor data without any pre-labeled dataset. To enable such unsupervised learning, SG-IRA includes an estimation model that uses a stacked Gated Recurrent Unit (GRU) structure and an analysis method that detects anomalies based on the difference between the estimated value and the actual measurement (residual). SG-IRA's residual analysis method dynamically adapts the detection threshold from the population using frequency analysis, unlike the baseline model that relies on a constant threshold. In this paper, SG-IRA is evaluated using the industrial control systems (ICS) datasets. SG-IRA improves the detection performance (F1 score) by 5.9% compared to the baseline model.

원심압축기의 유동해석을 위한 준삼차원 해석기법 (Flow Analysis of Centrifugal Compressor Using Quasi-Three-Dimensional Analysis)

  • 안상준;김광용
    • 한국유체기계학회 논문집
    • /
    • 제6권1호
    • /
    • pp.30-36
    • /
    • 2003
  • This paper presents the analysis of flows through three different types of radial compressor impeller by using quasi-three-dimensional analysis method. The method obtains two-dimensional solution for velocity distribution on meridional plane, and then calculates approximately the static pressure distributions on blade surfaces. Finite difference method is used for the solutions of governing equations. The compressors have low level compression-ratio and 12 straight radial blades with no backsweep. The results are compared with experimental data and the results of three-dimensional inviscid analysis with those by finite element method. It is found that the agreements with experimental data are good for the cases where viscous effects are not dominant.

요인 및 군집분석을 이용한 지상 라이다 자료의 분류 (Classification of Terrestrial LiDAR Data Using Factor and Cluster Analysis)

  • 최승필;조지현;김열;김준성
    • 대한공간정보학회지
    • /
    • 제19권4호
    • /
    • pp.139-144
    • /
    • 2011
  • 본 연구는 지상라이다 자료에서 얻어진 색상정보(R, G, B)와 반사강도정보(I)를 동시에 이용하여 이를 통계학적 분류기법으로 서로의 연관성을 분석하여 라이다 자료에 대한 분류방법을 제시하였다. 이를 위하여 우선 변수 R,G,B 및 I를 사용하여 분산 을 극대화하는 요인을 추출하여 주요인과 각 변수들 간의 요인행렬을 산출하였다. 그러나 요인행렬은 기초자료를 축소시켜 보여주기는 하지만, 이로부터 어떤 변수들이 어떤 요인에 의해 높게 관계되는지 명확하게 알기 어렵기 때문에 직각회전방식 중에서 Varimax방법을 이용하여 회전된 요인행렬을 구하여 요인점수를 산출하였다. 그리고 비 계층적 군집화 방법인 K-평균법을 이용하여 요인분석으로 산출된 요인점수에 대하여 군집분석을 실시한 후, 지상라이다 자료의 분류 정확도를 평가하였다.

Mode identifiability of a cable-stayed bridge using modal contribution index

  • Huang, Tian-Li;Chen, Hua-Peng
    • Smart Structures and Systems
    • /
    • 제20권2호
    • /
    • pp.115-126
    • /
    • 2017
  • The modal identification of large civil structures such as bridges under the ambient vibrational conditions has been widely investigated during the past decade. Many operational modal analysis methods have been proposed and successfully used for identifying the dynamic characteristics of the constructed bridges in service. However, there is very limited research available on reliable criteria for the robustness of these identified modal parameters of the bridge structures. In this study, two time-domain operational modal analysis methods, the data-driven stochastic subspace identification (SSI-DATA) method and the covariance-driven stochastic subspace identification (SSI-COV) method, are employed to identify the modal parameters from field recorded ambient acceleration data. On the basis of the SSI-DATA method, the modal contribution indexes of all identified modes to the measured acceleration data are computed by using the Kalman filter, and their applicability to evaluate the robustness of identified modes is also investigated. Here, the benchmark problem, developed by Hong Kong Polytechnic University with field acceleration measurements under different excitation conditions of a cable-stayed bridge, is adopted to show the effectiveness of the proposed method. The results from the benchmark study show that the robustness of identified modes can be judged by using their modal contributions to the measured vibration data. A critical value of modal contribution index of 2% for a reliable identifiability of modal parameters is roughly suggested for the benchmark problem.

Outlier detection of GPS monitoring data using relational analysis and negative selection algorithm

  • Yi, Ting-Hua;Ye, X.W.;Li, Hong-Nan;Guo, Qing
    • Smart Structures and Systems
    • /
    • 제20권2호
    • /
    • pp.219-229
    • /
    • 2017
  • Outlier detection is an imperative task to identify the occurrence of abnormal events before the structures are suffered from sudden failure during their service lives. This paper proposes a two-phase method for the outlier detection of Global Positioning System (GPS) monitoring data. Prompt judgment of the occurrence of abnormal data is firstly carried out by use of the relational analysis as the relationship among the data obtained from the adjacent locations following a certain rule. Then, a negative selection algorithm (NSA) is adopted for further accurate localization of the abnormal data. To reduce the computation cost in the NSA, an improved scheme by integrating the adjustable radius into the training stage is designed and implemented. Numerical simulations and experimental verifications demonstrate that the proposed method is encouraging compared with the original method in the aspects of efficiency and reliability. This method is only based on the monitoring data without the requirement of the engineer expertise on the structural operational characteristics, which can be easily embedded in a software system for the continuous and reliable monitoring of civil infrastructure.

Performance Analysis of Perturbation-based Privacy Preserving Techniques: An Experimental Perspective

  • Ritu Ratra;Preeti Gulia;Nasib Singh Gill
    • International Journal of Computer Science & Network Security
    • /
    • 제23권10호
    • /
    • pp.81-88
    • /
    • 2023
  • In the present scenario, enormous amounts of data are produced every second. These data also contain private information from sources including media platforms, the banking sector, finance, healthcare, and criminal histories. Data mining is a method for looking through and analyzing massive volumes of data to find usable information. Preserving personal data during data mining has become difficult, thus privacy-preserving data mining (PPDM) is used to do so. Data perturbation is one of the several tactics used by the PPDM data privacy protection mechanism. In Perturbation, datasets are perturbed in order to preserve personal information. Both data accuracy and data privacy are addressed by it. This paper will explore and compare several perturbation strategies that may be used to protect data privacy. For this experiment, two perturbation techniques based on random projection and principal component analysis were used. These techniques include Improved Random Projection Perturbation (IRPP) and Enhanced Principal Component Analysis based Technique (EPCAT). The Naive Bayes classification algorithm is used for data mining approaches. These methods are employed to assess the precision, run time, and accuracy of the experimental results. The best perturbation method in the Nave-Bayes classification is determined to be a random projection-based technique (IRPP) for both the cardiovascular and hypothyroid datasets.

Clustering non-stationary advanced metering infrastructure data

  • Kang, Donghyun;Lim, Yaeji
    • Communications for Statistical Applications and Methods
    • /
    • 제29권2호
    • /
    • pp.225-238
    • /
    • 2022
  • In this paper, we propose a clustering method for advanced metering infrastructure (AMI) data in Korea. As AMI data presents non-stationarity, we consider time-dependent frequency domain principal components analysis, which is a proper method for locally stationary time series data. We develop a new clustering method based on time-varying eigenvectors, and our method provides a meaningful result that is different from the clustering results obtained by employing conventional methods, such as K-means and K-centres functional clustering. Simulation study demonstrates the superiority of the proposed approach. We further apply the clustering results to the evaluation of the electricity price system in South Korea, and validate the reform of the progressive electricity tariff system.