• 제목/요약/키워드: method: data analysis

검색결과 22,105건 처리시간 0.043초

A Study on Selecting Principle Component Variables Using Adaptive Correlation (적응적 상관도를 이용한 주성분 변수 선정에 관한 연구)

  • Ko, Myung-Sook
    • KIPS Transactions on Software and Data Engineering
    • /
    • 제10권3호
    • /
    • pp.79-84
    • /
    • 2021
  • A feature extraction method capable of reflecting features well while mainaining the properties of data is required in order to process high-dimensional data. The principal component analysis method that converts high-level data into low-dimensional data and express high-dimensional data with fewer variables than the original data is a representative method for feature extraction of data. In this study, we propose a principal component analysis method based on adaptive correlation when selecting principal component variables in principal component analysis for data feature extraction when the data is high-dimensional. The proposed method analyzes the principal components of the data by adaptively reflecting the correlation based on the correlation between the input data. I want to exclude them from the candidate list. It is intended to analyze the principal component hierarchy by the eigen-vector coefficient value, to prevent the selection of the principal component with a low hierarchy, and to minimize the occurrence of data duplication inducing data bias through correlation analysis. Through this, we propose a method of selecting a well-presented principal component variable that represents the characteristics of actual data by reducing the influence of data bias when selecting the principal component variable.

Development of Realtime GRID Analysis Method based on the High Precision Streaming Data

  • Lee, HyeonSoo;Suh, YongCheol
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • 제34권6호
    • /
    • pp.569-578
    • /
    • 2016
  • With the recent advancement of surveying and technology, the spatial data acquisition rates and precision have been improved continually. As the updates of spatial data are rapid, and the size of data increases in line with the advancing technology, the LOD (Level of Detail) algorithm has been adopted to process data expressions in real time in a streaming format with spatial data divided precisely into separate steps. The existing GRID analysis utilizes the single DEM, as it is, in examining and analyzing all data outside the analysis area as well, which results in extending the analysis time in proportion to the quantity of data. Hence, this study suggests a method to reduce analysis time and data throughput by acquiring and analyzing DEM data necessary for GRID analysis in real time based on the area of analysis and the level of precision, specifically for streaming DEM data, which is utilized mostly for 3D geographic information service.

A FAST REDUCTION METHOD OF SURVEY DATA IN RADIO ASTRONOMY

  • LEE YOUNGUNG
    • Journal of The Korean Astronomical Society
    • /
    • 제34권1호
    • /
    • pp.1-8
    • /
    • 2001
  • We present a fast reduction method of survey data obtained using a single-dish radio telescope. Along with a brief review of classical method, a new method of identification and elimination of negative and positive bad channels are introduced using cloud identification code and several IRAF (Image Reduction and Analysis Facility) tasks relating statistics. Removing of several ripple patterns using Fourier Transform is also discussed. It is found that BACKGROUND task within IRAF is very efficient for fitting and subtraction of base-line with varying functions. Cloud identification method along with the possibility of its application for analysis of cloud structure is described, and future data reduction method is discussed.

  • PDF

Neo-Chinese Style Furniture Design Based on Semantic Analysis and Connection

  • Ye, Jialei;Zhang, Jiahao;Gao, Liqian;Zhou, Yang;Liu, Ziyang;Han, Jianguo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권8호
    • /
    • pp.2704-2719
    • /
    • 2022
  • Lately, neo-Chinese style furniture has been frequently noticed by product design professionals for the big part it played in promoting traditional Chinese culture. This article is an attempt to use big data semantic analysis method to provide effective design research method for neo-Chinese furniture design. By using big data mining program TEXTOM for big data collection and analysis, the data obtained from typical websites in a set time period will be sorted and analyzed. On the basis of "neo-Chinese furniture" samples, key data will be compared, classification analysis of overall data, and horizontal analysis of typical data will be performed by the methods of word frequency analysis, connection centrality analysis, and TF-IDF analysis. And we tried to summarize according to the related views and theories of the design. The research results show that the results of data analysis are close to the relevant definitions of design. The core high-frequency vocabulary obtained under data analysis, such as popular, furniture, modern, etc., can provide a reasonable and effective focus of attention for the designs. The result obtained through the systematic sorting and summary of the data can be a reliable guidance in the direction of our design. This research attempted to introduce related big data mining semantic analysis methods into the product design industry, to supply scientific and objective data and channels for studies on design, and to provide a case on the practical application of big data analysis in the industry.

Probabilistic Graphical Model for Transaction Data Analysis (트랜잭션 데이터 분석을 위한 확률 그래프 모형)

  • Ahn, Gil Seung;Hur, Sun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • 제42권4호
    • /
    • pp.249-255
    • /
    • 2016
  • Recently, transaction data is accumulated everywhere very rapidly. Association analysis methods are usually applied to analyze transaction data, but the methods have several problems. For example, these methods can only consider one-way relations among items and cannot reflect domain knowledge into analysis process. In order to overcome defect of association analysis methods, we suggest a transaction data analysis method based on probabilistic graphical model (PGM) in this study. The method we suggest has several advantages as compared with association analysis methods. For example, this method has a high flexibility, and can give a solution to various probability problems regarding the transaction data with relationships among items.

Approaches to Studying Low Birth Rate in Korea: A Critical Review (우리나라 저출산 관련 연구 동향 분석)

  • Na, Yu-Mi;Kim, Mi-Kyung
    • Korean Journal of Human Ecology
    • /
    • 제19권5호
    • /
    • pp.817-833
    • /
    • 2010
  • This study was dedicated to searching better course of low birth rate study in Korea by carefully analyzing past and present low birth rate researches. For this 179 studies(101 master thesis and 78 journal articles) from 1991 to 2009 were analyzed. Next, using SPSS Win 12.0, the research type, topic, participants, data collection and method of data analysis were compared to the studies' years of publication. The most frequently applied research approach, topic, sampling method, data collection procedure and data analysis method in the research was found to be a literature study, solution and prevention of low birth rate related policy, literature study, literacy analysis. In conclusion, low birth rate studies should become more diversified in terms of types of the research, data collection method, and data analysis. Additionally, research topics should become more realistic and specified. Moreover, research results should be verified before they are applied to the policy.

Meta-analysis of the programming learning effectiveness depending on the teaching and learning method

  • Jeon, SeongKyun;Lee, YoungJun
    • Journal of the Korea Society of Computer and Information
    • /
    • 제22권11호
    • /
    • pp.125-133
    • /
    • 2017
  • Recently, as the programming education has become essential in school, discussion of how to teach programming has been important. This study performed a meta-analysis of the effect size depending on the teaching and learning method for the programming education. 78 research data selected from 45 papers were analyzed from cognitive and affective aspects according to dependent variables. The analysis from the cognitive aspect showed that there was no statistically significant difference in the effect size depending on whether or not the teaching and learning method was specified in the research paper. Meta-analysis of the research data where the teaching and learning method was designated displayed significances in CPS, PBL and Storytelling. Unlike the cognitive aspect, the analysis from the affective aspect showed that the effect size of the research data without the specified teaching and learning method was larger than those with specified teaching and learning method with a statistical significance. Meta-analysis of the data according to the teaching and learning method displayed no statistical significance. Based upon these research results, this study suggested implications for the effective programming education.

Effective Data Management Method for Operational Data on Accredited Engineering Programs (공학교육인증 프로그램의 효과적인 운영 데이터 관리 방법)

  • Han, Kyoung-Soo
    • Journal of Engineering Education Research
    • /
    • 제17권5호
    • /
    • pp.51-58
    • /
    • 2014
  • This study proposes an effective data management method for easing the burden on self-study report by analyzing operational data on accredited engineering programs. Four analysis criteria are developed: variability, difficulty level of collecting, urgency of analysis, timeliness. After the operational data are analyzed in terms of the analysis criteria, the data which should be managed in time are extracted according to the analysis results. This study proposes a data management method in which tasks of managing the timely-managed data are performed based on the regular academic schedule, so that the result of this study may be used as a working-level reference material.

Complex Segregation Analysis of Categorical Traits in Farm Animals: Comparison of Linear and Threshold Models

  • Kadarmideen, Haja N.;Ilahi, H.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제18권8호
    • /
    • pp.1088-1097
    • /
    • 2005
  • Main objectives of this study were to investigate accuracy, bias and power of linear and threshold model segregation analysis methods for detection of major genes in categorical traits in farm animals. Maximum Likelihood Linear Model (MLLM), Bayesian Linear Model (BALM) and Bayesian Threshold Model (BATM) were applied to simulated data on normal, categorical and binary scales as well as to disease data in pigs. Simulated data on the underlying normally distributed liability (NDL) were used to create categorical and binary data. MLLM method was applied to data on all scales (Normal, categorical and binary) and BATM method was developed and applied only to binary data. The MLLM analyses underestimated parameters for binary as well as categorical traits compared to normal traits; with the bias being very severe for binary traits. The accuracy of major gene and polygene parameter estimates was also very low for binary data compared with those for categorical data; the later gave results similar to normal data. When disease incidence (on binary scale) is close to 50%, segregation analysis has more accuracy and lesser bias, compared to diseases with rare incidences. NDL data were always better than categorical data. Under the MLLM method, the test statistics for categorical and binary data were consistently unusually very high (while the opposite is expected due to loss of information in categorical data), indicating high false discovery rates of major genes if linear models are applied to categorical traits. With Bayesian segregation analysis, 95% highest probability density regions of major gene variances were checked if they included the value of zero (boundary parameter); by nature of this difference between likelihood and Bayesian approaches, the Bayesian methods are likely to be more reliable for categorical data. The BATM segregation analysis of binary data also showed a significant advantage over MLLM in terms of higher accuracy. Based on the results, threshold models are recommended when the trait distributions are discontinuous. Further, segregation analysis could be used in an initial scan of the data for evidence of major genes before embarking on molecular genome mapping.

Improving the Gumbel analysis by using M-th highest extremes

  • Cook, Nicholas J.
    • Wind and Structures
    • /
    • 제1권1호
    • /
    • pp.25-42
    • /
    • 1998
  • Improvements to the Gumbel method of extreme value analysis of wind data made over the last two decades are reviewed and illustrated using sample data for Jersey. A new procedure for extending the Gumbel method to include M-th highest annual extremes is shown to be less effective than the standard method, but leads to a method for calibrating peak-over-threshold methods against the standard Gumbel approach. Peak-over-threshold methods that include at least the 3rd highest annual extremes, specifically the modified Jensen and Franck method and the "Method of independent storms" are shown to give the best estimates of extremes from observations.