• Title/Summary/Keyword: data association

Search Result 32,918, Processing Time 0.045 seconds

Multi-block Analysis of Genomic Data Using Generalized Canonical Correlation Analysis

  • Jun, Inyoung;Choi, Wooree;Park, Mira
    • Genomics & Informatics
    • /
    • v.16 no.4
    • /
    • pp.33.1-33.9
    • /
    • 2018
  • Recently, there have been many studies in medicine related to genetic analysis. Many genetic studies have been performed to find genes associated with complex diseases. To find out how genes are related to disease, we need to understand not only the simple relationship of genotypes but also the way they are related to phenotype. Multi-block data, which is a summation form of variable sets, is used for enhancing the analysis of the relationships of different blocks. By identifying relationships through a multi-block data form, we can understand the association between the blocks in comprehending the correlation between them. Several statistical analysis methods have been developed to understand the relationship between multi-block data. In this paper, we will use generalized canonical correlation methodology to analyze multi-block data from the Korean Association Resource project, which has a combination of single nucleotide polymorphism blocks, phenotype blocks, and disease blocks.

A Study for Antecedent Association Rules

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1077-1083
    • /
    • 2006
  • Association rule mining searches for interesting relationships among items in a given database. Association rules are frequently used by retail stores to assist in marketing, advertising, floor placement, and inventory control. There are three primary quality measures for association rule, support and confidence and lift. In this paper we present association rule mining based antecedent variables. We call these rules to antecedent association rules. An antecedent variable is a variable that occurs before the independent variable and the dependent variable.

  • PDF

Temporal Associative Classification based on Calendar Patterns (캘린더 패턴 기반의 시간 연관적 분류 기법)

  • Lee Heon Gyu;Noh Gi Young;Seo Sungbo;Ryu Keun Ho
    • Journal of KIISE:Databases
    • /
    • v.32 no.6
    • /
    • pp.567-584
    • /
    • 2005
  • Temporal data mining, the incorporation of temporal semantics to existing data mining techniques, refers to a set of techniques for discovering implicit and useful temporal knowledge from temporal data. Association rules and classification are applied to various applications which are the typical data mining problems. However, these approaches do not consider temporal attribute and have been pursued for discovering knowledge from static data although a large proportion of data contains temporal dimension. Also, data mining researches from temporal data treat problems for discovering knowledge from data stamped with time point and adding time constraint. Therefore, these do not consider temporal semantics and temporal relationships containing data. This paper suggests that temporal associative classification technique based on temporal class association rules. This temporal classification applies rules discovered by temporal class association rules which extends existing associative classification by containing temporal dimension for generating temporal classification rules. Therefore, this technique can discover more useful knowledge in compared with typical classification techniques.

Proposition of causally confirmed measures in association rule mining (인과적 확인 측도에 의한 연관성 규칙 탐색)

  • Park, Hee Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.4
    • /
    • pp.857-868
    • /
    • 2014
  • Data mining is the representative analysis methodology in the era of big data, and is the process to analyze a massive volume database and summarize it into meaningful information. Association rule technique finds the relationship among several items in huge database using the interestingness measures such as support, confidence, lift, etc. But these interestingness measures cannot be used to establish a causality relationship between antecedent and consequent item sets. Moreover, we can not know association direction by them. This paper propose causally confirmed association thresholds to compensate for these problems, and then check the three conditions of interestingness measures. The comparative studies with basic association thresholds, causal association thresholds, and causally confirmed association thresholds are shown by simulation studies. The results show that causally confirmed association thresholds are better than basic and causal association thresholds.

Pattern Analysis of Nonconforming Farmers in Residual Pesticides using Exploratory Data Analysis and Association Rule Analysis (탐색적 자료 분석 및 연관규칙 분석을 활용한 잔류농약 부적합 농업인 유형 분석)

  • Kim, Sangung;Park, Eunsoo;Cho, Hyunjeong;Hong, Sunghie;Sohn, Byungchul;Hong, Jeehwa
    • Journal of Korean Society for Quality Management
    • /
    • v.49 no.1
    • /
    • pp.81-95
    • /
    • 2021
  • Purpose: The purpose of this study was to analysis pattern of nonconforming farmers who is one of the factors of unconformity in residual pesticides. Methods: Pattern analysis of nonconforming farmers were analyzed through convergence of safety data and farmer's DB data. Exploratory data analysis and association rule analysis were used for extracting factors related to unconformity. Results: The results of this study are as follows; regarding the exploratory data analysis, it was found that factors of farmers influencing unconformity in residual pesticides by total 9 factors; sampling time, gender, age, cultivation region, farming career, agricultural start form, type of agriculture, cultivation area, classification of agricultural products. Regarding the association rule analysis, non-conformity association rules were found over the past three years. There was a difference in the pattern of nonconforming farmers depending on the cultivation period. Conclusion: Exploratory data analysis and association rule analysis will be useful tools to establish more efficient and economical safety management plan for agricultural products.

Data Bias Optimization based Association Reasoning Model for Road Risk Detection (도로 위험 탐지를 위한 데이터 편향성 최적화 기반 연관 추론 모델)

  • Ryu, Seong-Eun;Kim, Hyun-Jin;Koo, Byung-Kook;Kwon, Hye-Jeong;Park, Roy C.;Chung, Kyungyong
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.9
    • /
    • pp.1-6
    • /
    • 2020
  • In this study, we propose an association inference model based on data bias optimization for road hazard detection. This is a mining model based on association analysis to collect user's personal characteristics and surrounding environment data and provide traffic accident prevention services. This creates transaction data composed of various context variables. Based on the generated information, a meaningful correlation of variables in each transaction is derived through correlation pattern analysis. Considering the bias of classified categorical data, pruning is performed with optimized support and reliability values. Based on the extracted high-level association rules, a risk detection model for personal characteristics and driving road conditions is provided to users. This enables traffic services that overcome the data bias problem and prevent potential road accidents by considering the association between data. In the performance evaluation, the proposed method is excellently evaluated as 0.778 in accuracy and 0.743 in the Kappa coefficient.

Development of Temporal Disaggregation Model using Neural Networks 1. Application of the Historic Data (신경망모형을 이용한 시간적 분해모형의 개발 1. 실측자료의 적용)

  • Kim, Seong-Won;Kim, Jeong-Heon;Park, Gi-Beom
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2009.05a
    • /
    • pp.1207-1210
    • /
    • 2009
  • The goal of this research is to apply the neural networks models for the disaggregation of the pan evaporation (PE) data, Republic of Korea. The neural networks models consist of generalized regression neural networks model (GRNNM) and multilayer perceptron neural networks model (MLP-NNM), respectively. The disaggregation means that the yearly PE data divides into the monthly PE data. And, for the performances of the neural networks models, they are composed of training and test performances, respectively. The training and test performances consist of the only historic data, respectively. From this research, we evaluate the impact of GRNNM and MLP-NNM for the disaggregation of the nonlinear time series data. We should, furthermore, construct the credible data of the monthly PE data from the disaggregation of the yearly PE data, and can suggest the methodology for the irrigation and drainage networks system.

  • PDF

Effects of Forcing Data Resolution in Macro Scale River Discharge Simulation

  • Tachikawa, Yasuto;Shrestha, Roshan
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2002.05b
    • /
    • pp.1179-1186
    • /
    • 2002
  • Macro scale distributed hydrological models simulate river discharge with better accuracy but it depends upon the grid resolution of input data. Effects of different input resolutions are studied here. Three different grid resolution input data obtained from HUBEX-IOP EEWB data and GAME Re-analysis data are used to simulate river discharge and compared against the observed one. CAME Re-analysis 1.25-degree resolution data are found quite satisfactory in larger basins, while HUBEX-IOP EEWB 10-minute resolution data are better for small catchments. GAME Re-analysis 2.5-degree resolution data did not give good result. Simulated results by using spatially interpolated data are rather worse than using original data. The Huaihe River basin $(132350\textrm{km}^2)$ is taken as the case of study.

  • PDF

A Study on the Direction of Data Triggers and Elements for Automated Vehicle Data Recorder (자율주행자동차 데이터 기록장치의 기록 조건 및 항목에 대한 방향성 연구)

  • Heejin Kang;Naeun Woo;Giok Park;Jihyun Song
    • Journal of Auto-vehicle Safety Association
    • /
    • v.15 no.4
    • /
    • pp.71-78
    • /
    • 2023
  • This study presents the direction of data triggers and elements to be recorded in automated vehicles in the future in relation to the event data recorder (EDR) and data storage system for automated driving (DSSAD). It does not distinguish between the EDR and DSSAD, but suggests data triggers and elements in preparation for overall automated vehicle accidents and dangerous situations. To propose, the current status of discussions on EDR/DSSAD internationally and the case of investigating accidents with automated vehicles under temporary driving licenses in Korea were analyzed. Based on the analysis, the direction of data triggers and elements of the EDR/DSSAD of automated vehicles were presented.