• 제목/요약/키워드: Multiple Data Sets

검색결과 348건 처리시간 0.025초

프마이크로어레이 데이터의 유전자 집합 및 대사 경로 분석 (Gene Set and Pathway Analysis of Microarray Data)

  • 김선영
    • 유전체소식지
    • /
    • 제6권1호
    • /
    • pp.29-33
    • /
    • 2006
  • Gene set analysis is a new concept and method. to analyze and interpret microarray gene expression data and tries to extract biological meaning from gene expression data at gene set level rather than at gene level. Compared with methods which select a few tens or hundreds of genes before gene ontology and pathway analysis, gene set analysis identifies important gene ontology terms and pathways more consistently and performs well even in gene expression data sets with minimal or moderate gene expression changes. Moreover, gene set analysis is useful for comparing multiple gene expression data sets dealing with similar biological questions. This review briefly summarizes the rationale behind the gene set analysis and introduces several algorithms and tools now available for gene set analysis.

  • PDF

다중 해시함수 기반 데이터 스트림에서의 아이템 의사 주기 탐사 기법 (Finding Pseudo Periods over Data Streams based on Multiple Hash Functions)

  • 이학주;김재완;이원석
    • 한국IT서비스학회지
    • /
    • 제16권1호
    • /
    • pp.73-82
    • /
    • 2017
  • Recently in-memory data stream processing has been actively applied to various subjects such as query processing, OLAP, data mining, i.e., frequent item sets, association rules, clustering. However, finding regular periodic patterns of events in an infinite data stream gets less attention. Most researches about finding periods use autocorrelation functions to find certain changes in periodic patterns, not period itself. And they usually find periodic patterns in time-series databases, not in data streams. Literally a period means the length or era of time that some phenomenon recur in a certain time interval. However in real applications a data set indeed evolves with tiny differences as time elapses. This kind of a period is called as a pseudo-period. This paper proposes a new scheme called FPMH (Finding Periods using Multiple Hash functions) algorithm to find such a set of pseudo-periods over a data stream based on multiple hash functions. According to the type of pseudo period, this paper categorizes FPMH into three, FPMH-E, FPMH-PC, FPMH-PP. To maximize the performance of the algorithm in the data stream environment and to keep most recent periodic patterns in memory, we applied decay mechanism to FPMH algorithms. FPMH algorithm minimizes the usage of memory as well as processing time with acceptable accuracy.

가속수명시험을 이용한 다련장 발사대 신뢰도 성장 연구 (A Study on the Reliability Growth of Multiple Launch Rocket System Using Accelerated Life Testing)

  • 이용준;류정민;손권일;송석봉;김상부;박우재
    • 한국군사과학기술학회지
    • /
    • 제22권2호
    • /
    • pp.241-248
    • /
    • 2019
  • In this paper, we aim to check the reliability growth of multiple launch rocket components by the life evaluation. We apply the Crow-AMSAA model to the sets of test data obtained from the development phase. The result of the data analysis shows that the reliability of some components needs to be improved. In order to improve their reliability, we analyze the failure mechanism and change their designs. The verification of the reliability growth for those components is done by analyzing the data sets obtained by the accelerated life tests. As a result, we show that the MTBF of those components is increased and also their reliabilities improved.

러프 집합 기반 적응 모델 선택을 갖는 다중 모델 퍼지 예측 시스템 구현과 시계열 예측 응용 (Multiple Model Fuzzy Prediction Systems with Adaptive Model Selection Based on Rough Sets and its Application to Time Series Forecasting)

  • 방영근;이철희
    • 한국지능시스템학회논문지
    • /
    • 제19권1호
    • /
    • pp.25-33
    • /
    • 2009
  • 최근 시계열 예측에 결론부에 선형식을 갖는 TS 퍼지 모델이 많이 이용되고 있는데, 이의 예측 성능은 정상성과 같은 데이터의 특성과 밀접한 관련이 있다. 그러므로 본 논문에서는 특히 비정상 시계열 예측에 매우 효과적인 새로운 예측 기법을 제안하였다. 시계열의 패턴이나 규칙성을 잘 끌어내기 위한 데이터 전처리 과정을 도입하고 다중 모델 TS 퍼지 예측기를 구성한 뒤, 러프집합을 이용한 적응 모델 선택 기법에 의해 입력 데이터의 특성에 따라 가변적으로 적합한 예측 모델을 선택하여 시계열 예측이 수행되도록 하였다. 마지막으로 예측 오차를 감소시키기 위하여 오차 보정 메커니즘을 추가함으로써 예측 성능을 더욱 향상시켰다. 시뮬레이션을 통해 제안된 기법의 성능을 검증하였다. 제안된 기법은 예측 모델 구현과 예측 수행 과정에서 시계열 데이터의 특성들을 잘 반영할 수 있으므로 불확실성과 비정상성을 갖는 시계열의 예측에 매우 효과적으로 이용될 수 있을 것이다.

복합생성 자료검색의 모형화 (A Modelling of Multi-derived Data and Its Retrieval Scheme)

  • 이춘열
    • Asia pacific journal of information systems
    • /
    • 제4권1호
    • /
    • pp.115-138
    • /
    • 1994
  • Current database systems are based on the assumption that a datum denotes the same meaning; however, in reality, the violation of this assumption is not unusual. Some data are created in such a way that they represent different sets of attribute values. The current research formulates this phenomenon as dissimilarities of derivation rules and defines multi-derived data as ones that are derived by multiple rules. For multi- derived data, this research proposes a new retrieval scheme and analyze its implication with relation to data retrieval.

  • PDF

Fully connecting the Observational Health Data Science and Informatics (OHDSI) initiative with the world of linked open data

  • Banda, Juan M.
    • Genomics & Informatics
    • /
    • 제17권2호
    • /
    • pp.13.1-13.3
    • /
    • 2019
  • The usage of controlled biomedical vocabularies is the cornerstone that enables seamless interoperability when using a common data model across multiple data sites. The Observational Health Data Science and Informatics (OHDSI) initiative combines over 100 controlled vocabularies into its own. However, the OHDSI vocabulary is limited in the sense that it combines multiple terminologies and does not provide a direct way to link them outside of their own self-contained scope. This issue makes the tasks of enriching feature sets by using external resources extremely difficult. In order to address these shortcomings, we have created a linked data version of the OHDSI vocabulary, connecting it with already established linked resources like bioportal, bio2rdf, etc. with the ultimate purpose of enabling the interoperability of resources previously foreign to the OHDSI universe.

Reinforcement learning multi-agent using unsupervised learning in a distributed cloud environment

  • Gu, Seo-Yeon;Moon, Seok-Jae;Park, Byung-Joon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제14권2호
    • /
    • pp.192-198
    • /
    • 2022
  • Companies are building and utilizing their own data analysis systems according to business characteristics in the distributed cloud. However, as businesses and data types become more complex and diverse, the demand for more efficient analytics has increased. In response to these demands, in this paper, we propose an unsupervised learning-based data analysis agent to which reinforcement learning is applied for effective data analysis. The proposal agent consists of reinforcement learning processing manager and unsupervised learning manager modules. These two modules configure an agent with k-means clustering on multiple nodes and then perform distributed training on multiple data sets. This enables data analysis in a relatively short time compared to conventional systems that perform analysis of large-scale data in one batch.

데이터 전처리를 이용한 다중 모델 퍼지 예측기의 설계 및 응용 (Design of Multiple Model Fuzzy Predictors using Data Preprocessing and its Application)

  • 방영근;이철희
    • 전기학회논문지
    • /
    • 제58권1호
    • /
    • pp.173-180
    • /
    • 2009
  • It is difficult to predict non-stationary or chaotic time series which includes the drift and/or the non-linearity as well as uncertainty. To solve it, we propose an effective prediction method which adopts data preprocessing and multiple model TS fuzzy predictors combined with model selection mechanism. In data preprocessing procedure, the candidates of the optimal difference interval are determined based on the correlation analysis, and corresponding difference data sets are generated in order to use them as predictor input instead of the original ones because the difference data can stabilize the statistical characteristics of those time series and better reveals their implicit properties. Then, TS fuzzy predictors are constructed for multiple model bank, where k-means clustering algorithm is used for fuzzy partition of input space, and the least squares method is applied to parameter identification of fuzzy rules. Among the predictors in the model bank, the one which best minimizes the performance index is selected, and it is used for prediction thereafter. Finally, the error compensation procedure based on correlation analysis is added to improve the prediction accuracy. Some computer simulations are performed to verify the effectiveness of the proposed method.

다중 메모리 뱅크 구조를 위한 고속의 자료 할당 기법 (Rapid Data Allocation Technique for Multiple Memory Bank Architectures)

  • 조정훈;백윤홍;최준식
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2003년도 가을 학술발표논문집 Vol.30 No.2 (1)
    • /
    • pp.196-198
    • /
    • 2003
  • Virtually every digital signal processors(DSPs) support on-chip multi- memory banks that allow the processor to access multiple words of data from memory in a single instruction cycle. Also, all existing fixed-point DSPs have irregular architecture of heterogeneous register which contains multiple register files that are distributed and dedicated to different sets of instructions. Although there have been several studies conducted to efficiently assign data to multi-memory banks, most of them assumed processors with relatively simple, homogeneous general-purpose resisters. Therefore, several vendor-provided compilers fer DSPs were unable to efficiently assign data to multiple data memory banks. thereby often failing to generate highly optimized code fer their machines. This paper presents an algorithm that helps the compiler to efficiently assign data to multi- memory banks. Our algorithm differs from previous work in that it assigns variables to memory banks in separate, decoupled code generation phases, instead of a single, tightly-coupled phase. The experimental results have revealed that our decoupled algorithm greatly simplifies our code generation process; thus our compiler runs extremely fast, yet generates target code that is comparable In quality to the code generated by a coupled approach

  • PDF

MULTI-VIEW STEREO CAMERA CALIBRATION USING LASER TARGETS FOR MEASUREMENT OF LONG OBJECTS

  • Yoshimi, Takashi;Yoshimura, Takaharu;Takase, Ryuichi;Kawai, Yoshihiro;Tomita, Fumiaki
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송공학회 2009년도 IWAIT
    • /
    • pp.566-571
    • /
    • 2009
  • A calibration method for multiple sets of stereo vision cameras is proposed. To measure the three-dimensional shape of a very long object, measuring the object at different viewpoints and registration of the data are necessary. In this study, two lasers beams generate two strings of calibration targets, which form straight lines in the world coordinate system. An evaluation function is defined to calculate the sum of the squares of the distances between each transformed target and the fitted line representing the laser beam to each target, and the distances between points appearing in the data sets of two adjacent viewpoints. The calculation process for the approximation method based on data linearity is presented. The experimental results show the effectiveness of the method.

  • PDF