• Title/Summary/Keyword: Large Data Set

Search Result 1,063, Processing Time 0.025 seconds

An Active Candidate Set Management Model on Association Rule Discovery using Database Trigger and Incremental Update Technique (트리거와 점진적 갱신기법을 이용한 연관규칙 탐사의 능동적 후보항목 관리 모델)

  • Hwang, Jeong-Hui;Sin, Ye-Ho;Ryu, Geun-Ho
    • Journal of KIISE:Databases
    • /
    • v.29 no.1
    • /
    • pp.1-14
    • /
    • 2002
  • Association rule discovery is a method of mining for the associated item set on large databases based on support and confidence threshold. The discovered association rules can be applied to the marketing pattern analysis in E-commerce, large shopping mall and so on. The association rule discovery makes multiple scan over the database storing large transaction data, thus, the algorithm requiring very high overhead might not be useful in real-time association rule discovery in dynamic environment. Therefore this paper proposes an active candidate set management model based on trigger and incremental update mechanism to overcome non-realtime limitation of association rule discovery. In order to implement the proposed model, we not only describe an implementation model for incremental updating operation, but also evaluate the performance characteristics of this model through the experiment.

The Effects of Selection Attributes for HMR on Satisfaction and Repurchase Intention: Comparative Analysis of Convenience Store and Large Market (HMR 선택속성이 만족과 재구매의도에 미치는 영향: 편의점과 대형마트의 비교 분석)

  • Yang, Dong-Hwi
    • Culinary science and hospitality research
    • /
    • v.24 no.3
    • /
    • pp.204-214
    • /
    • 2018
  • The study set up research models and hypotheses to examine the influence of HMR selection attributes on satisfaction and repurchase intention by distribution channels(convenience store/large market), verify the research hypothesis through empirical analysis, respectively. The purpose of this study is to investigate the convenience sampling method of HMR purchase from convenience store and large market in Seoul and Gyeonggi area. The survey was conducted from January 8, 2018 to January 26, 2018, and 300 questionnaires were distributed and 289 of them were used as an effective data. For the empirical analysis, SPSS 20.0 was used. The results of the analysis are as follows. First, product quality only has a significant effect on satisfaction among HMR selection attributes at convenience store, and product safety and convenience have no significant effect on satisfaction. Second, only the convenience of HMR selection attributes in the large market has a significant effect on satisfaction, and product safety and product quality have no significant effect on satisfaction. Third, HMR satisfaction in convenience stores and large markets has a significant effect on repurchase intention. The purpose of this study is to investigate the relationships among HMR selection attributes, satisfaction, and repurchase intention, which are important in the existing HMR research, by each distribution channel(convenience store/large market). It is meaningful to help them establish an effective sales strategy for each segment.

The Sliding Window Gene-Shaving Algorithm for Microarray Data Analysis

  • 이혜선;최대우;전치혁
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2002.06a
    • /
    • pp.139-152
    • /
    • 2002
  • Gene-shaving(Hastie et al, 2000) is a very useful method to identify a meaningful group of genes when the variation of expression is large. By shaving off the low-correlated genes with the leading principal component, the primary genes with the coherent expression pattern can be identified. Gene-shaving method works well If expression levels are varied enough, but it may not catch the meaningful cluster in low expression level or different expression time even with coherent patterns. The sliding window gene-shaving method which is to apply gene-shaving in each sliding window after hierarchical clustering is to compensate losing a meaningful set of genes whose variation is not large but distinct. The performance to identify expression patterns is compared for the simulated profile data by the different variance and expression level.

  • PDF

The Design and Implementation of a Reusable Viewer Component

  • Kim, Hong-Gab;Lim, Young-Jae;Kim, Kyung-Ok
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.66-69
    • /
    • 2002
  • This article outlines the capabilities of a viewer component called GridViewer, and proves its reusability. GridViewer was designed for the construction of the image display part of GIS or remote sensing application software, and consequently it is particularly straightforward to closely couple GridViewer with access to very large images. Displaying is performed through pyramid structure, which enables to treat very large dataset up to several gigabytes in size under the limited capability of PC. GridViewer is free from responsibility to handle various formats of raster data files by taking grid coverage, which is designed by OGC to promote interoperability between implementations done by data vendors and software vendors providing analysis and grid processing implementations. GridViewer differs from other such viewer by allowing for clients to extend its function and capability by using small set of methods originally implemented in it. We show its reusability and expandability by applying it in developing application programs performing various functions not supported originally by the GridViewer COM component.

  • PDF

A Study on the Improvement of the Batch-means Method in Simulation Analysis (모의실험 분석중 구간평균기법의 개선을 위한 연구)

  • 천영수
    • Journal of the Korea Society for Simulation
    • /
    • v.5 no.2
    • /
    • pp.59-72
    • /
    • 1996
  • The purpose of this study is to make an improvement to the batch-means method, which is a procedure to construct a confidence interval(c.i.) for the steady-state process mean of a stationary simulation output process. In the batch-means method, the data in the output process are grouped into batches. The sequence of means of the data included in individual batches is called a batch-menas process and can be treated as an independently and identically distributed set of variables if each batch includes sufficiently large number of observations. The traditional batch-means method, therefore, uses a batch size as large as possible in order to. destroy the autocovariance remaining in the batch-means process. The c.i. prodedure developed and empirically tested in this study uses a small batch size which can be well fitted by a simple ARMA model, and then utilizes the dependence structure in the fitted model to correct for bias in the variance estimator of the sample mean.

  • PDF

High Performance Data Cache Memory Architecture (고성능 데이터 캐시 메모리 구조)

  • Kim, Hong-Sik;Kim, Cheong-Ghil
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.9 no.4
    • /
    • pp.945-951
    • /
    • 2008
  • In this paper, a new high performance data cache scheme that improves exploitation of both the spatial and temporal locality is proposed. The proposed data cache consists of a hardware prefetch unit and two sub-caches such as a direct-mapped (DM) cache with a large block size and a fully associative buffer with a small block size. Spatial locality is exploited by fetching and storing large blocks into a direct mapped cache, and is enhanced by prefetching a neighboring block when a DM cache hit occurs. Temporal locality is exploited by storing small blocks from the DM cache in the fully associative buffer according to their activity in the DM cache when they are replaced. Experimental results on Spec2000 programs show that the proposed scheme can reduce the average miss ratio by $12.53%\sim23.62%$ and the AMAT by $14.67%\sim18.60%$ compared to the previous schemes such as direct mapped cache, 4-way set associative cache and SMI(selective mode intelligent) cache[8].

Experimental Investigation on Onset Criteria of Liquid/Gas Entrainment in the Header-Feeder System of CANDU

  • Lee Jae-Young;Hwang Gi-Suk;Kim Man-Woong
    • Journal of Mechanical Science and Technology
    • /
    • v.20 no.7
    • /
    • pp.1030-1042
    • /
    • 2006
  • An experimental study has been performed to investigate the off-take phenomena at the header-feeder systems (horizontal header pipe with multiple feeder branch pipes) in a CANDU (CANadian Deuterium Uranium) reactor with the branch orientation varies ${\pm}36^{\circ}\;or\;{\pm}72^{\circ}$. In order to evaluate the applicability of the conventional correlations used in the safety analysis code, RELAP5-Mod3, the test facility is designed with the 1/2 scale of the. CANDU 6. It was found that the data set for the top, bottom and side branches are in a good agreement with the correlations used. However, for the specific angled branches, ${\pm}36^{\circ}\;and\;{\pm}72^{\circ}$, the onsets of off-take data and quality data showed large deviation with the conventional model inside RELAP5-MOD3. Furthermore, based on the uncertainty analysis, the conventional 2.5 power law needs to be modified. The present experimental data set can be useful for the construction of the general correlation considering the arbitrary branch orientation.

Extended Equal Service and Differentiated Service Models for Peer-to-Peer File Sharing

  • Zhang, Jianwei;Wang, Yongchao;Xing, Wei;Lu, Dongming
    • Journal of Communications and Networks
    • /
    • v.15 no.2
    • /
    • pp.228-239
    • /
    • 2013
  • Peer-to-peer (P2P) systems have proved the most effective and popular file sharing applications in recent years. Previous studies mainly focused on equal service and differentiated service strategies when peers have no initial data before their downloads. For an upload-constrained P2P file sharing system, we model both the equal service process and the differentiated service process when the initial data distribution of peers satisfies some special conditions. Moreover, we show how to minimize the time required to distribute the file to any number of peers. The proposed fluid-based models can reveal the intrinsic relations among the initial data amount, the peer set size, and the minimum last finish time. The closed-form expressions derived from the extended models can closely approximate chunk-based models and systems, especially for relatively large files. As an application of the extended models, we show how to provide differentiated service efficiently to multiple peer sets. Since no limits are imposed on the upload bandwidth of peers or the size of each peer set, we believe that our analytic process and the results achieved can provide not only fundamental insights into bandwidth allocation and data scheduling but also a helpful reference for both improving system performance and building an effective incentive mechanism for P2P file sharing systems.

Vegetation Change Detection in the Sihwa Embankment using Multi-Temporal Satellite Data (다중시기 위성영상을 이용한 시화 방조제 내만 식생변화탐지)

  • Jeong, Jong-Chul;Suh, Young-Sang;Kim, Sang-Wook
    • Journal of Environmental Science International
    • /
    • v.15 no.4
    • /
    • pp.373-378
    • /
    • 2006
  • The western coast of South Korea is famous for its large and broad tidal lands. Nevertheless, land reclamation, which has been conducted on a large scale, such as Sihwa embankment construction project has accelerated coastal environmental changes in the embankment inland. For monitoring of environmental change, vegetation change detecting of the embankment inland were carried out and field survey data compared with Landsat TM, ETM+, IKONOS, and EOC satellite remotely sensed data. In order to utilize multi-temporal remotely sensed images effectively, all data set with pixel size were analyzed by same geometric correction method. To detect the tidal land vegetation change, the spectral characteristics and spatial resolution of Landsat TM and ETM+ images were analyzed by SMA(spectral mixture analysis). We obtained the 78.96% classification accuracy and Kappa index 0.2376 using March 2000 Landsat data. The SMA(spectral mixture analysis) results were considered with comparing of vegetation seasonal change detection method.

Development of Lightweight Molding CAE Data for Efficient Exchange (사출성형 해석 결과 데이터의 효율적 공유를 위한 경량데이터 개발)

  • Park, Ji-Hun;Park, Byoung-Keon;Kim, Jay-Jung
    • Korean Journal of Computational Design and Engineering
    • /
    • v.16 no.5
    • /
    • pp.344-350
    • /
    • 2011
  • In injection molding industries, CAE analyses are generally used to find out problems predicted during the process of manufacturing. The results of CAE analyses consist of much in formation such as meshes and stress, so that the size of data is pretty large. To reduce the size of the data and to make it easy to share, the CAE result to JT translator is proposed in this paper. The translator consists of three modules to translate CAE result to JT format; Extracting module gets ASCII data of product shape and the result values of CAE analysis. Sorting module and mapping module make an element data set and JT file with the data extracted from Extracting module respectively. To the JT files, engineers are able to append product properties and their comments, so that they can share the whole history of the analysis process. In addition, our case study shows that the size of JT format is reduced by almost 90% of its original data format.