Search | Korea Science

Discretizing Spatio-Temporal Data using Data Reduction and Clustering (데이타 축소와 군집화를 사용하는 시공간 데이타의 이산화 기법)

Kang, Ju-Young;Yong, Hwan-Seung
- Journal of KIISE:Computing Practices and Letters
- /
- v.15 no.1
- /
- pp.57-61
- /
- 2009
To increase the efficiency of mining process and derive accurate spatio-temporal patterns, continuous values of attributes should be discretized prior to mining process. In this paper, we propose a discretization method which improves the mining efficiency by reducing the data size without losing the correlations in the data. The proposed method first s original trajectories into approximations using line simplification and then groups them into similar clusters. Our experiments show that the proposed approach improves the mining efficiency as well as extracts more intuitive patterns compared to existing discretization methods.
PDF KSCI

TEMPORAL CLASSIFICATION METHOD FOR FORECASTING LOAD PATTERNS FROM AMR DATA

Lee, Heon-Gyu;Shin, Jin-Ho;Ryu, Keun-Ho
- Proceedings of the KSRS Conference
- /
- 2007.10a
- /
- pp.594-597
- /
- 2007
We present in this paper a novel mid and long term power load prediction method using temporal pattern mining from AMR (Automatic Meter Reading) data. Since the power load patterns have time-varying characteristic and very different patterns according to the hour, time, day and week and so on, it gives rise to the uninformative results if only traditional data mining is used. Also, research on data mining for analyzing electric load patterns focused on cluster analysis and classification methods. However despite the usefulness of rules that include temporal dimension and the fact that the AMR data has temporal attribute, the above methods were limited in static pattern extraction and did not consider temporal attributes. Therefore, we propose a new classification method for predicting power load patterns. The main tasks include clustering method and temporal classification method. Cluster analysis is used to create load pattern classes and the representative load profiles for each class. Next, the classification method uses representative load profiles to build a classifier able to assign different load patterns to the existing classes. The proposed classification method is the Calendar-based temporal mining and it discovers electric load patterns in multiple time granularities. Lastly, we show that the proposed method used AMR data and discovered more interest patterns.
PDF

A Study for Determining the Best Number of Clusters on Temporal Data (Temporal 데이터의 최적의 클러스터 수 결정에 관한 연구)

Cho Young-Hee;Lee Gye-Sung;Jeon Jin-Ho
- The Journal of the Korea Contents Association
- /
- v.6 no.1
- /
- pp.23-30
- /
- 2006
A clustering method for temporal data takes a model-based approach. This uses automata based model for each cluster. It is necessary to construct global models for a set of data in order to elicit individual models for the cluster. The preparation for building individual models is completed by determining the number of clusters inherent in the data set. In this paper, BIC(Bayesian Information Criterion) approximation is used to determine the number clusters and confirmed its applicability. A search technique to improve efficiency is also suggested by analyzing the relationship between data size and BIC values. A number of experiments have been performed to check its validity using artificially generated data sets. BIC approximation measure has been confirmed that it suggests best number of clusters through experiments provided that the number of data is relatively large.
PDF

Labeling Big Spatial Data: A Case Study of New York Taxi Limousine Dataset

AlBatati, Fawaz;Alarabi, Louai
- International Journal of Computer Science & Network Security
- /
- v.21 no.6
- /
- pp.207-212
- /
- 2021
Clustering Unlabeled Spatial-datasets to convert them to Labeled Spatial-datasets is a challenging task specially for geographical information systems. In this research study we investigated the NYC Taxi Limousine Commission dataset and discover that all of the spatial-temporal trajectory are unlabeled Spatial-datasets, which is in this case it is not suitable for any data mining tasks, such as classification and regression. Therefore, it is necessary to convert unlabeled Spatial-datasets into labeled Spatial-datasets. In this research study we are going to use the Clustering Technique to do this task for all the Trajectory datasets. A key difficulty for applying machine learning classification algorithms for many applications is that they require a lot of labeled datasets. Labeling a Big-data in many cases is a costly process. In this paper, we show the effectiveness of utilizing a Clustering Technique for labeling spatial data that leads to a high-accuracy classifier.
https://doi.org/10.22937/IJCSNS.2021.21.6.27 인용 PDF KSCI

County Level Clustering on Alcohol and HIV Mortality

Park, Byeonghwa
- Communications for Statistical Applications and Methods
- /
- v.20 no.1
- /
- pp.53-62
- /
- 2013
This study focuses on spatial/temporal relationship deaths caused by Human Immunodeficiency Virus (HIV) and Alcohol Use Disorder (AUD). Several studies have found links between these two diseases. By looking for clusters in mortality of Alcohol and HIV related deaths this study contributes to the field through the identification of exact spatial/temporal time of high and low occurrence risks based on the observed over the expected number of deaths. This study does not provide political or social interpretations of the data. It merely wants to show where clusters are found.
https://doi.org/10.5351/CSAM.2013.20.1.053 인용 PDF KSCI

Volatility clustering in data breach counts

Shim, Hyunoo;Kim, Changki;Choi, Yang Ho
- Communications for Statistical Applications and Methods
- /
- v.27 no.4
- /
- pp.487-500
- /
- 2020
Insurers face increasing demands for cyber liability; entailed in part by a variety of new forms of risk of data breaches. As data breach occurrences develop, our understanding of the volatility in data breach counts has also become important as well as its expected occurrences. Volatility clustering, the tendency of large changes in a random variable to cluster together in time, are frequently observed in many financial asset prices, asset returns, and it is questioned whether the volatility of data breach occurrences are also clustered in time. We now present volatility analysis based on INGARCH models, i.e., integer-valued generalized autoregressive conditional heteroskedasticity time series model for frequency counts due to data breaches. Using the INGARCH(1, 1) model with data breach samples, we show evidence of temporal volatility clustering for data breaches. In addition, we present that the firms' volatilities are correlated between some they belong to and that such a clustering effect remains even after excluding the effect of financial covariates such as the VIX and the stock return of S&P500 that have their own volatility clustering.
https://doi.org/10.29220/CSAM.2020.27.4.487 인용 PDF KSCI

Spatial pattern and temporal mode analysis of microarray time-series data by independent component analysis (독립성분분석에 의한 유전자 발현 시계열 데이터의 공간적 패턴과 시간적 모드 분석)

Sookjeong, Kim;Seungjin, Choi
- Proceedings of the Korean Information Science Society Conference
- /
- 2004.10b
- /
- pp.250-252
- /
- 2004
In this paper we apply several variations of independent component analysis( ICA) methods, such as spatial ICA (sICA), temporal ICA (tICA), and spatiotemporal ICA (stICA), to yeast cell cycle datasets, and compare their performance in finding components that result in gene clusters coherent with annotations and in extract ins meaningful temporal modes. It turns out that the results of tICA are superior to those of PCA, sICA, and stICA in terms of gene clustering and the temporal modes extracted by stICA highlights particular cellular processes.
PDF

Identifying Temporal Pattern Clusters to Predict Events in Time Series

Heesoo Hwang
- KIEE International Transaction on Systems and Control
- /
- v.2D no.2
- /
- pp.125-134
- /
- 2002
This paper proposes a method for identifying temporal pattern clusters to predict events in time series. Instead of predicting future values of the time series, the proposed method forecasts specific events that may be arbitrarily defined by the user. The prediction is defined by an event characterization function, which is the target of prediction. The events are predicted when the time series belong to temporal pattern clusters. To identify the optimal temporal pattern clusters, fuzzy goal programming is formulated to combine multiple objectives and solved by an adaptive differential evolution technique that can overcome the sensitivity problem of control parameters in conventional differential evolution. To evaluate the prediction method, five test examples are considered. The adaptive differential evolution is also tested for twelve optimization problems.
PDF

A Spatio-Temporal Clustering Technique for the Moving Object Path Search (이동 객체 경로 탐색을 위한 시공간 클러스터링 기법)

Lee, Ki-Young;Kang, Hong-Koo;Yun, Jae-Kwan;Han, Ki-Joon
- Journal of Korea Spatial Information System Society
- /
- v.7 no.3 s.15
- /
- pp.67-81
- /
- 2005
Recently, the interest and research on the development of new application services such as the Location Based Service and Telemetics providing the emergency service, neighbor information search, and route search according to the development of the Geographic Information System have been increasing. User's search in the spatio-temporal database which is used in the field of Location Based Service or Telemetics usually fixes the current time on the time axis and queries the spatial and aspatial attributes. Thus, if the range of query on the time axis is extensive, it is difficult to efficiently deal with the search operation. For solving this problem, the snapshot, a method to summarize the location data of moving objects, was introduced. However, if the range to store data is wide, more space for storing data is required. And, the snapshot is created even for unnecessary space that is not frequently used for search. Thus, non storage space and memory are generally used in the snapshot method. Therefore, in this paper, we suggests the Hash-based Spatio-Temporal Clustering Algorithm(H-STCA) that extends the two-dimensional spatial hash algorithm used for the spatial clustering in the past to the three-dimensional spatial hash algorithm for overcoming the disadvantages of the snapshot method. And, this paper also suggests the knowledge extraction algorithm to extract the knowledge for the path search of moving objects from the past location data based on the suggested H-STCA algorithm. Moreover, as the results of the performance evaluation, the snapshot clustering method using H-STCA, in the search time, storage structure construction time, optimal path search time, related to the huge amount of moving object data demonstrated the higher performance than the spatio-temporal index methods and the original snapshot method. Especially, for the snapshot clustering method using H-STCA, the more the number of moving objects was increased, the more the performance was improved, as compared to the existing spatio-temporal index methods and the original snapshot method.
PDF

A MapReduce-Based Workflow BIG-Log Clustering Technique (맵리듀스기반 워크플로우 빅-로그 클러스터링 기법)

Jin, Min-Hyuck;Kim, Kwanghoon Pio
- Journal of Internet Computing and Services
- /
- v.20 no.1
- /
- pp.87-96
- /
- 2019
In this paper, we propose a MapReduce-supported clustering technique for collecting and classifying distributed workflow enactment event logs as a preprocessing tool. Especially, we would call the distributed workflow enactment event logs as Workflow BIG-Logs, because they are satisfied with as well as well-fitted to the 5V properties of BIG-Data like Volume, Velocity, Variety, Veracity and Value. The clustering technique we develop in this paper is intentionally devised for the preprocessing phase of a specific workflow process mining and analysis algorithm based upon the workflow BIG-Logs. In other words, It uses the Map-Reduce framework as a Workflow BIG-Logs processing platform, it supports the IEEE XES standard data format, and it is eventually dedicated for the preprocessing phase of the ${\rho}$-Algorithm that is a typical workflow process mining algorithm based on the structured information control nets. More precisely, The Workflow BIG-Logs can be classified into two types: of activity-based clustering patterns and performer-based clustering patterns, and we try to implement an activity-based clustering pattern algorithm based upon the Map-Reduce framework. Finally, we try to verify the proposed clustering technique by carrying out an experimental study on the workflow enactment event log dataset released by the BPI Challenges.
https://doi.org/10.7472/jksii.2019.20.1.87 인용 PDF KSCI HTML

Search Result 120, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)