• Title/Summary/Keyword: synthetic approaches

Search Result 155, Processing Time 0.021 seconds

The Performance Bottleneck of Subsequence Matching in Time-Series Databases: Observation, Solution, and Performance Evaluation (시계열 데이타베이스에서 서브시퀀스 매칭의 성능 병목 : 관찰, 해결 방안, 성능 평가)

  • 김상욱
    • Journal of KIISE:Databases
    • /
    • v.30 no.4
    • /
    • pp.381-396
    • /
    • 2003
  • Subsequence matching is an operation that finds subsequences whose changing patterns are similar to a given query sequence from time-series databases. This paper points out the performance bottleneck in subsequence matching, and then proposes an effective method that improves the performance of entire subsequence matching significantly by resolving the performance bottleneck. First, we analyze the disk access and CPU processing times required during the index searching and post processing steps through preliminary experiments. Based on their results, we show that the post processing step is the main performance bottleneck in subsequence matching, and them claim that its optimization is a crucial issue overlooked in previous approaches. In order to resolve the performance bottleneck, we propose a simple but quite effective method that processes the post processing step in the optimal way. By rearranging the order of candidate subsequences to be compared with a query sequence, our method completely eliminates the redundancy of disk accesses and CPU processing occurred in the post processing step. We formally prove that our method is optimal and also does not incur any false dismissal. We show the effectiveness of our method by extensive experiments. The results show that our method achieves significant speed-up in the post processing step 3.91 to 9.42 times when using a data set of real-world stock sequences and 4.97 to 5.61 times when using data sets of a large volume of synthetic sequences. Also, the results show that our method reduces the weight of the post processing step in entire subsequence matching from about 90% to less than 70%. This implies that our method successfully resolves th performance bottleneck in subsequence matching. As a result, our method provides excellent performance in entire subsequence matching. The experimental results reveal that it is 3.05 to 5.60 times faster when using a data set of real-world stock sequences and 3.68 to 4.21 times faster when using data sets of a large volume of synthetic sequences compared with the previous one.

Detection of Forest Fire Damage from Sentinel-1 SAR Data through the Synergistic Use of Principal Component Analysis and K-means Clustering (Sentinel-1 SAR 영상을 이용한 주성분분석 및 K-means Clustering 기반 산불 탐지)

  • Lee, Jaese;Kim, Woohyeok;Im, Jungho;Kwon, Chunguen;Kim, Sungyong
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.5_3
    • /
    • pp.1373-1387
    • /
    • 2021
  • Forest fire poses a significant threat to the environment and society, affecting carbon cycle and surface energy balance, and resulting in socioeconomic losses. Widely used multi-spectral satellite image-based approaches for burned area detection have a problem in that they do not work under cloudy conditions. Therefore, in this study, Sentinel-1 Synthetic Aperture Radar (SAR) data from Europe Space Agency, which can be collected in all weather conditions, were used to identify forest fire damaged area based on a series of processes including Principal Component Analysis (PCA) and K-means clustering. Four forest fire cases, which occurred in Gangneung·Donghae and Goseong·Sokcho in Gangwon-do of South Korea and two areas in North Korea on April 4, 2019, were examined. The estimated burned areas were evaluated using fire reference data provided by the National Institute of Forest Science (NIFOS) for two forest fire cases in South Korea, and differenced normalized burn ratio (dNBR) for all four cases. The average accuracy using the NIFOS reference data was 86% for the Gangneung·Donghae and Goseong·Sokcho fires. Evaluation using dNBR showed an average accuracy of 84% for all four forest fire cases. It was also confirmed that the stronger the burned intensity, the higher detection the accuracy, and vice versa. Given the advantage of SAR remote sensing, the proposed statistical processing and K-means clustering-based approach can be used to quickly identify forest fire damaged area across the Korean Peninsula, where a cloud cover rate is high and small-scale forest fires frequently occur.

Comparison of Association Rule Learning and Subgroup Discovery for Mining Traffic Accident Data (교통사고 데이터의 마이닝을 위한 연관규칙 학습기법과 서브그룹 발견기법의 비교)

  • Kim, Jeongmin;Ryu, Kwang Ryel
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.1-16
    • /
    • 2015
  • Traffic accident is one of the major cause of death worldwide for the last several decades. According to the statistics of world health organization, approximately 1.24 million deaths occurred on the world's roads in 2010. In order to reduce future traffic accident, multipronged approaches have been adopted including traffic regulations, injury-reducing technologies, driving training program and so on. Records on traffic accidents are generated and maintained for this purpose. To make these records meaningful and effective, it is necessary to analyze relationship between traffic accident and related factors including vehicle design, road design, weather, driver behavior etc. Insight derived from these analysis can be used for accident prevention approaches. Traffic accident data mining is an activity to find useful knowledges about such relationship that is not well-known and user may interested in it. Many studies about mining accident data have been reported over the past two decades. Most of studies mainly focused on predict risk of accident using accident related factors. Supervised learning methods like decision tree, logistic regression, k-nearest neighbor, neural network are used for these prediction. However, derived prediction model from these algorithms are too complex to understand for human itself because the main purpose of these algorithms are prediction, not explanation of the data. Some of studies use unsupervised clustering algorithm to dividing the data into several groups, but derived group itself is still not easy to understand for human, so it is necessary to do some additional analytic works. Rule based learning methods are adequate when we want to derive comprehensive form of knowledge about the target domain. It derives a set of if-then rules that represent relationship between the target feature with other features. Rules are fairly easy for human to understand its meaning therefore it can help provide insight and comprehensible results for human. Association rule learning methods and subgroup discovery methods are representing rule based learning methods for descriptive task. These two algorithms have been used in a wide range of area from transaction analysis, accident data analysis, detection of statistically significant patient risk groups, discovering key person in social communities and so on. We use both the association rule learning method and the subgroup discovery method to discover useful patterns from a traffic accident dataset consisting of many features including profile of driver, location of accident, types of accident, information of vehicle, violation of regulation and so on. The association rule learning method, which is one of the unsupervised learning methods, searches for frequent item sets from the data and translates them into rules. In contrast, the subgroup discovery method is a kind of supervised learning method that discovers rules of user specified concepts satisfying certain degree of generality and unusualness. Depending on what aspect of the data we are focusing our attention to, we may combine different multiple relevant features of interest to make a synthetic target feature, and give it to the rule learning algorithms. After a set of rules is derived, some postprocessing steps are taken to make the ruleset more compact and easier to understand by removing some uninteresting or redundant rules. We conducted a set of experiments of mining our traffic accident data in both unsupervised mode and supervised mode for comparison of these rule based learning algorithms. Experiments with the traffic accident data reveals that the association rule learning, in its pure unsupervised mode, can discover some hidden relationship among the features. Under supervised learning setting with combinatorial target feature, however, the subgroup discovery method finds good rules much more easily than the association rule learning method that requires a lot of efforts to tune the parameters.

A Study on the Material Characteristics and Functionality Evaluation of a Size Layer of a Canvas (캔버스 차단층(Size Layer)의 재료특성 및 기능평가 연구)

  • Kim, Hwan Ju;Lee, Hwa Soo;Chung, Yong Jae
    • Journal of Conservation Science
    • /
    • v.32 no.2
    • /
    • pp.167-178
    • /
    • 2016
  • Despite the size layer is an important part for conserving the artworks in the configuration of oil painting, the conservation scientific approaches of that have not been made yet. Therefore, this study produced standard samples on the basis of the analysis results of oil painting works, and carried out the evaluation of functions of the size layer materials. As a result of literature material, traditionally, animal glue was used for the size layer, whereas synthetic resin have been used in combination with animal glue since the modern age, in particular, it was identified that Polyvinyl Acetate(PVAc) was in general use. As a result of analysis of oil painting works, size layer was detected on the support and it was identified as animal glue. As a result of analysis based on Funaoka canvas for ground, it showed that the lead oxide and the titanium dioxide were the main constituents. On the basis of these results, standard samples were produced. As a result of evaluation on the functions of the size layer materials, in the case of the animal glue, stable result was observed in the shrinkag expansion rate, whereas slight weakness was observed in moisture proofing, color, and tensile strength, and dense cracks were found on surface. As for PVAc(A), moisture proofing, color, and the tensile strength exhibited stable results. Higher shrinkage rate was observed and the cracks with wide gaps were found on surface. As for PVAc(B), tensile strength, shrinkage expansion rate, and surface observation showed stable results, whereas moisture proofing property showed poor results. Different aspects were observed in each experiment, and this phenomena were considered to be due to the density and the adhesion properties between the hydrophilic and hydrophobic molecules in the size layer materials. The results are expected to be used as materials for the oil painting work conservation henceforth.

A Study on the Technical and Administrative Innovation of Library Organization in the Perspective of the Contingency Theory (도서관조직의 기술혁신 및 행정혁신에 관한 조직상황론적 연구)

  • Hong Hyun-Jin
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.25
    • /
    • pp.343-388
    • /
    • 1993
  • The ability of any organization to innovate itself in a rapid change of environment means the existence of the organization. Innovative activity is achieved in different ways according to the objectives of organization. the characteristics of external environmental factors. and various attributes in organization. In the present study. all the existing approaches to the innovative nature of organization were synthetically compared to each other and evaluated: then. for a more rational approach. a research model was built and suggested by establishing the inclusive variables of the innovative nature of library organization and categorizing the types of such nature. Additionally. an empirical. analytical study on such a model was done. That is. paying regard to the fact that innovation has basically a close relation with the circumstantial factors of organization. synthetic, circumstantial relations were clarified. considering the external environmental factors and internal characteristics of organization. In the study. the innovation of library organization was seen in two parts i.e .. the feasible degree of technical innovation and the feasible degree of administrative innovation. Regarding the types of innovative implementation. according to the feasible degree of innovation, four types such as a stationary type. technic-oriented type, organization-oriented type. and technical-socio systematic type were classified. There were nine independent variables-i.e., the scale of organization. available resources of the organization, formalization, differentiation, specialization. decentralization, recognizant degree of the technical attribute. degree of response to the change of technical environment, and professional activities. There were three subordinate variables - i.e., technical innovation, administrative innovation. and the performance of organization. Through establishment of such variables, the factors which might influence the innovation of library organization were understood, and with the types of the innovative implementation of library organization being classified according to the feasible degree of innovation. the characteristics of library organization were reviewed in the light of each type. Also. the performance of library organization according to the types of the innovative implementation of library organization was analyzed. and the relations between the types of innovative implementation according to circumstantial variables and the performance of library organization were clarified. In order to clarify the adequacy of the research model in the methodology of empirical study, data were collected from 72 university libraries and 38 special libraries. and for a hypothetical test of the research model. an analysis of correlations, a stepwise regression analysis. and One Way ANOVA were utilized. The following are the major results or findings from the study 1) It appeared there is a trend that the bigger the scale of organization and available resources are, the more active the professional activity of the managerial class is, and the higher the recognizant degree of technical environment (recognizant degree of technical attributes and the degree of response t9 the change of technical environment) is, the higher the feasible degree of innovation becomes. 2) It appeared that among the variables influencing the feasible degree of technical innovation, the order from the variable influencing most was first, the recognizant degree of technical innovation: second, the available resources of organization: and third, professional activity. Regarding the variables influencing the feasible degree of administrative innovation from the most influential variable, it appeared they were the available resources of organization, the differentiation of organization. and the degree of response to the change of technical environment. 3) It appeared that the higher the educational level of the managerial class is, the more active the professional activity becomes. It seemed there is a trend that the group of library managers whose experience as a librarian was at the middle level(three years to six years of experience) was more active in research activity than the group of library managers whose experience as a librarian was at a higher level(more than ten years). Also, it appeared there is a trend that the lower the age of library managers is, the higher the recognizant degree of technical attributes becomes. and the group of library managers whose experience as a librarian was at the middle level (three years to six years of experience) recognized more affirmatively the technical aspect than the group of library managers whose experience as a librarian was at a higher level(more than 10 years). Also, it appeared that, when the activity of the professional association and research activity are active, the recognizant degree of technology becomes higher, and as a result. it influences the innovative nature of organization(the feasible degree of technical innovation and the feasible degree of administrative innovation). 4) As a result of the comparison and analysis of the characteristics of library organization according to the types of innovative implementation of library organization. it was indicated there is a trend that the larger the available resources of library organization, the higher the organic nature of organization such as differentiation. decentralization, etc., and the higher the level of the operation of system development, the more the type of the innovative implementation of library organization becomes the technical-socio systematic type which is higher both in the practical degrees of technical innovation and administrative innovation. 5) As a result of the comparison and analysis of the relations between the types of innovative implementation and the performance of organization, it appeared that the order from the highest performance of organization is the technical-socio systematic type, then the technic-oriented type, the organization­oriented type, and finally the stationary type which is lowest in such performance. That is, it demonstrated that, since the performance of library organization is highest in the library of the technical-socio systematic type while it is lowest in the library whose practical degrees in both technical innovation and administrative innovation are low, the performance of library organization differs significantly according to the types of innovative implementation of library organization. The present study has extracted the factors influencing innovation, classified systematically the types of innovative implementation, and inferred the synthetical, circumstantial correlations between the types and the performance of organization, and empirically inspected those factors. However, due to the present study's restrictive matters and the limit of the research design, results from the study should be more prudently interpreted. Also, the present study, as an investigative study of the types of innovative implementation, with few preceding studies, requires more complete hypothetical inference based on the results of the present study. In other words, if more systematical studies are given to understanding the relations, it will devote the suggestion and demonstration of a more useful theory.

  • PDF