• Title/Summary/Keyword: large database

Search Result 1,444, Processing Time 0.026 seconds

Temporal Data Mining Framework (시간 데이타마이닝 프레임워크)

  • Lee, Jun-Uk;Lee, Yong-Jun;Ryu, Geun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.9D no.3
    • /
    • pp.365-380
    • /
    • 2002
  • Temporal data mining, the incorporation of temporal semantics to existing data mining techniques, refers to a set of techniques for discovering implicit and useful temporal knowledge from large quantities of temporal data. Temporal knowledge, expressible in the form of rules, is knowledge with temporal semantics and relationships, such as cyclic pattern, calendric pattern, trends, etc. There are many examples of temporal data, including patient histories, purchaser histories, and web log that it can discover useful temporal knowledge from. Many studies on data mining have been pursued and some of them have involved issues of temporal data mining for discovering temporal knowledge from temporal data, such as sequential pattern, similar time sequence, cyclic and temporal association rules, etc. However, all of the works treated data in database at best as data series in chronological order and did not consider temporal semantics and temporal relationships containing data. In order to solve this problem, we propose a theoretical framework for temporal data mining. This paper surveys the work to date and explores the issues involved in temporal data mining. We then define a model for temporal data mining and suggest SQL-like mining language with ability to express the task of temporal mining and show architecture of temporal mining system.

Signature-based Indexing Scheme for Similar Sub-Trajectory Retrieval of Moving Objects (이동 객체의 유사 부분궤적 검색을 위한 시그니쳐-기반 색인 기법)

  • Shim, Choon-Bo;Chang, Jae-Woo
    • The KIPS Transactions:PartD
    • /
    • v.11D no.2
    • /
    • pp.247-258
    • /
    • 2004
  • Recently, there have been researches on storage and retrieval technique of moving objects, which are highly concerned by user in database application area such as video databases, spatio-temporal databases, and mobile databases. In this paper, we propose a new signature-based indexing scheme which supports similar sub-trajectory retrieval at well as good retrieval performance on moving objects trajectories. Our signature-based indexing scheme is classified into concatenated signature-based indexing scheme for similar sub-trajectory retrieval, entitled CISR scheme and superimposed signature-based indexing scheme for similar sub-trajectory retrieval, entitled SISR scheme according to generation method of trajectory signature based on trajectory data of moving object. Our indexing scheme can improve retrieval performance by reducing a large number of disk access on data file because it first scans all signatures and does filtering before accessing the data file. In addition, we can encourage retrieval efficiency by appling k-warping algorithm to measure the similarity between query trajectory and data trajectory. Final]y, we evaluate the performance on sequential scan method(SeqScan), CISR scheme, and SISR scheme in terms of data insertion time, retrieval time, and storage overhead. We show from our experimental results that both CISR scheme and SISR scheme are better than sequential scan in terms of retrieval performance and SISR scheme is especially superior to the CISR scheme.

Development of Traffic Accident Index Considering Driving Behavior of a Data Based (데이터 기반의 도로구간별 운전자의 통행행태를 고려한 교통사고지표 개발)

  • LEE, Soongbong;CHANG, Hyunho;CHEON, Seunghoon;BAEK, Seungkirl;LEE, Young-Ihn
    • Journal of Korean Society of Transportation
    • /
    • v.34 no.4
    • /
    • pp.341-353
    • /
    • 2016
  • Highway is mainly in charge of middle-long distance of vehicular travel. Trip length has shown a growing trend due to increased commute distances by the relocation of public agencies. For this reason, the proportion of driver-driven accidents, caused by their fatigue or sleepiness, are very high on highways. However, existing studies related to accident prediction have mainly considered external factors, such as road conditions, environmental factors and vehicle factors, without driving behavior. In this study, we suggested an accident index (FDR, Fatigued Driving Rate) based on traffic behavior using large-scale Car Navigation path data, and exlpored the relationship between FDR and traffic accidents. As a result, FDR and traffic accidents showed a high correlation. This confirmed the need for a paradigm shift (from facilities to travel behavior) in traffic accident prediction studies. FDR proposed in this study will be utilized in a variety of fields. For example, in providing information to prevent traffic accidents (sleepiness, reckless driving, etc) in advance, utilization of core technologies in highway safety diagnostics, selection of priority location of rest areas and shelter, and selection of attraction methods (rumble strips, grooving) for attention for fatigued sections.

Properties of a Social Network Topology of Livestock Movements to Slaughterhouse in Korea (도축장 출하차량 이동의 사회연결망 특성 분석)

  • Park, Hyuk;Bae, Sunhak;Pak, Son-Il
    • Journal of Veterinary Clinics
    • /
    • v.33 no.5
    • /
    • pp.278-285
    • /
    • 2016
  • Epidemiological studies have shown the association between transportation of live animals and the potential transmission of infectious disease between premises. This finding was also observed in the 2014-2015 foot-and-mouth disease (FMD) outbreak in Korea. Furthermore, slaughterhouses played a key role in the global spread of the FMD virus during the epidemic. In this context, in-depth knowledge of the structure of direct and indirect contact between slaughterhouses is paramount for understanding the dynamics of FMD transmission. But the social network structure of vehicle movements to slaughterhouses in Korea remains unclear. Hence, the aim of this study was to configure a social network topology of vehicle movements between slaughterhouses for a better understanding of how they are potentially connected, and to explore whether FMD outbreaks can be explained by the network properties constructed in the study. We created five monthly directed networks based on the frequency and chronology of on- and off-slaughterhouse vehicle movements. For the monthly network, a node represented a slaughterhouse, and an edge (or link) denoted vehicle movement between two slaughterhouses. Movement data were retrieved from the national Korean Animal Health Integrated System (KAHIS) database, which tracks the routes of individual vehicle movements using a global positioning system (GPS). Electronic registration of livestock movements has been a mandatory requirement since 2013 to ensure traceability of such movements. For each of the five studied networks, the network structures were characterized by small-world properties, with a short mean distance, a high clustering coefficient, and a short diameter. In addition, a strongly connected component was observed in each of the created networks, and this giant component included 94.4% to 100% of all network nodes. The characteristic hub-and-spoke type of structure was not identified. Such a structural vulnerability in the network suggests that once an infectious disease (such as FMD) is introduced in a random slaughterhouse within the cohesive component, it can spread to every other slaughterhouse in the component. From an epidemiological perspective, for disease management, empirically derived small-world networks could inform decision-makers on the higher potential for a large FMD epidemic within the livestock industry, and could provide insights into the rapid-transmission dynamics of the disease across long distances, despite a standstill of animal movements during the epidemic, given a single incursion of infection in any slaughterhouse in the country.

A Scene Boundary Detection Scheme using Audio Information in MPEG System Stream (MPEG 시스템 스트림상에서 오디오 정보를 이용한 장면 경계 검출 방법)

  • Kim, Jae-Hong;Nang, Jong-Ho;Park, Soo-Yong
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.8
    • /
    • pp.864-876
    • /
    • 2000
  • This paper proposes a new scene boundary detection scheme for the MPEG System stream using MPEG Audio information and proves its usefulness by extensive experiments. A scene boundary has a characteristic that the audio as well as video information are changed rapidly. This paper first classifies this scene boundary into three cases ; Radical, Gradual, Micro Changes, with respect to the audio changes. The Radical change has a large-scale changing of decibel value and pitch value at a scene boundary, the Gradual change shows the long-time transition of decibel and pitch values from max to min or vice versa, and the Micro change displays a some change of pitch or frequency distribution without decibel changes. Upon this analysis, a new scene change detection algorithm detecting these three cases is proposed in which a progressive window with a time line is used to trace the changes in the audio information. Some experiments with various movies show that proposed algorithm could produce a high detection ratio for Radical change that is the most popular scene change in the movies, while producing a moderate detection ratio for Gradual and Micro changes. The proposed scene boundary detection scheme could be used to build a database for visual information like MPEG System stream.

  • PDF

A Image Retrieval Model Based on Weighted Visual Features Determined by Relevance Feedback (적합성 피드백을 통해 결정된 가중치를 갖는 시각적 특성에 기반을 둔 이미지 검색 모델)

  • Song, Ji-Young;Kim, Woo-Cheol;Kim, Seung-Woo;Park, Sang-Hyun
    • Journal of KIISE:Databases
    • /
    • v.34 no.3
    • /
    • pp.193-205
    • /
    • 2007
  • Increasing amount of digital images requires more accurate and faster way of image retrieval. So far, image retrieval method includes content-based retrieval and keyword based retrieval, the former utilizing visual features such as color and brightness and the latter utilizing keywords which describe the image. However, the effectiveness of these methods as to providing the exact images the user wanted has been under question. Hence, many researchers have been working on relevance feedback, a process in which responses from the user are given as a feedback during the retrieval session in order to define user’s need and provide improved result. Yet, the methods which have employed relevance feedback also have drawbacks since several feedbacks are necessary to have appropriate result and the feedback information can not be reused. In this paper, a novel retrieval model has been proposed which annotates an image with a keyword and modifies the confidence level of the keyword in response to the user’s feedback. In the proposed model, not only the images which have received positive feedback but also the other images with the visual features similar to the features used to distinguish the positive image are subjected to confidence modification. This enables modifying large amount of images with only a few feedbacks ultimately leading to faster and more accurate retrieval result. An experiment has been performed to verify the effectiveness of the proposed model and the result has demonstrated rapid increase in recall and precision while receiving the same number of feedbacks.

An Efficient VEB Beats Detection Algorithm Using the QRS Width and RR Interval Pattern in the ECG Signals (ECG신호의 QRS 폭과 RR Interval의 패턴을 이용한 효율적인 VEB 비트 검출 알고리듬)

  • Chung, Yong-Joo
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.12 no.2
    • /
    • pp.96-101
    • /
    • 2011
  • In recent days, the demand for the remote ECG monitoring system has been increasing and the automation of the monitoring system is becoming quite of a concern. Automatic detection of the abnormal ECG beats must be a necessity for the successful commercialization of these real time remote ECG monitoring system. From these viewpoints, in this paper, we proposed an automatic detection algorithm for the abnormal ECG beats using QRS width and RR interval patterns. In the previous research, many efforts have been done to classify the ECG beats into detailed categories. But, these approaches have disadvantages such that they produce lots of misclassification errors and variabilities in the classification performance. Also, they require large amount of training data for the accurate classification and heavy computation during the classification process. But, we think that the detection of abnormality from the ECG beats is more important that the detailed classification for the automatic ECG monitoring system. In this paper, we tried to detect the VEB which is most frequently occurring among the abnormal ECG beats and we could achieve satisfactory detection performance when applied the proposed algorithm to the MIT/BIH database.

n-Gram/2L: A Space and Time Efficient Two-Level n-Gram Inverted Index Structure (n-gram/2L: 공간 및 시간 효율적인 2단계 n-gram 역색인 구조)

  • Kim Min-Soo;Whang Kyu-Young;Lee Jae-Gil;Lee Min-Jae
    • Journal of KIISE:Databases
    • /
    • v.33 no.1
    • /
    • pp.12-31
    • /
    • 2006
  • The n-gram inverted index has two major advantages: language-neutral and error-tolerant. Due to these advantages, it has been widely used in information retrieval or in similar sequence matching for DNA and Protein databases. Nevertheless, the n-gram inverted index also has drawbacks: the size tends to be very large, and the performance of queries tends to be bad. In this paper, we propose the two-level n-gram inverted index (simply, the n-gram/2L index) that significantly reduces the size and improves the query performance while preserving the advantages of the n-gram inverted index. The proposed index eliminates the redundancy of the position information that exists in the n-gram inverted index. The proposed index is constructed in two steps: 1) extracting subsequences of length m from documents and 2) extracting n-grams from those subsequences. We formally prove that this two-step construction is identical to the relational normalization process that removes the redundancy caused by a non-trivial multivalued dependency. The n-gram/2L index has excellent properties: 1) it significantly reduces the size and improves the Performance compared with the n-gram inverted index with these improvements becoming more marked as the database size gets larger; 2) the query processing time increases only very slightly as the query length gets longer. Experimental results using databases of 1 GBytes show that the size of the n-gram/2L index is reduced by up to 1.9${\~}$2.7 times and, at the same time, the query performance is improved by up to 13.1 times compared with those of the n-gram inverted index.

Online Signature Verification by Visualization of Dynamic Characteristics using New Pattern Transform Technique (동적 특성의 시각화를 수행하는 새로운 패턴변환 기법에 의한 온라인 서명인식 기술)

  • Chi Suyoung;Lee Jaeyeon;Oh Weongeun;Kim Changhun
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.7
    • /
    • pp.663-673
    • /
    • 2005
  • An analysis model for the dynamics information of two-dimensional time-series patterns is described. In the proposed model, two novel transforms that visualize the dynamic characteristics are proposed. The first transform, referred to as speed equalization, reproduces a time-series pattern assuming a constant linear velocity to effectively model the temporal characteristics of the signing process. The second transform, referred to as velocity transform, maps the signal onto a horizontal vs. vertical velocity plane where the variation oi the velocities over time is represented as a visible shape. With the transforms, the dynamic characteristics in the original signing process are reflected in the shape of the transformed patterns. An analysis in the context of these shapes then naturally results in an effective analysis of the dynamic characteristics. The proposed transform technique is applied to an online signature verification problem for evaluation. Experimenting on a large signature database, the performance evaluated in EER(Equal Error Rate) was improved to 1.17$\%$ compared to 1.93$\%$ of the traditional signature verification algorithm in which no transformed patterns are utilized. In the case of skilled forgery experiments, the improvement was more outstanding; it was demonstrated that the parameter set extracted from the transformed patterns was more discriminative in rejecting forgeries

Trajectory Indexing for Efficient Processing of Range Queries (영역 질의의 효과적인 처리를 위한 궤적 인덱싱)

  • Cha, Chang-Il;Kim, Sang-Wook;Won, Jung-Im
    • The KIPS Transactions:PartD
    • /
    • v.16D no.4
    • /
    • pp.487-496
    • /
    • 2009
  • This paper addresses an indexing scheme capable of efficiently processing range queries in a large-scale trajectory database. After discussing the drawbacks of previous indexing schemes, we propose a new scheme that divides the temporal dimension into multiple time intervals and then, by this interval, builds an index for the line segments. Additionally, a supplementary index is built for the line segments within each time interval. This scheme can make a dramatic improvement in the performance of insert and search operations using a main memory index, particularly for the time interval consisting of the segments taken by those objects which are currently moving or have just completed their movements, as contrast to the previous schemes that store the index totally on the disk. Each time interval index is built as follows: First, the extent of the spatial dimension is divided onto multiple spatial cells to which the line segments are assigned evenly. We use a 2D-tree to maintain information on those cells. Then, for each cell, an additional 3D $R^*$-tree is created on the spatio-temporal space (x, y, t). Such a multi-level indexing strategy can cure the shortcomings of the legacy schemes. Performance results obtained from intensive experiments show that our scheme enhances the performance of retrieve operations by 3$\sim$10 times, with much less storage space.