• Title/Summary/Keyword: Analysis of Query

Search Result 457, Processing Time 0.083 seconds

Design of Lazy Classifier based on Fuzzy k-Nearest Neighbors and Reconstruction Error (퍼지 k-Nearest Neighbors 와 Reconstruction Error 기반 Lazy Classifier 설계)

  • Roh, Seok-Beom;Ahn, Tae-Chon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.1
    • /
    • pp.101-108
    • /
    • 2010
  • In this paper, we proposed a new lazy classifier with fuzzy k-nearest neighbors approach and feature selection which is based on reconstruction error. Reconstruction error is the performance index for locally linear reconstruction. When a new query point is given, fuzzy k-nearest neighbors approach defines the local area where the local classifier is available and assigns the weighting values to the data patterns which are involved within the local area. After defining the local area and assigning the weighting value, the feature selection is carried out to reduce the dimension of the feature space. When some features are selected in terms of the reconstruction error, the local classifier which is a sort of polynomial is developed using weighted least square estimation. In addition, the experimental application covers a comparative analysis including several previously commonly encountered methods such as standard neural networks, support vector machine, linear discriminant analysis, and C4.5 trees.

Experiment and Simulation for Evaluation of Jena Storage Plug-in Considering Hierarchical Structure (계층 구조를 고려한 Jena Plug-in 저장소의 평가를 위한 실험 및 시뮬레이션)

  • Shin, Hee-Young;Jeong, Dong-Won;Baik, Doo-Kwon
    • Journal of the Korea Society for Simulation
    • /
    • v.17 no.2
    • /
    • pp.31-47
    • /
    • 2008
  • As OWL(Web Ontology Language) has been selected as a standard ontology description language by W3C, many ontologies have been building and developing in OWL. The lena developed by HP as an Application Programming Interface(API) provides various APIs to develop inference engines as well as storages, and it is widely used for system development. However, the storage model of Jena2 stores most owl documents not acceptable into a single table and it shows low processing performance for a large ontology data set. Most of all, Jena2 storage model does not consider hierarchical structures of classes and properties. In addition, it shows low query processing performance using the hierarchical structure because of many join operations. To solve these issues, this paper proposes an OWL ontology relational database model. The proposed model semantically classifies and stores information such as classes, properties, and instances. It improves the query processing performance by managing hierarchical information in a separate table. This paper also describes the implementation and evaluation results. This paper also shows the experiment and evaluation result and the comparative analysis on both results. The experiment and evaluation show our proposal provides a prominent performance as against Jena2.

  • PDF

Content-Based Video Search Using Eigen Component Analysis and Intensity Component Flow (고유성분 분석과 휘도성분 흐름 특성을 이용한 내용기반 비디오 검색)

  • 전대홍;강대성
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.3 no.3
    • /
    • pp.47-53
    • /
    • 2002
  • In this paper, we proposed a content-based video search method using the eigen value of key frame and intensity component. We divided the video stream into shot units to extract key frame representing each shot, and get the intensity distribution of the shot from the database generated by using ECA(Eigen Component Analysis). The generated codebook, their index value for each key frame, and the intensity values were used for database. The query image is utilized to find video stream that has the most similar frame by using the euclidean distance measure among the codewords in the codebook. The experimental results showed that the proposed algorithm is superior to any other methols in the search outcome since it makes use of eigen value and intensity elements, and reduces the processing time etc.

  • PDF

Design and Performance Analysis of a Parallel Cell-Based Filtering Scheme using Horizontally-Partitioned Technique (수평 분할 방식을 이용한 병렬 셀-기반 필터링 기법의 설계 및 성능 평가)

  • Chang, Jae-Woo;Kim, Young-Chang
    • The KIPS Transactions:PartD
    • /
    • v.10D no.3
    • /
    • pp.459-470
    • /
    • 2003
  • It is required to research on high-dimensional index structures for efficiently retrieving high-dimensional data because an attribute vector in data warehousing and a feature vector in multimedia database have a characteristic of high-dimensional data. For this, many high-dimensional index structures have been proposed, but they have so called ‘dimensional curse’ problem that retrieval performance is extremely decreased as the dimensionality is increased. To solve the problem, the cell-based filtering (CBF) scheme has been proposed. But the CBF scheme show a linear decreasing on performance as the dimensionality. To cope with the problem, it is necessary to make use of parallel processing techniques. In this paper, we propose a parallel CBF scheme which uses a horizontally-partitioned technique as declustering. In order to maximize the retrieval performance of the proposed parallel CBF scheme, we construct our parallel CBF scheme under a SN (Shared Nothing) cluster architecture. In addition, we present a data insertion algorithm, a rage query processing one, and a k-NN query processing one which are suitable for the SN cluster architecture. Finally, we show that our parallel CBF scheme achieves better retrieval performance in proportion to the number of servers in the SN cluster architecture, compared with the conventional CBF scheme.

Applying an Aggregate Function AVG to OLAP Cubes (OLAP 큐브에서의 집계함수 AVG의 적용)

  • Lee, Seung-Hyun;Lee, Duck-Sung;Choi, In-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.1
    • /
    • pp.217-228
    • /
    • 2009
  • Data analysis applications typically aggregate data across many dimensions looking for unusual patterns in data. Even though such applications are usually possible with standard structured query language (SQL) queries, the queries may become very complex. A complex query may result in many scans of the base table, leading to poor performance. Because online analytical processing (OLAP) queries are usually complex, it is desired to define a new operator for aggregation, called the data cube or simply cube. Data cube supports OLAP tasks like aggregation and sub-totals. Many aggregate functions can be used to construct a data cube. Those functions can be classified into three categories, the distributive, the algebraic, and the holistic. It has been thought that the distributive functions such as SUM, COUNT, MAX, and MIN can be used to construct a data cube, and also the algebraic function such as AVG can be used if the function is replaced to an intermediate function. It is believed that even though AVG is not distributive, but the intermediate function (SUM, COUNT) is distributive, and AVG can certainly be computed from (SUM, COUNT). In this paper, however, it is found that the intermediate function (SUM COUNT) cannot be applied to OLAP cubes, and consequently the function leads to erroneous conclusions and decisions. The objective of this study is to identify some problems in applying aggregate function AVG to OLAP cubes, and to design a process for solving these problems.

The Design and Implementation of Restructuring Tool with Logical Analysis of Object-Oriented Architecture and Design Information Recovery (설계 정보 복구와 객체 지향 구조의 논리적 분석을 통한 재구성 툴 설계 및 구현)

  • Kim, Haeng-Gon;Choe, Ha-Jeong;Byeon, Sang-Yong;Jeong, Yeon-Gi
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.7
    • /
    • pp.1739-1752
    • /
    • 1996
  • Software reengineering involves improving the software maintenance process and improving existing systems by applying new technologies and software tools. Software reengineering can help us understand existing systems and discover software components that are common across systems. In the paper, we discuss the program analysis and environment to assist reengineering. Program analysis takesan existing program as input and generates information about structured part and object-oriented part. It is used to restructure the information by extracting code through reengineering methodology. These restructuring informations with object-oriented archilccture are mapping prolog form to query by using direct reation and summary relation.

  • PDF

Distributed Processing System for Aggregate/Analytical Functions on CUBRID Shard Distributed Databases (큐브리드 샤드 분산 데이터베이스에서 집계/분석 함수의 분산 처리 시스템 개발)

  • Won, Jiseop;Kang, Suk;Jo, Sunhwa;Kim, Jinho
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.8
    • /
    • pp.537-542
    • /
    • 2015
  • Database Shard is a technique that can be queried and stored by dividing one logical table into multiple databases horizontally. In order to analyze the shard data with aggregate or analysis functions, a process is required that integrates partial results on each shard database. In this paper, we introduce the design and implementation of a distributed processing system for aggregation and analysis on the CUBRID Shard distributed database, which is an open source database management system. The implemented system can accelerate the analysis onto multiple shards of partitioned tables; it shows efficient aggregation on shard distributed databases compared to stand-alone databases.

Incremental Ensemble Learning for The Combination of Multiple Models of Locally Weighted Regression Using Genetic Algorithm (유전 알고리즘을 이용한 국소가중회귀의 다중모델 결합을 위한 점진적 앙상블 학습)

  • Kim, Sang Hun;Chung, Byung Hee;Lee, Gun Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.9
    • /
    • pp.351-360
    • /
    • 2018
  • The LWR (Locally Weighted Regression) model, which is traditionally a lazy learning model, is designed to obtain the solution of the prediction according to the input variable, the query point, and it is a kind of the regression equation in the short interval obtained as a result of the learning that gives a higher weight value closer to the query point. We study on an incremental ensemble learning approach for LWR, a form of lazy learning and memory-based learning. The proposed incremental ensemble learning method of LWR is to sequentially generate and integrate LWR models over time using a genetic algorithm to obtain a solution of a specific query point. The weaknesses of existing LWR models are that multiple LWR models can be generated based on the indicator function and data sample selection, and the quality of the predictions can also vary depending on this model. However, no research has been conducted to solve the problem of selection or combination of multiple LWR models. In this study, after generating the initial LWR model according to the indicator function and the sample data set, we iterate evolution learning process to obtain the proper indicator function and assess the LWR models applied to the other sample data sets to overcome the data set bias. We adopt Eager learning method to generate and store LWR model gradually when data is generated for all sections. In order to obtain a prediction solution at a specific point in time, an LWR model is generated based on newly generated data within a predetermined interval and then combined with existing LWR models in a section using a genetic algorithm. The proposed method shows better results than the method of selecting multiple LWR models using the simple average method. The results of this study are compared with the predicted results using multiple regression analysis by applying the real data such as the amount of traffic per hour in a specific area and hourly sales of a resting place of the highway, etc.

Factors Clustering Approach to Parametric Cost Estimates And OLAP Driver

  • JaeHo, Cho;BoSik, Son;JaeYoul, Chun
    • International conference on construction engineering and project management
    • /
    • 2009.05a
    • /
    • pp.707-716
    • /
    • 2009
  • The role of cost modeller is to facilitate the design process by systematic application of cost factors so as to maintain a sensible and economic relationship between cost, quantity, utility and appearance which thus helps in achieving the client's requirements within an agreed budget. There are a number of research on cost estimates in the early design stage based on the improvement of accuracy or impact factors. It is common knowledge that cost estimates are undertaken progressively throughout the design stage and make use of the information that is available at each phase, through the related research up to now. In addition, Cost estimates in the early design stage shall analyze the information under the various kinds of precondition before reaching the more developed design because a design can be modified and changed in all process depending on clients' requirements. Parametric cost estimating models have been adopted to support decision making in a changeable environment, in the early design stage. These models are using a similar instance or a pattern of historical case to be constituted in project information, geographic design features, relevant data to quantity or cost, etc. OLAP technique analyzes a subject data by multi-dimensional points of view; it supports query, analysis, comparison of required information by diverse queries. OLAP's data structure matches well with multiview-analysis framework. Accordingly, this study implements multi-dimensional information system for case based quantity data related to design information that is utilizing OLAP's technology, and then analyzes impact factors of quantity by the design criteria or parameter of the same meaning. On the basis of given factors examined above, this study will generate the rules on quantity measure and produce resemblance class using clustering of data mining. These sorts of knowledge-base consist of a set of classified data as group patterns, of which will be appropriate stand on the parametric cost estimating method.

  • PDF

Morphology and Molecular Phylogeny of Raillietina spp. (Cestoda: Cyclophyllidea: Davaineidae) from Domestic Chickens in Thailand

  • Butboonchoo, Preeyaporn;Wongsawad, Chalobol;Rojanapaibul, Amnat;Chai, Jong-Yil
    • Parasites, Hosts and Diseases
    • /
    • v.54 no.6
    • /
    • pp.777-786
    • /
    • 2016
  • Raillietina species are prevalent in domestic chickens (Gallus gallus domesticus) in Phayao province, northern Thailand. Their infection may cause disease and death, which affects the public health and economic situation in chicken farms. The identification of Raillietina has been based on morphology and molecular analysis. In this study, morphological observations using light (LM) and scanning electron microscopies (SEM) coupled with molecular analysis of the internal transcribed spacer 2 (ITS2) region and the nicotinamide adenine dinucleotide dehydrogenase subunit 1 (ND1) gene were employed for precise identification and phylogenetic relationship studies of Raillietina spp. Four Raillietina species, including R. echinobothrida, R. tetragona, R. cesticillus, and Raillietina sp., were recovered in domestic chickens from 4 districts in Phayao province, Thailand. LM and SEM observations revealed differences in the morphology of the scolex, position of the genital pore, number of eggs per egg capsule, and rostellar opening surface structures in all 4 species. Phylogenetic relationships were found among the phylogenetic trees obtained by the maximum likelihood and distance-based neighbor-joining methods. ITS2 and ND1 sequence data recorded from Raillietina sp. appeared to be monophyletic. The query sequences of R. echinobothrida, R. tetragona, R. cesticillus, and Raillietina sp. were separated according to the different morphological characters. This study confirmed that morphological studies combined with molecular analyses can differentiate related species within the genus Raillietina in Thailand.