• Title/Summary/Keyword: data partitioning

Search Result 387, Processing Time 0.031 seconds

Hybrid multiple component neural netwrok design and learning by efficient pattern partitioning method (효과적인 패턴분할 방법에 의한 하이브리드 다중 컴포넌트 신경망 설계 및 학습)

  • 박찬호;이현수
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.7
    • /
    • pp.70-81
    • /
    • 1997
  • In this paper, we propose HMCNN(hybrid multiple component neural networks) that enhance performance of MCNN by adapting new pattern partitioning algorithm which can cluster many input patterns efficiently. Added neural network performs similar learning procedure that of kohonen network. But it dynamically determine it's number of output neurons using algorithms that decide self-organized number of clusters and patterns in a cluster. The proposed network can effectively be applied to problems of large data as well as huge networks size. As a sresutl, proposed pattern partitioning network can enhance performance results and solve weakness of MCNN like generalization capability. In addition, we can get more fast speed by performing parallel learning than that of other supervised learning networks.

  • PDF

Branch-and-bound method for solving n-ary vertical partitioning problems in physical design of database (데이타베이스의 물리적 설계에서 분지한계법을 이용한 n-ary 수직분할문제)

  • Yoon, Byung-Ik;Kim, Jae-Yern
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.22 no.4
    • /
    • pp.567-578
    • /
    • 1996
  • In relational databases the number of disk accesses depends on the amount of data transferred from disk to main memory for processing the transactions. N-ary vertical partitioning of the relation can often result in a decrease in the number of disk accesses, since not all attributes in a tuple are required by each transactions. In this paper, a 0-1 integer programming model for solving n-ary vertical partitioning problem minimizing the number of disk accesses is formulated and a branch-and-bound method is used to solve it. A preprocessing procedure reducing the number of variables is presented. The algorithm is illustrated with numerical examples and is shown to be computationally efficient. Numerical experiments reveal that the proposed method is more effective in reducing access costs than the existing algorithms.

  • PDF

AN INTERFERENCE FRINGE REMOVAL METHOD BASED ON MULTI-SCALE DECOMPOSITION AND ADAPTIVE PARTITIONING FOR NVST IMAGES

  • Li, Yongchun;Zheng, Sheng;Huang, Yao;Liu, Dejian
    • Journal of The Korean Astronomical Society
    • /
    • v.52 no.2
    • /
    • pp.49-55
    • /
    • 2019
  • The New Vacuum Solar Telescope (NVST) is the largest solar telescope in China. When using CCDs for imaging, equal-thickness fringes caused by thin-film interference can occur. Such fringes reduce the quality of NVST data but cannot be removed using standard flat fielding. In this paper, a correction method based on multi-scale decomposition and adaptive partitioning is proposed. The original image is decomposed into several sub-scales by multi-scale decomposition. The region containing fringes is found and divided by an adaptive partitioning method. The interference fringes are then filtered by a frequency-domain Gaussian filter on every partitioned image. Our analysis shows that this method can effectively remove the interference fringes from a solar image while preserving useful information.

An efficient storing method of multiple streams based on fixed blocks in disk parititions (디스크 파티션내 고정 블록에 기반한 다중 스트림의 효율적 저장 방식)

  • 최성욱;박승규;최덕규
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.9
    • /
    • pp.2080-2089
    • /
    • 1997
  • Recent evolution in compute technology makesthe multimedia processing widely availiable. Conventional storage systems do not meet the requirements of multimedia data. Several approaches were suggested to improve disk storing methods for them. Bocheck proposed a disk partitioning technique for multiple steams assuming that all steams have same retrieval intervals with the same amount data for each access. While Bocheck's one provides a good method for same period, it does not consider the case of different periods of continous media streams. This paper proposes a new partitioning technique in which a fixed number of blocks are assigned for stresms with different retrieval periodicity. The analysis shows this problem is the same as the one scheduling the steams into a given sequence. The simulation was done to compare the proposed m-sequence merge method with the conventional Scan-EDF and Partitioning methods.

  • PDF

Implementation and Performance Evaluation of Parallel Programming Translator for High Performance Fortran (High Performance Fortran 병렬 프로그래밍 변환기의 구현 및 성능 평가)

  • Kim, Jung-Gwon;Hong, Man-Pyo;Kim, Dong-Gyu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.4
    • /
    • pp.901-915
    • /
    • 1999
  • Parallel computers are known to be excellent in performance per cost also satisfying scalability and high performance. However parallel machines have enjoyed limited success because of difficulty in parallel programming and non-portability between parallel machines. Recently, researchers have sought to develop data parallel language that provides machine independent programming systems. Data parallel language such as High Performance Fortran provides a basis to write a parallel program based on a global name space by partitioning data and computation, generating message-passing function. In this paper, we describe the Parallel Programming Translator(PPTran), source-to-source data parallel compiler, generating MPI SPMD parallel program from HPF input program through four phases such as data dependence analysis, partitioning data, partitioning computation, and code generation with explicit message-passing and verify the performance of PPTran

  • PDF

Global Optimization of Clusters in Gene Expression Data of DNA Microarrays by Deterministic Annealing

  • Lee, Kwon Moo;Chung, Tae Su;Kim, Ju Han
    • Genomics & Informatics
    • /
    • v.1 no.1
    • /
    • pp.20-24
    • /
    • 2003
  • The analysis of DNA microarry data is one of the most important things for functional genomics research. The matrix representation of microarray data and its successive 'optimal' incisional hyperplanes is a useful platform for developing optimization algorithms to determine the optimal partitioning of pairwise proximity matrix representing completely connected and weighted graph. We developed Deterministic Annealing (DA) approach to determine the successive optimal binary partitioning. DA algorithm demonstrated good performance with the ability to find the 'globally optimal' binary partitions. In addition, the objects that have not been clustered at small non­zero temperature, are considered to be very sensitive to even small randomness, and can be used to estimate the reliability of the clustering.

Performance of Distributed Database System built on Multicore Systems

  • Kim, Kangseok
    • Journal of Internet Computing and Services
    • /
    • v.18 no.6
    • /
    • pp.47-53
    • /
    • 2017
  • Recently, huge datasets have been generating rapidly in a variety of fields. Then, there is an urgent need for technologies that will allow efficient and effective processing of huge datasets. Therefore the problems of partitioning a huge dataset effectively and alleviating the processing overhead of the partitioned data efficiently have been a critical factor for scalability and performance in distributed database system. In our work we utilized multicore servers to provide scalable service to our distributed system. The partitioning of database over multicore servers have emerged from a need for new architectural design of distributed database system from scalability and performance concerns in today's data deluge. The system allows uniform access through a web service interface to concurrently distributed databases over multicore servers, using SQMD (Single Query Multiple Database) mechanism based on publish/subscribe paradigm. We will present performance results with the distributed database system built on multicore server, which is time intensive with traditional architectures. We will also discuss future works.

A Comparison of Clustering Algorithm in Data Mining

  • Lee, Yung-Seop;An, Mi-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.4
    • /
    • pp.725-736
    • /
    • 2003
  • To provide the information needed to make a decision, it is important to know the relationship or pattern between variables in database. Grouping objects which have similar characteristics of pattern is called as cluster analysis, one of data mining techniques. In this study, it is compared with several partitioning clustering algorithms, based on the statistical distance or total variance in each cluster.

  • PDF

Development of Parsimonious Semi-Distributed Hydrologic Partitioning Model Based on Soil Moisture Storages (토양수분 저류 기반의 간결한 준분포형 수문분할모형 개발)

  • Choi, Jeonghyeon;Kim, Ryoungeun;Kim, Sangdan
    • Journal of Korean Society on Water Environment
    • /
    • v.36 no.3
    • /
    • pp.229-244
    • /
    • 2020
  • Hydrologic models, as a useful tool for understanding the hydrologic phenomena in the watershed, have become more complex with the increase of computer performance. The hydrologic model, with complex configurations and powerful performance, facilitates a broader understanding of the effects of climate and soil in hydrologic partitioning. However, the more complex the model is, the more effort and time is required to drive the model, and the more parameters it uses, the less accessible to the user and less applicable to the ungauged watershed. Rather, a parsimonious hydrologic model may be effective in hydrologic modeling of the ungauged watershed. Thus, a semi-distributed hydrologic partitioning model was developed with minimal composition and number of parameters to improve applicability. In this study, the validity and performance of the proposed model were confirmed by applying it to the Namgang Dam, Andong Dam, Hapcheon Dam, and Milyang Dam watersheds among the Nakdong River watersheds. From the results of the application, it was confirmed that despite the simple model structure, the hydrologic partitioning process of the watershed can be modeled relatively well through three vertical layers comprising the surface layer, the soil layer, and the aquifer. Additionally, discussions were conducted on antecedent soil moisture conditions widely applied to stormwater estimation using the soil moisture data simulated by the proposed model.

Performance Improvement of Declustering Algorithm by Efficient Grid-Partitioning Multi-Dimensional Space (다차원 공간의 효율적인 그리드 분할을 통한 디클러스터링 알고리즘 성능향상 기법)

  • Kim, Hak-Cheol
    • Journal of Korea Spatial Information System Society
    • /
    • v.12 no.1
    • /
    • pp.37-48
    • /
    • 2010
  • In this paper, we analyze the shortcomings of the previous declustering methods, which are based on grid-like partitioning and a mapping function from a cell to a disk number, for high-dimensional space and propose a solution. The problems arise from the fact that the number of splitting is small(for the most part, binary-partitioning is sufficient), and the side length of a range query whose selectivity is small is quite large. To solve this problem, we propose a mathematical model to estimate the performance of a grid-like partitioning method. With the proposed estimation model, we can choose a good grid-like partitioning method among the possible schemes and this results in overall improvement in declustering performance. Several experimental results show that we can improve the performance of a previous declustering method up to 2.7 times.