• Title/Summary/Keyword: parallel computers

Search Result 141, Processing Time 0.025 seconds

Enhanced NOW-Sort on a PC Cluster with a Low-Speed Network (저속 네트웍 PC 클러스터상에서 NOW-Sort의 성능향상)

  • Kim, Ji-Hyoung;Kim, Dong-Seung
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.10
    • /
    • pp.550-560
    • /
    • 2002
  • External sort on cluster computers requires not only fast internal sorting computation but also careful scheduling of disk input and output and interprocessor communication through networks. This is because the overall time for the execution is determined by reflecting the times for all the jobs involved, and the portion for interprocessor communication and disk I/O operations is significant. In this paper, we improve the sorting performance (sorting throughput) on a cluster of PCs with a low-speed network by developing a new algorithm that enables even distribution of load among processors, and optimizes the disk read and write operations with other computation/communication activities during the sort. Experimental results support the effectiveness of the algorithm. We observe the algorithm reduces the sort time by 45% compared to the previous NOW-sort[1], and provides more scalability in the expansion of the computing nodes of the cluster as well.

AMG-CG method for numerical analysis of high-rise structures on heterogeneous platforms with GPUs

  • Li, Zuohua;Shan, Qingfei;Ning, Jiafei;Li, Yu;Guo, Kaisheng;Teng, Jun
    • Computers and Concrete
    • /
    • v.29 no.2
    • /
    • pp.93-105
    • /
    • 2022
  • The degrees of freedom (DOFs) of high-rise structures increase rapidly due to the need for refined analysis, which poses a challenge toward a computationally efficient method for numerical analysis of high-rise structures using the finite element method (FEM). This paper presented an efficient iterative method, an algebraic multigrid (AMG) with a Jacobi overrelaxation smoother preconditioned conjugate gradient method (AMG-CG) used for solving large-scale structural system equations running on heterogeneous platforms with parallel accelerator graphics processing units (GPUs) enabled. Furthermore, an AMG-CG FEM application framework was established for the numerical analysis of high-rise structures. In the proposed method, the coarsening method, the optimal relaxation coefficient of the JOR smoother, the smoothing times, and the solution method for the coarsest grid of an AMG preconditioner were investigated via several numerical benchmarks of high-rise structures. The accuracy and the efficiency of the proposed FEM application framework were compared using the mature software Abaqus, and there were speedups of up to 18.4x when using an NVIDIA K40C GPU hosted in a workstation. The results demonstrated that the proposed method could improve the computational efficiency of solving structural system equations, and the AMG-CG FEM application framework was inherently suitable for numerical analysis of high-rise structures.

Benchmark Results of a Monte Carlo Treatment Planning system (몬데카를로 기반 치료계획시스템의 성능평가)

  • Cho, Byung-Chul
    • Progress in Medical Physics
    • /
    • v.13 no.3
    • /
    • pp.149-155
    • /
    • 2002
  • Recent advances in radiation transport algorithms, computer hardware performance, and parallel computing make the clinical use of Monte Carlo based dose calculations possible. To compare the speed and accuracies of dose calculations between different developed codes, a benchmark tests were proposed at the XIIth ICCR (International Conference on the use of Computers in Radiation Therapy, Heidelberg, Germany 2000). A Monte Carlo treatment planning comprised of 28 various Intel Pentium CPUs was implemented for routine clinical use. The purpose of this study was to evaluate the performance of our system using the above benchmark tests. The benchmark procedures are comprised of three parts. a) speed of photon beams dose calculation inside a given phantom of 30.5 cm$\times$39.5 cm $\times$ 30 cm deep and filled with 5 ㎣ voxels within 2% statistical uncertainty. b) speed of electron beams dose calculation inside the same phantom as that of the photon beams. c) accuracy of photon and electron beam calculation inside heterogeneous slab phantom compared with the reference results of EGS4/PRESTA calculation. As results of the speed benchmark tests, it took 5.5 minutes to achieve less than 2% statistical uncertainty for 18 MV photon beams. Though the net calculation for electron beams was an order of faster than the photon beam, the overall calculation time was similar to that of photon beam case due to the overhead time to maintain parallel processing. Since our Monte Carlo code is EGSnrc, which is an improved version of EGS4, the accuracy tests of our system showed, as expected, very good agreement with the reference data. In conclusion, our Monte Carlo treatment planning system shows clinically meaningful results. Though other more efficient codes are developed such like MCDOSE and VMC++, BEAMnrc based on EGSnrc code system may be used for routine clinical Monte Carlo treatment planning in conjunction with clustering technique.

  • PDF

Statistical Issues in the Articles Published in the Journal of Veterinary Clinics (한국임상수의학회지에 발표된 논문의 통계분석 검토)

  • Pak, Son-Il;Oh, Tae-Ho
    • Journal of Veterinary Clinics
    • /
    • v.27 no.2
    • /
    • pp.170-174
    • /
    • 2010
  • With the ease availability of statistical software and powerful computers the application of statistical methods in domestic veterinary journals is on the increase. In parallel with this benefit, statistical errors are not uncommon even in renowned scientific and medical journals. These errors may lead to misinterpretation of the data, thereby, subjected to faulty conclusions. A systematic review of articles published in 8 issues of the Journal of Veterinary Clinics during 2006-2007 was performed to assess the statistical methodology and reporting. Ninety-four (72.9%) articles of the 129 original articles screened included any inferential statistical analysis in the article, including comparison of 3 or more groups (53 or 56.4%), comparison of independent 2 groups (40 or 42.6%), and paired t-test (9 or 9.6%) in order. Of the 94 articles in which statistical analysis was done 62 (or 66.0%) had at least 1 statistical error. Errors included failure to apply or incorrectly applying independent Student's t-test for paired data or vice versa, inappropriate use of t-test for more than 3 groups and failure in chi-square test to consider continuity-correction for small expected frequencies. The common errors in ANOVA were failure to validate assumption of the test, inappropriate post-hoc multiple-comparison and incorrect assumption of independence of data in repeated measures design. Reporting errors included failure to state statistical methods and failure to state specific test if more than 1 test was done. It is suggested that an editorial effort would be necessary to achieve the improvement of appropriate statistical procedures through the publication of statistical guidelines to author(s).

A Comprehensive Groundwater Modeling using Multicomponent Multiphase Theory: 1. Development of a Multidimensional Finite Element Model (다중 다상이론을 이용한 통합적 지하수 모델링: 1. 다차원 유한요소 모형의 개발)

  • Joon Hyun Kim
    • Journal of Korea Soil Environment Society
    • /
    • v.1 no.1
    • /
    • pp.89-102
    • /
    • 1996
  • An integrated model is presented to describe underground flow and mass transport, using a multicomponent multiphase approach. The comprehensive governing equation is derived considering mass and force balances of chemical species over four phases(water, oil, air, and soil) in a schematic elementary volume. Compact and systemati notations of relevant variables and equations are introduced to facilitate the inclusion of complex migration and transformation processes, and variable spatial dimensions. The resulting nonlinear system is solved by a multidimensional finite element code. The developed code with dynamic array allocation, is sufficiently flexible to work across a wide spectrum of computers, including an IBM ES 9000/900 vector facility, SP2 cluster machine, Unix workstations and PCs, for one-, two and three-dimensional problems. To reduce the computation time and storage requirements, the system equations are decoupled and solved using a banded global matrix solver, with the vector and parallel processing on the IBM 9000. To avoide the numerical oscillations of the nonlinear problems in the case of convective dominant transport, the techniques of upstream weighting, mass lumping, and elementary-wise parameter evaluation are applied. The instability and convergence criteria of the nonlinear problems are studied for the one-dimensional analogue of FEM and FDM. Modeling capacity is presented in the simulation of three dimensional composite multiphase TCE migration. Comprehesive simulation feature of the code is presented in a companion paper of this issue for the specific groundwater or flow and contamination problems.

  • PDF

Debelppment of C++ Compiler and Programming Environment (C++컴파일러 및 프로그래밍 환경 개발)

  • Jang, Cheon-Hyeon;O, Se-Man
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.3
    • /
    • pp.831-845
    • /
    • 1997
  • In this paper,we proposed and developed a compiler and interactive programming enviroments for C++ wich is mostly worth of nitice among the object -oriented languages.To develope the compiler for C++ we took front=end/back-end model using EM virtual machine.In develpoing Front-End,we formailized C++ gram-mar with the context semsitive tokens which must be manipulated by dexical scanner and designed a AST class li-brary which is the hierarchy of AST node class and well defined interface among them,In develpoing Bacik-End,we proposed model for three major components :code oprtimizer,code generator and run-time enviroments.We emphasized the retargatable back-end which can be systrmatically reconfigured to genrate code for a variety of distinct target computers.We also developed terr pattern matching algorithm and implemented target code gen-erator which produce SPARC code.We also proposed the theroy and model for construction interative pro-gramming enviroments. To represent language features we adopt AST as internal reprsentation and propose uncremental analysis algorithm and viseal digrams.We also studied unparsing scheme, visual diagram,graphical user interface to generate interactive environments automatically Results of our resarch will be very useful for developing a complier and programming environments, and also can be used in compilers for parallel and distributed enviroments.

  • PDF

AS B-tree: A study on the enhancement of the insertion performance of B-tree on SSD (AS B-트리: SSD를 사용한 B-트리에서 삽입 성능 향상에 관한 연구)

  • Kim, Sung-Ho;Roh, Hong-Chan;Lee, Dae-Wook;Park, Sang-Hyun
    • The KIPS Transactions:PartD
    • /
    • v.18D no.3
    • /
    • pp.157-168
    • /
    • 2011
  • Recently flash memory has been being utilized as a main storage device in mobile devices, and flashSSDs are getting popularity as a major storage device in laptop and desktop computers, and even in enterprise-level server machines. Unlike HDDs, on flash memory, the overwrite operation is not able to be performed unless it is preceded by the erase operation to the same block. To address this, FTL(Flash memory Translation Layer) is employed on flash memory. Even though the modified data block is overwritten to the same logical address, FTL writes the updated data block to the different physical address from the previous one, mapping the logical address to the new physical address. This enables flash memory to avoid the high block-erase cost. A flashSSD has an array of NAND flash memory packages so it can access one or more flash memory packages in parallel at once. To take advantage of the internal parallelism of flashSSDs, it is beneficial for DBMSs to request I/O operations on sequential logical addresses. However, the B-tree structure, which is a representative index scheme of current relational DBMSs, produces excessive I/O operations in random order when its node structures are updated. Therefore, the original b-tree is not favorable to SSD. In this paper, we propose AS(Always Sequential) B-tree that writes the updated node contiguously to the previously written node in the logical address for every update operation. In the experiments, AS B-tree enhanced 21% of B-tree's insertion performance.

Effcient Neural Network Architecture for Fat Target Detection and Recognition (목표물의 고속 탐지 및 인식을 위한 효율적인 신경망 구조)

  • Weon, Yong-Kwan;Baek, Yong-Chang;Lee, Jeong-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.10
    • /
    • pp.2461-2469
    • /
    • 1997
  • Target detection and recognition problems, in which neural networks are widely used, require translation invariant and real-time processing in addition to the requirements that general pattern recognition problems need. This paper presents a novel architecture that meets the requirements and explains effective methodology to train the network. The proposed neural network is an architectural extension of the shared-weight neural network that is composed of the feature extraction stage followed by the pattern recognition stage. Its feature extraction stage performs correlational operation on the input with a weight kernel, and the entire neural network can be considered a nonlinear correlation filter. Therefore, the output of the proposed neural network is correlational plane with peak values at the location of the target. The architecture of this neural network is suitable for implementing with parallel or distributed computers, and this fact allows the application to the problems which require realtime processing. Net training methodology to overcome the problem caused by unbalance of the number of targets and non-targets is also introduced. To verify the performance, the proposed network is applied to detection and recognition problem of a specific automobile driving around in a parking lot. The results show no false alarms and fast processing enough to track a target that moves as fast as about 190 km per hour.

  • PDF

Data Mining Algorithm Based on Fuzzy Decision Tree for Pattern Classification (퍼지 결정트리를 이용한 패턴분류를 위한 데이터 마이닝 알고리즘)

  • Lee, Jung-Geun;Kim, Myeong-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.11
    • /
    • pp.1314-1323
    • /
    • 1999
  • 컴퓨터의 사용이 일반화됨에 따라 데이타를 생성하고 수집하는 것이 용이해졌다. 이에 따라 데이타로부터 자동적으로 유용한 지식을 얻는 기술이 필요하게 되었다. 데이타 마이닝에서 얻어진 지식은 정확성과 이해성을 충족해야 한다. 본 논문에서는 데이타 마이닝을 위하여 퍼지 결정트리에 기반한 효율적인 퍼지 규칙을 생성하는 알고리즘을 제안한다. 퍼지 결정트리는 ID3와 C4.5의 이해성과 퍼지이론의 추론과 표현력을 결합한 방법이다. 특히, 퍼지 규칙은 속성 축에 평행하게 판단 경계선을 결정하는 방법으로는 어려운 속성 축에 평행하지 않는 경계선을 갖는 패턴을 효율적으로 분류한다. 제안된 알고리즘은 첫째, 각 속성 데이타의 히스토그램 분석을 통해 적절한 소속함수를 생성한다. 둘째, 주어진 소속함수를 바탕으로 ID3와 C4.5와 유사한 방법으로 퍼지 결정트리를 생성한다. 또한, 유전자 알고리즘을 이용하여 소속함수를 조율한다. IRIS 데이타, Wisconsin breast cancer 데이타, credit screening 데이타 등 벤치마크 데이타들에 대한 실험 결과 제안된 방법이 C4.5 방법을 포함한 다른 방법보다 성능과 규칙의 이해성에서 보다 효율적임을 보인다.Abstract With an extended use of computers, we can easily generate and collect data. There is a need to acquire useful knowledge from data automatically. In data mining the acquired knowledge needs to be both accurate and comprehensible. In this paper, we propose an efficient fuzzy rule generation algorithm based on fuzzy decision tree for data mining. We combine the comprehensibility of rules generated based on decision tree such as ID3 and C4.5 and the expressive power of fuzzy sets. Particularly, fuzzy rules allow us to effectively classify patterns of non-axis-parallel decision boundaries, which are difficult to do using attribute-based classification methods.In our algorithm we first determine an appropriate set of membership functions for each attribute of data using histogram analysis. Given a set of membership functions then we construct a fuzzy decision tree in a similar way to that of ID3 and C4.5. We also apply genetic algorithm to tune the initial set of membership functions. We have experimented our algorithm with several benchmark data sets including the IRIS data, the Wisconsin breast cancer data, and the credit screening data. The experiment results show that our method is more efficient in performance and comprehensibility of rules compared with other methods including C4.5.

Horizontal only and horizontal-vertical combined earthquake effects on three R/C frame building structures through linear time-history analysis (LTHA): An implementation to Turkey

  • Selcuk Bas;Mustafa A. Bilgin
    • Computers and Concrete
    • /
    • v.34 no.3
    • /
    • pp.329-346
    • /
    • 2024
  • In this study, it is aimed to investigate the vertical seismic performance of reinforced concrete (R/C) frame buildings in two different building stocks, one of which consists of those designed as per the previous Turkish Seismic Code (TSC-2007) that does not consider the vertical earthquake load, and the other of which consists of those designed as per the new Turkish Seismic Code (TSCB-2018) that considers the vertical earthquake load. For this aim, three R/C buildings with heights of 15 m, 24 m and 33 m are designed separately as per TSC-2007 and TSCB-2018 based on some limitations in terms of seismic zone, soil class and structural behavior factor (Rx/Ry) etc. The vertical earthquake motion effects are identified according to the linear time-history analyses (LTHA) that are performed separately for only horizontal (H) and combined horizontal+vertical (H+V) earthquake motions. LTHA is performed to predict how vertical earthquake motion affects the response of the designed buildings by comparing the linear response parameters of the base shear force, the base overturning, the base axial force, top-story vertical displacement. Nonlinear time-history analysis (NLTHA) is generally required for energy dissipative buildings, not required for design of buildings. In this study, the earthquake records are scaled to force the buildings in the linear range. Since nonlinear behavior is not expected from the buildings herein, the nonlinear time-history analysis (NLTHA) is not considered. Eleven earthquake acceleration records are considered by scaling them to the design spectrum given in TSCB-2018. The base shear force is obtained not to be affected from the combined H+V earthquake load for the buildings. The base overturning moment outcomes underline that the rigidity of the frame system in terms of the dimensions of the columns can be a critical parameter for the influence of the vertical earthquake motion on the buildings. In addition, the building stock from TSC-2007 is estimated to show better vertical earthquake performance than that of TSCB-2018. The vertical earthquake motion is found out to be highly effective on the base axial force of 33 m building rather than 15 m and 24 m buildings. Thus, the building height is a particularly important parameter for the base axial force. The percentage changes in the top-story vertical displacement of the buildings designed for both codes show an increase parallel to that in the base axial force results. To extrapolate more general results, it is clear to state that many buildings should be analyzed.