• Title/Summary/Keyword: Data partitioning

Search Result 388, Processing Time 0.026 seconds

An Iterative Algorithm for the Bottom Up Computation of the Data Cube using MapReduce (맵리듀스를 이용한 데이터 큐브의 상향식 계산을 위한 반복적 알고리즘)

  • Lee, Suan;Jo, Sunhwa;Kim, Jinho
    • Journal of Information Technology and Architecture
    • /
    • v.9 no.4
    • /
    • pp.455-464
    • /
    • 2012
  • Due to the recent data explosion, methods which can meet the requirement of large data analysis has been studying. This paper proposes MRIterativeBUC algorithm which enables efficient computation of large data cube by distributed parallel processing with MapReduce framework. MRIterativeBUC algorithm is developed for efficient iterative operation of the BUC method with MapReduce, and overcomes the limitations about the storage size and processing ability caused by large data cube computation. It employs the idea from the iceberg cube which computes only the interesting aspect of analysts and the distributed parallel process of cube computation by partitioning and sorting. Thus, it reduces data emission so that it can reduce network overload, processing amount on each node, and eventually the cube computation cost. The bottom-up cube computation and iterative algorithm using MapReduce, proposed in this paper, can be expanded in various way, and will make full use of many applications.

Numerical Study on the Drag of a Car Model under Road Condition (주행조건에서의 자동차 모델 항력에 대한 수치해석적 연구)

  • Kim, Beom-Jun;Kang, Sung-Woo;Choi, Hyoung-gwon;Yoo, Jung-Yul
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.27 no.8
    • /
    • pp.1182-1190
    • /
    • 2003
  • A parallelized FEM code based on domain decomposition method has been recently developed for large-scale computational fluid dynamics. A 4-step splitting finite element algorithm is adopted for unsteady flow computation of the incompressible Navier-Stokes equation, and Smagorinsky LES model is chosen for turbulent flow computation. Both METIS and MPI Libraries are used for domain partitioning and data communication between processors, respectively. Tiburon model of Hyundai Motor Company is chosen as the computational model at Re=7.5 $\times$ 10$^{5}$ , which is based on the car height. The calculation is carried out under both the wind tunnel condition and the road condition using IBM SP parallel architecture at KISTI Super Computing Center. Compared with the existing experimental data, both the velocity and pressure fields are predicted reasonably well and the drag coefficient is in good agreement. Furthermore, it is confirmed that the drag under the road condition is smaller than that under the wind-tunnel condition.

A New Method to Retrieve Sensible Heat and Latent Heat Fluxes from the Remote Sensing Data

  • Liou Yuei-An;Chen Yi-Ying;Chien Tzu-Chieh;Chang Tzu-Yin
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.415-417
    • /
    • 2005
  • In order to retrieve the latent and sensible heat fluxes, high-resolution airborne imageries with visible, near infrared, and thermal infrared bands and ground-base meteorology measurements are utilized in this paper. The retrieval scheme is based on the balance of surface energy budget and momentum equations. There are three basic surface parameters including surface albedo $(\alpha)$, normalized difference vegetation index (NOVI) and surface kinetic temperature (TO). Lowtran 7 code is used to correct the atmosphere effect. The imageries were taken on 28 April and 5 May 2003. From the scattering plot of data set, we observed the extreme dry and wet pixels to derive the fitting of dry and wet controlled lines, respectively. Then the sensible heat and latent heat fluxes are derived from through a partitioning factor A. The retrieved latent and sensible heat fluxes are compared with in situ measurements, including eddy correlation and porometer measurements. It is shown that the retrieved fluxes from our scheme match with the measurements better than those derived from the S-SEBI model.

  • PDF

SOC Verification Based on WGL

  • Du, Zhen-Jun;Li, Min
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.12
    • /
    • pp.1607-1616
    • /
    • 2006
  • The growing market of multimedia and digital signal processing requires significant data-path portions of SoCs. However, the common models for verification are not suitable for SoCs. A novel model--WGL (Weighted Generalized List) is proposed, which is based on the general-list decomposition of polynomials, with three different weights and manipulation rules introduced to effect node sharing and the canonicity. Timing parameters and operations on them are also considered. Examples show the word-level WGL is the only model to linearly represent the common word-level functions and the bit-level WGL is especially suitable for arithmetic intensive circuits. The model is proved to be a uniform and efficient model for both bit-level and word-level functions. Then Based on the WGL model, a backward-construction logic-verification approach is presented, which reduces time and space complexity for multipliers to polynomial complexity(time complexity is less than $O(n^{3.6})$ and space complexity is less than $O(n^{1.5})$) without hierarchical partitioning. Finally, a construction methodology of word-level polynomials is also presented in order to implement complex high-level verification, which combines order computation and coefficient solving, and adopts an efficient backward approach. The construction complexity is much less than the existing ones, e.g. the construction time for multipliers grows at the power of less than 1.6 in the size of the input word without increasing the maximal space required. The WGL model and the verification methods based on WGL show their theoretical and applicable significance in SoC design.

  • PDF

A Placement Policy improving Retrieval Efficiency of video streams in Clustered VOD Servers (클러스터드 주문형 비디오 서버에서 비디오 스트림의 검색효율을 높이는 배치정책)

  • 안유정;원유헌
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.9B
    • /
    • pp.1652-1660
    • /
    • 1999
  • One of the most important goals in VOD servers is to provide services to more clients with services which clients request. In order to provide service efficiently and rapidly, though considering a few of policies, efficient placement of data when they are stored is direct cause to improve efficiency of retrievals. In this paper, we propose a efficient placement policy, encoded video data being stored in clustered VOD servers. In the proposed placement policy, partitioning a large disk array into smaller disk groups which consists of a few of disks with similar performances, specially disk I/O bandwidth. In last chapter, we compare proposed placement policy with conventional policies, and show the results of improved performances with proposed policy.

  • PDF

Mesh Decimation for Polygon Rendering Based Real-Time 3-Axis NC Milling Simulation (실시간 3축 NC 밀링 시뮬레이션을 위한 메쉬 간략화 방법)

  • Joo, S.W.;Lee, S.H.;Park, K.H.
    • Korean Journal of Computational Design and Engineering
    • /
    • v.5 no.4
    • /
    • pp.347-358
    • /
    • 2000
  • The view dependency of typical spatial-partitioning based NC simulation methods is overcome by polygon rendering technique that generates polygons to represent the workpiece, thus enabling dynamic viewing transformations without reconstruction of the entire data structure. However, the polygon rendering technique still has difficulty in realizing real-time simulation due to unsatisfactory performance of current graphics devices. Therefore, it is necessary to develop a mesh decimation method that enables rapid rendering without loss of display quality. In this paper. we proposed a new mesh decimation algorithm thor a workpiece whose shape varies dynamically. In this algorithm, the 2-map data thor a given workpiece is divided into several regions, and a triangular mesh is constructed for each region first. Then, if any region it cut by the tool, its mesh is regenerated and decimated again. Since the range of mesh decimation is confined to a few regions, the reduced polygons for rendering can be obtained rapidly. Our method enables the polygon-rendering based NC simulation to be applied to the computers equipped with a wider range of graphics cards.

  • PDF

A Clustering-Based Fault Detection Method for Steam Boiler Tube in Thermal Power Plant

  • Yu, Jungwon;Jang, Jaeyel;Yoo, Jaeyeong;Park, June Ho;Kim, Sungshin
    • Journal of Electrical Engineering and Technology
    • /
    • v.11 no.4
    • /
    • pp.848-859
    • /
    • 2016
  • System failures in thermal power plants (TPPs) can lead to serious losses because the equipment is operated under very high pressure and temperature. Therefore, it is indispensable for alarm systems to inform field workers in advance of any abnormal operating conditions in the equipment. In this paper, we propose a clustering-based fault detection method for steam boiler tubes in TPPs. For data clustering, k-means algorithm is employed and the number of clusters are systematically determined by slope statistic. In the clustering-based method, it is assumed that normal data samples are close to the centers of clusters and those of abnormal are far from the centers. After partitioning training samples collected from normal target systems, fault scores (FSs) are assigned to unseen samples according to the distances between the samples and their closest cluster centroids. Alarm signals are generated if the FSs exceed predefined threshold values. The validity of exponentially weighted moving average to reduce false alarms is also investigated. To verify the performance, the proposed method is applied to failure cases due to boiler tube leakage. The experiment results show that the proposed method can detect the abnormal conditions of the target system successfully.

A GPU scheduling framework for applications based on dataflow specification (데이터 플로우 기반 응용들을 위한 GPU 스케줄링 프레임워크)

  • Lee, Yongbin;Kim, Sungchan
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.10
    • /
    • pp.1189-1197
    • /
    • 2014
  • Recently, general purpose graphic processing units(GPUs) are being widely used in mobile embedded systems such as smart phone and tablet PCs. Because of architectural limitations of mobile GPGPUs, only a single program is allowed to occupy a GPU at a time in a non-preemptive way. As a result, it is difficult to meet performance requirements of applications such as frame rate or response time if applications running on a GPU are not scheduled properly. To tackle this difficulty, we propose to specify applications using synchronous data flow model of computation such that applications are formed with edges and nodes. Then nodes of applications are scheduled onto a GPU unlike conventional scheduling an application as a whole. This approach allows applications to share a GPU at a finer granularity, node (or task)-level, providing several benefits such as eliminating need for manually partitioning applications and better GPU utilization. Furthermore, any scheduling policy can be applied in response to the characteristics of applications.

Trace Metal Contamination and Solid Phase Partitioning of Metals in National Roadside Sediments Within the Watershed of Hoidong Reservoir in Pusan City (부산시 회동저수지 집수분지 내 국도도로변 퇴적물의 미량원소 오염 및 존재형태)

  • Lee Pyeong-Koo;Kang Min-Joo;Youm Seung-Jun;Lee In-Gyeong;Park Sung-Won;Lee Wook-Jong
    • Journal of Soil and Groundwater Environment
    • /
    • v.11 no.5
    • /
    • pp.20-34
    • /
    • 2006
  • This study was undertaken to assess the anthropogenic impact on trace metal concentrations (Zn, Cu, Pb, Cr, Ni, and Cd) of roadside sediments (N = 70) from No.7 national road within the watershed of Hoidong Reservoir in Pusan City and to estimate the potential mobility of selected metals using sequential extraction. We generally found high concentrations of metals, especially Zn, Cu and Pb, affected by anthropogenic inputs. Compared to the trace metal concentrations of uncontaminated stream sediments, arithmetic mean concentrations of roadside sediments were about 7 times higher for Cu, 4 times higher for Zn, 3 times higher for Pb and Cr and, 2 times higher for Ni and As. Speciation data on the basis of sequential extraction indicate that most of the trace metals considered do not occur in significant quantities in the exchangeable fraction, except for Cd and Ni whose exchangeable fractions are appreciable (average 29.3 and 25.8%, respectively). Other metals such as Zn (51.4%) and Pb (45.2%) are preferentially bound to the reducible fraction, and therefore they can be potentially released by a pH decrease and/or redox change. Copper is mainly found in the organic fraction, while Cd is highest in the exchangeable fraction, and Cr and Ni in the residual fraction. Considering the proportion of metals bound to the exchangeable and carbonate fractions, the comparative mobility of metals probably decreases in the order of Cd>Ni>Pb>Zn>Cr>Cu. Although the total concentration data showed that Zn was typically present in potentially harmful concentration levels, the data on metal partitioning indicated that Cd, Ni and Pb pose the highest potential hazard for runoff water. As potential changes of redox state and pH may remobilize the metals bound to carbonates, amorphous oxides, and/or organic matter, and may release and flush them through drain networks into the watershed of Hoidong Reservoir, careful monitoring of environmental conditions appears to be very important.

A Data Gathering Protocol for Multihop Transmission for Large Sensor Networks (대형 센서네트워크에서 멀티홉 전송을 이용한 데이터 수집 프로토콜)

  • Park, Jang-Su;Ahn, Byoung-Chul
    • Journal of KIISE:Information Networking
    • /
    • v.37 no.1
    • /
    • pp.50-56
    • /
    • 2010
  • This paper proposes a data gathering method by adapting the mobile sink to prolong the whole operation time of large WSNs. After partitioning a network into several clusters, a mobile sink visits each cluster and collects data from it. An efficient protocol improves the energy efficiency by delivering messages from the mobile sink to the cluster head as well as reduces the data gathering delay, which is the disadvantage of the mobile sink. For the scalability of sensor network, the network architecture should support the multihop transmission in the duster rather than the single hop transmission. The process for the data aggregation linked to the travelling path is proposed to improve the energy consumption of intermediate nodes. The experiment results show that the proposed model is more efficient than legacy methods in the energy consumption and the data gathering time.