• Title/Summary/Keyword: 분산 병렬 처리

Search Result 411, Processing Time 0.032 seconds

The Application and Integration of an Improvement Technique for Layers of NETCONF (NETCONF 계층에 대한 개선 기법 적용 및 통합)

  • Lee, YangMin;Lee, JaeKee
    • Journal of KIISE
    • /
    • v.43 no.2
    • /
    • pp.256-268
    • /
    • 2016
  • Modern networks consisting of various heterogeneous equipment are often installed in a distributed manner. Thus the NETCONF standard was established to manage networks centrally and efficiently. In this paper, we present a method that integrates each NETCONF layer into a single system based on the results of previous studies. In the RPC Layer, an asynchronous communication channel and parallel processes are possible using multi-threading. In the Operation Layer, operational efficiency is increased by using a data group with dependencies between the equipment configuration data and by improving the data structure, enabling efficiently processing of XML queries even with multiple managers. The data modeling techniques and grouping methods in the Content Layer are presented in detail for interoperability between the Operation Layer and the Content Layer. Finally, the GUI program was implemented and its implementation is reported. We performed an experiment comparing the improved NETCONF with the standard NETCONF to measure factors, such as query processing ratio, query processing speed, and CPU utilization. The improved NETCONF demonstrated excellent query processing ratio and query processing speed, whereas the standard NETCONF had excellent CPU utilization.

A Workqueue Replication Scheduling Algorithm Using Static Information on Grid Systems (그리드 시스템에서 정적정보를 활용한 작업큐 중복 스케줄링 알고리즘)

  • Kang, Oh-Han;Kang, Sang-Sung;Song, Hee-Heon
    • The KIPS Transactions:PartA
    • /
    • v.16A no.1
    • /
    • pp.9-16
    • /
    • 2009
  • Because Grid system consists of heterogenous computing resources, which are distributed on a wide scale, it is impossible to efficiently execute applications with scheduling algorithms of a conventional parallel system that, in contrast, aim at homogeneous and controllable resources. To suggest an algorithm that can fully reflect the characteristics of a grid system, our research is focused on examining the type of information used in current scheduling algorithms and consequently, deriving factors that could develop algorithms further. The results from the analysis of these algorithms not only show that static information of resources such as capacity or the number of processors can facilitate the scheduling algorithms but also verified a decrease in efficiency in case of utilizing real time load information of resources due to the intrinsic characteristics of a grid system relatively long computing time, and the need for the means to evade unfeasible resources or ones with slow processing time. In this paper, we propose a new algorithm, which is revised to reflect static information in the logic of WQR(Workqueue Replication) algorithms and show that it provides better performance than the one used in the existing method through simulation.

Parallel k-Modes Algorithm for Spark Framework (스파크 프레임워크를 위한 병렬적 k-Modes 알고리즘)

  • Chung, Jaehwa
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.10
    • /
    • pp.487-492
    • /
    • 2017
  • Clustering is a technique which is used to measure similarities between data in big data analysis and data mining field. Among various clustering methods, k-Modes algorithm is representatively used for categorical data. To increase the performance of iterative-centric tasks such as k-Modes, a distributed and concurrent framework Spark has been received great attention recently because it overcomes the limitation of Hadoop. Spark provides an environment that can process large amount of data in main memory using the concept of abstract objects called RDD. Spark provides Mllib, a dedicated library for machine learning, but Mllib only includes k-means that can process only continuous data, so there is a limitation that categorical data processing is impossible. In this paper, we design RDD for k-Modes algorithm for categorical data clustering in spark environment and implement an algorithm that can operate effectively. Experiments show that the proposed algorithm increases linearly in the spark environment.

Spatial Computation on Spark Using GPGPU (GPGPU를 활용한 스파크 기반 공간 연산)

  • Son, Chanseung;Kim, Daehee;Park, Neungsoo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.5 no.8
    • /
    • pp.181-188
    • /
    • 2016
  • Recently, as the amount of spatial information increases, an interest in the study of spatial information processing has been increased. Spatial database systems extended from the traditional relational database systems are difficult to handle large data sets because of the scalability. SpatialHadoop extended from Hadoop system has a low performance, because spatial computations in SpationHadoop require a lot of write operations of intermediate results to the disk, resulting in the performance degradation. In this paper, Spatial Computation Spark(SC-Spark) is proposed, which is an in-memory based distributed processing framework. SC-Spark is extended from Spark in order to efficiently perform the spatial operation for large-scale data. In addition, SC-Spark based on the GPGPU is developed to improve the performance of the SC-Spark. SC-Spark uses the advantage of the Spark holding intermediate results in the memory. And GPGPU-based SC-Spark can perform spatial operations in parallel using a plurality of processing elements of an GPU. To verify the proposed work, experiments on a single AMD system were performed using SC-Spark and GPGPU-based SC-Spark for Point-in-Polygon and spatial join operation. The experimental results showed that the performance of SC-Spark and GPGPU-based SC-Spark were up-to 8 times faster than SpatialHadoop.

Reconfigurable Architecture Design for H.264 Motion Estimation and 3D Graphics Rendering of Mobile Applications (이동통신 단말기를 위한 재구성 가능한 구조의 H.264 인코더의 움직임 추정기와 3차원 그래픽 렌더링 가속기 설계)

  • Park, Jung-Ae;Yoon, Mi-Sun;Shin, Hyun-Chul
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.1
    • /
    • pp.10-18
    • /
    • 2007
  • Mobile communication devices such as PDAs, cellular phones, etc., need to perform several kinds of computation-intensive functions including H.264 encoding/decoding and 3D graphics processing. In this paper, new reconfigurable architecture is described, which can perform either motion estimation for H.264 or rendering for 3D graphics. The proposed motion estimation techniques use new efficient SAD computation ordering, DAU, and FDVS algorithms. The new approach can reduce the computation by 70% on the average than that of JM 8.2, without affecting the quality. In 3D rendering, midline traversal algorithm is used for parallel processing to increase throughput. Memories are partitioned into 8 blocks so that 2.4Mbits (47%) of memory is shared and selective power shutdown is possible during motion estimation and 3D graphics rendering. Processing elements are also shared to further reduce the chip area by 7%.

Array Localization for Multithreaded Code Generation (다중스레드 코드 생성을 위한 배열 지역화)

  • Yang, Chang-Mo;Yu, Won-Hui
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.6
    • /
    • pp.1407-1417
    • /
    • 1996
  • In recent researches on thread partitioning algorithms break a thread at the long latency operation and merge threads to get the longer threads under the given constraints. Due to this limitation, even a program with little parallelism is partitioned into small-sized threads and context-swithings occur frequently. In the paper, we propose another method array localization about the array name, dependence distance(the difference of accessed element index from loop index), and the element usage that indicates whether element is used or defined. Using this information we can allocate array elements to the node where the corresponding loop activation is executed. By array localization, remote accesses to array elements can be replaced with local accesses to localized array elements. As a resuit,the boundaries of some threads are removed, programs can be partitioned into the larger threads and the number of context switchings reduced.

  • PDF

Hierarchical Visualization of Cloud-Based Social Network Service Using Fuzzy (퍼지를 이용한 클라우드 기반의 소셜 네트워크 서비스 계층적 시각화)

  • Park, Sun;Kim, Yong-Il;Lee, Seong Ro
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38B no.7
    • /
    • pp.501-511
    • /
    • 2013
  • Recently, the visualization method of social network service have been only focusing on presentation of visualizing network data, which the methods do not consider an efficient processing speed and computational complexity for increasing at the ratio of arithmetical of a big data regarding social networks. This paper proposes a cloud based on visualization method to visualize a user focused hierarchy relationship between user's nodes on social network. The proposed method can intuitionally understand the user's social relationship since the method uses fuzzy to represent a hierarchical relationship of user nodes of social network. It also can easily identify a key role relationship of users on social network. In addition, the method uses hadoop and hive based on cloud for distributed parallel processing of visualization algorithm, which it can expedite the big data of social network.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.

Binary Locally Repairable Codes from Complete Multipartite Graphs (완전다분할그래프 기반 이진 부분접속복구 부호)

  • Kim, Jung-Hyun;Nam, Mi-Young;Song, Hong-Yeop
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.9
    • /
    • pp.1734-1740
    • /
    • 2015
  • This paper introduces a generalized notion, referred to as joint locality, of the usual locality in distributed storage systems and proposes a code construction of binary locally repairable codes with joint locality ($r_1$=2, $r_2$=3 or 4). Joint locality is a set of numbers of nodes for repairing various failure patterns of nodes. The proposed scheme simplifies the code design problem utilizing complete multipartite graphs. Moreover, our construction can generate binary locally repairable codes achieving (2,t)-availability for any positive integer t. It means that each node can be repaired by t disjoint repair sets of cardinality 2. This property is useful for distributed storage systems since it permits parallel access to hot data.

Adaptive Dynamic Load Balancing Strategies for Network-based Cluster Systems (네트워크 기반 클러스터 시스템을 위한 적응형 동적 부하균등 방법)

  • Jeong, Hun-Jin;Jeong, Jin-Ha;Choe, Sang-Bang
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.28 no.11
    • /
    • pp.549-560
    • /
    • 2001
  • Cluster system provides attractive scalability in terms of compution power and memory size. With the advances in high speed computer network technology, cluster systems are becoming increasingly competitive compared to expensive MPPs (massively parallel processors). Load balancing is very important issue since an inappropriate scheduling of tasks cannot exploit the true potential of the system and can offset the gain from parallelization. In parallel processing program, it is difficult to predict the load of each task before running the program. Furthermore, tasks are interdependent each other in many ways. The dynamic load balancing algorithm, which evaluates each processor's load in runtime, partitions each task into the appropriate granularity and assigns them to processors in proportion to their performance in cluster systems. However, if the communication cost between processing nodes is expensive, it is not efficient for all nodes to attend load balancing process. In this paper, we restrict a processor that attend load balancing by the communication cost and the deviation of its load from the average. We simulate various models of the cluster system with parameters such as communication cost, node number, and range of workload value to compare existing load balancing methods with the proposed dynamic algorithms.

  • PDF