• Title/Summary/Keyword: Distributed Processing

Search Result 2,316, Processing Time 0.032 seconds

Performance Comparison of Python and Scala APIs in Spark Distributed Cluster Computing System (Spark 기반에서 Python과 Scala API의 성능 비교 분석)

  • Ji, Keung-yeup;Kwon, Youngmi
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.2
    • /
    • pp.241-246
    • /
    • 2020
  • Hadoop is a framework to process large data sets in a distributed way across clusters of nodes. It has been a popular platform to process big data, but in recent years, other platforms became competitive ones depending on the characteristics of the application. Spark is one of distributed platforms to enable real-time data processing and improve overall processing performance over Hadoop by introducing in-memory processing instead of disk I/O. Whereas Hadoop is designed to work on Java and data analysis is processed using Java API, Spark provides a variety of APIs with Scala, Python, Java and R. In this paper, the goal is to find out whether the APIs of different programming languages af ect the performances in Spark. We chose two popular APIs: Python and Scala. Python is easy to learn and is used in AI domain in a wide range. Scala is a programming language with advantages of parallelism. Our experiment shows much faster processing with Scala API than Python API. For the performance issues on AI-based analysis, further study is needed.

Effects of Hypervisor on Distributed Big Data Processing in Virtualizated Cluster Environment (가상화 클러스터 환경에서 빅 데이터 분산 처리 성능에 하이퍼바이저가 미치는 영향)

  • Chung, Haejin;Nah, Yunmook
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.2
    • /
    • pp.89-94
    • /
    • 2016
  • Recently, cluster computing environments have been in a process of change toward virtualized cluster environments. The change of the cluster environment has great impact on the performance of large volume distributed processing. Therefore, many domestic and international IT companies have invested heavily in research on cluster environments. In this paper, we show how the hypervisor affects the performance of distributed processing of a large volume of data. We present a performance comparison of MapReduce processing in two virtualized cluster environments, one built using the Xen hypervisor and the other built using the container-based Docker. Our results show that Docker is faster than Xen.

The Distributed Encryption Processing System for Large Capacity Personal Information based on MapReduce (맵리듀스 기반 대용량 개인정보 분산 암호화 처리 시스템)

  • Kim, Hyun-Wook;Park, Sung-Eun;Euh, Seong-Yul
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.3
    • /
    • pp.576-585
    • /
    • 2014
  • Collecting and utilizing have a huge amount of personal data have caused severe security issues such as leakage of personal information. Several encryption algorithms for collected personal information have been widely adopted to prevent such problems. In this paper, a novel algorithm based on MapReduce is proposed for encrypting such private information. Furthermore, test environment has been built for the performance verification of the distributed encryption processing method. As the result of the test, average time efficiency has improved to 15.3% compare to encryption processing of token server and 3.13% compare to parallel processing.

A Simulated Distributed Database System for Response Time Evaluation (응답시간평가를 위한 분산데이터베이스 시뮬레이션시스템)

  • Rho, Sang-Kyu
    • Asia pacific journal of information systems
    • /
    • v.7 no.3
    • /
    • pp.23-37
    • /
    • 1997
  • Although numerous models and solution algorithms to design efficient distributed databases have been developed, very few have been validated for their effectiveness. In this paper, we develop a simulation system which can be used to analyze and validate the average response time of distributed database designs. Our simulation system models comprehensive query processing strategies such as semijoin as well as a concurrency control mechanism We analyze and validate an average response time distributed database design model using our simulation system.

  • PDF

An Intelligent New Dynamic Load Redistribution Mechanism in Distributed Environments

  • Lee, Seong-Hoon
    • International Journal of Contents
    • /
    • v.3 no.1
    • /
    • pp.34-38
    • /
    • 2007
  • Load redistribution is a critical resource in computer system. In sender-initiated load redistribution algorithms, the sender continues to send unnecessary request messages for load transfer until a receiver is found while the system load is heavy. These unnecessary request messages result in inefficient communications, low CPU utilization, and low system throughput in distributed systems. To solve these problems, we propose a genetic algorithm based approach for improved sender-initiated load redistribution in distributed systems. Compared with the conventional sender-initiated algorithms, the proposed algorithm decreases the response time and task processing time.

Towards the Distributed Brain for Collectively Behaving Robots

  • Tomoo, Aoyama;Zhang, Y.G.
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.88.1-88
    • /
    • 2001
  • The paper describes a new approach to the organization of an artificial brain for mobile multi-robot systems, where individual robots are not considered as independent entities, but rather forming together a universal parallel and distributed machine capable of processing both information and physical matter in distributed worlds. This spatial machine, operating without any central control, is driven on top by distributed mission scenarios in WAVE-WP language. The scenarios can be written on a variety of levels, and any mixture of them, supporting the needed system flexibility and freedom ...

  • PDF

Hilbert-curve based Multi-dimensional Indexing Key Generation Scheme and Query Processing Algorithm for Encrypted Databases (암호화 데이터를 위한 힐버트 커브 기반 다차원 색인 키 생성 및 질의처리 알고리즘)

  • Kim, Taehoon;Jang, Miyoung;Chang, Jae-Woo
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.10
    • /
    • pp.1182-1188
    • /
    • 2014
  • Recently, the research on database outsourcing has been actively done with the popularity of cloud computing. However, because users' data may contain sensitive personal information, such as health, financial and location information, the data encryption methods have attracted much interest. Existing data encryption schemes process a query without decrypting the encrypted databases in order to support user privacy protection. On the other hand, to efficiently handle the large amount of data in cloud computing, it is necessary to study the distributed index structure. However, existing index structure and query processing algorithms have a limitation that they only consider single-column query processing. In this paper, we propose a grid-based multi column indexing scheme and an encrypted query processing algorithm. In order to support multi-column query processing, the multi-dimensional index keys are generated by using a space decomposition method, i.e. grid index. To support encrypted query processing over encrypted data, we adopt the Hilbert curve when generating a index key. Finally, we prove that the proposed scheme is more efficient than existing scheme for processing the exact and range query.

On the Current Status and Future Trend of Distributed Object System (분산 객체 시스템의 현찰과 기술 전망)

  • 윤석환;김평중
    • Journal of the Korean Professional Engineers Association
    • /
    • v.30 no.2
    • /
    • pp.79-86
    • /
    • 1997
  • As network has high speed and wide communication capability, users demand diverse and new software to satisfy their needs. To meet users needs, the softwares for multimedia or groupware or distributed virtual environments can communicate the widely distributed information fast and accurately. Even though the technology for this is under development, it is deficient to support the reliable computer communication. Distributed Object System aims, as the new paradigm of distributed system software development to overcome this problem, to obtain in distributed environment the easiness of development and management, expandability, reusability which object oriented technologies support by solving the complexity of communication processing through the object oriented methods. This paper aims to introduce distributed object system, its technological properties and the current status and trend of technology development related to its standardization. Additionally, with explaining the Replicated Shared Object System(RSOS) which is developed in our country as one of the distributed object systems, its future prospects and technical issues are discussed.

  • PDF

In-network Distributed Event Boundary Computation in Wireless Sensor Networks: Challenges, State of the art and Future Directions

  • Jabeen, Farhana;Nawaz, Sarfraz
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.11
    • /
    • pp.2804-2823
    • /
    • 2013
  • Wireless sensor network (WSN) is a promising technology for monitoring physical phenomena at fine-grained spatial and temporal resolution. However, the typical approach of sending each sensed measurement out of the network for detailed spatial analysis of transient physical phenomena may not be an efficient or scalable solution. This paper focuses on in-network physical phenomena detection schemes, particularly the distributed computation of the boundary of physical phenomena (i.e. event), to support energy efficient spatial analysis in wireless sensor networks. In-network processing approach reduces the amount of network traffic and thus achieves network scalability and lifetime longevity. This study investigates the recent advances in distributed event detection based on in-network processing and includes a concise comparison of various existing schemes. These boundary detection schemes identify not only those sensor nodes that lie on the boundary of the physical phenomena but also the interior nodes. This constitutes an event geometry which is a basic building block of many spatial queries. In this paper, we introduce the challenges and opportunities for research in the field of in-network distributed event geometry boundary detection as well as illustrate the current status of research in this field. We also present new areas where the event geometry boundary detection can be of significant importance.

A Study on the Efficient Energy Management using Mobility Management in Distributed Wireless Network Environments (분산 무선 네트워크 환경에서의 이동성 관리를 통한 효율적인 에너지 사용에 관한 연구)

  • Kim, Tae-Kyung
    • Journal of Internet Computing and Services
    • /
    • v.8 no.3
    • /
    • pp.57-63
    • /
    • 2007
  • Providing the sufficient energy to the mobile device is essential to process the job in distributed wireless network. To solve the restrained conditions of energy problems of mobile devices, this paper suggests the efficient method of processing the distributed job using mobility management in wireless network. Energy consumption can be analyzed using the statistical model and required energy of processing the distributed job in mobile device can be predicted using the mobility management. Therefore, this paper suggests the reliable algorithm to process distributed job through the mobile devices with regular mobility and shows the efficiency of the suggested algorithm.

  • PDF