Search | Korea Science

A Study on the Effect of the Name Node and Data Node on the Big Data Processing Performance in a Hadoop Cluster (Hadoop 클러스터에서 네임 노드와 데이터 노드가 빅 데이터처리 성능에 미치는 영향에 관한 연구)

Lee, Younghun;Kim, Yongil
- Smart Media Journal
- /
- v.6 no.3
- /
- pp.68-74
- /
- 2017
Big data processing processes various types of data such as files, images, and video to solve problems and provide insightful useful information. Currently, various platforms are used for big data processing, but many organizations and enterprises are using Hadoop for big data processing due to the simplicity, productivity, scalability, and fault tolerance of Hadoop. In addition, Hadoop can build clusters on various hardware platforms and handle big data by dividing into a name node (master) and a data node (slave). In this paper, we use a fully distributed mode used by actual institutions and companies as an operation mode. We have constructed a Hadoop cluster using a low-power and low-cost single board for smooth experiment. The performance analysis of Name node is compared through the same data processing using single board and laptop as name nodes. Analysis of influence by number of data nodes increases the number of data nodes by two times from the number of existing clusters. The effect of the above experiment was analyzed.
PDF KSCI

A Study on Routing Message Retransmission Scheme for Big data (빅데이터를 위한 라우팅 메시지 재전송 기법 연구)

Lee, Byung-Jun;Youn, Hee-Yong
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2014.01a
- /
- pp.395-396
- /
- 2014
최근 소셜 네트워크 서비스로 대표되는 정보유통 매체의 급격한 발전으로 인해 데이터 빅뱅(Data Big Bang)이라 할 수 있는 데이터의 폭발적인 증가 현상에 따라 빅데이터에 대한 관심이 급격히 증대되고 있다. 빅데이터 관련 기술들은 기본적으로 대용량 데이터를 하나의 노드로 관리하는 것이 아닌 여러 노드를 연결하기 때문에 효율적인 데이터 관리를 위해서 노드 간 연결을 담당하는 라우팅 알고리즘의 중요성 역시 대두되고 있다. 본 논문에서 대용량 데이터를 위한 효율적 라우팅 알고리즘을 위해 새로운 라우팅 메시지 재전송 기법을 위한 혼잡 확률 연산 알고리즘을 제안한다.
PDF

Study of In-Memory based Hybrid Big Data Processing Scheme for Improve the Big Data Processing Rate (빅데이터 처리율 향상을 위한 인-메모리 기반 하이브리드 빅데이터 처리 기법 연구)

Lee, Hyeopgeon;Kim, Young-Woon;Kim, Ki-Young
- The Journal of Korea Institute of Information, Electronics, and Communication Technology
- /
- v.12 no.2
- /
- pp.127-134
- /
- 2019
With the advancement of IT technology, the amount of data generated has been growing exponentially every year. As an alternative to this, research on distributed systems and in-memory based big data processing schemes has been actively underway. The processing power of traditional big data processing schemes enables big data to be processed as fast as the number of nodes and memory capacity increases. However, the increase in the number of nodes inevitably raises the frequency of failures in a big data infrastructure environment, and infrastructure management points and infrastructure operating costs also increase accordingly. In addition, the increase in memory capacity raises infrastructure costs for a node configuration. Therefore, this paper proposes an in-memory-based hybrid big data processing scheme for improve the big data processing rate. The proposed scheme reduces the number of nodes compared to traditional big data processing schemes based on distributed systems by adding a combiner step to a distributed system processing scheme and applying an in-memory based processing technology at that step. It decreases the big data processing time by approximately 22%. In the future, realistic performance evaluation in a big data infrastructure environment consisting of more nodes will be required for practical verification of the proposed scheme.
https://doi.org/10.17661/jkiiect.2019.12.2.127 인용 PDF KSCI HTML

A Study on Modified PBFT Study for Effective Convergence of IoT Big Data and Blockchain Technology (IoT 빅데이터와 블록체인 기술의 효과적 융합을 위한 수정된 PBFT연구)

Baek, Yeong-Tae;Min, Youn-A
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2020.01a
- /
- pp.193-194
- /
- 2020
블록체인의 활용이 다양해지며 블록체인을 통한 산업, 정부의 기술적용이 확산되고 있다. 특히 사물인터넷 등 빅데이터 관리를 위한 방법으로 블록체인과의 융합도 적지 않게 거론되고 있다. 사물인터넷과 같은 빅데이터를 효과적으로 관리하기 위해서는 수집 및 저장과정과 더불어 투명하고 정확한 신뢰기반의 데이터 관리가 필요하다. 현재 블록체인의 프라이빗 블록체인 플랫폼에서 가장 많이 제시되고 활용되는 합의알고리즘은 PBFT이다. PBFT의 경우 노드 증가에 따른 연산알고리즘의 과중으로 인한 속도저하가 문제가 될 수 있다. 본 논문에서는 PBFT의 합의과정에 대한 알고리즘을 수정하여 노드 증가 시에도 복잡도를 낮출 수 있는 방법을 제안하였다. 본 논문에서는 시뮬레이션을 통하여 노드 개수를 변형하며 기존 PBFT알고리즘 대비 제안 알고리즘의 우수성을 증명한다.
PDF

Management of Distributed Nodes for Big Data Analysis in Small-and-Medium Sized Hospital (중소병원에서의 빅데이터 분석을 위한 분산 노드 관리 방안)

Ryu, Wooseok
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2016.05a
- /
- pp.376-377
- /
- 2016
Performance of Hadoop, which is a distributed data processing framework for big data analysis, is affected by several characteristics of each node in distributed cluster such as processing power and network bandwidth. This paper analyzes previous approaches for heterogeneous hadoop clusters, and presents several requirements for distributed node clustering in small-and-medium sized hospitals by considering computing environments of the hospitals.
PDF

In-Memory Based Incremental Processing Method for Stream Query Processing in Big Data Environments (빅데이터 환경에서 스트림 질의 처리를 위한 인메모리 기반 점진적 처리 기법)

Bok, Kyoungsoo;Yook, Misun;Noh, Yeonwoo;Han, Jieun;Kim, Yeonwoo;Lim, Jongtae;Yoo, Jaesoo
- The Journal of the Korea Contents Association
- /
- v.16 no.2
- /
- pp.163-173
- /
- 2016
Recently, massive amounts of stream data have been studied for distributed processing. In this paper, we propose an incremental stream data processing method based on in-memory in big data environments. The proposed method stores input data in a temporary queue and compare them with data in a master node. If the data is in the master node, the proposed method reuses the previous processing results located in the node chosen by the master node. If there are no previous results of data in the node, the proposed method processes the data and stores the result in a separate node. We also propose a job scheduling technique considering the load and performance of a node. In order to show the superiority of the proposed method, we compare it with the existing method in terms of query processing time. Our experimental results show that our method outperforms the existing method in terms of query processing time.
https://doi.org/10.5392/JKCA.2016.16.02.163 인용 PDF KSCI

A Cloud-based Big Data System for Performance Comparison of Edge Computing (Edge Computing 성능 비교를 위한 Cloud 기반 빅데이터 시스템 구축 방안)

Lim, Hwan-Hee;Lee, Tae-Ho;Lee, Byung-Jun;Kim, Kyung-Tae;Youn, Hee-Yong
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2019.01a
- /
- pp.5-6
- /
- 2019
Edge Computing에서 발생하는 데이터 분석에 대한 알고리즘의 성능 평가나 검증은 필수적이다. 이러한 평가 및 검증을 위해서는 비교 가능한 데이터가 필요하다. 본 논문에서는 Edge Computing에서 발생하는 데이터에 대한 분석 결과 및 Computing Resource에 대한 성능평가를 위해 Cloud 기반의 빅 데이터 분석시스템을 구축한다. Edge Computing 비교분석 빅 데이터 시스템은 실제 IoT 노드에서 Edge Computing을 수행할 때와 유사한 환경을 Cloud 상에 구축하고 연구되는 Edge Computing 알고리즘을 Data Analysis Cluster Container에 탑재해 분석을 시행한다. 그리고 분석 결과와 Computing Resource 사용률 데이터를 기존 IoT 노드 Edge Computing 데이터와 비교하여 개선점을 도출하는 것이 본 논문의 목표이다.
PDF

Distributed Framework for Data Processing of IoT Node (IoT 노드의 데이터 처리를 위한 분산 프레임워크)

Kim, Min-Woo;Lee, Tae-Ho;Lee, Byung-Jun;Kim, Kyung-Tae;Youn, Hee-Yong
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2018.07a
- /
- pp.215-216
- /
- 2018
분산 컴퓨팅 환경에서 사용되어지는 빅 데이터 파일 시스템은 IoT(Internet of Things) 노드에서 처리해야할 데이터 탐색 시 모든 저장장치를 탐색하기 때문에 속도가 느리며 트래픽으로 인한 오버헤드가 발생할 수 있다. 본 연구에서는 IoT 노드의 분산 컴퓨팅 환경에서 빅 데이터를 좀 더 효율적으로 처리하고 빠른 검색을 위해 머신 러닝 기법을 이용한 분산 프레임워크를 제안하며 IoT 노드에서의 데이터 처리를 위해 다른 저장 장치로의 불필요한 액세스를 사전에 방지하여 빠르고 정확한 연산 결과를 도출하여 효율성을 향상 시키고자 한다.
PDF

Analysis of Encryption Algorithm Performance by Workload in BigData Platform (빅데이터 플랫폼 환경에서의 워크로드별 암호화 알고리즘 성능 분석)

Lee, Sunju;Hur, Junbeom
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.29 no.6
- /
- pp.1305-1317
- /
- 2019
Although encryption for data protection is essential in the big data platform environment of public institutions and corporations, much performance verification studies on encryption algorithms considering actual big data workloads have not been conducted. In this paper, we analyzed the performance change of AES, ARIA, and 3DES for each of six workloads of big data by adding data and nodes in MongoDB environment. This enables us to identify the optimal block-based cryptographic algorithm for each workload in the big data platform environment, and test the performance of MongoDB by testing various workloads in data and node configurations using the NoSQL Database Benchmark (YCSB). We propose an optimized architecture that takes into account.
https://doi.org/10.13089/JKIISC.2019.29.6.1305 인용 PDF KSCI HTML

Method for Selecting a Big Data Package (빅데이터 패키지 선정 방법)

Byun, Dae-Ho
- Journal of Digital Convergence
- /
- v.11 no.10
- /
- pp.47-57
- /
- 2013
Big data analysis needs a new tool for decision making in view of data volume, speed, and variety. Many global IT enterprises are announcing a variety of Big data products with easy to use, best functionality, and modeling capability. Big data packages are defined as a solution represented by analytic tools, infrastructures, platforms including hardware and software. They can acquire, store, analyze, and visualize Big data. There are many types of products with various and complex functionalities. Because of inherent characteristics of Big data, selecting a best Big data package requires expertise and an appropriate decision making method, comparing the selection problem of other software packages. The objective of this paper is to suggest a decision making method for selecting a Big data package. We compare their characteristics and functionalities through literature reviews and suggest selection criteria. In order to evaluate the feasibility of adopting packages, we develop two Analytic Hierarchy Process(AHP) models where the goal node of a model consists of costs and benefits and the other consists of selection criteria. We show a numerical example how the best package is evaluated by combining the two models.
https://doi.org/10.14400/JDPM.2013.11.10.047 인용 PDF

Search Result 101, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)