Search | Korea Science

A Study on the Effect of the Name Node and Data Node on the Big Data Processing Performance in a Hadoop Cluster (Hadoop 클러스터에서 네임 노드와 데이터 노드가 빅 데이터처리 성능에 미치는 영향에 관한 연구)

Lee, Younghun;Kim, Yongil
- Smart Media Journal
- /
- v.6 no.3
- /
- pp.68-74
- /
- 2017
Big data processing processes various types of data such as files, images, and video to solve problems and provide insightful useful information. Currently, various platforms are used for big data processing, but many organizations and enterprises are using Hadoop for big data processing due to the simplicity, productivity, scalability, and fault tolerance of Hadoop. In addition, Hadoop can build clusters on various hardware platforms and handle big data by dividing into a name node (master) and a data node (slave). In this paper, we use a fully distributed mode used by actual institutions and companies as an operation mode. We have constructed a Hadoop cluster using a low-power and low-cost single board for smooth experiment. The performance analysis of Name node is compared through the same data processing using single board and laptop as name nodes. Analysis of influence by number of data nodes increases the number of data nodes by two times from the number of existing clusters. The effect of the above experiment was analyzed.
PDF KSCI

A Study on Finding Emergency Conditions for Automatic Authentication Applying Big Data Processing and AI Mechanism on Medical Information Platform

Ham, Gyu-Sung;Kang, Mingoo;Joo, Su-Chong
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.16 no.8
- /
- pp.2772-2786
- /
- 2022
We had researched an automatic authentication-supported medical information platform[6]. The proposed automatic authentication consists of user authentication and mobile terminal authentication, and the authentications are performed simultaneously in patients' emergency conditions. In this paper, we studied on finding emergency conditions for the automatic authentication by applying big data processing and AI mechanism on the extended medical information platform with an added edge computing system. We used big data processing, SVM, and 1-Dimension CNN of AI mechanism to find emergency conditions as authentication means considering patients' underlying diseases such as hypertension, diabetes mellitus, and arrhythmia. To quickly determine a patient's emergency conditions, we placed edge computing at the end of the platform. The medical information server derives patients' emergency conditions decision values using big data processing and AI mechanism and transmits the values to an edge node. If the edge node determines the patient emergency conditions, the edge node notifies the emergency conditions to the medical information server. The medical server transmits an emergency message to the patient's charge medical staff. The medical staff performs the automatic authentication using a mobile terminal. After the automatic authentication is completed, the medical staff can access the patient's upper medical information that was not seen in the normal condition.
https://doi.org/10.3837/tiis.2022.08.017 인용 PDF KSCI HTML

An Analysis of Utilization on Virtualized Computing Resource for Hadoop and HBase based Big Data Processing Applications (Hadoop과 HBase 기반의 빅 데이터 처리 응용을 위한 가상 컴퓨팅 자원 이용률 분석)

Cho, Nayun;Ku, Mino;Kim, Baul;Xuhua, Rui;Min, Dugki
- Journal of Information Technology and Architecture
- /
- v.11 no.4
- /
- pp.449-462
- /
- 2014
In big data era, there are a number of considerable parts in processing systems for capturing, storing, and analyzing stored or streaming data. Unlike traditional data handling systems, a big data processing system needs to concern the characteristics (format, velocity, and volume) of being handled data in the system. In this situation, virtualized computing platform is an emerging platform for handling big data effectively, since virtualization technology enables to manage computing resources dynamically and elastically with minimum efforts. In this paper, we analyze resource utilization of virtualized computing resources to discover suitable deployment models in Apache Hadoop and HBase-based big data processing environment. Consequently, Task Tracker service shows high CPU utilization and high Disk I/O overhead during MapReduce phases. Moreover, HRegion service indicates high network resource consumption for transfer the traffic data from DataNode to Task Tracker. DataNode shows high memory resource utilization and Disk I/O overhead for reading stored data.
KSCI

In-Memory Based Incremental Processing Method for Stream Query Processing in Big Data Environments (빅데이터 환경에서 스트림 질의 처리를 위한 인메모리 기반 점진적 처리 기법)

Bok, Kyoungsoo;Yook, Misun;Noh, Yeonwoo;Han, Jieun;Kim, Yeonwoo;Lim, Jongtae;Yoo, Jaesoo
- The Journal of the Korea Contents Association
- /
- v.16 no.2
- /
- pp.163-173
- /
- 2016
Recently, massive amounts of stream data have been studied for distributed processing. In this paper, we propose an incremental stream data processing method based on in-memory in big data environments. The proposed method stores input data in a temporary queue and compare them with data in a master node. If the data is in the master node, the proposed method reuses the previous processing results located in the node chosen by the master node. If there are no previous results of data in the node, the proposed method processes the data and stores the result in a separate node. We also propose a job scheduling technique considering the load and performance of a node. In order to show the superiority of the proposed method, we compare it with the existing method in terms of query processing time. Our experimental results show that our method outperforms the existing method in terms of query processing time.
https://doi.org/10.5392/JKCA.2016.16.02.163 인용 PDF KSCI

Study of MongoDB Architecture by Data Complexity for Big Data Analysis System (빅데이터 분석 시스템 구현을 위한 데이터 구조의 복잡성에 따른 MongoDB 환경 구성 연구)

Hyeopgeon Lee;Young-Woon Kim;Jin-Woo Lee;Seong Hyun Lee
- The Journal of Korea Institute of Information, Electronics, and Communication Technology
- /
- v.16 no.5
- /
- pp.354-361
- /
- 2023
Big data analysis systems apply NoSQL databases like MongoDB to store, process, and analyze diverse forms of large-scale data. MongoDB offers scalability and fast data processing speeds through distributed processing and data replication, depending on its configuration. This paper investigates the suitable MongoDB environment configurations for implementing big data analysis systems. For performance evaluation, we configured both single-node and multi-node environments. In the multi-node setup, we expanded the number of data nodes from two to three and measured the performance in each environment. According to the analysis, the processing speeds for complex data structures with three or more dimensions are approximately 5.75% faster in the single-node environment compared to an environment with two data nodes. However, a setting with three data nodes processes data about 25.15% faster than the single-node environment. On the other hand, for simple one-dimensional data structures, the multi-node environment processes data approximately 28.63% faster than the single-node environment. Further research is needed to practically validate these findings with diverse data structures and large volumes of data.
https://doi.org/10.17661/jkiiect.2023.16.5.354 인용 PDF HTML

Estimating Station Transfer Trips of Seoul Metropolitan Urban Railway Stations -Using Transportation Card Data - (수도권 도시철도 역사환승량 추정방안 -교통카드자료를 활용하여 -)

Lee, Mee-Young
- KSCE Journal of Civil and Environmental Engineering Research
- /
- v.38 no.5
- /
- pp.693-701
- /
- 2018
Transfer types at the Seoul Metropolitan Urban Railway Stations can be classified into transfer between lines and station transfer. Station transfer is defined as occurring when either 1) the operating line that operates the tag-in card-reader and that operating the first train boarded by the passenger are different; or 2) the line operating the final alighted train and that operating the tag-out card-reader are different. In existing research, transportation card data is used to estimate transfer volume between lines, but excludes station transfer volume which leads to underestimation of volume through transfer passages. This research applies transportation card data to a method for station transfer volume estimation. To achieve this, the passenger path choice model is made appropriate for station transfer estimation using a modified big-node based network construction and data structure method. Case study analysis is performed using about 8 million daily data inputs from the metropolitan urban railway.
https://doi.org/10.12652/Ksce.2018.38.5.0693 인용 PDF KSCI

Big Data Refining System for Environmental Sensor of Continuous Manufacturing Process using IIoT Middleware Platform (IIoT 미들웨어 플랫폼을 활용한 연속 제조공정의 환경센서 빅데이터 정제시스템)

Yoon, Yeo-Jin;Kim, Tea-Hyung;Lee, Jun-Hee;Kim, Young-Gon
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.18 no.4
- /
- pp.219-226
- /
- 2018
IIoT(Industrial Internet of Thing) means that all manufacturing processes are informed beyond the conventional automation of process automation. The objective of the system is to build an information system based on the data collected from the sensors installed in each process and to maintain optimal productivity by managing and automating each process in real time. Data collected from sensors in each process is unstructured and many studies have been conducted to collect and process such unstructured data effectively. In this paper, we propose a system using Node-RED as middleware for effective big data collection and processing.
https://doi.org/10.7236/JIIBC.2018.18.4.219 인용 PDF KSCI

The Big Data Analysis and Medical Quality Management for Wellness (웰니스를 위한 빅데이터 분석과 의료 질 관리)

Cho, Young-Bok;Woo, Sung-Hee;Lee, Sang-Ho
- Journal of the Korea Society of Computer and Information
- /
- v.19 no.12
- /
- pp.101-109
- /
- 2014
Medical technology development and increase the income level of a "Long and healthy Life=Wellness," with the growing interest in actively promoting and maintaining health and wellness has become enlarged. In addition, the demand for personalized health care services is growing and extensive medical moves of big data, disease prevention, too. In this paper, the main interest in the market, highlighting wellness in order to support big data-driven healthcare quality through patient-centered medical services purposes. Patients with drug dependence treatment is not to diet but to improve disease prevention and treatment based on analysis of big data. Analysing your Tweets-daily information and wellness disease prevention and treatment, based on the purpose of the dictionary. Efficient big data analysis for node while increasing processing time experiment. Test result case of total access time efficient 26% of one node to three nodes and case of data storage is 63%, case of data aggregate is 18% efficient of one node to three nodes.
https://doi.org/10.9708/jksci.2014.19.12.101 인용 PDF KSCI

A Novel Node Management in Hadoop Cluster by using DNA

Balaraju. J;PVRD. Prasada Rao
- International Journal of Computer Science & Network Security
- /
- v.23 no.9
- /
- pp.134-140
- /
- 2023
The distributed system is playing a vital role in storing and processing big data and data generation is speedily increasing from various sources every second. Hadoop has a scalable, and efficient distributed system supporting commodity hardware by combining different networks in the topographical locality. Node support in the Hadoop cluster is rapidly increasing in different versions which are facing difficulty to manage clusters. Hadoop does not provide Node management, adding and deletion node futures. Node identification in a cluster completely depends on DHCP servers which managing IP addresses, hostname based on the physical address (MAC) address of each Node. There is a scope to the hacker to theft the data using IP or Hostname and creating a disturbance in a distributed system by adding a malicious node, assigning duplicate IP. This paper proposing novel node management for the distributed system using DNA hiding and generating a unique key using a unique physical address (MAC) of each node and hostname. The proposed mechanism is providing better node management for the Hadoop cluster providing adding and deletion node mechanism by using limited computations and providing better node security from hackers. The main target of this paper is to propose an algorithm to implement Node information hiding in DNA sequences to increase and provide security to the node from hackers.
https://doi.org/10.22937/IJCSNS.2023.23.9.17 인용 PDF

Neighbor Cooperation Based In-Network Caching for Content-Centric Networking

Luo, Xi;An, Ying
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.11 no.5
- /
- pp.2398-2415
- /
- 2017
Content-Centric Networking (CCN) is a new Internet architecture with routing and caching centered on contents. Through its receiver-driven and connectionless communication model, CCN natively supports the seamless mobility of nodes and scalable content acquisition. In-network caching is one of the core technologies in CCN, and the research of efficient caching scheme becomes increasingly attractive. To address the problem of unbalanced cache load distribution in some existing caching strategies, this paper presents a neighbor cooperation based in-network caching scheme. In this scheme, the node with the highest betweenness centrality in the content delivery path is selected as the central caching node and the area of its ego network is selected as the caching area. When the caching node has no sufficient resource, part of its cached contents will be picked out and transferred to the appropriate neighbor by comprehensively considering the factors, such as available node cache, cache replacement rate and link stability between nodes. Simulation results show that our scheme can effectively enhance the utilization of cache resources and improve cache hit rate and average access cost.
https://doi.org/10.3837/tiis.2017.05.005 인용 PDF KSCI

Search Result 125, Processing Time 0.019 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)