Search | Korea Science

Design of a Large-scale Task Dispatching & Processing System based on Hadoop (하둡 기반 대규모 작업 배치 및 처리 기술 설계)

Kim, Jik-Soo;Cao, Nguyen;Kim, Seoyoung;Hwang, Soonwook
- Journal of KIISE
- /
- v.43 no.6
- /
- pp.613-620
- /
- 2016
This paper presents a MOHA(Many-Task Computing on Hadoop) framework which aims to effectively apply the Many-Task Computing(MTC) technologies originally developed for high-performance processing of many tasks, to the existing Big Data processing platform Hadoop. We present basic concepts, motivation, preliminary results of PoC based on distributed message queue, and future research directions of MOHA. MTC applications may have relatively low I/O requirements per task. However, a very large number of tasks should be efficiently processed with potentially heavy inter-communications based on files. Therefore, MTC applications can show another pattern of data-intensive workloads compared to existing Hadoop applications, typically based on relatively large data block sizes. Through an effective convergence of MTC and Big Data technologies, we can introduce a new MOHA framework which can support the large-scale scientific applications along with the Hadoop ecosystem, which is evolving into a multi-application platform.
https://doi.org/10.5626/JOK.2016.43.6.613 인용 KSCI

A Study on Adaptive Parallel Computability in Many-Task Computing on Hadoop Framework (하둡 기반 대규모 작업처리 프레임워크에서의 Adaptive Parallel Computability 기술 연구)

Jik-Soo, Kim
- Journal of Broadcast Engineering
- /
- v.24 no.6
- /
- pp.1122-1133
- /
- 2019
We have designed and implemented a new data processing framework called MOHA(Mtc On HAdoop) which can effectively support Many-Task Computing(MTC) applications in a YARN-based Hadoop platform. MTC applications can be composed of a very large number of computational tasks ranging from hundreds of thousands to millions of tasks, and each MTC application may have different resource usage patterns. Therefore, we have implemented MOHA-TaskExecutor(a pilot-job that executes real MTC application tasks)'s Adaptive Parallel Computability which can adaptively execute multiple tasks simultaneously, in order to improve the parallel computability of a YARN container and the overall system throughput. We have implemented multi-threaded version of TaskExecutor which can "independently and dynamically" adjust the number of concurrently running tasks, and in order to find the optimal number of concurrent tasks, we have employed Hill-Climbing algorithm.
https://doi.org/10.5909/JBE.2019.24.6.1122 인용 PDF KSCI KPUBS

A Case Study of Drug Repositioning Simulation based on Distributed Supercomputing Technology (분산 슈퍼컴퓨팅 기술에 기반한 신약재창출 시뮬레이션 사례 연구)

Kim, Jik-Soo;Rho, Seungwoo;Lee, Minho;Kim, Seoyoung;Kim, Sangwan;Hwang, Soonwook
- Journal of KIISE
- /
- v.42 no.1
- /
- pp.15-22
- /
- 2015
In this paper, we present a case study for a drug repositioning simulation based on distributed supercomputing technology that requires highly efficient processing of large-scale computations. Drug repositioning is the application of known drugs and compounds to new indications (i.e., new diseases), and this process requires efficient processing of a large number of docking tasks with relatively short per-task execution times. This mechanism shows the main characteristics of a Many-Task Computing (MTC) application, and as a representative case of MTC applications, we have applied a drug repositioning simulation in our HTCaaS system which can leverage distributed supercomputing infrastructure, and show that efficient task dispatching, dynamic resource allocation and load balancing, reliability, and seamless integration of multiple computing resources are crucial to support these challenging scientific applications.
https://doi.org/10.5626/JOK.2015.42.1.15 인용 KSCI

Efficient Task Distribution for Pig Monitoring Applications Using OpenCL (OpenCL을 이용한 돈사 감시 응용의 효율적인 태스크 분배)

Kim, Jinseong;Choi, Younchang;Kim, Jaehak;Chung, Yeonwoo;Chung, Yongwha;Park, Daihee;Kim, Hakjae
- KIPS Transactions on Computer and Communication Systems
- /
- v.6 no.10
- /
- pp.407-414
- /
- 2017
Pig monitoring applications consisting of many tasks can take advantage of inherent data parallelism and enable parallel processing using performance accelerators. In this paper, we propose a task distribution method for pig monitoring applications into a heterogenous computing platform consisting of a multicore-CPU and a manycore-GPU. That is, a parallel program written in OpenCL is developed, and then the most suitable processor is determined based on the measured execution time of each task. The proposed method is simple but very effective, and can be applied to parallelize other applications consisting of many tasks on a heterogeneous computing platform consisting of a CPU and a GPU. Experimental results show that the performance of the proposed task distribution method on three different heterogeneous computing platforms can improve the performance of the typical GPU-only method where every tasks are executed on a deviceGPU by a factor of 1.5, 8.7 and 2.7, respectively.
https://doi.org/10.3745/KTCCS.2017.6.10.407 인용 PDF KSCI

Applications of R package for statistical engineering (통계공학을 위한 R 패키지 응용)

Jang, Dae-Heung
- The Korean Journal of Applied Statistics
- /
- v.33 no.1
- /
- pp.87-105
- /
- 2020
Statistical engineering contains the design of experiments, quality control/management, and reliability engineering. R is a free software environment for statistical computing and graphics that is supported by the R Foundation for Statistical Computing. R package has many functions and libraries for statistical engineering. We can use R package as a useful tool for statistical engineering. This paper shows the applications of R package for statistical engineering and suggests a R Task View for statistical engineering.
https://doi.org/10.5351/KJAS.2020.33.1.087 인용 PDF KSCI

"Multi-use Data Platform" 하둡 2.0과 관련 데이터 처리 프레임워크 기술

Kim, Jik-Su
- Broadcasting and Media Magazine
- /
- v.22 no.4
- /
- pp.11-17
- /
- 2017
본 고에서는 멀티 응용 데이터 플랫폼으로 진화하고 있는 하둡(Hadoop) 2.0의 주요 특징과 관련된 다양한 데이터 처리 프레임워크들에 대해서 기술하고자 한다. 기존의 맵리듀스(MapReduce) 기반의 배치 처리(Batch Processing)에 최적화되어 있던 하둡 1.0과는 달리, YARN의 등장과 함께 시작된 하둡 2.0 플랫폼은 다양한 형태의 데이터 처리 워크플로우들(Batch, Interactive, Streaming 등)을 지원할 수 있는 기능을 제공하고 있다. 또한, 최근에는 고성능컴퓨팅 분야에서 주로 활용되던 기술들도 하둡 2.0 플랫폼에서 지원되고 있다. 마지막으로 YARN 어플리케이션 개발 사례로서 본 연구팀에서 개발 중에 있는 Many-Task Computing (MTC) 응용을 위한 신규 데이터 처리 프레임워크를 소개한다.
PDF KSCI

An Offloading Decision Scheme Considering the Scheduling Latency of the Cloud in Real-time Applications (실시간 응용에서 클라우드의 스케줄링 지연 시간을 고려한 오프로딩 결정 기법)

Min, Hong;Jung, Jinman;Kim, Bongjae;Heo, Junyoung
- KIISE Transactions on Computing Practices
- /
- v.23 no.6
- /
- pp.392-396
- /
- 2017
Although mobile device-related technologies have developed rapidly, many problems arising from resource constraints have not been solved. Computation offloading that uses resources of cloud servers over the Internet was proposed to overcome physical limitations, and many studies have been conducted in terms of energy saving. However, completing tasks within their deadlines is more important than saving energy in real-time applications. In this paper, we proposed an offloading decision scheme considering the scheduling latency in the cloud to support real-time applications. The proposed scheme can improve the reliability of real-time tasks by comparing the estimated laxity of offloading a task with the estimated laxity of executing a task in a mobile device and selecting a more effective way to satisfy the task's deadline.
https://doi.org/10.5626/KTCP.2017.23.6.392 인용 KSCI

Response Time Analysis Considering Sensing Data Synchronization in Mobile Cloud Applications (모바일 클라우드 응용에서 센싱 데이터 동기화를 고려한 응답 시간 분석)

Min, Hong;Heo, Junyoung
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.15 no.3
- /
- pp.137-141
- /
- 2015
Mobile cloud computing uses cloud service to solve the resource constraint problem of mobile devices. Offloading means that a task executed on the mobile device commits to cloud and many studies related to the energy consumption have been researched. In this paper, we designed a response time model considering sensing data synchronization to estimate the efficiency of the offloading scheme in terms of the response time. The proposed model considers synchronization of required sensing data to improve the accuracy of response time estimation when cloud processes the task requested from a mobile device. We found that the response time is effected by new sensing data generation rate and synchronization period through simulation results.
https://doi.org/10.7236/JIIBC.2015.15.3.137 인용 PDF KSCI

Study on Memory Performance Improvement based on Machine Learning (머신러닝 기반 메모리 성능 개선 연구)

Cho, Doosan
- The Journal of the Convergence on Culture Technology
- /
- v.7 no.1
- /
- pp.615-619
- /
- 2021
This study focuses on memory systems that are optimized to increase performance and energy efficiency in many embedded systems such as IoT, cloud computing, and edge computing, and proposes a performance improvement technique. The proposed technique improves memory system performance based on machine learning algorithms that are widely used in many applications. The machine learning technique can be used for various applications through supervised learning, and can be applied to a data classification task used in improving memory system performance. Data classification based on highly accurate machine learning techniques enables data to be appropriately arranged according to data usage patterns, thereby improving overall system performance.
https://doi.org/10.17703/JCCT.2021.7.1.615 인용 PDF KSCI

DATA MINING AND PREDICTION OF SAI TYPE MATRIX PRECONDITIONER

Kim, Sang-Bae;Xu, Shuting;Zhang, Jun
- Journal of applied mathematics & informatics
- /
- v.28 no.1_2
- /
- pp.351-361
- /
- 2010
The solution of large sparse linear systems is one of the most important problems in large scale scientific computing. Among the many methods developed, the preconditioned Krylov subspace methods are considered the preferred methods. Selecting a suitable preconditioner with appropriate parameters for a specific sparse linear system presents a challenging task for many application scientists and engineers who have little knowledge of preconditioned iterative methods. The prediction of ILU type preconditioners was considered in [27] where support vector machine(SVM), as a data mining technique, is used to classify large sparse linear systems and predict best preconditioners. In this paper, we apply the data mining approach to the sparse approximate inverse(SAI) type preconditioners to find some parameters with which the preconditioned Krylov subspace method on the linear systems shows best performance.
PDF KSCI

Search Result 15, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)