Search | Korea Science

Processing large-scale data with Apache Spark (Apache Spark를 활용한 대용량 데이터의 처리)

Ko, Seyoon;Won, Joong-Ho
- The Korean Journal of Applied Statistics
- /
- v.29 no.6
- /
- pp.1077-1094
- /
- 2016
Apache Spark is a fast and general-purpose cluster computing package. It provides a new abstraction named resilient distributed dataset, which is capable of support for fault tolerance while keeping data in memory. This type of abstraction results in a significant speedup compared to legacy large-scale data framework, MapReduce. In particular, Spark framework is suitable for iterative machine learning applications such as logistic regression and K-means clustering, and interactive data querying. Spark also supports high level libraries for various applications such as machine learning, streaming data processing, database querying and graph data mining thanks to its versatility. In this work, we introduce the concept and programming model of Spark as well as show some implementations of simple statistical computing applications. We also review the machine learning package MLlib, and the R language interface SparkR.
https://doi.org/10.5351/KJAS.2016.29.6.1077 인용 PDF KSCI

Resource Availability-based Multi Auction Model for Cloud Service Reservation and Resource Brokering System (자원 가용성 기반 다중 경매 모델을 이용한 서비스 예약형 클라우드 자원 거래 시스템)

Lee, Seok Woo;Kim, Tae Young;Lee, Jong Sik
- Journal of the Korea Society for Simulation
- /
- v.23 no.1
- /
- pp.1-10
- /
- 2014
A cloud computing is one of a parallel and distributed computing. The cloud computing provides some service for user with virtual resources. However, a user's service request does not show a time pattern. As a result, each resource also shows a different availability at the same time. This difference affects a quality of service (QoS) and a resource selection for users. Therefore, we propose the resource availability-based multi auction model for cloud service reservation and resource brokering system. The proposed system is to select the proper resource provider based on the users' request. The proposal adopts the multi phase of the auction to transact resources. The system evaluates the available factor of each resource on the auction phase, and finally reserves the service on the adaptive queue. The proposed model shows the better performance than other existing method.
https://doi.org/10.9709/JKSS.2014.23.1.001 인용 PDF KSCI

Implementation of Massive FDTD Simulation Computing Model Based on MPI Cluster for Semi-conductor Process (반도체 검증을 위한 MPI 기반 클러스터에서의 대용량 FDTD 시뮬레이션 연산환경 구축)

Lee, Seung-Il;Kim, Yeon-Il;Lee, Sang-Gil;Lee, Cheol-Hoon
- The Journal of the Korea Contents Association
- /
- v.15 no.9
- /
- pp.21-28
- /
- 2015
In the semi-conductor process, a simulation process is performed to detect defects by analyzing the behavior of the impurity through the physical quantity calculation of the inner element. In order to perform the simulation, Finite-Difference Time-Domain(FDTD) algorithm is used. The improvement of semiconductor which is composed of nanoscale elements, the size of simulation is getting bigger. Problems that a processor such as CPU or GPU cannot perform the simulation due to the massive size of matrix or a computer consist of multiple processors cannot handle a massive FDTD may come up. For those problems, studies are performed with parallel/distributed computing. However, in the past, only single type of processor was used. In GPU's case, it performs fast, but at the same time, it has limited memory. On the other hand, in CPU, it performs slower than that of GPU. To solve the problem, we implemented a computing model that can handle any FDTD simulation regardless of size on the cluster which consist of heterogeneous processors. We tested the simulation on processors using MPI libraries which is based on 'point to point' communication and verified that it operates correctly regardless of the number of node and type. Also, we analyzed the performance by measuring the total execution time and specific time for the simulation on each test.
https://doi.org/10.5392/JKCA.2015.15.09.021 인용 PDF KSCI

Distributed Construction of the Multiple-Ring Topology of the Connected Dominating Set for the Mobile Ad Hoc Networks: Boltzmann Machine Approach (무선 애드혹 망을 위한 연결 지배 집합 다중-링 위상의 분산적 구성-볼츠만 기계적 접근)

Park, Jae-Hyun
- Journal of KIISE:Information Networking
- /
- v.34 no.3
- /
- pp.226-238
- /
- 2007
In this paper, we present a novel fully distributed topology control protocol that can construct the multiple-ring topology of Minimal Connected Dominating Set (MCDS) as the transport backbone for mobile ad hoc networks. It makes a topology from the minimal nodes that are chosen from all the nodes, and the constructed topology is comprised of the minimal physical links while preserving connectivity. This topology reduces the interference. The all nodes work as the nodes of the distributed parallel Boltzmann machine, of which the objective function is consisted of two Boltzmann factors: the link degree and the connection domination degree. To define these Boltzmann factors, we extend the Connected Dominating Set into a fuzzy set, and also define the fuzzy set of nodes by which the multiple-ring topology can be constructed. To construct the transport backbone of the mobile ad hoc network, the proposed protocol chooses the nodes that are the strong members of these two fuzzy sets as the clusterheads. We also ran simulations to provide the quantitative comparison against the related works in terms of the packet loss rate and the energy consumption rate. As a result, we show that the network that is constructed by the proposed protocol has far better than the other ones with respect to the packet loss rate and the energy consumption rate.
PDF KSCI

Performance Analysis of MVDR and RLS Beamforming Using Systolic Array Structure (시스토릭 어레이 구조를 갖는 최소분산 비왜곡응답 및 최소자승 회귀 빔형성기법 성능 분석)

이호중;서상우;이원철
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.1
- /
- pp.1-6
- /
- 2003
This paper analyses the performance of either the minimum variance distortionless response (MVDR) or the recursive least square (RLS) beamformer structured on the systolic array. Provided that the snapshot vector including the desired user's signal and the interferences with the noise is received at the array antenna. In order to improve the quality of received signal, MVDR or RLS algorithm can be utilized to update the beamformer weights recursively. Furthermore to increase the channel capacity, by the usage of the above schemes, the effect of the spatial filtering can be obtained which constructively combining multipath components corresponding to the desired user whereas the multiple access interferences (MAI) is nulled out on spatial domain. This paper introduces the MVDR and RLS beamformer structured on systolic array conducting the spatial filtering, and its performance under the multipath fading channel in the presence of multiple access interferences will be analyzed. To show the superior spatial filtering performances of the proposed scheme employing the systolic way structured beamformer, the computer simulations are carried out. And the validity of practical deployment of the proposed scheme will be confirmed throughout showing the BER behaviors and the beampatterns.
PDF KSCI

Analysis of Ultimate Bearing Capacity of Piles Using Artificial Neural Networks Theory (I) -Theory (인공 신경망 이론을 이용한 말뚝의 극한지지력 해석(I)-이론)

이정학;이인모
- Geotechnical Engineering
- /
- v.10 no.4
- /
- pp.17-28
- /
- 1994
It is well known that human brain has the advantage of handling disperse and parallel distributed data efficiently. On the basic of this fact, artificial neural networks theory was developed and has been applied to various fields of science successfully. In this study, error back propagation algorithm which is one of the teaching technique of artificial neural networks is applied to predict ultimate bearing capacity of pile foundations. For the verification of applicability of this system, a total of 28 data of model pile test results are used. The 9, 14 and 21 test data respectively out of the total 28 data are used for training the networks, and the others are used for the comparison between the predicted and the measured. The results show that the developed system can provide a good matching with model pile test results by training with data more than 14. These limited results show the possibility of utilizing the neural networks for pile capacity prediction problems.
PDF

The Study of the Object Replication Management using Adaptive Duplication Object Algorithm (적응적 중복 객체 알고리즘을 이용한 객체 복제본 관리 연구)

박종선;장용철;오수열
- Journal of the Korea Society of Computer and Information
- /
- v.8 no.1
- /
- pp.51-59
- /
- 2003
It is effective to be located in the double nodes in the distributed object replication systems, then object which nodes share is the same contents. The nodes store an access information on their local cache as it access to the system. and then the nodes fetch and use it, when it needed. But with time the coherence Problems will happen because a data carl be updated by other nodes. So keeping the coherence of the system we need a mechanism that we managed the to improve to improve the performance and availability of the system effectively. In this paper to keep coherence in the shared memory condition, we can set the limited parallel performance without the additional cost except the coherence cost using it to keep the object at the proposed adaptive duplication object(ADO) algorithms. Also to minimize the coherence maintenance cost which is the bi99est overhead in the duplication method, we must manage the object effectively for the number of replication and location of the object replica which is the most important points, and then it determines the cos. And that we must study the adaptive duplication object management mechanism which will improve the entire run time.
PDF

Hierarchical Visualization of Cloud-Based Social Network Service Using Fuzzy (퍼지를 이용한 클라우드 기반의 소셜 네트워크 서비스 계층적 시각화)

Park, Sun;Kim, Yong-Il;Lee, Seong Ro
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.38B no.7
- /
- pp.501-511
- /
- 2013
Recently, the visualization method of social network service have been only focusing on presentation of visualizing network data, which the methods do not consider an efficient processing speed and computational complexity for increasing at the ratio of arithmetical of a big data regarding social networks. This paper proposes a cloud based on visualization method to visualize a user focused hierarchy relationship between user's nodes on social network. The proposed method can intuitionally understand the user's social relationship since the method uses fuzzy to represent a hierarchical relationship of user nodes of social network. It also can easily identify a key role relationship of users on social network. In addition, the method uses hadoop and hive based on cloud for distributed parallel processing of visualization algorithm, which it can expedite the big data of social network.
https://doi.org/10.7840/kics.2013.38B.7.501 인용 PDF KSCI

Development of Information Technology Infrastructures through Construction of Big Data Platform for Road Driving Environment Analysis (도로 주행환경 분석을 위한 빅데이터 플랫폼 구축 정보기술 인프라 개발)

Jung, In-taek;Chong, Kyu-soo
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.19 no.3
- /
- pp.669-678
- /
- 2018
This study developed information technology infrastructures for building a driving environment analysis platform using various big data, such as vehicle sensing data, public data, etc. First, a small platform server with a parallel structure for big data distribution processing was developed with H/W technology. Next, programs for big data collection/storage, processing/analysis, and information visualization were developed with S/W technology. The collection S/W was developed as a collection interface using Kafka, Flume, and Sqoop. The storage S/W was developed to be divided into a Hadoop distributed file system and Cassandra DB according to the utilization of data. Processing S/W was developed for spatial unit matching and time interval interpolation/aggregation of the collected data by applying the grid index method. An analysis S/W was developed as an analytical tool based on the Zeppelin notebook for the application and evaluation of a development algorithm. Finally, Information Visualization S/W was developed as a Web GIS engine program for providing various driving environment information and visualization. As a result of the performance evaluation, the number of executors, the optimal memory capacity, and number of cores for the development server were derived, and the computation performance was superior to that of the other cloud computing.
https://doi.org/10.5762/KAIS.2018.19.3.669 인용 PDF KSCI

QoS-Guaranteed IP Mobility Management For Fast Moving Vehicles Using Multiple Tunnels (멀티 터널링을 이용한 고속 차량에서 QoS 보장 IP 이동성 관리 방법)

Chun, Seung-Man;Nah, Jae-Wook;Park, Jong-Tae
- Journal of the Institute of Electronics Engineers of Korea TC
- /
- v.48 no.11
- /
- pp.44-52
- /
- 2011
In this article, we present a QoS-guaranteed IP mobility management scheme of Internet service for fast moving vehicles with multiple wireless network interfaces. The idea of the proposed mechanism consists of two things. One is that new wireless connections are established to available wireless channels whenever the measured data rate at the vehicle equipped with mobile gateway drops below to the required data rate of the user requirement. The other is that parallel distribution packet tunnels between an access router and the mobile gateway are dynamically constructed using multiple wireless network interfaces in order to guarantee the required data rate during the mobile gateway's movement. By doing these methods, the required data rate of the mobile gateway can be preserved while eliminating the possible delay and packet loss during handover operation, thus resulting in the guaranteed QoS. The architecture of the IETF standard HMIPv6 has been extended to realize the proposed scheme, and detailed algorithms for the extension of HMIPv6 has been designed. Finally, simulation has been done for performance evaluation, and the simulation results show that the proposed mechanism demonstrates guaranteed QoS during the handover with regard to the handover delay, packet loss and throughput.
PDF KSCI

Search Result 170, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)