• Title/Summary/Keyword: Mapreduce

Search Result 22, Processing Time 0.031 seconds

A Dynamic Weapon Allocation Algorithm using Genetic Algorithm in Mapreduce Environments (맵리듀스 환경에서 유전자 알고리즘 기반의 동적 무기할당 알고리즘)

  • Park, Junho;Kim, Jieun;Cho, Kilseok
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2014.11a
    • /
    • pp.469-470
    • /
    • 2014
  • 동적 무기할당 문제는 전형적인 NP-완전 문제로써 위협하는 표적에 대해 아군의 무기를 적절히 할당하는 문제이다. 이는 매우 시간 제약적인 문제로써 가능한 단 시간 내에 적절한 무기할당 및 대응을 도출하여야 하지만 매우 유동적인 전장 환경에서 이는 쉽지 않다. 최근 이와 같이 높은 복잡성을 가진 빅데이터를 기반으로 하는 응용에서 분산 처리 시스템을 활용한 분석 및 처리에 대한 연구가 큰 주목을 받고 있고, 대표적인 프레임워크로써 맵리듀스가 활용되고 있다. 그러나 맵리듀스는 전체 데이터에 대한 일괄 처리 기능만을 제공하므로 동적 데이터에 대한 유전자 알고리즘의 수행이 쉽지 않고, 최종 결과 도출에 여전히 많은 시간을 필요로 한다. 본 논문에서는 맵리듀스 환경에서 유전자 알고리즘 기반의 동적 무기할당 알고리즘을 제안한다. 제안하는 기법에서는 맵리듀스 환경에서 유전자 알고리즘의 연속적인 데이터 처리의 지원을 위해 새롭게 추가 및 제거된 무기-표적 데이터만을 분석하고, 이를 기 분석 완료된 데이터와 결합하여 최종 결과를 도출한다. 이를 통해, 신속한 동적무기할당의 수행이 가능하다.

  • PDF

Implementation of big web logs analyzer in estimating preferences for web contents (웹 컨텐츠 선호도 측정을 위한 대용량 웹로그 분석기 구현)

  • Choi, Eun Jung;Kim, Myuhng Joo
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.8 no.4
    • /
    • pp.83-90
    • /
    • 2012
  • With the rapid growth of internet infrastructure, World Wide Web is evolving recently into various services such as cloud computing, social network services. It simply go beyond the sharing of information. It started to provide new services such as E-business, remote control or management, providing virtual services, and recently it is evolving into new services such as cloud computing and social network services. These kinds of communications through World Wide Web have been interested in and have developed user-centric customized services rather than providing provider-centric informations. In these environments, it is very important to check and analyze the user requests to a website. Especially, estimating user preferences is most important. For these reasons, analyzing web logs is being done, however, it has limitations that the most of data to analyze are based on page unit statistics. Therefore, it is not enough to evaluate user preferences only by statistics of specific page. Because recent main contents of web page design are being made of media files such as image files, and of dynamic pages utilizing the techniques of CSS, Div, iFrame etc. In this paper, large log analyzer was designed and executed to analyze web server log to estimate web contents preferences of users. With mapreduce which is based on Hadoop, large logs were analyzed and web contents preferences of media files such as image files, sounds and videos were estimated.

User-based Collaborative Filtering Recommender Technique using MapReduce (맵리듀스를 이용한 사용자 기반 협업 필터링 추천 기법)

  • Yun, So-young;Youn, Sung-dae
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.331-333
    • /
    • 2015
  • Data is increasing explosively with the spread of networks and mobile devices and there are problems in effectively processing the rapidly increasing data using existing recommendation techniques. Therefore, researches are being conducted on how to solve the scalability problem of the collaborative filtering technique. In this paper applies MapReduce, which is a distributed parallel process framework, to the collaborative filtering technique to reduce the scalability problem and heighten accuracy. The proposed technique applies MapReduce and the index technique to a user-based collaborative filtering technique and as a method which improves neighbor numbers which are used in similarity calculations and neighbor suitability, scalability and accuracy improvement effects can be expected.

  • PDF

Cost-Effective MapReduce Processing in the Cloud (클라우드 환경에서의 비용 효율적인 맵리듀스 처리)

  • Ryu, Wooseok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.114-115
    • /
    • 2018
  • This paper studies a mechanism for cost-effective analysis of big data in the cloud environment. Recently, as a storage of electronic medical records can be managed outside the hospital, there is a growing demand for cloud-based big data analysis in small-and-medium hospitals. This paper firstly analyze the Amazon Elastic MapReduce which is a popular cloud framework for big data analysis, and proposes a cost model for analyzing big data using Amazon EMR with less cost. Using the proposed model, the user can construct a cost-effective computing cluster, which maximize the effectiveness of the analysis per operational cost.

  • PDF

A Study On Recommend System Using Co-occurrence Matrix and Hadoop Distribution Processing (동시발생 행렬과 하둡 분산처리를 이용한 추천시스템에 관한 연구)

  • Kim, Chang-Bok;Chung, Jae-Pil
    • Journal of Advanced Navigation Technology
    • /
    • v.18 no.5
    • /
    • pp.468-475
    • /
    • 2014
  • The recommend system is getting more difficult real time recommend by lager preference data set, computing power and recommend algorithm. For this reason, recommend system is proceeding actively one's studies toward distribute processing method of large preference data set. This paper studied distribute processing method of large preference data set using hadoop distribute processing platform and mahout machine learning library. The recommend algorithm is used Co-occurrence Matrix similar to item Collaborative Filtering. The Co-occurrence Matrix can do distribute processing by many node of hadoop cluster, and it needs many computation scale but can reduce computation scale by distribute processing. This paper has simplified distribute processing of co-occurrence matrix by changes over from four stage to three stage. As a result, this paper can reduce mapreduce job and can generate recommend file. And it has a fast processing speed, and reduce map output data.

A Hadoop-based Multimedia Transcoding System for Processing Social Media in the PaaS Platform of SMCCSE

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku;Jeong, Changsung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.11
    • /
    • pp.2827-2848
    • /
    • 2012
  • Previously, we described a social media cloud computing service environment (SMCCSE). This SMCCSE supports the development of social networking services (SNSs) that include audio, image, and video formats. A social media cloud computing PaaS platform, a core component in a SMCCSE, processes large amounts of social media in a parallel and distributed manner for supporting a reliable SNS. Here, we propose a Hadoop-based multimedia system for image and video transcoding processing, necessary functions of our PaaS platform. Our system consists of two modules, including an image transcoding module and a video transcoding module. We also design and implement the system by using a MapReduce framework running on a Hadoop Distributed File System (HDFS) and the media processing libraries Xuggler and JAI. In this way, our system exponentially reduces the encoding time for transcoding large amounts of image and video files into specific formats depending on user-requested options (such as resolution, bit rate, and frame rate). In order to evaluate system performance, we measure the total image and video transcoding time for image and video data sets, respectively, under various experimental conditions. In addition, we compare the video transcoding performance of our cloud-based approach with that of the traditional frame-level parallel processing-based approach. Based on experiments performed on a 28-node cluster, the proposed Hadoop-based multimedia transcoding system delivers excellent speed and quality.

Reverse k-Nearest Neighbor Query Processing Method for Continuous Query Processing in Bigdata Environments (빅데이터 환경에서 연속 질의 처리를 위한 리버스 k-최근접 질의 처리 기법)

  • Lim, Jongtae;Park, Sunyong;Seo, Kiwon;Lee, Minho;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.10
    • /
    • pp.454-462
    • /
    • 2014
  • With the development of location aware technologies and mobile devices, location-based services have been studied. To provide location-based services, many researchers proposed methods for processing various query types with Mapreduce(MR). One of the proposed methods, is a Reverse k-nearest neighbor(RkNN) query processing method with MR. However, the existing methods spend too much cost to process the continuous RkNN query. In this paper, we propose an efficient continuous RkNN query processing method with MR to resolve the problems of the existing methods. The proposed method uses the 60-degree-pruning method. The proposed method does not need to reprocess the query for continuous query processing because the proposed method draws and monitors the monitoring area including the candidate objects of a RkNN query. In order to show the superiority of the proposed method, we compare it with the query processing performance of the existing method.

Knowledge Creation Structure of Big Data Research Domain (빅데이터 연구영역의 지식창출 구조)

  • Namn, Su-Hyeon
    • Journal of Digital Convergence
    • /
    • v.13 no.9
    • /
    • pp.129-136
    • /
    • 2015
  • We investigate the underlying structure of big data research domain, which is diversified and complicated using bottom-up approach. For that purpose, we derive a set of articles by searching "big data" through the Korea Citation Index System provided by National Research Foundation of Korea. With some preprocessing on the author-provided keywords, we analyze bibliometric data such as author-provided keywords, publication year, author, and journal characteristics. From the analysis, we both identify major sub-domains of big data research area and discover the hidden issues which made big data complex. Major keywords identified include SOCIAL NETWORK ANALYSIS, HADOOP, MAPREDUCE, PERSONAL INFORMATION POLICY/PROTECTION/PRIVATE INFORMATION, CLOUD COMPUTING, VISUALIZATION, and DATA MINING. We finally suggest missing research themes to make big data a sustainable management innovation and convergence medium.

Initial Authentication Protocol of Hadoop Distribution System based on Elliptic Curve (타원곡선기반 하둡 분산 시스템의 초기 인증 프로토콜)

  • Jeong, Yoon-Su;Kim, Yong-Tae;Park, Gil-Cheol
    • Journal of Digital Convergence
    • /
    • v.12 no.10
    • /
    • pp.253-258
    • /
    • 2014
  • Recently, the development of cloud computing technology is developed as soon as smartphones is increases, and increased that users want to receive big data service. Hadoop framework of the big data service is provided to hadoop file system and hadoop mapreduce supported by data-intensive distributed applications. But, smpartphone service using hadoop system is a very vulnerable state to data authentication. In this paper, we propose a initial authentication protocol of hadoop system assisted by smartphone service. Proposed protocol is combine symmetric key cryptography techniques with ECC algorithm in order to support the secure multiple data processing systems. In particular, the proposed protocol to access the system by the user Hadoop when processing data, the initial authentication key and the symmetric key instead of the elliptic curve by using the public key-based security is improved.

Design of Spark SQL Based Framework for Advanced Analytics (Spark SQL 기반 고도 분석 지원 프레임워크 설계)

  • Chung, Jaehwa
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.10
    • /
    • pp.477-482
    • /
    • 2016
  • As being the advanced analytics indispensable on big data for agile decision-making and tactical planning in enterprises, distributed processing platforms, such as Hadoop and Spark which distribute and handle the large volume of data on multiple nodes, receive great attention in the field. In Spark platform stack, Spark SQL unveiled recently to make Spark able to support distributed processing framework based on SQL. However, Spark SQL cannot effectively handle advanced analytics that involves machine learning and graph processing in terms of iterative tasks and task allocations. Motivated by these issues, this paper proposes the design of SQL-based big data optimal processing engine and processing framework to support advanced analytics in Spark environments. Big data optimal processing engines copes with complex SQL queries that involves multiple parameters and join, aggregation and sorting operations in distributed/parallel manner and the proposing framework optimizes machine learning process in terms of relational operations.