• Title/Summary/Keyword: MapReduce

Search Result 849, Processing Time 0.038 seconds

GPS Data Partitioning Method for POI Extraction Based MapReduce (MapReduce 기반 POI를 추출하기 위한 GPS 데이터 분할 방법)

  • Oh, Joo-Seong;Jeon, Hye-Ji;Lee, Hye-Jin;Jeong, Min-A;Lee, Seong-Ro
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.1199-1201
    • /
    • 2015
  • 위치 기반 서비스는 여러 분야에서 활용되어지고 있다. 사용자들에게 정확한 정보를 제공하기 위해서는 대량의 위치 데이터를 분석하여 POI를 추출하고 분석해야 된다. 본 논문에서는 POI를 추출하는 방법으로 DBSCAN 클러스터링을 이용하고 이를 MapReduce 환경에서 구현한다. 또한 알고리즘의 수행속도를 향상시키기위해 데이터를 분할하는 방법을 제안한다.

Documents Filtering and Topic Prediction for SNS using Naïve Bayesian Classifier and MapReduce (나이브 베이지안 분류기와 MapReduce 를 이용한 SNS 문서 필터링 및 토픽 예측)

  • Park, Hosik;Kang, Namyong;Park, Seulgi;Moon, Jungmin;Oh, Sangyoon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.04a
    • /
    • pp.109-111
    • /
    • 2014
  • SNS(Social Network Service)는 새로운 소통수단으로 인적 네트워크뿐만 아니라 사회, 문화 등에 많은 영향을 미치고 있다. 특히, 무선인터넷과 스마트폰의 보급으로 정보유통량이 기하급수적으로 증가하면서, 데이터를 처리 및 분석하는 것이 화두가 되고 있다. 본 논문에서는 급증하는 SNS 데이터를 처리 및 분석하여 의미 있는 데이터를 키워드 중심으로 추출하고자 하였다. 이를 위해 기존 데이터 처리방식이 아닌 빅데이터 처리에 적합한 MapReduce 환경에서 SNS 데이터를 필터링하고, 토픽을 예측하기 처리방법을 제시하였다. 또한, 웹 서비스를 기반으로 구현하여 분석된 데이터를 시각적으로 표현하고, 재생산하였으며, 실험을 통해 제안하는 처리방법의 성능을 검증하였다.

Design and Implementation of a Book Recommendation System based on the MapReduce Model (MapReduce Model에 기반한 도서 추천 시스템의 설계 및 구현)

  • Lim, Chan-Shik;Lee, Won-Jae;Lee, Ha-Na;Lee, Se-Hwa;Lee, Sang-Jun
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2010.06c
    • /
    • pp.201-204
    • /
    • 2010
  • 하루에도 수많은 도서가 출판되는 현실에서 사용자가 원하는 목적에 맞는 도서를 찾아 읽기는 어려운 일이다. 본 논문에서는 방대한 분량의 도서 데이타를 바탕으로, MapReduce 모델을 활용하여 도서들 사이의 연관 관계를 추출하였다. 추출한 연관 관계 DB를 이용하여 사용자에게 서로 관련 있는 도서를 추천해줄 수 있는 시스템을 개발하고자 한다.

  • PDF

A Study on Efficient Cluster Analysis of Bio-Data Using MapReduce Framework

  • Yoo, Sowol;Lee, Kwangok;Bae, Sanghyun
    • Journal of Integrative Natural Science
    • /
    • v.7 no.1
    • /
    • pp.57-61
    • /
    • 2014
  • This study measured the stream data from the several sensors, and stores the database in MapReduce framework environment, and it aims to design system with the small performance and cluster analysis error rate through the KMSVM algorithm. Through the KM-SVM algorithm, the cluster analysis effective data was used for U-health system. In the results of experiment by using 2003 data sets obtained from 52 test subjects, the k-NN algorithm showed 79.29% cluster analysis accuracy, K-means algorithm showed 87.15 cluster analysis accuracy, and SVM algorithm showed 83.72%, KM-SVM showed 90.72%. As a result, the process speed and cluster analysis effective ratio of KM-SVM algorithm was better.

Pattern mining for large distributed dataset: A parallel approach (PMLDD)

  • Pal, Amrit;Kumar, Manish
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.11
    • /
    • pp.5287-5303
    • /
    • 2018
  • Handling vast amount of data found in large transactional datasets is an obvious challenge for the conventional data mining algorithms. Addressing this challenge, our paper proposes a parallel approach for proper decomposition of mining problem into sub-problems in order to find frequent patterns from these datasets. The proposed, Pattern Mining for Large Distributed Dataset (PMLDD) approach, ensures minimum dependencies as well as minimum communications among sub-problems. It establishes a linear aggregation of the intermediate results so that it can be adapted to large-scale programming models like MapReduce. In this context, an algorithmic structure for MapReduce programming model is presented. PMLDD guarantees an efficient load balancing among the sub-problems by a specific selection criterion. Further, it optimizes the number of required iterations over the dataset for mining frequent patterns as compared to the existing approaches. Finally, we believe that our approach is scalable enough to handle larger datasets in terms of performance evaluation, and the result analysis justifies all these mentioned concerns.

Cloud Computing Platforms for Big Data Adoption and Analytics

  • Hussain, Mohammad Jabed;Alsadie, Deafallah
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.2
    • /
    • pp.290-296
    • /
    • 2022
  • Big Data is a data analysis technology empowered by late advances in innovations and engineering. In any case, big data involves a colossal responsibility of equipment and handling assets, making reception expenses of big data innovation restrictive to little and medium estimated organizations. Cloud computing offers the guarantee of big data execution to little and medium measured organizations. Big Data preparing is performed through a programming worldview known as MapReduce. Normally, execution of the MapReduce worldview requires organized joined stockpiling and equal preparing. The computing needs of MapReduce writing computer programs are frequently past what little and medium measured business can submit. Cloud computing is on-request network admittance to computing assets, given by an external element. Normal arrangement models for cloud computing incorporate platform as a service (PaaS), software as a service (SaaS), framework as a service (IaaS), and equipment as a service (HaaS).

Parallel Evolution Strategy Using an Extended MapReduce (확장된 MapReduce를 이용한 병렬 진화 전략)

  • Choi, Hyun Hwa;Lee, Mi Young;Lee, Kyu Chul
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.97-98
    • /
    • 2009
  • 진화 전략은 생식, 돌연변이, 재조합과 같은 생물의 진화과정을 모델링하여 복잡한 문제를 해결하고자 하는 개체군 기반의 조합 최적화 알고리즘 중의 하나이다. 데이터 집약적이며, 소요 시간이 오래 걸리는 진화 전략은 클라우드 컴퓨팅 하의 IT 서비스로서 적합한 대표적인 예이다. 이에 본 논문에서는 최근 분산 환경 하에서 병렬 처리 응용을 쉽게 개발할 수 있도록 지원하는 프로그래밍 모델인 MapReduce 를 확장하여 진화 전략을 수행할 수 있는 방법을 제안한다.

VotingRank: A Case Study of e-Commerce Recommender Application Using MapReduce

  • Ren, Jian-Ji;Lee, Jae-Kee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.834-837
    • /
    • 2009
  • There is a growing need for ad-hoc analysis of extremely large data sets, especially at e-Commerce companies which depend on recommender application. Nowadays, as the number of e-Commerce web pages grow to a tremendous proportion; vertical recommender services can help customers to find what they need. Recommender application is one of the reasons for e-Commerce success in today's world. Compared with general e-Commerce recommender application, obviously, general e-Commerce recommender application's processing scope is greatly narrowed down. MapReduce is emerging as an important programming model for large-scale data-parallel applications such as web indexing, data mining, and scientific simulation. The objective of this paper is to explore MapReduce framework for the e-Commerce recommender application on major general and dedicated link analysis for e-Commerce recommender application, and thus the responding time has been decreased and the recommender application's accuracy has been improved.

Pre-processing of Depth map for Multi-view Stereo Image Synthesis (다시점 영상 합성을 위한 깊이 정보의 전처리)

  • Seo Kwang-Wug;Han Chung-Shin;Yoo Ji-Sang
    • Journal of Broadcast Engineering
    • /
    • v.11 no.1 s.30
    • /
    • pp.91-99
    • /
    • 2006
  • Pre-processing is one of image processing techniques to enhance image quality or appropriately convert a given image into another form for a specific purpose. An 8 bit depth map obtained by a depth camera usually contains a lot of noisy components caused by the characteristics of depth camera and edges are also more distorted by the quality of a source object and illumination condition comparing with edges in RGB texture image. To reduce this distortion, we use noise removing filters, but they are only able to reduce noise components, so that distorted edges of depth map can not be properly recovered. In this paper, we propose an algorithm that can reduce noise components and also enhance the quality of edges of depth map by using edges in RGB texture. Consequently, we can reduce errors in multi-view stereo image synthesis process.

A GPU-enabled Face Detection System in the Hadoop Platform Considering Big Data for Images (이미지 빅데이터를 고려한 하둡 플랫폼 환경에서 GPU 기반의 얼굴 검출 시스템)

  • Bae, Yuseok;Park, Jongyoul
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.1
    • /
    • pp.20-25
    • /
    • 2016
  • With the advent of the era of digital big data, the Hadoop platform has become widely used in various fields. However, the Hadoop MapReduce framework suffers from problems related to the increase of the name node's main memory and map tasks for the processing of large number of small files. In addition, a method for running C++-based tasks in the MapReduce framework is required in order to conjugate GPUs supporting hardware-based data parallelism in the MapReduce framework. Therefore, in this paper, we present a face detection system that generates a sequence file for images to process big data for images in the Hadoop platform. The system also deals with tasks for GPU-based face detection in the MapReduce framework using Hadoop Pipes. We demonstrate a performance increase of around 6.8-fold as compared to a single CPU process.