• Title/Summary/Keyword: data scalability

검색결과 574건 처리시간 0.022초

Hadoop Based Wavelet Histogram for Big Data in Cloud

  • Kim, Jeong-Joon
    • Journal of Information Processing Systems
    • /
    • 제13권4호
    • /
    • pp.668-676
    • /
    • 2017
  • Recently, the importance of big data has been emphasized with the development of smartphone, web/SNS. As a result, MapReduce, which can efficiently process big data, is receiving worldwide attention because of its excellent scalability and stability. Since big data has a large amount, fast creation speed, and various properties, it is more efficient to process big data summary information than big data itself. Wavelet histogram, which is a typical data summary information generation technique, can generate optimal data summary information that does not cause loss of information of original data. Therefore, a system applying a wavelet histogram generation technique based on MapReduce has been actively studied. However, existing research has a disadvantage in that the generation speed is slow because the wavelet histogram is generated through one or more MapReduce Jobs. And there is a high possibility that the error of the data restored by the wavelet histogram becomes large. However, since the wavelet histogram generation system based on the MapReduce developed in this paper generates the wavelet histogram through one MapReduce Job, the generation speed can be greatly increased. In addition, since the wavelet histogram is generated by adjusting the error boundary specified by the user, the error of the restored data can be adjusted from the wavelet histogram. Finally, we verified the efficiency of the wavelet histogram generation system developed in this paper through performance evaluation.

A Visualization System for Multiple Heterogeneous Network Security Data and Fusion Analysis

  • Zhang, Sheng;Shi, Ronghua;Zhao, Jue
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제10권6호
    • /
    • pp.2801-2816
    • /
    • 2016
  • Owing to their low scalability, weak support on big data, insufficient data collaborative analysis and inadequate situational awareness, the traditional methods fail to meet the needs of the security data analysis. This paper proposes visualization methods to fuse the multi-source security data and grasp the network situation. Firstly, data sources are classified at their collection positions, with the objects of security data taken from three different layers. Secondly, the Heatmap is adopted to show host status; the Treemap is used to visualize Netflow logs; and the radial Node-link diagram is employed to express IPS logs. Finally, the Labeled Treemap is invented to make a fusion at data-level and the Time-series features are extracted to fuse data at feature-level. The comparative analyses with the prize-winning works prove this method enjoying substantial advantages for network analysts to facilitate data feature fusion, better understand network security situation with a unified, convenient and accurate mode.

금융분야의 플랫폼 기반 마이데이터 서비스 개발에 관한 연구 (A Study on the Development of Platform-based MyData Service in Financial Industry)

  • 최재섭;차상훈;최정일
    • 한국IT서비스학회지
    • /
    • 제22권1호
    • /
    • pp.29-42
    • /
    • 2023
  • Amid the global movement to harness individual data and boost the data economy, MyData services that utilize personal data are being implemented in earnest in the financial sector in Korea due to the government's active encouragement policy. To this end, MyData service providers must have a service system for business operators that collects and efficiently loads personal information scattered in various financial institutions with individual consent, and comprehensively analyzes and provides it. The system must not only have strict security management capabilities, but also be built in a flexible form that takes into account future data scalability and additional services. In this paper, it has been proposed to be implemented the essential functions that MyData service system must have and the core functions that can manage the entire data life cycle from data collection, distribution to disposal in the form of a platform. In addition, the strengths of the platform structure were reviewed, and the effectiveness of the platform model was examined upon application.

Big data platform for health monitoring systems of multiple bridges

  • Wang, Manya;Ding, Youliang;Wan, Chunfeng;Zhao, Hanwei
    • Structural Monitoring and Maintenance
    • /
    • 제7권4호
    • /
    • pp.345-365
    • /
    • 2020
  • At present, many machine leaning and data mining methods are used for analyzing and predicting structural response characteristics. However, the platform that combines big data analysis methods with online and offline analysis modules has not been used in actual projects. This work is dedicated to developing a multifunctional Hadoop-Spark big data platform for bridges to monitor and evaluate the serviceability based on structural health monitoring system. It realizes rapid processing, analysis and storage of collected health monitoring data. The platform contains offline computing and online analysis modules, using Hadoop-Spark environment. Hadoop provides the overall framework and storage subsystem for big data platform, while Spark is used for online computing. Finally, the big data Hadoop-Spark platform computational performance is verified through several actual analysis tasks. Experiments show the Hadoop-Spark big data platform has good fault tolerance, scalability and online analysis performance. It can meet the daily analysis requirements of 5s/time for one bridge and 40s/time for 100 bridges.

소프트웨어 정의 모바일 에지 차량 네트워크(SDMEVN)의 센싱 데이터 수집 전략 (A Sensing Data Collection Strategy in Software-Defined Mobile-Edge Vehicular Networks (SDMEVN))

  • 라이오넬;장종욱
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2018년도 추계학술대회
    • /
    • pp.62-65
    • /
    • 2018
  • This paper comes out with the study on sensing data collection strategy in a Software-Defined Mobile Edge vehicular networking. The two cooperative data dissemination are Direct Vehicular cloud mode and edge cell trajectory prediction decision mode. In direct vehicular cloud, the vehicle observe its neighboring vehicles and sets up vehicular cloud for cooperative sensing data collection, the data collection output can be transmitted from vehicles participating in the cooperative sensing data collection computation to the vehicle on which the sensing data collection request originate through V2V communication. The vehicle on which computation originate will reassemble the computation out-put and send to the closest RSU. The SDMEVN (Software Defined Mobile Edge Vehicular Network) Controller determines how much effort the sensing data collection request requires and calculates the number of RSUs required to support coverage of one RSU to the other. We set up a simulation scenario based on realistic traffic and communication features and demonstrate the scalability of the proposed solution.

  • PDF

JXTA를 이용한 P2P 기반 자료공유시스템(JDSS)의 구현 (Implementation of a P2P-based Data Sharing System using JXTA)

  • 양광민;주형렬
    • Journal of Information Technology Applications and Management
    • /
    • 제10권3호
    • /
    • pp.1-22
    • /
    • 2003
  • P2P systems have been studied by many researchers in universities and commercial firms in recent years. In this study. we design and implement a system that makes UP for shortcomings of currently available P2P systems. Gnutella and Napster. The study also includes an efficiency analysis scheme conducted through a series of experimental data. The data sharing system of the study demonstrated duality of roles(client, service) of peers. But, their roles were separated from the existing client-server systems. Also, the study implements mechanism that shows the redundancy of data to communicate efficiently among peers for transferring data. The results of performance measure of the system shows that the amount of information shared by peers increases as the amount of peers increases but with no significant increase in response time. This constant response time is far more stable and faster than current file sharing systems. such as Gnutella and Napster. Business applications such as knowledge management, enterprise information portal management and transfer of data are done by use of supercomputers. They need to extend their systems to equip with more capacity and throughput as the number of clients increases. Moreover, they will face with more complicated problems if integration with new systems exists. If this JDSS is introduced to these business applications. it would easily augment scalability of the system with high performance at less expense.

  • PDF

XML 기반 CCR 문서의 무선 데이터 방송을 위한 프레임워크의 설계와 구현 (Design and Development of Framework for Wireless Data Broadcast of XML-based CCR Documents)

  • 임석진;황희정
    • 한국인터넷방송통신학회논문지
    • /
    • 제15권5호
    • /
    • pp.169-175
    • /
    • 2015
  • ICT 기술과 의료기술이 융합된 의료 정보기술에서 XML 기반의 CCR 문서는 환자 데이터의 연속성과 이동성을 확보해 준다. 대규모의 클라이언트들이 CCR 문서를 동시에 액세스할 때 클라이언트의 수에 상관없이 데이터 배표가 가능하게 해주는 확장성을 가진 무선 데이터 방송은 효율적인 정보 전달을 위한 하나의 대안이 된다. 본 논문은 무선데이터방송을 이용한 CCR 문서 전달을 위해 다양한 CCR 문서 스케줄링 기법과 인덱스 기법을 적용하여 최적의 클라이언트 성능을 낼 수 있도록 방송환경을 구축할 수 있는 무선 데이터 방송 시뮬레이션 프레임워크를 설계하고 구현한다. 구현된 프레임워크에 다양한 스케줄링 기법과 인덱스를 적용한 시뮬레이션을 통해 제안된 프레임워크의 효율성을 보였다.

Study of Data Placement Schemes for SNS Services in Cloud Environment

  • Chen, Yen-Wen;Lin, Meng-Hsien;Wu, Min-Yan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제9권8호
    • /
    • pp.3203-3215
    • /
    • 2015
  • Due to the high growth of SNS population, service scalability is one of the critical issues to be addressed. The cloud environment provides the flexible computing and storage resources for services deployment, which fits the characteristics of scalable SNS deployment. However, if the SNS related information is not properly placed, it will cause unbalance load and heavy transmission cost on the storage virtual machine (VM) and cloud data center (CDC) network. In this paper, we characterize the SNS into a graph model based on the users' associations and interest correlations. The node weight represents the degree of associations, which can be indexed by the number of friends or data sources, and the link weight denotes the correlation between users/data sources. Then, based on the SNS graph, the two-step algorithm is proposed in this paper to determine the placement of SNS related data among VMs. Two k-means based clustering schemes are proposed to allocate social data in proper VM and physical servers for pre-configured VM and dynamic VM environment, respectively. The experimental example was conducted and to illustrate and compare the performance of the proposed schemes.

사물인터넷 컴퓨팅 환경에서 QoS를 고려한 데이터 전송 구조 (QoS-aware Data Delivery Infrastructure for IoT Computing Environments)

  • 이윤석
    • 디지털콘텐츠학회 논문지
    • /
    • 제19권2호
    • /
    • pp.407-413
    • /
    • 2018
  • 최근 사물인터넷(IoT) 기술의 발전과 함께 수많은 센서와 소형 구동장치들로 구성된 새로운 컴퓨팅 환경이 도래했다. 본 논문은 이와 같은 IoT 기반 컴퓨팅 환경에서 데이터 제공자들과 소비자들 사이에 센싱 데이터를 쉽게 공유하고 접근하도록 지원하는 공유 플랫폼으로서, 확장성있는 데이터 전송 기반구조를 제안한다. 확장성과 효율성을 제공하기 위해, 이 논문은 특히 소비자들 간의 서로 다른 QoS 요구사항을 활용하여 전송 대역폭을 효과적으로 활용한 전송 경로를 구성하는 방법을 제시한다. 전송경로 구성과 재구성 과정이 제안하는 구조의 가장 큰 오버헤드로 판단되므로, 본 논문에서는 그 비용을 산정하는 기본적인 실험을 수행하였는데, 결과는 제안된 구조의 우수한 확장성에 비해 오버헤드는 비교적 적은 것으로 확인되었다.

범위 질의 인덱싱을 이용한 스트림 데이터의 다중 질의처리 기법 (A Multi-dimensional Query Processing Scheme for Stream Data using Range Query Indexing)

  • 이동언;이윤석
    • 한국컴퓨터정보학회논문지
    • /
    • 제14권2호
    • /
    • pp.69-77
    • /
    • 2009
  • 스트림 서비스 환경에서는 지속적으로 입력되는 막대한 양의 데이터에 대해 원하는 조건을 탐색하는 실시간 질의처리가 요구된다. 기존의 R-tee기반 질의처리 기술은 각 이벤트에 대해 트리 전체에 대해 동일한 탐색과정을 반복해야 하므로 이를 효율적으로 감당할 수 없었다. 한편 센서 측정값을 비롯한 대부분의 스트림 데이터는 매우 높은 지역성을 가지며 이를 활용하여 탐색 공간을 크게 줄일 수 있다. 따라서 본 연구에서는 스트림 데이터의 지역성을 활용하여 스트림 환경에 적합한 질의처리 기법을 제안하였다. 또한 이 프레임웍을 활용하여 스트림 환경에서 어플리케이션이 요구하는 다양한 질의처리 서비스를 개발할 수 있을 것으로 기대된다. 본 연구에서 구현한 프로토타입 시스템을 스트림 환경에 적용해 얻은 실험 결과를 통해, 스트림 환경에서 기존질의처리 기법보다 더 적합하고 효율이 크게 개선됨을 확인할 수 있었다.