• Title/Summary/Keyword: data scalability

Search Result 579, Processing Time 0.032 seconds

Dynamic Cluster Management of Hadoop Distributed Filesystem (하둡 분산 파일시스템의 동적 클러스터 관리 기법)

  • Ryu, Wooseok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2016.10a
    • /
    • pp.435-437
    • /
    • 2016
  • Hadoop Distributed File System(HDFS) is a file system for distributed processing of big data by replicating data to distributed data nodes. HDFS cluster shows a great scalability up to thousands of nodes, but it assumes a exclusive node cluster with numerous nodes for the big data processing. Various operational-purpose worker systems used by office are hardly considered as a part of cluster. This paper discusses this problem and proposes a dynamic cluster management technique to increase storage capability and analytic performance of hadoop cluster. The propsed technique can add legacy systems to the cluster and can remove them from the cluster dynamically depending on their availability.

  • PDF

Controller Backup and Replication for Reliable Multi-domain SDN

  • Mao, Junli;Chen, Lishui;Li, Jiacong;Ge, Yi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.12
    • /
    • pp.4725-4747
    • /
    • 2020
  • Software defined networking (SDN) is considered to be one of the most promising paradigms in the future. To solve the scalability and performance problem that a single and centralized controller suffers from, the distributed multi-controller architecture is adopted, thus forms multi-domain SDN. In a multi-domain SDN network, it is of great importance to ensure a reliable control plane. In this paper, we focus on the reliability problem of multi-domain SDN against controller failure from perspectives of backup controller deployment and controller replication. We firstly propose a placement algorithm for backup controllers, which considers both the reliability and the cost factors. Then a controller replication mechanism based on shared data storage is proposed to solve the inconsistency between the active and standby controllers. We also propose a shared data storage layout method that considers both reliability and performance. Besides, a fault recovery and repair process is designed based on the controller backup and shared data storage mechanism. Simulations show that our approach can recover and repair controller failure. Evaluation results also show that the proposed backup controller placement approach is more effective than other methods.

Knowledge-guided artificial intelligence technologies for decoding complex multiomics interactions in cells

  • Lee, Dohoon;Kim, Sun
    • Clinical and Experimental Pediatrics
    • /
    • v.65 no.5
    • /
    • pp.239-249
    • /
    • 2022
  • Cells survive and proliferate through complex interactions among diverse molecules across multiomics layers. Conventional experimental approaches for identifying these interactions have built a firm foundation for molecular biology, but their scalability is gradually becoming inadequate compared to the rapid accumulation of multiomics data measured by high-throughput technologies. Therefore, the need for data-driven computational modeling of interactions within cells has been highlighted in recent years. The complexity of multiomics interactions is primarily due to their nonlinearity. That is, their accurate modeling requires intricate conditional dependencies, synergies, or antagonisms between considered genes or proteins, which retard experimental validations. Artificial intelligence (AI) technologies, including deep learning models, are optimal choices for handling complex nonlinear relationships between features that are scalable and produce large amounts of data. Thus, they have great potential for modeling multiomics interactions. Although there exist many AI-driven models for computational biology applications, relatively few explicitly incorporate the prior knowledge within model architectures or training procedures. Such guidance of models by domain knowledge will greatly reduce the amount of data needed to train models and constrain their vast expressive powers to focus on the biologically relevant space. Therefore, it can enhance a model's interpretability, reduce spurious interactions, and prove its validity and utility. Thus, to facilitate further development of knowledge-guided AI technologies for the modeling of multiomics interactions, here we review representative bioinformatics applications of deep learning models for multiomics interactions developed to date by categorizing them by guidance mode.

Batch Processing Algorithm for Moving k-Farthest Neighbor Queries in Road Networks (도로망에서 움직이는 k-최원접 이웃 질의를 위한 일괄 처리 알고리즘)

  • Cho, Hyung-Ju
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.223-224
    • /
    • 2021
  • Recently, k-farthest neighbor (kFN) queries have not as much attention as k-nearest neighbor (kNN) queries. Therefore, this study considers moving k-farthest neighbor (MkFN) queries for spatial network databases. Given a positive integer k, a moving query point q, and a set of data points P, MkFN queries can constantly retrieve k data points that are farthest from the query point q. The challenge with processing MkFN queries in spatial networks is to avoid unnecessary or superfluous distance calculations between the query and associated data points. This study proposes a batch processing algorithm, called MOFA, to enable efficient processing of MkFN queries in spatial networks. MOFA aims to avoid dispensable distance computations based on the clustering of both query and data points. Moreover, a time complexity analysis is presented to clarify the effect of the clustering method on the query processing time. Extensive experiments using real-world roadmaps demonstrated the efficiency and scalability of the MOFA when compared with a conventional solution.

  • PDF

Data Level Parallelism for H.264/AVC Decoder on a Multi-Core Processor and Performance Analysis (멀티코어 프로세서에서의 H.264/AVC 디코더를 위한 데이터 레벨 병렬화 성능 예측 및 분석)

  • Cho, Han-Wook;Jo, Song-Hyun;Song, Yong-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.8
    • /
    • pp.102-116
    • /
    • 2009
  • There have been lots of researches for H.264/AVC performance enhancement on a multi-core processor. The enhancement has been performed through parallelization methods. Parallelization methods can be classified into a task-level parallelization method and a data level parallelization method. A task-level parallelization method for H.264/AVC decoder is implemented by dividing H.264/AVC decoder algorithms into pipeline stages. However, it is not suitable for complex and large bitstreams due to poor load-balancing. Considering load-balancing and performance scalability, we propose a horizontal data level parallelization method for H.264/AVC decoder in such a way that threads are assigned to macroblock lines. We develop a mathematical performance expectation model for the proposed parallelization methods. For evaluation of the mathematical performance expectation, we measured the performance with JM 13.2 reference software on ARM11 MPCore Evaluation Board. The cycle-accurate measurement with SoCDesigner Co-verification Environment showed that expected performance and performance scalability of the proposed parallelization method was accurate in relatively high level

A Performance Improvement Scheme for a Wireless Internet Proxy Server Cluster (무선 인터넷 프록시 서버 클러스터 성능 개선)

  • Kwak, Hu-Keun;Chung, Kyu-Sik
    • Journal of KIISE:Information Networking
    • /
    • v.32 no.3
    • /
    • pp.415-426
    • /
    • 2005
  • Wireless internet, which becomes a hot social issue, has limitations due to the following characteristics, as different from wired internet. It has low bandwidth, frequent disconnection, low computing power, and small screen in user terminal. Also, it has technical issues to Improve in terms of user mobility, network protocol, security, and etc. Wireless internet server should be scalable to handle a large scale traffic due to rapidly growing users. In this paper, wireless internet proxy server clusters are used for the wireless Internet because their caching, distillation, and clustering functions are helpful to overcome the above limitations and needs. TranSend was proposed as a clustering based wireless internet proxy server but it has disadvantages; 1) its scalability is difficult to achieve because there is no systematic way to do it and 2) its structure is complex because of the inefficient communication structure among modules. In our former research, we proposed the All-in-one structure which can be scalable in a systematic way but it also has disadvantages; 1) data sharing among cache servers is not allowed and 2) its communication structure among modules is complex. In this paper, we proposed its improved scheme which has an efficient communication structure among modules and allows data to be shared among cache servers. We performed experiments using 16 PCs and experimental results show 54.86$\%$ and 4.70$\%$ performance improvement of the proposed system compared to TranSend and All-in-one system respectively Due to data sharing amount cache servers, the proposed scheme has an advantage of keeping a fixed size of the total cache memory regardless of cache server numbers. On the contrary, in All-in-one, the total cache memory size increases proportional to the number of cache servers since each cache server should keep all cache data, respectively.

Fat Client-Based Abstraction Model of Unstructured Data for Context-Aware Service in Edge Computing Environment (에지 컴퓨팅 환경에서의 상황인지 서비스를 위한 팻 클라이언트 기반 비정형 데이터 추상화 방법)

  • Kim, Do Hyung;Mun, Jong Hyeok;Park, Yoo Sang;Choi, Jong Sun;Choi, Jae Young
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.3
    • /
    • pp.59-70
    • /
    • 2021
  • With the recent advancements in the Internet of Things, context-aware system that provides customized services become important to consider. The existing context-aware systems analyze data generated around the user and abstract the context information that expresses the state of situations. However, these datasets is mostly unstructured and have difficulty in processing with simple approaches. Therefore, providing context-aware services using the datasets should be managed in simplified method. One of examples that should be considered as the unstructured datasets is a deep learning application. Processes in deep learning applications have a strong coupling in a way of abstracting dataset from the acquisition to analysis phases, it has less flexible when the target analysis model or applications are modified in functional scalability. Therefore, an abstraction model that separates the phases and process the unstructured dataset for analysis is proposed. The proposed abstraction utilizes a description name Analysis Model Description Language(AMDL) to deploy the analysis phases by each fat client is a specifically designed instance for resource-oriented tasks in edge computing environments how to handle different analysis applications and its factors using the AMDL and Fat client profiles. The experiment shows functional scalability through examples of AMDL and Fat client profiles targeting a vehicle image recognition model for vehicle access control notification service, and conducts process-by-process monitoring for collection-preprocessing-analysis of unstructured data.

Performance evaluation of approximate frequent pattern mining based on probabilistic technique (확률 기법에 기반한 근접 빈발 패턴 마이닝 기법의 성능평가)

  • Pyun, Gwangbum;Yun, Unil
    • Journal of Internet Computing and Services
    • /
    • v.14 no.1
    • /
    • pp.63-69
    • /
    • 2013
  • Approximate Frequent pattern mining is to find approximate patterns, not exact frequent patterns with tolerable variations for more efficiency. As the size of database increases, much faster mining techniques are needed to deal with huge databases. Moreover, it is more difficult to discover exact results of mining patterns due to inherent noise or data diversity. In these cases, by mining approximate frequent patterns, more efficient mining can be performed in terms of runtime, memory usage and scalability. In this paper, we study the characteristics of an approximate mining algorithm based on probabilistic technique and run performance evaluation of the efficient approximate frequent pattern mining algorithm. Finally, we analyze the test results for more improvement.

Scrambling Technology using Scalable Encryption in SVC (SVC에서 스케일러블 암호화를 이용한 스크램블링 기술)

  • Kwon, Goo-Rak
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.4
    • /
    • pp.575-581
    • /
    • 2010
  • With widespread use of the Internet and improvements in streaming media and compression technology, digital music, video, and image can be distributed instantaneously across the Internet to end-users. However, most conventional Digital Right Management are often not secure and not fast enough to process the vast amount of data generated by the multimedia applications to meet the real-time constraints. The SVC offers temporal, spatial, and SNR scalability to varying network bandwidth and different application needs. Meanwhile, for many multimedia services, security is an important component to restrict unauthorized content access and distribution. This suggests the need for new cryptography system implementations that can operate at SVC. In this paper, we propose a new scrambling encryption for reserving the characteristic of scalability in MPEG4-SVC. In the base layer, the proposed algorithm is applied and performed the selective scambling. And it encrypts various MVS and intra-mode scrambling in the enhancement layer. In the decryption, it decrypts each encrypted layers by using another encrypted keys. Throughout the experimental results, the proposed algorithms have low complexity in encryption and the robustness of communication errors.

Modeling and Analysis of Vehicle Detection Using Roadside Ultrasonic Sensors in Wireless Sensor Networks (WSN 기반 노변 초음파 센서를 이용한 차량인식에 대한 모델링 및 분석)

  • Jo, Youngtae;Jung, Inbum
    • Journal of KIISE
    • /
    • v.41 no.10
    • /
    • pp.745-761
    • /
    • 2014
  • To address the problems of existing traffic information acquisition systems such as high cost and low scalability, wireless sensor networks (WSN)-based traffic information acquisition systems have been studied. WSN-based systems have many benefits including high scalability and low maintenance cost. Recently, various sensors are studied for traffic surveillance based on WSN, such as magnetic, acoustic, and accelerometer sensors. However, ultrasonic sensor based systems have not been studied. There are many issues for WSN-based systems, such as battery driven operation and low computing power. Thus, power saving methods and specific algorithms with low complexity are necessary. In this paper, we introduce optimal methodologies for power saving of ultrasonic sensors based on the modeling and analysis in detail. Moreover, a new vehicle detection algorithm for low complexity using ultrasonic data is presented. The proposed methodologies are implemented in a tiny microprocessor. The evaluation results show that our algorithm has high detection accuracy.