• Title/Summary/Keyword: Fault-Tolerance Computing System

Search Result 51, Processing Time 0.026 seconds

Synchronize Ethernet-based Fault Injection Algorithm Implementation for Intelligent Automotive Network (차량용 지능형 네트워크에서의 동기식 이더넷중심 오류 주입 알고리즘 구현☆)

  • Jang, Eunji;Kim, Inyoung;Lee, Woongjae
    • Journal of Internet Computing and Services
    • /
    • v.17 no.4
    • /
    • pp.43-50
    • /
    • 2016
  • In this paper, we propose the protocol of Ethernet that will receive a popular interesting in the automotive intelligent network, it also attempts to implementation and verification through simulation and experiments to propose a fault tolerance algorithm when the data transfer on it. It has proven the usefulness of the system in order to apply toward an existing automotive communication system. In the case of actual real-time data for automotive industry, we generated a randomly-generated data which is the set of payload into a standard format to complete the experiment. Among the implemented existing algorithms performance, we confirmed the effectiveness of all range from a single data to mixed (Hybrid-type) data, to verify the proposed algorithm.

Big data platform for health monitoring systems of multiple bridges

  • Wang, Manya;Ding, Youliang;Wan, Chunfeng;Zhao, Hanwei
    • Structural Monitoring and Maintenance
    • /
    • v.7 no.4
    • /
    • pp.345-365
    • /
    • 2020
  • At present, many machine leaning and data mining methods are used for analyzing and predicting structural response characteristics. However, the platform that combines big data analysis methods with online and offline analysis modules has not been used in actual projects. This work is dedicated to developing a multifunctional Hadoop-Spark big data platform for bridges to monitor and evaluate the serviceability based on structural health monitoring system. It realizes rapid processing, analysis and storage of collected health monitoring data. The platform contains offline computing and online analysis modules, using Hadoop-Spark environment. Hadoop provides the overall framework and storage subsystem for big data platform, while Spark is used for online computing. Finally, the big data Hadoop-Spark platform computational performance is verified through several actual analysis tasks. Experiments show the Hadoop-Spark big data platform has good fault tolerance, scalability and online analysis performance. It can meet the daily analysis requirements of 5s/time for one bridge and 40s/time for 100 bridges.

Long-Term Container Allocation via Optimized Task Scheduling Through Deep Learning (OTS-DL) And High-Level Security

  • Muthakshi S;Mahesh K
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.4
    • /
    • pp.1258-1275
    • /
    • 2023
  • Cloud computing is a new technology that has adapted to the traditional way of service providing. Service providers are responsible for managing the allocation of resources. Selecting suitable containers and bandwidth for job scheduling has been a challenging task for the service providers. There are several existing systems that have introduced many algorithms for resource allocation. To overcome these challenges, the proposed system introduces an Optimized Task Scheduling Algorithm with Deep Learning (OTS-DL). When a job is assigned to a Cloud Service Provider (CSP), the containers are allocated automatically. The article segregates the containers as' Long-Term Container (LTC)' and 'Short-Term Container (STC)' for resource allocation. The system leverages an 'Optimized Task Scheduling Algorithm' to maximize the resource utilisation that initially inquires for micro-task and macro-task dependencies. The bottleneck task is chosen and acted upon accordingly. Further, the system initializes a 'Deep Learning' (DL) for implementing all the progressive steps of job scheduling in the cloud. Further, to overcome container attacks and errors, the system formulates a Container Convergence (Fault Tolerance) theory with high-level security. The results demonstrate that the used optimization algorithm is more effective for implementing a complete resource allocation and solving the large-scale optimization problem of resource allocation and security issues.

Autonomic Self Healing-Based Load Assessment for Load Division in OKKAM Backbone Cluster

  • Chaudhry, Junaid Ahsenali
    • Journal of Information Processing Systems
    • /
    • v.5 no.2
    • /
    • pp.69-76
    • /
    • 2009
  • Self healing systems are considered as cognation-enabled sub form of fault tolerance system. But our experiments that we report in this paper show that self healing systems can be used for performance optimization, configuration management, access control management and bunch of other functions. The exponential complexity that results from interaction between autonomic systems and users (software and human users) has hindered the deployment and user of intelligent systems for a while now. We show that if that exceptional complexity is converted into self-growing knowledge (policies in our case), can make up for initial development cost of building an intelligent system. In this paper, we report the application of AHSEN (Autonomic Healing-based Self management Engine) to in OKKAM Project infrastructure backbone cluster that mimics the web service based architecture of u-Zone gateway infrastructure. The 'blind' load division on per-request bases is not optimal for distributed and performance hungry infrastructure such as OKKAM. The approach adopted assesses the active threads on the virtual machine and does resource estimates for active processes. The availability of a certain server is represented through worker modules at load server. Our simulation results on the OKKAM infrastructure show that the self healing significantly improves the performance and clearly demarcates the logical ambiguities in contemporary designs of self healing infrastructures proposed for large scale computing infrastructures.

Designing of Network based Tiny Ubiquitous Networked Systems (네트워크 기반의 소형 유비쿼터스 시스템의 개발)

  • Hwang, Kwang-Il;Eom, Doo-Seop
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.13 no.3
    • /
    • pp.141-152
    • /
    • 2007
  • In this paper, we present a network-oriented lightweight real-time system, which is composed of an event-driven operating system called the Embedded Lightweight Operating System (ELOS) and a generic multi hop ad hoc routing protocol suite. In the ELOS, a conditional preemptive FCFS scheduling method with a guaranteed time slot is designed for efficient real-time processing. For more elaborate configurations, we reinforce fault tolerance by supplementing semi-auto configuration using wireless agent nodes. The developed hardware platform is also introduced, which is a scalable prototype constructed using off-the-shelf components. In addition, in order to evaluate the performance of the proposed system, we developed a ubiquitous network test-bed on which several experiments with respect to various environments are conducted. The results show that the ELOS is considerably favorable for tiny ubiquitous networked systems with real-time constraints.

An ORB Extension for support of Fault-Tolerant CORBA (고장감내 CORBA를 지원하기 위한 객체중개자의 확장)

  • Shin, Bum-Joo;Son, Duk-Joo;Kim, Myung-Joon
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.2
    • /
    • pp.121-131
    • /
    • 2001
  • The failure of network and/or node on which server object is executed is a single point of system failure in the CORBA application. One of the possible ways to overcome such problem is to replicate server objects to several independent nodes. The replicated objects executing same tasks are called object group. In order to provide fault tolerance of server object, this paper proposes and implements new CORBA model that supports the object group based on active replication. The proposed model not only provides interoperability with existing CORBA application but also minimizes additional application interface required to support object group because it uses nop to exchange messages between client and server. And this paper extends IDL structure. Depending to application logic, it makes possible to prevent performance degradation caused by consistency maintenance. At present, this paper supports only active replication. But it can be easily extended to provide warm ancVor cold passive replication without modification of architecture required for active replication.

  • PDF

Analysis of Available Performance Satisfying Waiting Time Deadline for (n, k)-way Systems (대기시간 데드라인 조건을 고려한(n, k)-way 시스템의 가용 성능 분석)

  • 박기진;김성수
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.30 no.9
    • /
    • pp.445-453
    • /
    • 2003
  • As cluster systems used for high performance computing consist of large number of running servers, one has to solve the low availability problems occurred by the high chance of the server failures. To handle the problems, it is necessary to have the precise definition of available performance of cluster systems that represents availability and performability of the systems simultaneously. Previous research results that mention availability issues lack for concerning system performance such as waiting time and response time in their availability definition. In this paper, we propose a new availability metric for (n, k)-way cluster systems which compose of n primary servers and k backup servers. With the metric, the change of system performance according to arrival rates is captured and the waiting time of a request can be kept below to a certain level. Using various system operating parameters, we calculate availability and downtime of cluster systems along with waiting tine deadline.

Intelligent Multi-Agent Distributed Platform based on Dynamic Object Group Management using Fk-means (Fk means를 이용한 동적객체그룹관리기반 지능형 멀티 에이전트 분산플랫폼)

  • Lee, Jae-wan;Na, Hye-Young;Mateo, Romeo Mark A.
    • Journal of Internet Computing and Services
    • /
    • v.10 no.1
    • /
    • pp.101-110
    • /
    • 2009
  • Multi-agent systems are mostly used to integrate the intelligent and distributed approaches to various systems for effective sharing of resources and dynamic system reconfigurations. Object replication is usually used to implement fault tolerance and solve the problem of unexpected failures to the system. This paper presents the intelligent multi-agent distributed platform based on the dynamic object group management and proposes an object search technique based on the proposed filtered k-means (Fk-means). We propose Fk-means for the search mechanism to find alternative objects in the event of object failures and transparently reconnect client to the object. The filtering range of Fk-means value is set only to include relevant objects within the group to perform the search method efficiently. The simulation result shows that the proposed mechanism provides fast and accurate search for the distributed object groups.

  • PDF

A video transmission system for a high quality and fault tolerance based on multiple paths using TCP/IP (다중 경로를 이용한 TCP/IP 기반 고품질 및 고장 감내 비디오 전송 시스템)

  • Kim, Nam-Su;Lee, Jong-Yeol;Pyun, Kihyun
    • Journal of Internet Computing and Services
    • /
    • v.15 no.6
    • /
    • pp.1-8
    • /
    • 2014
  • As the e-learning spreads widely and demands on the internet video service, transmitting video data for many users over the Internet becomes popular. To satisfy this needs, the traditional approach uses a tree structure that uses the video server as the root node. However, this approach has the danger of stopping the video service even when one of the nodes along the path has a some problem. In this paper, we propose a video-on-demand service that uses multiple paths. We add new paths for backup and speed up for transmitting the video data. We show by simulation experiments that our approach provides a high-quality of video service.

A Study on the Architecture for Avionics System of Jet Fighters (제트 전투기의 항공전자 시스템 아키텍처에 관한 연구)

  • Gook, Kwon Byeong;Won, Son Il
    • Journal of Aerospace System Engineering
    • /
    • v.16 no.1
    • /
    • pp.86-96
    • /
    • 2022
  • The development trend of jet fighter's avionics system architecture is the digitization of subsystem component functions, increased RF sensor sharing, fiber optic channel networks, and modularized integrated structures. The avionics system architecture of the fifth generation jet fighters (F-22, F-35) has evolved into an integrated modular avionics system based on computing function integration and RF integrated sensor systems. The integrated modular avionics system of jet fighters should provide improved combat power, fault tolerance, and ease of jet fighter control. To this aim, this paper presents the direction and requirements of the next-generation jet fighter's avionics system architecture through analysis of the fifth generation jet fighter's avionics system architecture. The core challenge of the integrated modularized avionic system architecture requirements for next-generation fighters is to build a platform that integrates major components and sensors into aircraft. In other words, the architecture of the next-generation fighters is standardization of systems, sensor integration of each subsystem through open interfaces, integration of functional elements, network integration, and integration of pilots and fighters to improve their ability to respond and control.