• Title/Summary/Keyword: fault detection and recovery

Search Result 40, Processing Time 0.026 seconds

A fault detection and recovery mechanism for the fault-tolerance of a Mini-MAP system (Mini-MAP 시스템의 결함 허용성을 위한 결함 감지 및 복구 기법)

  • Mun, Hong-Ju;Kwon, Wook-Hyun
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.4 no.2
    • /
    • pp.264-272
    • /
    • 1998
  • This paper proposes a fault detection and recovery mechanism for a fault-tolerant Mini-MAP system, and provides detailed techniques for its implementation. This paper considers the fault-tolerant Mini-MAP system which has dual layer structure from the LLC sublayer down to the physical layer to cope with the faults of those layers. For a good fault detection, a redundant and hierarchical fault supervision architecture is proposed and its implementation technique for a stable detection operation is provided. Information for the fault location is provided from data reported with a fault detection and obtained by an additional network diagnosis. The faults are recovered by the stand-by sparing method applied for a dual network composed of two equivalent networks. A network switch mechanism is proposed to achieve a reliable and stable network function. A fault-tolerant Mini-MAP system is implemented by applying the proposed fault detection and recovery mechanism.

  • PDF

A Fault Detection and Self-Recovery System for Space-Borne Dual Ring Counters (우주용 중복구조 링 카운터를 위한 고장 진단 및 자가 복구 시스템)

  • Kwak, Seong Woo;Yang, Jung-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.62 no.1
    • /
    • pp.120-126
    • /
    • 2013
  • This paper proposes a novel scheme of fault detection and self-recovery for space-borne dual ring counters subject to transient faults. The considered ring counter is equipped with hardware redundancy, but it has a limited output domain where direct access to the current state is unavailable. We employ the theory of corrective control to detect any transient fault occurring to the counter bits and to realize immediate self-recovery of the ring counter back to the normal state. The structure of the fault-tolerant controller is designed to be minimal regardless of the counter size. To validate the applicability, we implement the proposed system on a commercial FGPA board.

Design and Implementation of a Architecture For Fault-Tolerant and Real-Time System (결함허용 실시간 시스템 구조에 대한 설계 및 구현)

  • 유종상;김범식;신인철
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 1997.11a
    • /
    • pp.417-433
    • /
    • 1997
  • A real-time operating system has focused primary on techniques to minimize processing time, with a secondary emphasis on system reliability issues. Conversely, fault-tolerant system has concentrated on using recourse and information redundancy to maximize the availability and reliability of the system, with a lesser emphasis on performance. We have developed a fault-tolerant and real-time operations system which support a powerful concurrent runtime environment under the above requirements. In this paper, we present an overview of real-time systems, design and implementation of a duplex architecture using advanced concepts and technologies such as fast " fault detection", "fault isolation" and "fault recovery" Because the duplex architecture has two dentical hardware elements and has several recovery steps and hierarchy to facilitate a fast recovery which must be proceeded by a prompt fault detection and isolation. Thus it makes possible to minimize the overhead of the systems including hardware and software and guarantee the service continuity of he systems.

  • PDF

Analytical fault tolerant navigation system for an aerospace launch vehicle using sliding mode observer

  • Hasani, Mahdi;Roshanian, Jafar;Khoshnooda, A. Majid
    • Advances in aircraft and spacecraft science
    • /
    • v.4 no.1
    • /
    • pp.53-64
    • /
    • 2017
  • Aerospace Launch Vehicles (ALV) are generally designed with high reliability to operate in complete security through fault avoidance practices. However, in spite of such precaution, fault occurring is inevitable. Hence, there is a requirement for on-board fault recovery without significant degradation in the ALV performance. The present study develops an advanced fault recovery strategy to improve the reliability of an Aerospace Launch Vehicle (ALV) navigation system. The proposed strategy contains fault detection features and can reconfigure the system against common faults in the ALV navigation system. For this purpose, fault recovery system is constructed to detect and reconfigure normal navigation faults based on the sliding mode observer (SMO) theory. In the face of pitch channel sensor failure, the original gyro faults are reconstructed using SMO theory and by correcting the faulty measurement, the pitch-rate gyroscope output is constructed to provide fault tolerant navigation solution. The novel aspect of the paper is employing SMO as an online tuning of analytical fault recovery solution against unforeseen variations due to its hardware/software property. In this regard, a nonlinear model of the ALV is simulated using specific navigation failures and the results verified the feasibility of the proposed system. Simulation results and sensitivity analysis show that the proposed techniques can produce more effective estimation results than those of the previous techniques, against sensor failures.

An Architecture-based Multi-level Self-Adaptive Monitoring Method for Software Fault Detection (소프트웨어 오류 탐지를 위한 아키텍처 기반의 다계층적 자가적응형 모니터링 방법)

  • Youn, Hyun-Ji;Park, Soo-Yong
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.7
    • /
    • pp.568-572
    • /
    • 2010
  • Self-healing is one of the techniques that assure dependability of mission-critical system. Self-healing consists of fault detection and fault recovery and fault detection is important first step that enables fault recovery but it causes overhead. We can detect fault based on model, the detection tasks that notify system's behavior and compare normal behavior model and system's behavior are heavy jobs. In this paper, we propose architecture-based multi-level self-adaptive monitoring method that complements model-based fault detection. The priority of fault detection per component is different in the software architecture. Because the seriousness and the frequency of fault per component are different. If the monitor is adapted to intensive to the component that has high priority of monitoring and loose to the component that has low priority of monitoring, the overhead can be decreased and the efficiency can be maintained. Because the environmental changes of software and the architectural changes bring the changes at the priority of fault detection, the monitor learns the changes of fault frequency and that is adapted to intensive to the component that has high priority of fault detection.

KOMPSAT-2 Fault and Recovery Management

  • Baek, Myung-Jin;Lee, Na-Young;Keum, Jung-Hoon
    • International Journal of Aeronautical and Space Sciences
    • /
    • v.3 no.2
    • /
    • pp.31-39
    • /
    • 2002
  • In this paper, KOMPSAT-2 on-board fault and ground recovery management design is addressesed in terms of hardware and software components which provide failure detection and spacecraft safing for anomalies which threaten spacecraft survival. It also includes ground real time up-commanding operation to recover the system safely. KOMPSAT-2 spacecraft fault and recovery management is designed such that the subsequent system configuration due to system initialization is initiated and controlled by processors. This paper will show that KOMPSAT-2 has a new design feature of CPU SEU mitigation for the possible upsets in the processor CPUs as a part of on-board fault management design. Recovery management of processor switching has two different ways: gang switching and individual switching. This paper will show that the difficulties of using multiple-processor system can be managed by proper design implementation and flight operation.

Design of a Fault-tolerant Embedded Controllerfor Rail-way Signaling Systems

  • Cho, Yong-Gee;Lim, Jae-Sik
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2002.10a
    • /
    • pp.68.4-68
    • /
    • 2002
  • $\textbullet$ This report presents an implementation a set of reusable software components which use of fault-tolerance embedded controller for railway signalling systems. These components can be used in real-time applications without application reprogramming. $\textbullet$ This library runs under VxWorks operating system and is oriented on real-time embedded systems. The library includes fault detection, fault containment, checkpointing and recovery components. $\textbullet$ The library enables to support high-speed response to fault occurrence in application software. Garbage collector together with VxWorks Watchdog provides both dead tasks detection and useless resources removing to avoid an overflow. Control flow...

  • PDF

A Study on Software Based Fault-Tolerance Techniques for Flight Control Computer (비행조종컴퓨터 소프트웨어 기반 고장허용 설계 기법 연구)

  • Yoon, Hyung-Sik;Kim, Yeon-Gyun
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.44 no.3
    • /
    • pp.256-265
    • /
    • 2016
  • Software based fault tolerance techniques are designed to allow a system to tolerate software faults in the system. Fault tolerance techniques are divided into two groups : software based fault tolerance techniques and hardware based fault tolerance techniques. We need a proper design method according to characteristics of the system. In this paper, the concepts of software based fault tolerance techniques for Dual Flight Control Computer are described. For software based fault tolerance design, we classified software failure, designed a way for failure detection and the way of recovery. Eventually the effectiveness of software based fault tolerance techniques was verified through the Software Test Environment(STE).

Fault Diagnosis and Recovery of a Thermal Error Compensation System in a CNC Machine Tool (CNC 공작기계에서 열변형 오차 보정 시스템의 고장진단 및 복구)

  • 황석현;이진현;양승한
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.17 no.4
    • /
    • pp.135-141
    • /
    • 2000
  • The major role of temperature sensors in thermal error compensation system of machine tools is improving machining accuracy by supplying reliable temperature data on the machine structure. This paper presents a new method for fault diagnosis of temperature sensors and recovery of faulted data to establish the reliability of thermal error compensation system. The detection of fault and its location is based on the correlation coefficients among temperature data from the sensors. The multiple linear regression model which is prepared using complete normal data is also used fur the recovery of faulted data. The effectiveness of this method was tested by comparing the computer simulation results and measured data in a CNC machining center.

  • PDF

OPRoS based Fault Tolerance Support for Reliability of Service Robots (서비스로봇의 신뢰성 향상을 위한 OPRoS 기반 Fault-tolerance 기법)

  • Ahn, Hee-June;Lee, Dong-Su;Ahn, Sang-Chul
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.16 no.6
    • /
    • pp.601-607
    • /
    • 2010
  • For commercial success of emerging service robots, the fault tolerant technology for system reliability and human safety is crucial. Traditionally fault tolerance methods have been implemented in application level. However, from our studies on the common design patterns in fault tolerance, we argue that a framework-based approach provides many benefits in providing reliability for system development. To demonstrate the benefits, we build a framework-based fault tolerant engine for OPRoS (Open Platform for Robotic Services) standards. The fault manager in framework provides a set of fault tolerant measures of detection, isolation, and recovery. The system integrators choose the appropriate fault handling tools by declaring XML configuration descriptors, considering the constraints of components and operating environment. By building a fault tolerant navigation application from the non-faulttolerant components, we demonstrate the usability and benefits of the proposed framework-based approach.