• 제목/요약/키워드: fault detection and recovery

검색결과 40건 처리시간 0.026초

Mini-MAP 시스템의 결함 허용성을 위한 결함 감지 및 복구 기법 (A fault detection and recovery mechanism for the fault-tolerance of a Mini-MAP system)

  • 문홍주;권욱현
    • 제어로봇시스템학회논문지
    • /
    • 제4권2호
    • /
    • pp.264-272
    • /
    • 1998
  • This paper proposes a fault detection and recovery mechanism for a fault-tolerant Mini-MAP system, and provides detailed techniques for its implementation. This paper considers the fault-tolerant Mini-MAP system which has dual layer structure from the LLC sublayer down to the physical layer to cope with the faults of those layers. For a good fault detection, a redundant and hierarchical fault supervision architecture is proposed and its implementation technique for a stable detection operation is provided. Information for the fault location is provided from data reported with a fault detection and obtained by an additional network diagnosis. The faults are recovered by the stand-by sparing method applied for a dual network composed of two equivalent networks. A network switch mechanism is proposed to achieve a reliable and stable network function. A fault-tolerant Mini-MAP system is implemented by applying the proposed fault detection and recovery mechanism.

  • PDF

우주용 중복구조 링 카운터를 위한 고장 진단 및 자가 복구 시스템 (A Fault Detection and Self-Recovery System for Space-Borne Dual Ring Counters)

  • 곽성우;양정민
    • 전기학회논문지
    • /
    • 제62권1호
    • /
    • pp.120-126
    • /
    • 2013
  • This paper proposes a novel scheme of fault detection and self-recovery for space-borne dual ring counters subject to transient faults. The considered ring counter is equipped with hardware redundancy, but it has a limited output domain where direct access to the current state is unavailable. We employ the theory of corrective control to detect any transient fault occurring to the counter bits and to realize immediate self-recovery of the ring counter back to the normal state. The structure of the fault-tolerant controller is designed to be minimal regardless of the counter size. To validate the applicability, we implement the proposed system on a commercial FGPA board.

결함허용 실시간 시스템 구조에 대한 설계 및 구현 (Design and Implementation of a Architecture For Fault-Tolerant and Real-Time System)

  • 유종상;김범식;신인철
    • 한국산업정보학회:학술대회논문집
    • /
    • 한국산업정보학회 1997년도 추계학술대회 발표논문집:21세기를 향한 정보통신 기술의 전망
    • /
    • pp.417-433
    • /
    • 1997
  • A real-time operating system has focused primary on techniques to minimize processing time, with a secondary emphasis on system reliability issues. Conversely, fault-tolerant system has concentrated on using recourse and information redundancy to maximize the availability and reliability of the system, with a lesser emphasis on performance. We have developed a fault-tolerant and real-time operations system which support a powerful concurrent runtime environment under the above requirements. In this paper, we present an overview of real-time systems, design and implementation of a duplex architecture using advanced concepts and technologies such as fast " fault detection", "fault isolation" and "fault recovery" Because the duplex architecture has two dentical hardware elements and has several recovery steps and hierarchy to facilitate a fast recovery which must be proceeded by a prompt fault detection and isolation. Thus it makes possible to minimize the overhead of the systems including hardware and software and guarantee the service continuity of he systems.

  • PDF

Analytical fault tolerant navigation system for an aerospace launch vehicle using sliding mode observer

  • Hasani, Mahdi;Roshanian, Jafar;Khoshnooda, A. Majid
    • Advances in aircraft and spacecraft science
    • /
    • 제4권1호
    • /
    • pp.53-64
    • /
    • 2017
  • Aerospace Launch Vehicles (ALV) are generally designed with high reliability to operate in complete security through fault avoidance practices. However, in spite of such precaution, fault occurring is inevitable. Hence, there is a requirement for on-board fault recovery without significant degradation in the ALV performance. The present study develops an advanced fault recovery strategy to improve the reliability of an Aerospace Launch Vehicle (ALV) navigation system. The proposed strategy contains fault detection features and can reconfigure the system against common faults in the ALV navigation system. For this purpose, fault recovery system is constructed to detect and reconfigure normal navigation faults based on the sliding mode observer (SMO) theory. In the face of pitch channel sensor failure, the original gyro faults are reconstructed using SMO theory and by correcting the faulty measurement, the pitch-rate gyroscope output is constructed to provide fault tolerant navigation solution. The novel aspect of the paper is employing SMO as an online tuning of analytical fault recovery solution against unforeseen variations due to its hardware/software property. In this regard, a nonlinear model of the ALV is simulated using specific navigation failures and the results verified the feasibility of the proposed system. Simulation results and sensitivity analysis show that the proposed techniques can produce more effective estimation results than those of the previous techniques, against sensor failures.

소프트웨어 오류 탐지를 위한 아키텍처 기반의 다계층적 자가적응형 모니터링 방법 (An Architecture-based Multi-level Self-Adaptive Monitoring Method for Software Fault Detection)

  • 윤현지;박수용
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제37권7호
    • /
    • pp.568-572
    • /
    • 2010
  • Mission-critical 시스템의 경우 자가 치유는 신뢰성을 보장하기 위한 기술 중 하나이다. 자가치유는 오류 탐지와 오류 회복으로 이루어져 있으며 오류 탐지는 오류 회복을 가능하게 하는 자가 치유의 중요한 첫 단계이지만 시스템에 과부하를 주는 문제가 있다. 모델 기반의 방법 등으로 오류를 탐지할 수 있는데 시스템의 모든 행위를 통지하고 정상 행위 모델과 통지된 시스템의 행위를 비교하여야 하므로 그양이 많고 부하가 크기 때문이다. 본 논문에서는 모델 기반의 오류 탐지 방법을 보완하는 아키텍처 기반의 다계층적 자가적응형 모니터링 방법을 제안한다. 소프트웨어 아키텍처 상에서 오류 탐지의 중요도는 컴포넌트 마다 다르다. 각 컴포넌트마다 발생하는 오류의 심각도와 빈도가 다르기 때문이다. 모니터링 중요도가 높은 컴포넌트에는 강도가 높고 모니터링 중요도가 낮은 컴포넌트에는 강도가 낮도록 모니터가 적응한다면 오류 탐지의 부하는 줄이고 효율은 유지시킬 수 있다. 또한 소프트웨어의 환경 변화 및 아키텍처상의 변화 등에 따라 오류 발생 빈도가 변화하여 컴포넌트의 오류 탐지 중요도가 변화하기 때문에 학습을 통해 이를 추적하여 자가적응적으로 중요도가 높은 컴포넌트를 집중 모니터링 한다.

KOMPSAT-2 Fault and Recovery Management

  • Baek, Myung-Jin;Lee, Na-Young;Keum, Jung-Hoon
    • International Journal of Aeronautical and Space Sciences
    • /
    • 제3권2호
    • /
    • pp.31-39
    • /
    • 2002
  • In this paper, KOMPSAT-2 on-board fault and ground recovery management design is addressesed in terms of hardware and software components which provide failure detection and spacecraft safing for anomalies which threaten spacecraft survival. It also includes ground real time up-commanding operation to recover the system safely. KOMPSAT-2 spacecraft fault and recovery management is designed such that the subsequent system configuration due to system initialization is initiated and controlled by processors. This paper will show that KOMPSAT-2 has a new design feature of CPU SEU mitigation for the possible upsets in the processor CPUs as a part of on-board fault management design. Recovery management of processor switching has two different ways: gang switching and individual switching. This paper will show that the difficulties of using multiple-processor system can be managed by proper design implementation and flight operation.

Design of a Fault-tolerant Embedded Controllerfor Rail-way Signaling Systems

  • Cho, Yong-Gee;Lim, Jae-Sik
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2002년도 ICCAS
    • /
    • pp.68.4-68
    • /
    • 2002
  • $\textbullet$ This report presents an implementation a set of reusable software components which use of fault-tolerance embedded controller for railway signalling systems. These components can be used in real-time applications without application reprogramming. $\textbullet$ This library runs under VxWorks operating system and is oriented on real-time embedded systems. The library includes fault detection, fault containment, checkpointing and recovery components. $\textbullet$ The library enables to support high-speed response to fault occurrence in application software. Garbage collector together with VxWorks Watchdog provides both dead tasks detection and useless resources removing to avoid an overflow. Control flow...

  • PDF

비행조종컴퓨터 소프트웨어 기반 고장허용 설계 기법 연구 (A Study on Software Based Fault-Tolerance Techniques for Flight Control Computer)

  • 윤형식;김연균
    • 한국항공우주학회지
    • /
    • 제44권3호
    • /
    • pp.256-265
    • /
    • 2016
  • 소프트웨어 기반의 고장허용이란 장비의 일부분에 소프트웨어 고장이 발생하더라도 허용할 수 있도록 장비를 설계하는 것을 의미힌다. 고장허용을 위한 설계 방법은 크게 하드웨어 기반 고장허용 설계 방법과 소프트웨어 기반 고장허용 설계 방법이 있으며, 시스템의 특징에 따라 적절한 방법의 고장허용 설계 방법 선택이 필요하다. 본 논문에서는 하드웨어적으로 이중화로 구성된 비행조종컴퓨터의 소프트웨어 기반 고장허용 설계 기법에 대하여 기술하였다. 소프트웨어 기반의 고장허용 설계를 위하여 소프트웨어 고장을 분류하고, 고장에 대한 검출 방법을 설계한 후, 고장발생시 복구 방법을 설계하였다. 설계된 방법의 유효성을 확인하기 위하여 전용 소프트웨어 시험 환경을 통해 설계된 소프트웨어 기반 고장허용 설계의 타당성을 검증하였다.

CNC 공작기계에서 열변형 오차 보정 시스템의 고장진단 및 복구 (Fault Diagnosis and Recovery of a Thermal Error Compensation System in a CNC Machine Tool)

  • 황석현;이진현;양승한
    • 한국정밀공학회지
    • /
    • 제17권4호
    • /
    • pp.135-141
    • /
    • 2000
  • The major role of temperature sensors in thermal error compensation system of machine tools is improving machining accuracy by supplying reliable temperature data on the machine structure. This paper presents a new method for fault diagnosis of temperature sensors and recovery of faulted data to establish the reliability of thermal error compensation system. The detection of fault and its location is based on the correlation coefficients among temperature data from the sensors. The multiple linear regression model which is prepared using complete normal data is also used fur the recovery of faulted data. The effectiveness of this method was tested by comparing the computer simulation results and measured data in a CNC machining center.

  • PDF

서비스로봇의 신뢰성 향상을 위한 OPRoS 기반 Fault-tolerance 기법 (OPRoS based Fault Tolerance Support for Reliability of Service Robots)

  • 안희준;이동수;안상철
    • 제어로봇시스템학회논문지
    • /
    • 제16권6호
    • /
    • pp.601-607
    • /
    • 2010
  • For commercial success of emerging service robots, the fault tolerant technology for system reliability and human safety is crucial. Traditionally fault tolerance methods have been implemented in application level. However, from our studies on the common design patterns in fault tolerance, we argue that a framework-based approach provides many benefits in providing reliability for system development. To demonstrate the benefits, we build a framework-based fault tolerant engine for OPRoS (Open Platform for Robotic Services) standards. The fault manager in framework provides a set of fault tolerant measures of detection, isolation, and recovery. The system integrators choose the appropriate fault handling tools by declaring XML configuration descriptors, considering the constraints of components and operating environment. By building a fault tolerant navigation application from the non-faulttolerant components, we demonstrate the usability and benefits of the proposed framework-based approach.