• Title/Summary/Keyword: Software fault tolerance techniques

Search Result 20, Processing Time 0.025 seconds

A Study on Software Based Fault-Tolerance Techniques for Flight Control Computer (비행조종컴퓨터 소프트웨어 기반 고장허용 설계 기법 연구)

  • Yoon, Hyung-Sik;Kim, Yeon-Gyun
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.44 no.3
    • /
    • pp.256-265
    • /
    • 2016
  • Software based fault tolerance techniques are designed to allow a system to tolerate software faults in the system. Fault tolerance techniques are divided into two groups : software based fault tolerance techniques and hardware based fault tolerance techniques. We need a proper design method according to characteristics of the system. In this paper, the concepts of software based fault tolerance techniques for Dual Flight Control Computer are described. For software based fault tolerance design, we classified software failure, designed a way for failure detection and the way of recovery. Eventually the effectiveness of software based fault tolerance techniques was verified through the Software Test Environment(STE).

Reliability Analysis for Train Control System by Software Fault Tolerance Techniques (소프트웨어 결함허용 기법에 의한 열차제어시스템 신뢰도 분석)

  • Suh, Seog-Chul;Lee, Jong-Woo
    • Journal of the Korean Society for Railway
    • /
    • v.12 no.6
    • /
    • pp.1043-1048
    • /
    • 2009
  • PES (Programmable Electronic System) is used by software development for the train control system. PES has been widely used in real world and consists of hardware, firmware and application software. The PES are easily apply to many applications because its implementation has high flexibility. Many safety critical functions are realized through software in safety critical system. Normally, it is difficult to detect failures for PES system because the PES is too sophisticated to identify sources of the failure. So, the reliability analysis is needed by using software fault tolerance techniques. Currently, there are the recovery block, distributed recovery block, N-version programming, N self-checking programming in fault tolerance techniques. In this paper, the models of recovery block and N-version programming in software fault tolerance techniques are suggested by using the Markov model. Also, the reliability in the train control system is analyzed through changing time. The fault occupancy rates of the program, adjustment test and voter are stationary. So, the relation between time and reliability is presented by using Matlab program. In the result of reliability, the reliability of recovery block is more high than N-version programming in case of the same number of substitution block.

A study on the Design Techniques and Analysis of Fault-Tolerant Computers

  • Cho, Jai-Rip
    • Journal of Korean Society for Quality Management
    • /
    • v.21 no.1
    • /
    • pp.78-95
    • /
    • 1993
  • The art of designing and analyzing fault-tolerant computers is surveyed with special emphasis on problems of analyzing the behavior of computers that have autonomous repair capability. The survey covers the following topics : (1) general issues in computer reliability, (2) fault-tolerance state relations and requirements, (3) computational hierarchy, (4) fault characteristics, (5) fault diagnosis, (6) fault-tolerance schemes for logic network and machines, (7) fault-coverage effects, and (8) fault-tree analysis of coverage. This paper does not include techniques for verifying nonredundant hardware or system software designs or for verifying the correctness of application programs.

  • PDF

Comparative Study of the System Operational Method for Fault-Tolernace (Fault-Tolerance를 위한 시스템의 동작방식에 대한 비교 연구)

  • 양성현;이기서
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.17 no.11
    • /
    • pp.1279-1289
    • /
    • 1992
  • Fault-tolerant system in improved the reliability and safety by using hardware and software redundancy. Fault mask and detection, identification techniques are conditionally used with system's application areas. Here DMR system is operated with standby and fail-safe module method that has minimal hardware and software redundancy, then its reliablity and safety comparison is presented respectively. Also this paper proposed an effective methods of dealing with transient faults as compared system's MTTFs to transient faults tolerance capabilities of self-diagnosis program.

  • PDF

A study on the Correlation Hazard Analysis for Signaling System Safety (안전성 확보를 위한 위험원 분석 기법간 상관관계에 대한 연구)

  • Han, Chan-Hee;Lee, Young-Soo;Ahn, Jin;Jo, Woo-Sic
    • Proceedings of the KSR Conference
    • /
    • 2007.11a
    • /
    • pp.638-645
    • /
    • 2007
  • Computers are increasingly being introduced into safety and reliability critical systems. The safe and reliable operation of these systems cannot be taken for granted. Malfunctions of these systems can have potentially catastrophic consequences and they have already been involved in serious accidents. Software fault prevention, fault tolerance, fault removal and fault forecasting are the techniques to be used, implemented and verified for embedded software in critical systems as the contributors to safety and reliability of the software. To use them when developing a software product, a relationship must be established between them and the development processes, the methods and techniques to be used to develop software, as well as with the different product architectures. Railroad signaling system software is a safety-critical embedded software with realtime and high reliability requirements. The primary purpose of the safety management is to prevent the loss of lives or physical damages arising from potential hazards in the railroad signaling system. This study provides a systematic approach to analysis of potential hazards for their management during the system life cycle to assure the identification and definition of the most appropriate hazards.

  • PDF

Switchover Time Analysis of Primary-Backup Server Systems Based on Software Rejuvenation (소프트웨어 재활기법에 기반한 주-여분 서버 시스템의 작업전이 시간 분석)

  • Lee, Jae-Sung;Park, Kie-Jin;Kim, Sung-Soo
    • The KIPS Transactions:PartA
    • /
    • v.8A no.2
    • /
    • pp.81-90
    • /
    • 2001
  • As the rapid growth of Internet, computer systems are growing in its size and complexity. To meet high availability requirements for the systems, one usually uses both hardware and software fault tolerance techniques. To prevent failures of computer systems from software-aging phenomenon that come from long mission time, we adopt software rejuvenation method that stops and restarts the software in the servers intentionally. The method makes the systems clean and healthy state in which the probability of fault occurrence is very low. In this paper, we study how switchover time affects software rejuvenation of primary-backup server systems. Through experiments, we find that switchover time is an essential factor for deciding the rejuvenation policy.

  • PDF

A Study on the Application of Virtual Track Circuit by Considering Software Fault Tolerance Techniques in Depot (검수고에서 소프트웨어 결함허용기법을 고려한 가상궤도회로의 적용에 대한 연구)

  • Lee, Myoung-Chol;Ko, Young-Hwan;Kim, Min-Seok;Lee, Jong-Woo
    • Journal of the Korean Society for Railway
    • /
    • v.15 no.2
    • /
    • pp.122-128
    • /
    • 2012
  • Considering structure of depot, it is impossible to install the track circuit systems due to iron-beam. Because rails and earth are connected by the iron-beam, there is much leakage current. So, it is hard to apply the track circuit systems. Thus, when trains go to the depot, sign which indicates existence of trains is used manually. In case of wrong sign, accidents occur such as train crash, derailment etc. Currently, location of trains has been found by using optical sensor in the depot to prevent the accidents. However, it costs a great deal to install and maintain the optical sensor. Therefore, this method is hardly used in train operation institutes. In this paper, virtual track circuit systems are introduced by using software program in the depot. Also, algorithm of the virtual track circuit systems is proposed. In case that signal is handled to the depot which is occupied by the train, safety is ensured by indicating sign which means existence of trains and stop signal. Also, proper fault tolerance techniques are proposed to the software by analyzing reliability and availability.

Analytic Model for Optimal Checkpoints in Mobile Real-time Systems

  • Lim, Sung-Hwa;Lee, Byoung-Hoon;Kim, Jai-Hoon
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.8
    • /
    • pp.3689-3700
    • /
    • 2016
  • It is not practically feasible to apply hardware-based fault-tolerant schemes, such as hardware replication, in mobile devices. Therefore, software-based fault-tolerance techniques, such as checkpoint and rollback schemes, are required. In checkpoint and rollback schemes, the optimal checkpoint interval should be applied to obtain the best performance. Most previous studies focused on minimizing the expected execution time or response time for completing a given task. Currently, most mobile applications run in real-time environments. Therefore, it is extremely essential for mobile devices to employ optimal checkpoint intervals as determined by the real-time constraints of tasks. In this study, we tackle the problem of determining the optimal inter-checkpoint interval of checkpoint and rollback schemes to maximize the deadline meet ratio in real-time systems and to build a probabilistic cost model. From this cost model, we can numerically find the optimal checkpoint interval using mathematical tools. The performance of the proposed solution is evaluated using analytical estimates.

Supporting Adaptability and Modularity of System Software

  • Netinant, Paniti
    • Proceedings of the IEEK Conference
    • /
    • 2002.07b
    • /
    • pp.1339-1342
    • /
    • 2002
  • It is difficult to design system software to meet a better separation of concerns, which can provide a number of benefits such as adaptability, extensibility, and modularity in the design and implementation. During design, some aspectual properties, such as synchronization, scheduling, performance and fault tolerance, crosscut the basic functionalities of the system software. By separating functional components from the different aspectual components of the system software in the design, we can provide a better generic design model of system software. Aspect-Oriented Programming is a methodology that aims at separating components and aspects from the early stages of the software life cycle, and using techniques to combining them together at the implementation phase. In this paper we discuss an aspect-oriented framework that can simplify system software design and implementation by expressing it at a higher level of abstraction. Our work concentrates on how to achieve a higher separation of aspectual components, functional components, and layers from each other. Our goal is to achieve a better design model for implementing system software in terms of modularity, reusability and adaptability.

  • PDF

A Fault-Recovery Agent for Distance Education on Home Network Environment (홈 네트워크 환경에서 원격 교육을 위한 결함 복구 에이전트)

  • Ko, Eung-Nam
    • Journal of Advanced Navigation Technology
    • /
    • v.11 no.4
    • /
    • pp.479-484
    • /
    • 2007
  • This paper explains the design and implementation of the FRA(Fault Recovery Agent). FRA is a system that is suitable for recovering software error for multimedia distance education based on home network environment. In terms of distributed multimedia systems, the most important catagories for quality of service are a timeless, volume, and reliability. In this paper, we discuss a method for increasing reliability through fault tolerance. This paper explains a performance analysis of an error recovery system running on distributed multimedia environment using rule-based DEVS modeling and simulation techniques. In DEVS, a system has a time base, inputs, states, outputs, and functions. The proposed method is more efficient than the other method in comparison with error ration and processing time.

  • PDF