• Title/Summary/Keyword: Fault-tolerance

Search Result 571, Processing Time 0.028 seconds

Error Recovery Technique for Improving Reliability of Embedded Systems

  • Son, Sunghoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.6
    • /
    • pp.1-8
    • /
    • 2017
  • In this paper, we propose a fault tolerance technique which enables embedded systems to run without interruption while its operating system and tasks fail. In order to improve reliability, the proposed scheme makes an embedded system run as a virtual machine on virtual machine monitor. It also prepares a contingency virtual machine at which periodical backups of the embedded system are saved. When an error occurs in the main virtual machine, the corresponding standby virtual machine takes a role of the main virtual machine and continues its operation. Especially such backups and switches of virtual machines are performed with minor performance degradation by manipulating page table entries in virtual machine monitor. By conducting performance evaluation studies, we show that the proposed scheme makes embedded system robust against errors while it does not degrade the performance of the system significantly.

Design and Implementation of HA NAS with Fault-Tolerance (Fault-Tolerance 기능을 갖는 HA NAS 시스템의 설계 및 구현)

  • 김주영;박준희;권혁빈;서희정;정영준
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.664-666
    • /
    • 2004
  • 최근 업무에 컴퓨터를 이용하는 사례가 늘어나고, 각종 컨텐츠 서비스업계가 발전하면서 독립적인 파일 서버 기능만을 처리하도록 만든 네트워크 저장장치인 NAS(Network Attached Storage)의 이용이 점차 증가하는 추세에 있다. NAS는 기존 파일서버의 문제점을 보완하면서 이기종 플랫폼간의 파일 공유, 스토리지 확장성, 관리 용이성 등을 특징으로 한다. 그러나 NAS 시스템에 장애가 발생할 경우에는 막대한 경제적인 손실이 발생하게 된다. 따라서 본 논문에서는 NAS 시스템 장애가 발생하였을 때, 효율적으로 장애를 복구할 수 있는 HA(High Availability) NAS 시스템을 설계 및 구현하고, 다양한 장애 상황에서 NAS를 이용하는 서비스가 중단 없이 제공되는 것을 확인하였다

  • PDF

Double-Input DC-DC Converter for Applications with Wide-Input-Voltage-Ranges

  • Hu, Renjun;Zeng, Jun;Liu, Junfeng;Yang, Jinming
    • Journal of Power Electronics
    • /
    • v.18 no.6
    • /
    • pp.1619-1626
    • /
    • 2018
  • The output power of most facilities for renewable energy generation is unstable due to external environmental conditions. In distributed power systems with two or more sources, a stable output can be achieved with the complementary power supply among the different input sources. In this paper, a double-input DC-DC converter with a wide-input-voltage-range is proposed for renewable energy generation. This converter has the following advantages: the circuit is simple, and the input voltage range is wide and the fault tolerance is excellent. The operation modes and the steady-state analysis are examined. Finally, experimental results are illustrated to verify the correctness of the analysis and the feasibility of the proposed converter.

Design and Analysis of Leader Election Algorithm in Wireless Network based on Fixed Stations (기지국 기반 무선 통신망에서 리더 선택 알고리즘의 설계 및 분석)

  • Park, Sung-Hoon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.7
    • /
    • pp.4554-4561
    • /
    • 2014
  • In recent years, several paradigms have been identified to simplify the design of fault-tolerant distributed applications in a conventional static system. Leader Election is among the most noticeable, particularly because it is closely related to group communication, which provides a powerful basis for implementing active replications among other uses. On the other hand, despite its usefulness, to our knowledge, no study has focused on this problem in a mobile computing environment. The aim of this paper was to propose an algorithm for the leader election protocol in a fixed station based wireless networks environment. The election algorithm is much more efficient than other election algorithms in terms of the fault tolerance.

Robust State Feedback Control of Asynchronous Machines with Intermittent Faults (간헐 고장이 존재하는 비동기 머신의 견실한 상태 피드백 제어)

  • Yang, Jung-Min
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.48 no.3
    • /
    • pp.40-47
    • /
    • 2011
  • This paper addresses the problem of fault detection and tolerance for asynchronous sequential machines using state feedback control. The considered asynchronous machine is affected by intermittent faults. When intermittent faults occur, the machine undergoes unauthorized state transitions and, for a finite duration, remains at the fault state, not responding to the change of the external input. In this paper, we postulate the scheme of detecting intermittent faults and present the existence condition and design algorithm for a robust state feedback controller that overcomes the adversarial effect of intermittent faults. We also undertake a comparative study between the previous control scheme for transient faults and the present strategy for intermittent faults. The design procedure for the proposed controller is described in a case study.

SSR (Simple Sector Remapper) the fault tolerant FTL algorithm for NAND flash memory

  • Lee, Gui-Young;Kim, Bumsoo;Kim, Shin-han;Byungsoo Jung
    • Proceedings of the IEEK Conference
    • /
    • 2002.07b
    • /
    • pp.932-935
    • /
    • 2002
  • In this paper, we introduce new FTL(Flash Translation Layer) driver algorithm that tolerate the power off errors. FTL driver is the software that provide the block device interface to the upper layer software such as file systems or application programs that using the flash memory as a block device interfaced storage. Usually, the flash memory is used as the storage devices of the mobile system due to its low power consumption and small form factor. In mobile system, the state of the power supplement is not stable, because it using the small sized battery that has limited capacity. So, a sudden power off failure can be occurred when we read or write the data on the flash memory. During the write operation, power off failure may introduce the incomplete write operation. Incomplete write operation denotes the inconsistency of the data in flash memory. To provide the stable storage facility with flash memory in mobile system, FTL should provide the fault tolerance against the power off failure. SSR (Simple Sector Remapper) is a fault tolerant FTL driver that provides block device interface and also provides tolerance against power off errors.

  • PDF

Design and Implementation of Reliable Distributed Programming Environment based on HORB (HORB에 기반한 신뢰성 있는 분산 프로그래밍 환경의 설계 및 구현)

  • Hyun, Mu-Yong;Kim, Shik;Kim, Myung-Jun
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.39 no.2
    • /
    • pp.1-9
    • /
    • 2002
  • The use of Object-Oriented Distributed Programming(OODP) environment such as DCOM, DSOM, Java RMI, CORBA to implement distributed applications is becoming increasingly popular. However, absence of a fault-tolerance feature in these middleware platforms complicates the design and implementation of reliable distributed object-based applications, although they greatly enhance the quality and reusability of the distributed object-based applications. In this paper, we propose a fault-tolerant programming environment based on RMI, namely Evergreen, for the reliable distributed computing with checkpoints and rollback-recovery mechanism. Based on a series of experiments, we evaluate the performance of Evergreen and find its possibility of extension to fully support our optimal design goal.

Switchover Time Analysis of Primary-Backup Server Systems Based on Software Rejuvenation (소프트웨어 재활기법에 기반한 주-여분 서버 시스템의 작업전이 시간 분석)

  • Lee, Jae-Sung;Park, Kie-Jin;Kim, Sung-Soo
    • The KIPS Transactions:PartA
    • /
    • v.8A no.2
    • /
    • pp.81-90
    • /
    • 2001
  • As the rapid growth of Internet, computer systems are growing in its size and complexity. To meet high availability requirements for the systems, one usually uses both hardware and software fault tolerance techniques. To prevent failures of computer systems from software-aging phenomenon that come from long mission time, we adopt software rejuvenation method that stops and restarts the software in the servers intentionally. The method makes the systems clean and healthy state in which the probability of fault occurrence is very low. In this paper, we study how switchover time affects software rejuvenation of primary-backup server systems. Through experiments, we find that switchover time is an essential factor for deciding the rejuvenation policy.

  • PDF

Realtime Monitoring System using AJAX + XML (AJAX+XML 기반의 모니터링 시스템)

  • Choi, Yun Jeong;Park, Seung Soo
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.5 no.4
    • /
    • pp.39-49
    • /
    • 2009
  • Nowadays, according to rapid development of computing environments, information processing and analysis system are very interesting research area. As a viewpoint of data preparation-processing-analysis in knowledge technology, the goal of automated information system is to satisfy high reliability and confidence and to minimize of human-administrator intervention. In addition, we expect the system which can deal with problem and abnormal error effectively as a fault detection and fault tolerance. In this paper, we design a monitoring system as follows. A productive monitoring information from various systems has unstructured forms and characteristics and crawls informative data by conditions and gathering rules. For representing of monitering information which requested by administrator, running-status can be able to check dynamically and systematic like connection/closed status in real-time. Our proposed system can easily correct and processing for monitoring information from various type of server and support to make objective judgement and analysis of administrator under operative target of information system. We implement semi-realtime monitering system using AJAX technology for dynamic browsing of web information and information processing using XML and XPATH. We apply our system to SMS server for checking running status and the system shows that has high utility and reliability.

A Study on the CSMP Multistage Interconnection Network having Fault Tolerance & Dynamic Reroutability (내고장성 및 동적 재경로선택 SCMP 다단상호접속망에 관한 연구)

  • 김명수;임재탁
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.28B no.10
    • /
    • pp.807-821
    • /
    • 1991
  • A mulitpath MIN(Multistage Interconnection Network), CSMP(Chained Shuffle Multi-Path) network, is proposed, having fault-tolerance and dynamic reroutability. The number of stages and the number of links between adjacent stagges are the same as in single path MINs, so the overall hardware complexity is considerably reduced in comparison with other multipath MINs. The CSMP networks feature links between switches belonging to the same state, forming loops of switches. The network can tolerate multiple faults, up to (N/4)*(log$_2$N-1), having occured in any stages including the first and the last ones(N:NO. of input). To analyze reliability, terminal reliability (TR) and mean time to failure( MTTE) age given for the networks, and the TR figures are compared to those of other static and dynamic rerouting multipath MINs. Also the MTTE figures are compared. The performance of the proposed network with respect to its bandwidth (BW) and probability of acceptance(PA) is analyzed and is compared to that of other more complex multipath MINs. The cost efficiency analysis of reliability and performance shows that the network is more cost-effective than other previously proposed fault-tolerant multipath MINs.

  • PDF