Search | Korea Science

Distributed Fault-Tolerant System using Dual Channel Ethernet (이중 채널 이더넷을 이용한 분산 결함 허용 시스템)

최보곤;김진용;함명호;신현식
- Proceedings of the Korean Information Science Society Conference
- /
- 2002.10c
- /
- pp.307-309
- /
- 2002
고가용성 및 고신뢰성의 분산 결함 허용 시스템의 설계와 구현에 대해서 다룬다. 이 시스템은 관리자 노드와 작업 노드 풀로 노드들을 구성하고, 각각의 노드들은 결함 허용 네트웍을 통해 통신을 하게 된다. 이 결함 허용 네트웍은 두 개의 네트웍이 중복되게 구성되어 한 네트웍의 결함 시에도 정상적인 데이터 교환을 보장한다. 여기서 중복된 네트웍을 위한 결함 검출 복구 기법이 필요하고 이들 관리자 노드와 작업 노드들의 관리를 위해 결함 허용 미들웨어가 포함된다. 미들웨어의 기능에 적응형 결함 허용 기법을 도입하여 실행 시간에 결함 허용 모드를 선택할 수 있게 하고, 결과적으로 보다 높은 가용성과 신뢰성의 결함 허용 시스템을 구성하였다.
PDF

Methodology of Fault Tolerance for Integrated Management and Monitoring System based on Information Model of Naval Combat System (해군 전투 체계의 정보 모델 기반 통합 관리 및 모니터링 시스템을 위한 결함허용 방법)

Min, Bup-Ki;Kim, Hyeon-Soo;Kuk, Seung-Hak;Kim, Chum-Su
- Proceedings of the Korean Information Science Society Conference
- /
- 2012.06b
- /
- pp.114-116
- /
- 2012
본 논문에서는 대규모 무기체계에서 정보 모델 기반의 통합 관리 및 모니터링 시스템을 위한 결함허용 방법을 제시한다. 정보 모델 기반의 통합 관리 및 모니터링 시스템은 이기종 분산 환경으로 이루어지는 대규모 무기체계의 하드웨어 및 애플리케이션을 추상화된 정보 모델을 이용하여 관리할 수 있는 중앙 관리 및 통제 시스템이다. 대규모 무기체계에서는 하나의 시스템에서 오류가 발생하게 되면 시스템 전체에 영향을 줄 수 있기 때문에 중앙 관리 및 통제 시스템에서의 결함허용 방법이 필요하다. 이 문제를 해결하기 위해 정보 모델을 관리하기 위한 결함허용 그룹을 정의하고, 결함허용 그룹마다 서로 다른 결함허용 방법을 설정하여 애플리케이션의 중요도에 따라 다양한 방법을 이용하여 결함허용을 수행한다.

Dependability Modeling of Software Fault Tolerance Techniques (소프트웨어 결함허용 기법들의 의존도 모델링)

김용규;김성수
- Proceedings of the Korean Information Science Society Conference
- /
- 1999.10a
- /
- pp.614-616
- /
- 1999
신뢰도 높은 소프트웨어 개발의 필요성은 전혀 새로운 것이 아니다. 요즘 들어, 소프트웨어의 크기와 복잡도가 증가함으로 인해 소프트웨어의 결함 때문에 발생하는 시스템 고장이 전체 시스템 고장에서 많은 비중을 차지하고 있다. 고 신뢰도를 요구하는 시스템의 소프트웨어는 복구블록, 분산 복구블록, N-버전 프로그래밍, N 자기검사 프로그래밍과 같은 소프트웨어 결함허용 기법들을 사용하고 있다. 이러한 소프트웨어 결함허용 기법들에 대한 연구와 함께 소프트웨어 결함허용 기법들의 의존도 측정에 관한 연구 또한 매우 중요하다. 이에 본 논문에서는 마르코프 모델을 사용해서 소프트웨어 결함허용 기법들의 보다 자세한 신뢰도 모델링과 가용도, 안전도 등에 관한 모델링을 제시한다. 제안된 모델 분석 결과 같은 수의 대체블록이 있을 때는 분산 복구블록, 복구 블록, N 자기검사 프로그래밍, N-버전 프로그래밍 순으로 의존도가 높음을 알 수 있다. 또한 소프트웨어 결함허용 기법들의 신뢰도민감성 분석에서는 복구블록과 분산 복구블록인 경우는 적응검사의 결함발생율에, N-버전 프로그래밍인 경우는 프로그램 버전의 결함발생율에 더 민감한 영향을 받는 것을 알 수 있다.
PDF

A Striped Checkpointing Scheme for the Cluster System with the Distributed RAID (분산 RAID 기반의 클러스터 시스템을 위한 분할된 결함허용정보 저장 기법)

Chang, Yun-Seok
- The KIPS Transactions:PartA
- /
- v.10A no.2
- /
- pp.123-130
- /
- 2003
This paper presents a new striped checkpointing scheme for serverless cluster computers, where the local disks are attached to the cluster nodes collectively form a distributed RAID with a single I/O space. Striping enables parallel I/O on the distributed disks and staggering avoids network bottleneck in the distributed RAID. We demonstrate how to reduce the checkpointing overhead and increase the availability by striping and staggering dynamically for communication intensive applications. Linpack HPC Benchamark and MPI programs are applied to these checkpointing schemes for performance evaluation on the 16-nodes cluster system. Benchmark results prove the benefits of the striped checkpointing scheme compare to the existing schemes, and these results are useful to design the efficient checkpointing scheme for fast rollback recovery from any single node failure in a cluster system.
https://doi.org/10.3745/KIPSTA.2003.10A.2.123 인용 PDF KSCI

Advanced Method to Improve the Transparency for Fault-Tolerance in Distributed System (분산 처리 시스템의 결함 허용을 위한 투명성 향상 기법)

Kim, Boon-Hee
- Proceedings of the Korea Contents Association Conference
- /
- 2006.11a
- /
- pp.609-611
- /
- 2006
분산 시스템에서 어떠한 시스템 요소에 결함이나 오류가 발생하더라도 시스템이 정상적으로 동작하게 하는 결함 허용 기법은 해당 분산시스템의 효용성을 높이는데 기여한다. 본 연구는 분산 시스템의 결함 허용 기법 중 시간 제약성 측면에서 강한 여분기반 결함허용 기법을 수용한다. 이 기법의 구성 요소인 어플리케이션 서버는 그 상태가 결정적(deterministic)이냐 비결정적(nondeterministic)이냐에 따라 그 처리 기법을 달리하고 있다. 그 중 SAR(Semi-Active Replication)이 자원 활용도 측면에서 그 효율성 증명된바 있다. 본 논문에서는 SAR의 단점인 응답시간 지연문제와 클라이언트 측면에서의 결함 허용(fault-tolerance) 문제를 해결하기 위한 기반 구조를 제안한다.
PDF

Development of a Fault-tolerant IoT System Based on the EVENODD Method (EVENODD 방법 기반 결함허용 사물인터넷 시스템 개발)

Woo, Min-Woo;Park, KeeHyun;An, Donghyeok
- Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
- /
- v.7 no.3
- /
- pp.263-272
- /
- 2017
The concept of Internet of Things (IoT) has been increasingly popular these days, and its areas of application have been broadened. However, if the data stored in an IoT system is damaged and cannot be recovered, our society would suffer considerable damages and chaos. Thus far, most of the studies on fault-tolerance have been focused on computer systems, and there has not been much research on fault-tolerance for IoT systems. In this study, therefore, a fault-tolerance method in IoT environments is proposed. In other words, based on the EVENODD method, one of the traditional fault-tolerance methods, a fault-tolerance storage and recovery method for the data stored in the IoT server is proposed, and the method is implemented on an oneM2M IoT system. The fault-tolerance method proposed in this paper consists of two phases - fault-tolerant data storage and recovery. In the fault-tolerant data storage phase, some F-T gateways are designated and fault-tolerant data are distributed in the F-T gateways' storage using the EVENODD method. In the fault-tolerant recovery phase, the IoT server initiates the recovery procedure after it receives fault-tolerant data from non-faulty F-T gateways. In other words, an EVENODD array is reconstructed and received data are merged to obtain the original data.
https://doi.org/10.14257/AJMAHS.2017.03.21 인용

Fault Tolerant System Modeling based on Real-Time Object (실시간 객체 기반 결함허용 시스템 모델링)

Im, Hyeong-Taek;Yang, Seung-Min
- The Transactions of the Korea Information Processing Society
- /
- v.6 no.8
- /
- pp.2233-2244
- /
- 1999
It is essential to guarantee high reliability of embedded real-time systems since the failure of such systems may result in large financial damage or threaten human life. Though many researches have devoted to fault tolerant mechanisms, most of them are object-level fault tolerant mechanisms that can detect errors occurred in a single object and treat the errors in object-level. As embedded real-time systems become more complex and larger, there exist faults that cannot be detected by or tolerated with object-level fault tolerance. Hence, system-level fault tolerance is needed. System-level fault tolerance examines the status of a system whether the system is normal or not by analyzing the status of objects. When an error is detected it should be capable of locating the fault and performing an appropriate recovery and reconfiguration action. In this paper, we propose RobustRTO(Robust Real-Time Object) that provides object-level fault tolerance capability and RMO(Region Monitor real-time Object) that offers system-level fault tolerance capability. Then we show how highly dependable fault tolerant systems can be modeled by RobustRTO and RMO. The model is presented based on real-time objects.
PDF

Search Technique for the Design of Cost Effective Fault Tolerant Systems (효율적인 결함허용 시스템 설계를 위한 탐색기법)

이효순;신현식
- Proceedings of the Korean Information Science Society Conference
- /
- 2000.04a
- /
- pp.6-8
- /
- 2000
결함허용 시스템은 다양한 형태의 중복을 사용하여 신뢰도를 향상시킬 수 있는 반면, 시스템의 비용을 크게 증가시킨다. 본 논문은 만족스러운 신뢰도를 갖추면서 추가 비용을 적게 요구하는 결함허용 컴퓨터 시스템의 구조를 결정하기 위한 설계 문제를 정의하고 탐색에 기반을 둔 해결법을 제안한다. 이 때, 탐색 기법이 방문하는 탐색 공간의 크기를 줄이기 위하여 사용되는 세 가지의 유용한 사실을 설명한다. 이를 바탕으로 삼중 모듈 중복(TMR: Triple-Modular-Redundancy), 백업 예비(backup sparing), 그리고 혼합 중복(hybride redundancy) 기법과 같은 결함허용 기법들이 시스템 구조에 적용되었을 때, 탐색 공간을 줄이는 용도로 사용될 수 있는 신뢰도 제약조건을 유도해낸다.
PDF

Establishing detours for Fault-Tolerance Real-Time Communication in K-ary n-cube Networks (k-ary n-cube네트웍에서 결함허용실시간통신을 위한 우회경로 설정)

이경희
- Proceedings of the Korean Information Science Society Conference
- /
- 1998.10a
- /
- pp.627-629
- /
- 1998
실시간 어플리케이션이 확장되고 복잡해질수록 시스템이나 네트웍에 존재하는 결함에 대응해야 할 필요성은 더 높아진다. 이런 작업의 활용도가 높지는 않더라도 하나의 결함이 시스템 전체에 영향을 미칠 가능성은 항상 있기 때문에 신뢰도 제공면에서 요구되는 작업이다. 전통적인 결함허용 방법은 여분의 하드웨어나 소프트웨어를 중복 사용함으로써 결함에 대처하고자 하였다. 본 논문에서는 네트워크에 대하여 네트웍을 구성하는 요소를 중복하는 것이 아니라 네트웍의 결함발생시 통신경로를 우회함으로써 결함을 허용하는 방법을 제안한다.
PDF

Fault Tolerance Design of Uplink Command Processor (상향링크 명령 처리기의 결함 허용 설계)

Gu, Cheol Hoe
- Journal of the Korean Society for Aeronautical & Space Sciences
- /
- v.31 no.3
- /
- pp.95-100
- /
- 2003
Electronic equipment used in satellites are demanding extremely high reliability, so they should be designed to have immunity for some critical faults by using redundancy component. Generally, Communication satellites are assigned to meet the 15 years mission lifetime, of the analysis about faults must be performed to electronic equipments of satellite. This paper is a summary of the fault tolerance design research of command processor, the improvement of reliability and trade-off study of fault tolerance design result. The reliability prediction value of the satellite component used in this research was taken from Koreasat 3 and Kompsat 1. It is important to perform many trade-off studies for fault tolerance design, especially to choose the most proper fault tolerance method for the specified fault scenario.
https://doi.org/10.5139/JKSAS.2003.31.3.095 인용 PDF KSCI

Search Result 2,489, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)