• 제목/요약/키워드: Fault-Tolerance Computing System

검색결과 51건 처리시간 0.025초

DSP를 이용한 고장허용을 갖는 신뢰 적응 필터 설계 (Design of Reliable Adaptive Fitter with Fault Tolerance Using DSP)

  • 유동완;이전우;서보혁
    • 대한전기학회논문지:시스템및제어부문D
    • /
    • 제50권1호
    • /
    • pp.8-13
    • /
    • 2001
  • LMS algorithm has been used for plant identifier and noise cancellation. This algorithm has been researched for performance enhancement of filtering. The design and development of a reliable system has been becoming a key issue in industry field because the reliability of a system is considered as an important factor to perform the system's function successfully. And the computing with reliability and fault tolerance is a important factor in the case of aviation, system communication, and nuclear plant. This paper presents design of reliable adaptive filter with fault tolerance. Generally, redundancy is used for reliability. In this case it needs computing or circuit for voting mechanism, or fault detection. Therefore it has simple computing, and practicality for application. And in this paper, reliability of adaptive filter is analyzed. The effectiveness of the proposed adaptive filter is demonstrated to the case studies of plant identifier and noise cancellation by using DSP.

  • PDF

Service Deployment Strategy for Customer Experience and Cost Optimization under Hybrid Network Computing Environment

  • Ning Wang;Huiqing Wang;Xiaoting Wang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권11호
    • /
    • pp.3030-3049
    • /
    • 2023
  • With the development and wide application of hybrid network computing modes like cloud computing, edge computing and fog computing, the customer service requests and the collaborative optimization of various computing resources face huge challenges. Considering the characteristics of network environment resources, the optimized deployment of service resources is a feasible solution. So, in this paper, the optimal goals for deploying service resources are customer experience and service cost. The focus is on the system impact of deploying services on load, fault tolerance, service cost, and quality of service (QoS). Therefore, the alternate node filtering algorithm (ANF) and the adjustment factor of cost matrix are proposed in this paper to enhance the system service performance without changing the minimum total service cost, and corresponding theoretical proof has been provided. In addition, for improving the fault tolerance of system, the alternate node preference factor and algorithm (ANP) are presented, which can effectively reduce the probability of data copy loss, based on which an improved cost-efficient replica deployment strategy named ICERD is given. Finally, by simulating the random occurrence of cloud node failures in the experiments and comparing the ICERD strategy with representative strategies, it has been validated that the ICERD strategy proposed in this paper not only effectively reduces customer access latency, meets customers' QoS requests, and improves system service quality, but also maintains the load balancing of the entire system, reduces service cost, enhances system fault tolerance, which further confirm the effectiveness and reliability of the ICERD strategy.

TMS320C32를 이용한 고장허용을 갖는 신뢰 적응 필터 설계 (Design of Reliable Adaptive Filter with Fault Tolerance Using TMS320C32)

  • 유동완;서보혁
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2000년도 하계학술대회 논문집 D
    • /
    • pp.2429-2432
    • /
    • 2000
  • Adaptive filter algorithm has been used for plant identifier and noise cancellation. This algorithm has been researched for performance enhancement of filtering. The design and development of a reliable system has been becoming a key issue in industry field because the reliability of a system is considered as an important factor to perform the system's function successfully. And the computing with reliability and fault tolerance is a important factor in the case of aviation and nuclear plant. This paper presents design of reliable adaptive filter with fault tolerance. Generally, redundancy is used for reliability. In this case it needs computing or circuit for voting mechanism or computing for fault detection or switching part. But this presented Filter is not in need of computing for voting mechanism, or fault detection. Therefore it has simple computing, and practicality for application. And in this paper, reliability of adaptive filter is analyzed. The effectiveness of the proposed adaptive filter is demonstrated to the case studies of plant identifier and noise cancellation by using DSP.

  • PDF

A Fault Tolerant Data Management Scheme for Healthcare Internet of Things in Fog Computing

  • Saeed, Waqar;Ahmad, Zulfiqar;Jehangiri, Ali Imran;Mohamed, Nader;Umar, Arif Iqbal;Ahmad, Jamil
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권1호
    • /
    • pp.35-57
    • /
    • 2021
  • Fog computing aims to provide the solution of bandwidth, network latency and energy consumption problems of cloud computing. Likewise, management of data generated by healthcare IoT devices is one of the significant applications of fog computing. Huge amount of data is being generated by healthcare IoT devices and such types of data is required to be managed efficiently, with low latency, without failure, and with minimum energy consumption and low cost. Failures of task or node can cause more latency, maximum energy consumption and high cost. Thus, a failure free, cost efficient, and energy aware management and scheduling scheme for data generated by healthcare IoT devices not only improves the performance of the system but also saves the precious lives of patients because of due to minimum latency and provision of fault tolerance. Therefore, to address all such challenges with regard to data management and fault tolerance, we have presented a Fault Tolerant Data management (FTDM) scheme for healthcare IoT in fog computing. In FTDM, the data generated by healthcare IoT devices is efficiently organized and managed through well-defined components and steps. A two way fault-tolerant mechanism i.e., task-based fault-tolerance and node-based fault-tolerance, is provided in FTDM through which failure of tasks and nodes are managed. The paper considers energy consumption, execution cost, network usage, latency, and execution time as performance evaluation parameters. The simulation results show significantly improvements which are performed using iFogSim. Further, the simulation results show that the proposed FTDM strategy reduces energy consumption 3.97%, execution cost 5.09%, network usage 25.88%, latency 44.15% and execution time 48.89% as compared with existing Greedy Knapsack Scheduling (GKS) strategy. Moreover, it is worthwhile to mention that sometimes the patients are required to be treated remotely due to non-availability of facilities or due to some infectious diseases such as COVID-19. Thus, in such circumstances, the proposed strategy is significantly efficient.

그리드 환경의 적응형 오류 극복 관리 시스템 설계 및 구현 (Design and Implementation of Adaptive Fault-Tolerant Management System over Grid)

  • 김은경;김지영;김윤희
    • 정보처리학회논문지A
    • /
    • 제15A권3호
    • /
    • pp.151-154
    • /
    • 2008
  • 서비스 이동과 자원 상태 변화 등 실행 환경 변화가 빈번히 발생하는 그리드 컴퓨팅 환경은 다양한 응용 프로그램 작업 환경을 지원하고 사용자에게 끊임없는 작업 환경을 보장하기 위하여 고가용성을 지원하는 미들웨어가 필수적으로 필요하다. 기존의 분산 환경 미들웨어 역시 고가용성 지원 서비스가 일부 연구자에 의해 진행되고 있으나 공개표준은 아니며 다양한 그리드 서비스에 대한 고려가 없다. 본 논문에서는 환경에 따라 적응하는 서비스 미들웨어 런타임 서비스 관리 시스템을 통해 자율적으로 작업 환경을 재구성하도록 하여 미들웨어의 가용성을 증대시키고 안정적으로 서비스의 계속성과 데이터 및 자료의 일관성을 보장하는 방법을 제시하고 프로토타입 Wapee(Web-Service based Application Execution Environment)를 통해 실제 환경에서 적용 가능성을 확인한다.

다중 Gigabit Server NICs에서 동적 검출 주기를 적용한 결함 허용 메커니즘 (A Fault Tolerance Mechanism with Dynamic Detection Period in Multiple Gigabit Server NICs)

  • 이진영;이시진
    • 인터넷정보학회논문지
    • /
    • 제3권5호
    • /
    • pp.31-39
    • /
    • 2002
  • 인터넷의 초고속 성장과 멀티미디어 데이터의 급격한 증가로 고속의 전송 매체와 인터페이스 시스템이 요구되고 있다. 이러한 고속의 네트워크 대역폭을 지원하기 위한 대안으로 다중(Multiple) NIC가 개발되고 연구되어 왔다. 다중 NIC를 사용함으로써 기존 네트워크 환경의 큰 변화 없이 고속의 LAN 환경을 구축할 수 있으므로 고성능. 저비용의 효과를 얻을 수 있다. 그러나 대용량 다중 NIC에 SPOF(Singe Point Of Failure) 결함으로 시스템 중단이 생기면, 대용량의 멀티미디어 데이터를 서비스하는 시스템인 만큼 커다란 손실을 가져오게 된다. 따라서 본 논문에서는 결함으로 오는 손실을 방지하기 위해 결함 허용 기법을 사용하여 '결함 허용 다중 NIC'에 대해서 연구한다. 기존의 TMR, Primary-Standby 기법. Watchdog Timer 기법에서 발생되는 자원에 대한 가용성과 내구성의 비효율적인 부분을 고려하여, 동적으로 검출 주기를 변환하여 다운타임을 최소화 할 수 있는 효율적인 결함 허용 메커니즘을 설계하여 제안한다. 결과적으로 본 논문에서 제안한 결함 허용 기법은 결함이 발생하여 생기는 오버헤드 시간을 줄이고자, Fault Detection에서 소요되는 Timeout 시간을 감소시켜 시스템 전반적으로 다운타임을 최소화시킬 수 있다.

  • PDF

P2P 컴퓨팅에서 중복 수행 결과의 정확성 검증 기법 (A Verification of Replicated Operation In P2P Computing)

  • 박찬열
    • 컴퓨터교육학회논문지
    • /
    • 제7권3호
    • /
    • pp.35-43
    • /
    • 2004
  • 인터넷을 기반으로 독립적인 장치들이 참여하는 P2P 컴퓨팅은 원하는 목적을 달성하는데 있어서 참여 장치들의 이탈, 고장, 네트워크 상태, 익명성 등으로 인해 잦은 접속단절과 보안 공격을 겪는다. 여러 연구와 구현에서 이러한 문제들을 해결하기 위해 공유되는 자원의 중복 기법을 사용한다. 이 논문에서는 컴퓨팅 자원의 공유를 목적으로 하는 P2P 컴퓨팅에서 수행되는 작업의 중복 수행을 통해 접속단절과 보안 공격에도 올바른 결과를 얻어내는 정확성 검증 기법을 제안한다. 제안하는 기법에서는 종속성이 존재하는 단위작업들에 대해 시스템 전체의 전역적인 메시지 교환 없이 주기적으로 정확성을 검증하고, 검증된 결과는 검사점이 되어 복귀 회복이 가능한 결함 포용이 가능하다.

  • PDF

IoT 응용을 위한 결함 포용 발행/구독 시스템의 설계 및 평가 (Design and Evaluation of a Fault-tolerant Publish/Subscribe System for IoT Applications)

  • 배인한
    • 한국멀티미디어학회논문지
    • /
    • 제24권8호
    • /
    • pp.1101-1113
    • /
    • 2021
  • The rapid growth of sense-and-respond applications and the emerging cloud computing model present a new challenge: providing publish/subscribe middleware as a scalable and elastic cloud service. The publish/subscribe interaction model is a promising solution for scalable data dissemination over wide-area networks. In addition, there have been some work on the publish/subscribe messaging paradigm that guarantees reliability and availability in the face of node and link failures. These publish/subscribe systems are commonly used in information-centric networks and edge-fog-cloud infrastructures for IoT. The IoT has an edge-fog cloud infrastructure to efficiently process massive amounts of sensing data collected from the surrounding environment. In this paper. we propose a quorum-based hierarchical fault-tolerant publish/subscribe systems (QHFPS) to enable reliable delivery of messages in the presence of link and node failures. The QHFPS efficiently distributes IoT messages to the publish/subscribe brokers in fog overlay layers on the basis of proposing extended stepped grid (xS-grid) quorum for providing tolerance when faced with node failures and network partitions. We evaluate the performance of QHFPS in three aspects: number of transmitted Pub/Sub messages, average subscription delay, and subscritpion delivery rate with an analytical model.

40-TFLOPS artificial intelligence processor with function-safe programmable many-cores for ISO26262 ASIL-D

  • Han, Jinho;Choi, Minseok;Kwon, Youngsu
    • ETRI Journal
    • /
    • 제42권4호
    • /
    • pp.468-479
    • /
    • 2020
  • The proposed AI processor architecture has high throughput for accelerating the neural network and reduces the external memory bandwidth required for processing the neural network. For achieving high throughput, the proposed super thread core (STC) includes 128 × 128 nano cores operating at the clock frequency of 1.2 GHz. The function-safe architecture is proposed for a fault-tolerance system such as an electronics system for autonomous cars. The general-purpose processor (GPP) core is integrated with STC for controlling the STC and processing the AI algorithm. It has a self-recovering cache and dynamic lockstep function. The function-safe design has proved the fault performance has ASIL D of ISO26262 standard fault tolerance levels. Therefore, the entire AI processor is fabricated via the 28-nm CMOS process as a prototype chip. Its peak computing performance is 40 TFLOPS at 1.2 GHz with the supply voltage of 1.1 V. The measured energy efficiency is 1.3 TOPS/W. A GPP for control with a function-safe design can have ISO26262 ASIL-D with the single-point fault-tolerance rate of 99.64%.

실시간 고장포용 생산시스템의 적정 성능 유지를 위한 최적 설계 기법에 관한 연구 (Determination of the profit-maximizing configuration for the modular cell manufacturing system using stochastic process)

  • 박승규
    • 제어로봇시스템학회논문지
    • /
    • 제5권5호
    • /
    • pp.614-621
    • /
    • 1999
  • In this paper, the analytical appproaches are presented for jointly determining the profit-miximizing configuration of the fault-tolerance real time modular cell manufacturing system. The transient(time-dependent) analysis of Markovian models is firstly applied to modular cell manufacturing system from a performability viewpoint whose modeling advantage lies in its ability to express the performance that truly matters - the user's perception of it - as well as various performance measures compositely in the context of application. The modular cells are modeled with hybrid decomposition method and then availability measures such as instantaneous availability, interval availability, expected cumulative operational time are evaluated as special cases of performability. In addition to this evaluation, sensitivity analysis of the entire manufacturing system as well as each machining cell is performed, from which the time of a major repair policy and the optimal configuration among the alternative configurations of the system can be determined. Secondly, the recovery policies from the machine failures by computing the minimal number of redundant machines and also from the task failures by computing the minimum number of tasks equipped with detection schemes of task failure and reworked upon failure detection, to meet the timing requirements are optimized. Some numerical examples are presented to demonstrate the effectiveness of the work.

  • PDF