• Title/Summary/Keyword: Recovery probability

Search Result 126, Processing Time 0.033 seconds

A Multi-objective Optimization Approach to Workflow Scheduling in Clouds Considering Fault Recovery

  • Xu, Heyang;Yang, Bo;Qi, Weiwei;Ahene, Emmanuel
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.3
    • /
    • pp.976-995
    • /
    • 2016
  • Workflow scheduling is one of the challenging problems in cloud computing, especially when service reliability is considered. To improve cloud service reliability, fault tolerance techniques such as fault recovery can be employed. Practically, fault recovery has impact on the performance of workflow scheduling. Such impact deserves detailed research. Only few research works on workflow scheduling consider fault recovery and its impact. In this paper, we investigate the problem of workflow scheduling in clouds, considering the probability that cloud resources may fail during execution. We formulate this problem as a multi-objective optimization model. The first optimization objective is to minimize the overall completion time and the second one is to minimize the overall execution cost. Based on the proposed optimization model, we develop a heuristic-based algorithm called Min-min based time and cost tradeoff (MTCT). We perform extensive simulations with four different real world scientific workflows to verify the validity of the proposed model and evaluate the performance of our algorithm. The results show that, as expected, fault recovery has significant impact on the two performance criteria, and the proposed MTCT algorithm is useful for real life workflow scheduling when both of the two optimization objectives are considered.

Determination of Optimal Checkpoint Interval for RM Scheduled Real-time Tasks (RM 스케줄링된 실시간 태스크에서의 최적 체크 포인터 구간 선정)

  • Kwak, Seong-Woo;Jung, Young-Joo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.56 no.6
    • /
    • pp.1122-1129
    • /
    • 2007
  • For a system with multiple real-time tasks of different deadlines, it is very difficult to find the optimal checkpoint interval because of the complexity in considering the scheduling of tasks. In this paper, we determine the optimal checkpoint interval for multiple real-time tasks that are scheduled by RM(Rate Monotonic) algorithm. Faults are assumed to occur with Poisson distribution. Checkpoints are inserted in the execution of task with equal distance in the same task, but different distances in other tasks. When faults occur, rollback to the latest checkpoint and re-execute task after the checkpoint. We derive the equation of maximum slack time for each task, and determine the number of re-executable checkpoint intervals for fault recovery. The equation to check the schedulibility of tasks is also derived. Based on these equations, we find the probability of all tasks executed within their deadlines successfully. Checkpoint intervals which make the probability maximum is the optimal.

Probabilistic Modeling for Evaluation of Information Security Investment Portfolios (확률모형을 이용한 정보보호 투자 포트폴리오 분석)

  • Yang, Won-Seok;Kim, Tae-Sung;Park, Hyun-Min
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.34 no.3
    • /
    • pp.155-163
    • /
    • 2009
  • We develop a probability model to evaluate information security investment portfolios. We assume that organizations install portfolios of information security countermeasures to mitigate the damage such as loss of the transaction being processed, damage of hardware and data, etc. A queueing model and Its expected value analysis are used to derive the lost cost of transactions being processed, the replacement cost of hardwares, and the recovery cost of data. The net present value for each portfolio is derived and organizations can select the optimal information security investment portfolio by comparing portfolios.

Nonuniform Encoding and Hybrid Decoding Schemes for Equal Error Protection of Rateless Codes

  • Lim, Hyung Taek;Joo, Eon Kyeong
    • ETRI Journal
    • /
    • v.34 no.5
    • /
    • pp.719-726
    • /
    • 2012
  • Messages are generally selected with the same probability in the encoding scheme of rateless codes for equal error protection. In addition, a belief propagation (BP) decoding scheme is generally used because of the low computational complexity. However, the probability of recovering a new message by BP decoding is reduced if both the recovered and unrecovered messages are selected uniformly. Thus, more codeword symbols than expected are required for the perfect recovery of message symbols. Therefore, a new encoding scheme with a nonuniform selection of messages is proposed in this paper. In addition, a BP-Gaussian elimination hybrid decoding scheme that complements the drawback of the BP decoding scheme is proposed. The performances of the proposed schemes are analyzed and compared with those of the conventional schemes.

A Quantitative Model of System-Man Interaction Based on Discrete Function Theory

  • Kim, Man-Cheol;Seong, Poong-Hyun
    • Nuclear Engineering and Technology
    • /
    • v.36 no.5
    • /
    • pp.430-449
    • /
    • 2004
  • A quantitative model for a control system that integrates human operators, systems, and their interactions is developed based on discrete functions. After identifying the major entities and the key factors that are important to each entity in the control system, a quantitative analysis to estimate the recovery failure probability from an abnormal state is performed. A numerical analysis based on assumed values of related variables shows that this model produces reasonable results. The concept of 'relative sensitivity' is introduced to identify the major factors affecting the reliability of the control system. The analysis shows that the hardware factor and the design factor of the instrumentation system have the highest relative sensitivities in this model. T도 probability of human operators performing incorrect actions, along with factors related to human operators, are also found to have high relative sensitivities. This model is applied to an analysis of the TMI-2 nuclear power plant accident and systematically explains how the accident took place.

An Analysis of System Error Rate (시스템 오류 발생률 분석)

  • Seong, Soon-Yong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.3
    • /
    • pp.475-481
    • /
    • 2009
  • The frequency and probability of deadlock are influential factors in the design of algorithms for deadlock. However, little work has been done in this area because it's not easy to analyze how factors such as the characteristics of process or resource, resource request and release patterns, or the number of process affect the occurrence of deadlock. This study was designed to reduce remarkably the number of state by adapting the model 'state (a,b)t' to represent the resource allocation state, as well as to include the effect of resource error rate and recovery rate in the system analysis. Various formulas about deadlock occurrence were resulted in this study such as the average time interval of deadlock, the probability that a process requesting a resource waits or deadlocks, and the probability that a request deadlocks in a cycle of length 2.

A Heuristic Buffer Management and Retransmission Control Scheme for Tree-Based Reliable Multicast

  • Baek, Jin-Suk;Paris, Jehan-Francois
    • ETRI Journal
    • /
    • v.27 no.1
    • /
    • pp.1-12
    • /
    • 2005
  • We propose a heuristic buffer management scheme that uses both positive and negative acknowledgments to provide scalability and reliability. Under our scheme, most receiver nodes only send negative acknowledgments to their repair nodes to request packet retransmissions while some representative nodes also send positive acknowledgments to indicate which packets can be discarded from the repair node's buffer. Our scheme provides scalability because it significantly reduces the number of feedbacks sent by the receiver nodes. In addition, it provides fast recovery of transmission errors since the packets requested from the receiver nodes are almost always available in their buffers. Our scheme also reduces the number of additional retransmissions from the original sender node or upstream repair nodes. These features satisfy the original goal of treebased protocols since most packet retransmissions are performed within a local group.

  • PDF

Dynamic Behavior Analysis of Bridges under the Combined Effect of Earthquake and Scour (지진 및 기초의 세굴을 고려한 교량시스템의 동적거동분석)

  • 김상효;최성욱;이상우;김호상
    • Proceedings of the Earthquake Engineering Society of Korea Conference
    • /
    • 2002.03a
    • /
    • pp.187-194
    • /
    • 2002
  • Bridge dynamic behaviors and the failure of the foundation are examined in this study under seismic excitations including the local scour effect. The simplified mechanical model, which can consider the effect of various influence elements, is proposed to simulate the bridge motions. The scour depths around the foundations are estimated by the CSU equation recommended by the HEC-18 and the local scour effect upon global bridge motions is then considered by applying various foundation stiffness based upon the reduced embedded depths. From the simulation results, it is found that seismic responses of a bridge with the same scour depth for both foundations increase due to the local scour effect. The bridge scour is found to be significant under weak and moderate seismic intensity. The recovery durations of the foundation stiffness after local scour are found to be critical in the estimation of the probability of foundation failure under earthquakes. Therefore, the safety of the whole bridge system should be conducted with the consideration of the scour effect upon the foundations and the recovery duration of stiffness should be determined rationally.

  • PDF

An Evaluation of Operator's Action Time for Core Cooling Recovery Operation in Nuclear Power Plant (원자력발전소의 노심냉각회복 조치에 대한 운전원 조치시간 평가)

  • Bae, Yeon-Kyoung
    • Journal of the Korean Society of Safety
    • /
    • v.27 no.5
    • /
    • pp.229-234
    • /
    • 2012
  • Operator's action time is evaluated from MAAP4 analysis used in conventional probabilistic safety assessment(PSA) of a nuclear power plant. MAAP4 code which was developed for severe accident analysis is too conservative to perform a realistic PSA. A best-estimate code such as RELAP5/MOD3, MARS has been used to reduce the conservatism of thermal hydraulic analysis. In this study, operator's action time of core cooling recovery operation is evaluated by using the MARS code, which its Fussell-Vessely(F-V) value was evaluated as highly important in a small break loss of coolant(SBLOCA) event and loss of component cooling water(LOCCW) event in previous PSA. The main conclusions were elicited : (1) MARS analysis provides larger time window for operator's action time than MAAP4 analysis and gives the more realistic time window in PSA (2) Sufficient operator's action time can reduce human error probability and core damage frequency in PSA.

A Study on the Transmission Protocol Including Error Recovery Strategy for Ethernet. (에러 회복 기능을 포함하는 Ethernet 전송 프로토콜에 관한 연구)

  • Park, Seong-Rae;Shin, Woo-Cheol;Lee, Sang-Bae;Park, Mi-Gnon
    • Proceedings of the KIEE Conference
    • /
    • 1988.07a
    • /
    • pp.261-264
    • /
    • 1988
  • In this paper, a transmission protocol including error recovery strategy on the data link layer for Ethernet using CSMA/CD media accessing method was proposed. So when considering the actual transmission error probability on the channel, it's performance was analyzed through a simulation. Performming the simmulation, the required parameters was taken as those given by Ethernet controller interface board.

  • PDF