The only recent book on dependability/fault-tolerance that covers both software and hardware aspects of dependability, Dependable Computing Design and Assessment addresses the new reality of dependability. After a discussion of reliability, availability, and hardware and software fault models, the authors explore hardware redundancy, coding techniques, processor-level error detection and recovery, checkpoint and recovery, software fault tolerance techniques, and network-specific issues. Ideal for both students and practitioners, the capabilities and applicability of all techniques are illustrated with examples of actual applications and systems.
Dependable Computing: Design and Assessment
PrefaceacknowledgementsChapter 01 Classical Dependability Techniques & Modern Computing Systemsx.docxChapter 02 Hardware Error Detection through Hardware Implemented Techniquesx.docxChapter 03 Processor-Level Error Detection and Recoveryx.docxChapter 04 Hardware Error Detection through Software Implemented Techniquesx.docxChapter 05 Software error detection and recovery through software analysis.docxChapter 06 Measurement-based Analysis of System Software Operating System Failure Behavior Overviewx.docxChapter 07 Reliable Networked and Distributed Systemsx.docxChapter 08 Checkpointing and Rollback Error Recoveryx.docxChapter 09 Dependability in Large-Scale Systemsx.docxChapter 10 Measurement-based Analysis of Large Scale Clustersx.docxChapter 11 Internals of Fault Injection Techniquesx.docxChapter 12 Safeguarding Current Technologiesx.docxIndex