Reliable Cloud Applications

Fault Tolerance

Typically a crash-stop/crash-recovery system works by…

This however means, that it is difficult to identify…

Unreliable Networks

Remember that “even” data center internal networks can be faulty…

Typically you send a response (ACK) message, but that too may be lost

Detecting Faults

System Models

Used to model assumptions about our system, specifically

Network Behavior

Node Behavior

Executed algorithm could…

Synchronous Timing

Synchronity Violations


Additional Content

  • Network Behavior
  • Node Behavior
  • Timing Behavior