Optimism vs Pessimism in Distributed Systems
Avoiding coordination is the one fundamental thing that allows us to build distributed systems that out-scale the performance of a single machine1. When we build systems that avoid coordinating, we end up building components that make assumptions about what other components are doing. This, too, is fundamental. If two components can’t check in with each other after every single step, they need to make assumptions about the ongoing behavior of the other component.
One way to classify these assumptions is into optimistic and pessimistic assumptions. I find it very useful, when thinking through the design of a distributed system, to be explicit about each assumption each component is making, whether that assumption is optimistic or pessimistic, and what exactly happens if the assumption is wrong. The choice between pessimistic and optimistic assumptions can make a huge difference to the scalability and performance of systems.