Understanding Distributed Systems
โ Scribed by Roberto Vitillo
- Year
- 2021
- Tongue
- English
- Leaves
- 236
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Table of Contents
Copyright
About the author
Acknowledgements
Preface
Who should read this book
Introduction
Communication
Coordination
Scalability
Resiliency
Operations
Anatomy of a distributed system
I Communication
Reliable links
Reliability
Connection lifecycle
Flow control
Congestion control
Custom protocols
Secure links
Encryption
Authentication
Integrity
Handshake
Discovery
APIs
HTTP
Resources
Request methods
Response status codes
OpenAPI
Evolution
II Coordination
System models
Failure detection
Time
Physical clocks
Logical clocks
Vector clocks
Leader election
Raft leader election
Practical considerations
Replication
State machine replication
Consensus
Consistency models
Strong consistency
Sequential consistency
Eventual consistency
CAP theorem
Practical considerations
Transactions
ACID
Isolation
Concurrency control
Atomicity
Two-phase commit
Asynchronous transactions
Log-based transactions
Sagas
Isolation
III Scalability
Functional decomposition
Microservices
Benefits
Costs
Practical considerations
API gateway
Routing
Composition
Translation
Cross-cutting concerns
Caveats
CQRS
Messaging
Guarantees
Exactly-once processing
Failures
Backlogs
Fault isolation
Reference plus blob
Partitioning
Sharding strategies
Range partitioning
Hash partitioning
Rebalancing
Static partitioning
Dynamic partitioning
Practical considerations
Duplication
Network load balancing
DNS load balancing
Transport layer load balancing
Application layer load balancing
Geo load balancing
Replication
Single leader replication
Multi-leader replication
Leaderless replication
Caching
Policies
In-process cache
Out-of-process cache
IV Resiliency
Common failure causes
Single point of failure
Unreliable network
Slow processes
Unexpected load
Cascading failures
Risk management
Downstream resiliency
Timeout
Retry
Exponential backoff
Retry amplification
Circuit breaker
State machine
Upstream resiliency
Load shedding
Load leveling
Rate-limiting
Single-process implementation
Distributed implementation
Bulkhead
Health endpoint
Health checks
Watchdog
V Testing and operations
Testing
Scope
Size
Practical considerations
Continuous delivery and deployment
Review and build
Pre-production
Production
Rollbacks
Monitoring
Metrics
Service-level indicators
Service-level objectives
Alerts
Dashboards
Best practices
On-call
Observability
Logs
Traces
Putting it all together
Final words
๐ SIMILAR VOLUMES
<p><span>This book helps readers easily learn basic model checking by presenting examples, exercises and case studies. The toolset mCRL2 provides a language to specify the behaviour of distributed systems, in particular where there is concurrency with inter-process communication. This language allow
This book helps readers easily learn basic model checking by presenting examples, exercises and case studies. The toolset mCRL2 provides a language to specify the behaviour of distributed systems, in particular where there is concurrency with inter-process communication. This language allows us to a
<span>Learning to build distributed systems is hard</span><span>, especially if they are large scale. It's not that there is a lack of information out there. You can find academic papers, engineering blogs, and even books on the subject. The problem is that the available information is spread out al