Skip to main content

Posts

Showing posts from September, 2020

Resiliency in distributed messaging

A common unit of work in a distributed system involves at least two things: 1) Storing the result of work to DB 2) Notifying consumers about the changes The problem is - they often can't be done in the same transaction: Kafka, Redis, RabitMQ, AlmostAnyCloud messaging system - they all don't support XA transactions. ( And not because they are lazy to implement one, but that deserves a separate post ) Let's say we have a transaction scope opened(Tb) and we want to store some changes to DB and dispatch an event: Tb -> DB -> Message -> Tc  Looks valid, right? If DB changes fail we will rollback DB transaction. If dispatching an event failed, we still rollback. Seems quite transactional, where is the problem ¯\(°_o)/¯? Problem 1: Leaked notifications: on failure The problem is with the last part:  Tc  - Transaction commit (end) Why would it fail? It can be a network outage Our DB transaction can timeout while we are dispatching messages SQL Server might run ou...