Enable an application to handle transient failures when it tries to connect to a service or network resource, by transparently retrying a failed operation.
When to use this pattern?
Transient faults when interacts with a remote service or accesses a remote resource.
Short lived fault, and repeating a request that has previously failed could succeed on a subsequent attempt.
When NOT to use this pattern?
Long-lasting faults.
Errors in the business logic of an application.
As an alternative to addressing scalability issues in a system.
Retry Strategies 1 -
Cancel
Not transient
Unlikely to succussed if repeated
Should cancel the operation and report an exception
Example: Authentication failure
Retry Strategies 2 -
Retry Uncommon or rare faults
Unlikely to be repeated
Should retry the failed operation immediately
Retry after delay 3 -
Connectivity or busy faults
Unlikely to be repeated
Should wait for a suitable time before retry the failed operation.
Considerations
Matching with business requirements and the nature of the failure.
Avoid an aggressive retry policy with minimal delay between attempts.
Consider implementing Circuit Breaker Pattern.
Retry with idempotent operation.
Adjust the time between retry attempts based on the type of the exception.
Retrying an operation that's part of a transaction.
Only when the full context of a failing operation is understood.
Testing - Need to be tested in different situation
Logging
Investigate the faults.