TMS message publish do not retry on socket timeout

Description

We have seen cases where die to messaging service restart TMS message publish gets IO socket timeout. As we do not retry on any IO exception this can cause failures to publish.

We have seen this causing issues while provisioning where the failing to publish message after cluster is provisioned marks the provisioning as failed.

Release Notes

None

Activity

Show:

Samik GuptaMarch 13, 2024 at 6:13 PM

Discussed with . Generally it should be safe to republish program state messages hence it is ok to retry on IOExceptions.

Avinash AcharMarch 12, 2024 at 9:18 AM

Need to check with Albert

Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Triaged

No

Components

Fix versions

Due date

Priority

Created March 7, 2024 at 8:54 AM
Updated April 16, 2024 at 9:04 AM