No retry when system worker is temporarily unavailable due to error 5xx

Description

https://github.com/cdapio/cdap/blob/c2743d8d69cd8538f579fba9a7f684ede32ca33e/cdap-app-fabric/src/main/java/io/cdap/cdap/internal/app/worker/system/SystemWorkerHttpHandlerInternal.java#L112

When system worker is temporarily unavailable due to error 5xx, there are no retries in place to check for availability after some time. Retries should be added in this case if it the service comes up after a while.

The issue was identified when the pod containing system worker was deleted during pipeline execution.

Release Notes

None

Activity

Show:
Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Triaged

Yes

Size

M

Components

Fix versions

Priority

Created February 8, 2024 at 8:58 AM
Updated February 15, 2024 at 10:08 AM
Resolved February 15, 2024 at 10:08 AM