LOG current yarn memory usage when an Application is Submitted or accepted.

Description

Current job submission via yarn :
Submitted > accepted > running ( if resource is available )

But if resource is not available i.e. yarn available memory is less than yarn pending memory then the application is killed.
We do not have a LOG or WARN to indicate the root cause of this pipeline failure.

1. Better logging of yarn resource :

When the application goes in submitted or accepted state:

  • we should log YARN over all available memory/core + YARN pending memory

  • log how much resource are we requesting. [ Possible that multiple jobs are running on the cluster and this application might be requesting for less resource , just for better visibility ]

  1. If the pipeline fails because of resource unavailability
    [ STARTING:o.a.t.y.YarnTwillController@138] - Yarn application worker.edw.aa_df_test_truncate.DeltaWorker application_1705099731196_0153 is not in running state. Shutting down controller.

  • We should log the reason is insufficient resources ( if there is an error code from YARN , then great. )

The above would require interacting with yarn client to fetch these infos..

Release Notes

None

Activity

Show:
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Triaged

No

Size

M

Components

Priority

Created February 21, 2024 at 8:08 AM
Updated February 21, 2024 at 8:08 AM