Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-51011][CORE] Add logging for whether a task is going to be interrupted when killed #49699

Closed
wants to merge 4 commits into from

Conversation

neilramaswamy
Copy link
Contributor

What changes were proposed in this pull request?

We now log the value of interruptThread when a TaskRunner's kill method is killed. This should help with debugging when potential zombie Spark tasks do not seem to be exiting.

Why are the changes needed?

Today, it's tricky to debug why a task is not exiting (and thus, why executors might be getting lost) without knowing for sure if it was issued a Java interrupt.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Ran org.apache.spark.executor.ExecutorSuite and verified the log looked as expected.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the CORE label Jan 28, 2025
@HyukjinKwon HyukjinKwon changed the title [SPARK-51011] Add logging for whether a task is going to be interrupted when killing tasks [SPARK-51011][CORE] Add logging for whether a task is going to be interrupted when killing tasks Jan 28, 2025
@neilramaswamy
Copy link
Contributor Author

cc: @JoshRosen, but let me know if you do not want to review this.

Also, I don't plan on backporting this, but reviewers, please let me know if you think it's a good idea.

@neilramaswamy neilramaswamy changed the title [SPARK-51011][CORE] Add logging for whether a task is going to be interrupted when killing tasks [SPARK-51011][CORE] Add logging for whether a task is going to be interrupted when killed Jan 28, 2025
Copy link
Contributor

@JoshRosen JoshRosen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM: this seems like a straightforward and useful logging enhancement.

@HyukjinKwon
Copy link
Member

Merged to master and branch-4.0.

HyukjinKwon added a commit that referenced this pull request Feb 3, 2025
…errupted when killed

We now log the value of `interruptThread` when a `TaskRunner`'s `kill` method is killed. This should help with debugging when potential zombie Spark tasks do not seem to be exiting.

Today, it's tricky to debug why a task is not exiting (and thus, why executors might be getting lost) without knowing for sure if it was issued a Java interrupt.

No.

Ran `org.apache.spark.executor.ExecutorSuite` and verified the log looked as expected.

No.

Closes #49699 from neilramaswamy/spark-51011.

Lead-authored-by: Neil Ramaswamy <[email protected]>
Co-authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
(cherry picked from commit f03f45e)
Signed-off-by: Hyukjin Kwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants