Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRA: admin-controlled device attributes (device health, maintenance, priority) #5027

Open
4 tasks
pohly opened this issue Jan 8, 2025 · 3 comments
Open
4 tasks
Labels
sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.

Comments

@pohly
Copy link
Contributor

pohly commented Jan 8, 2025

Enhancement Description

Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.

@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jan 8, 2025
@pohly
Copy link
Contributor Author

pohly commented Jan 8, 2025

/sig node
/sig scheduling
/wg device-management

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. wg/device-management Categorizes an issue or PR as relevant to WG Device Management. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 8, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Scheduling Jan 8, 2025
@eero-t
Copy link

eero-t commented Jan 13, 2025

By overriding device attributes, cluster admins can prevent usage of certain devices (unhealthy, taken offline for maintenance)

How admin should run test workload(s) on device which scheduling has been disabled (e.g. for firmware upgrade), to know whether it can be enabled egain (for production workloads)?

With node taints, one would use taint tolerance for this, but I don't seem from KEP description how similar thing would be achieved for DRA devices.

@pohly
Copy link
Contributor Author

pohly commented Jan 13, 2025

@eero-t: let's discuss in the PR, there we have threading => https://github.com/kubernetes/enhancements/pull/5034/files#r1913015929

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.
Projects
Status: Needs Triage
Development

No branches or pull requests

3 participants