-
Notifications
You must be signed in to change notification settings - Fork 117
Gang Scheduling of JobSets #969
Copy link
Copy link
Open
Labels
kind/featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.triage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.wg/workload-aware-schedulingCategorizes issue or PR as relevant to workload-aware scheduling WG.Categorizes issue or PR as relevant to workload-aware scheduling WG.
Metadata
Metadata
Assignees
Labels
kind/featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.triage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.wg/workload-aware-schedulingCategorizes issue or PR as relevant to workload-aware scheduling WG.Categorizes issue or PR as relevant to workload-aware scheduling WG.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Projects
StatusShow more project fields
Feature
What would you like to be added:
Following LWS gang scheduling, I think we should have a way to support gang scheduling via PodGroups.
Why is this needed:
JobSets usually require tight grouping and I think gang scheduling would be useful to enforce scheduling at one time.
Projects like Kubeflow Traininer and KAI use PodGroups to enforce gang scheduling but our controller does not create PodGroups.
This means that these projects will create their own PodGroups which is not ideal.
Why not Kueue:
In previous asks for this, we used to say that we would recommend the use of Kueue for gang scheduling. I still think this is valid but there are other options. Also, it is heavyweight to recommend Kueue for gang scheduling if someone only wants a JobSet to be scheduled together.
So I think having this repo create the PodGroups would make sense.
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.