Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Kueue service backend #706

Open
5 of 13 tasks
cortadocodes opened this issue Jan 6, 2025 · 0 comments
Open
5 of 13 tasks

Create Kueue service backend #706

cortadocodes opened this issue Jan 6, 2025 · 0 comments
Assignees
Labels
epic Contains links to a collection of issues

Comments

@cortadocodes
Copy link
Member

cortadocodes commented Jan 6, 2025

Epic

End User Goal

Allow a user to run a question of any size without it timing out.

Overview

Cloud Run is limiting our ability to run questions that take longer than an hour and/or require more powerful hardware. It also locks us into a set of frustrating problems.

Creating a Kueue service backend will:

  • Allow us to run questions that take any amount of time (specifically opening us up to runs > 1 hour)
  • Access hardware we can't currently access (e.g. GPUs)
  • Access arbitrarily provisioned hardware (CPU, memory, storage etc.)
  • Stop pointless question reruns by allowing us to control when we acknowledge question events
  • Cancel running questions
  • Monitor running questions individually
  • Run questions on providers other than Google (i.e. on any Kubernetes cluster)

Contents

@cortadocodes cortadocodes added epic Contains links to a collection of issues major-missing-feature labels Jan 6, 2025
@cortadocodes cortadocodes changed the title Create Kueue backend Create Kueue service backend Jan 6, 2025
@cortadocodes cortadocodes self-assigned this Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic Contains links to a collection of issues
Projects
None yet
Development

No branches or pull requests

2 participants