Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add moderate way to retry docker login failures #849

Open
wants to merge 22 commits into
base: master
Choose a base branch
from

Conversation

fedordikarev
Copy link

Ref: #750
Ref: https://github.com/neondatabase/cloud/issues/20164

Current problem:

  • occasionally we hit 502 from docker registry (from our experience, no registry with 100% uptime, and issues could happen no matter who owns the registry)
  • docker/login-action doesn't have retry mechanism
  • so we either have to fail the whole Workflow
  • or we have to restart whole Workflow
  • we do "restarting whole Workflow", and most of the times second attempt works.
  • and if second-third attempt dont work, often it's bigger issue so just need to wait and we introduce local caching for our case

that mechanism at the same time should be conservative, not adding extra unneeded load to DockerHub or another Registry, so for example we shouldn't retry "Bad credentials", we should have timeouts, and reasonable max_attempts

so with that approach:

  • having no retries in default config
  • having max number of attempts
  • we expect changes to be safe
  • people who doesn't care much, they will continue to use default settings with no retries
  • people who knows, they understand no reason for over-intensive retries, and should set timeout and max attempts wisely
  • as safer option, we could add extra checks, like retry timeout 15 seconds or more, max attempts 4 or less, etc
  • we could also add exponential backoff here, just for me it seems extra complexity, while just give up after 3 attempts and 1 minute IMO should work best for the login action

Please let me know if it's good approach or if anything should be adjusted to be accepted

Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Signed-off-by: Fedor Dikarev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant