Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install: several fixes to avoid containerd config file corruption #482

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ldoktor
Copy link
Contributor

@ldoktor ldoktor commented Jan 14, 2025

The corruption seems to be mainly caused by double-execution of the pre-reqs setup. This part should avoid the corruption but it still might collide with kata-deploy as the operator currently uses the same label to start cleanup and to detect the cleanup is done. This should be handled separately later...

Fixes #481

@ldoktor ldoktor requested a review from a team as a code owner January 14, 2025 11:53
it might happen that someone else is also modifying containerd
configuration at the same time. This patch ensures our modification is
atomic and only applied when the source file is not changed. In case it
fails to update the file in 10 iterations it fails.

Signed-off-by: Lukáš Doktor <[email protected]>
containerd fails to start when import is missing. Let's first remove the
import and only when it's gone remove the drop-in nydus configuration.

Signed-off-by: Lukáš Doktor <[email protected]>
the install pod can fail and be re-scheduled. Modify the containerd
setup to allow re-execution without breaking the setup.

Fixes: confidential-containers#481

Signed-off-by: Lukáš Doktor <[email protected]>
the labeling itself as well as all the following jobs require functional
node. Let's wait for the node to be up and running before labeling it.

Signed-off-by: Lukáš Doktor <[email protected]>
@mythi
Copy link
Contributor

mythi commented Jan 16, 2025

@ldoktor nice work on #481!

What's your preference on getting these two PRs reviewed and merged. Should this be completed first?

@ldoktor
Copy link
Contributor Author

ldoktor commented Jan 16, 2025

@ldoktor nice work on #481!

What's your preference on getting these two PRs reviewed and merged. Should this be completed first?

I don't really have a preference. This one is less intrusive but I only tested it briefly. It should decrease the probability of hitting the issue, but it is only a workaround, while the #483 is a proper fix. On the other hand #483 contains intrusive changes (changing names of labels) which might not be a good idea this early before a release (on the other hand I have tested that one a lot and it seems to work well)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Corrupted containerd/config.toml after several operator install/uninstall iterations
2 participants