Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 sync defined machine labels to node #11650

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions controllers/alias.go
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,8 @@ type MachineReconciler struct {
WatchFilterValue string

RemoteConditionsGracePeriod time.Duration

SyncMachineLabels []string
}

func (r *MachineReconciler) SetupWithManager(ctx context.Context, mgr ctrl.Manager, options controller.Options) error {
Expand All @@ -81,6 +83,7 @@ func (r *MachineReconciler) SetupWithManager(ctx context.Context, mgr ctrl.Manag
ClusterCache: r.ClusterCache,
WatchFilterValue: r.WatchFilterValue,
RemoteConditionsGracePeriod: r.RemoteConditionsGracePeriod,
SyncMachineLabels: r.SyncMachineLabels,
}).SetupWithManager(ctx, mgr, options)
}

Expand Down
11 changes: 7 additions & 4 deletions docs/book/src/reference/api/metadata-propagation.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,10 @@ Top-level labels that meet a specific cretria are propagated to the Node labels
- `.labels.[label-meets-criteria]` => `Node.labels`
- `.annotations` => Not propagated.

Label should meet one of the following criteria to propagate to Node:
- Has `node-role.kubernetes.io` as prefix.
- Belongs to `node-restriction.kubernetes.io` domain.
- Belongs to `node.cluster.x-k8s.io` domain.
Labels on a Machine that match the list of target domains provided by the combined list of the default sync machine label domains and `--additional-sync-machine-labels` on the manager are synced down to the node.
fabriziopandini marked this conversation as resolved.
Show resolved Hide resolved
The default sync machine label domains are:
- `node-role.kubernetes.io`
- `node-restriction.kubernetes.io`
- `*.node-restriction.kubernetes.io`
- `node.cluster.x-k8s.io`
- `*.node.cluster.x-k8s.io`
14 changes: 9 additions & 5 deletions docs/proposals/20220927-label-sync-between-machine-and-nodes.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,9 @@ With the "divide and conquer" principle in mind this proposal aims to address th

### Goals

- Support label sync from Machine to the linked Kubernetes node, limited to `node-role.kubernetes.io/` prefix and the `node-restriction.kubernetes.io` domain.
- Support label sync from Machine to the linked Kubernetes node.
- Support syncing labels from Machine to the linked Kubernetes node for the Cluster API owned `node.cluster.x-k8s.io` domain.
- Support to configure the domains of labels to be synced from the Machine to the Node.

### Non-Goals

Expand Down Expand Up @@ -98,14 +99,16 @@ While designing a solution for syncing labels between Machine and underlying Kub

### Label domains & prefixes

The idea of scoping synchronization to a well defined set of labels is a first answer to security/concurrency concerns; labels to be managed by Cluster API have been selected based on following criteria:
A default list of labels would always be synced from the Machines to the Nodes. This list can be extended using a flag on the manager by providing the additional list of label domains.

- The `node-role.kubernetes.io` label has been used widely in the past to identify the role of a Kubernetes Node (e.g. `node-role.kubernetes.io/worker=''`). For example, `kubectl get node` looks for this specific label when displaying the role to the user.
The following is the default list of label domains that would always be sync from Machines to Nodes:

- The `node-restriction.kubernetes.io/` domain is recommended in the [Kubernetes docs](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction) for things such as workload placement; please note that in this case
- `node-role.kubernetes.io`: The `node-role.kubernetes.io` label has been used widely in the past to identify the role of a Kubernetes Node (e.g. `node-role.kubernetes.io/worker=''`). For example, `kubectl get node` looks for this specific label when displaying the role to the user.

- `node-restriction.kubernetes.io`, `*.node-restriction.kubernetes.io`: The `node-restriction.kubernetes.io/` domain is recommended in the [Kubernetes docs](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction) for things such as workload placement; please note that in this case
we are considering the entire domain, thus also labels like `example.node-restriction.kubernetes.io/fips=true` fall in this category.

- Cluster API owns a specific domain: `node.cluster.x-k8s.io`.
- `node.cluster.x-k8s.io`, `*node.cluster.x-k8s.io`: Cluster API owns a specific domain: `node.cluster.x-k8s.io`.

#### Synchronization of CAPI Labels

Expand Down Expand Up @@ -163,3 +166,4 @@ Users could also implement their own label synchronizer in their tooling, but th

- [ ] 09/27/2022: First Draft of this document
- [ ] 09/28/2022: First Draft of this document presented in the Cluster API office hours meeting
- [ ] 01/09/2025: Update to support configurable label syncing Ref:[11657](https://github.com/kubernetes-sigs/cluster-api/issues/11657)
2 changes: 2 additions & 0 deletions internal/controllers/machine/machine_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,8 @@ type Reconciler struct {

RemoteConditionsGracePeriod time.Duration

SyncMachineLabels []string

controller controller.Controller
recorder record.EventRecorder
externalTracker external.ObjectTracker
Expand Down
24 changes: 14 additions & 10 deletions internal/controllers/machine/machine_controller_noderef.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ package machine
import (
"context"
"fmt"
"path/filepath"
"strings"

"github.com/pkg/errors"
Expand Down Expand Up @@ -127,7 +128,7 @@ func (r *Reconciler) reconcileNode(ctx context.Context, s *scope) (ctrl.Result,
// Compute labels to be propagated from Machines to nodes.
// NOTE: CAPI should manage only a subset of node labels, everything else should be preserved.
// NOTE: Once we reconcile node labels for the first time, the NodeUninitializedTaint is removed from the node.
nodeLabels := getManagedLabels(machine.Labels)
nodeLabels := r.getManagedLabels(machine.Labels)

// Get interruptible instance status from the infrastructure provider and set the interruptible label on the node.
interruptible := false
Expand Down Expand Up @@ -178,24 +179,27 @@ func (r *Reconciler) reconcileNode(ctx context.Context, s *scope) (ctrl.Result,

// getManagedLabels gets a map[string]string and returns another map[string]string
// filtering out labels not managed by CAPI.
func getManagedLabels(labels map[string]string) map[string]string {
func (r *Reconciler) getManagedLabels(labels map[string]string) map[string]string {
managedLabels := make(map[string]string)
for key, value := range labels {
dnsSubdomainOrName := strings.Split(key, "/")[0]
if dnsSubdomainOrName == clusterv1.NodeRoleLabelPrefix {
managedLabels[key] = value
}
if dnsSubdomainOrName == clusterv1.NodeRestrictionLabelDomain || strings.HasSuffix(dnsSubdomainOrName, "."+clusterv1.NodeRestrictionLabelDomain) {
managedLabels[key] = value
}
if dnsSubdomainOrName == clusterv1.ManagedNodeLabelDomain || strings.HasSuffix(dnsSubdomainOrName, "."+clusterv1.ManagedNodeLabelDomain) {
if r.allowedDomainSync(dnsSubdomainOrName) {
managedLabels[key] = value
}
}

return managedLabels
}

func (r *Reconciler) allowedDomainSync(domain string) bool {
for _, v := range r.SyncMachineLabels {
matched, _ := filepath.Match(v, domain)
Copy link
Member

@sbueringer sbueringer Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason we're using filepath.Match that as per the documentation "matches shell file name patterns" to select labels to sync?

I would have expected something like regexp as it should be more intuitive to match regexp against the full labels keys instead of using file patc matching on a substring of the label key

if matched {
return true
}
}
return false
}

// summarizeNodeConditions summarizes a Node's conditions and returns the summary of condition statuses and concatenate failed condition messages:
// if there is at least 1 semantically-negative condition, summarized status = False;
// if there is at least 1 semantically-positive condition when there is 0 semantically negative condition, summarized status = True;
Expand Down
51 changes: 45 additions & 6 deletions internal/controllers/machine/machine_controller_noderef_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -700,7 +700,6 @@ func TestSummarizeNodeConditions(t *testing.T) {
}

func TestGetManagedLabels(t *testing.T) {
// Create managedLabels map from known managed prefixes.
managedLabels := map[string]string{
clusterv1.NodeRoleLabelPrefix + "/anyRole": "",

Expand All @@ -713,24 +712,64 @@ func TestGetManagedLabels(t *testing.T) {
"custom-prefix." + clusterv1.NodeRestrictionLabelDomain: "",
clusterv1.NodeRestrictionLabelDomain + "/anything": "",
"custom-prefix." + clusterv1.NodeRestrictionLabelDomain + "/anything": "",
"test.foo.com": "",
}

// Append arbitrary labels.
allLabels := map[string]string{
additionalLabels := map[string]string{
"foo": "",
"bar": "",
"company.xyz/node.cluster.x-k8s.io": "not-managed",
"gpu-node.cluster.x-k8s.io": "not-managed",
"company.xyz/node-restriction.kubernetes.io": "not-managed",
"gpu-node-restriction.kubernetes.io": "not-managed",
"wrong.test.foo.com": "",
}

allLabels := map[string]string{}
for k, v := range managedLabels {
allLabels[k] = v
}
for k, v := range additionalLabels {
allLabels[k] = v
}

g := NewWithT(t)
got := getManagedLabels(allLabels)
g.Expect(got).To(BeEquivalentTo(managedLabels))
tests := []struct {
name string
syncMachineLabels []string
allLabels map[string]string
managedLabels map[string]string
}{
{
name: "sync defined labels",
syncMachineLabels: []string{
"node-role.kubernetes.io",
"node-restriction.kubernetes.io",
"*.node-restriction.kubernetes.io",
"node.cluster.x-k8s.io",
"*.node.cluster.x-k8s.io",
"test.foo.com",
},
allLabels: allLabels,
managedLabels: managedLabels,
},
{
name: "sync all labels",
syncMachineLabels: []string{"*"},
allLabels: allLabels,
managedLabels: allLabels,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
g := NewWithT(t)
r := &Reconciler{
SyncMachineLabels: tt.syncMachineLabels,
}
got := r.getManagedLabels(tt.allLabels)
g.Expect(got).To(BeEquivalentTo(tt.managedLabels))
})
}
}

func TestPatchNode(t *testing.T) {
Expand Down
7 changes: 7 additions & 0 deletions internal/controllers/machine/suite_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,13 @@ func TestMain(m *testing.M) {
APIReader: mgr.GetAPIReader(),
ClusterCache: clusterCache,
RemoteConditionsGracePeriod: 5 * time.Minute,
SyncMachineLabels: []string{
"node-role.kubernetes.io",
"node-restriction.kubernetes.io",
"*.node-restriction.kubernetes.io",
"node.cluster.x-k8s.io",
"*.node.cluster.x-k8s.io",
},
}).SetupWithManager(ctx, mgr, controller.Options{MaxConcurrentReconciles: 1}); err != nil {
panic(fmt.Sprintf("Failed to start MachineReconciler: %v", err))
}
Expand Down
21 changes: 21 additions & 0 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ var (
clusterResourceSetConcurrency int
machineHealthCheckConcurrency int
useDeprecatedInfraMachineNaming bool
syncMachineLabels []string
)

func init() {
Expand Down Expand Up @@ -254,11 +255,30 @@ func InitFlags(fs *pflag.FlagSet) {
"Use deprecated infrastructure machine naming")
_ = fs.MarkDeprecated("use-deprecated-infra-machine-naming", "This flag will be removed in v1.9.")

fs.StringArrayVar(&syncMachineLabels, "additional-sync-machine-labels", []string{}, `The list of additional domains used to sync labels from machines to nodes.
The flag also supports standard glob processing:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How sure are we that this definition from the filepath.Match godoc is identical to glob patterns?

'*' matches any sequence of non-Separator characters
'?' matches any single non-Separator character
'[' [ '^' ] { character-range } ']'
character class (must be non-empty)
c matches character c (c != '*', '?', '\\', '[')
'\\' c matches character c`)

flags.AddManagerOptions(fs, &managerOptions)

feature.MutableGates.AddFlag(fs)
}

var (
defaultSyncMachineLabels = []string{
"node-role.kubernetes.io",
"node-restriction.kubernetes.io",
"*.node-restriction.kubernetes.io",
"node.cluster.x-k8s.io",
"*.node.cluster.x-k8s.io",
}
)

// Add RBAC for the authorized diagnostics endpoint.
// +kubebuilder:rbac:groups=authentication.k8s.io,resources=tokenreviews,verbs=create
// +kubebuilder:rbac:groups=authorization.k8s.io,resources=subjectaccessreviews,verbs=create
Expand Down Expand Up @@ -565,6 +585,7 @@ func setupReconcilers(ctx context.Context, mgr ctrl.Manager, watchNamespaces map
ClusterCache: clusterCache,
WatchFilterValue: watchFilterValue,
RemoteConditionsGracePeriod: remoteConditionsGracePeriod,
SyncMachineLabels: append(defaultSyncMachineLabels, syncMachineLabels...),
}).SetupWithManager(ctx, mgr, concurrency(machineConcurrency)); err != nil {
setupLog.Error(err, "Unable to create controller", "controller", "Machine")
os.Exit(1)
Expand Down
Loading