Microsoft.AspNetCore.DataProtection.KeyManagement cache invalidation issue #58074
-
Hi there. We use Microsoft.AspNetCore.DataProtection.KeyManagement for encryption purposes where multiple micro services rely on a shared keys.xml file held in blob storage that all services have access to. This week we had an issue where one of the services created a new key unexpectedly over 2 weeks before the expiration date of the current key. it then started encrypting data with that new key. our other services didnt know anything about the new key due to the caching and we started getting failures. It seems there is no way to invalidate the cache apart from modifying the keys by manually creating a new one, which is what our code does in these error cases. However, what that did was commit a new keys.xml file to the storage with another new key but the other recent key wasnt persisted I assume because when the second service committed its new key it did it directly from cache, meaning we lost the key from the first service. we managed to restore the version that had the first of the two keys after some challenges so we resolved the situation. But I would like to know what should we be doing in this case? should we force a restart of the other micro services when a new key is created? how else can we force the cache to be reloaded apart from modifying the keys file? Perhaps @amcasey you might have some ideas as I know you are familiar with this code. Thanks. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 10 replies
-
Hey, @dougreesIG, thanks for reaching out! Without much data to go on, my initial guess would be that the app had trouble connecting to your key repository (Azure?) and, having failed to do so, concluded that a new key was require. Unfortunately, there's no way for other instances to know when that happens so, as you observed, they tend to fail when they see data encrypted with it.
I believe (but have not personally verified) that you could also use a dummy revocation - say revoking all keys created before 1900-01-01 - that should also invalidate the cache.
This makes sense to me.
But this doesn't. Which instance generated the new key (to force cache invalidation)? Was it the same one that created the problematic key or a different one? Either way, I wouldn't expect a key to get lost (unless, I suppose, the generating instance lost access to the backing store and was unable to persist it).
I think you mean you manually replaced keys.xml in the repository and each instance. That would suffice, but is definitely not something we want app authors to have to do. Before I answer your other questions, it would help to know more about your app. Which version of aspnetcore are you running? Where are you storing your keys (i.e. your repository)? How are you protecting your keys? Cheers, |
Beta Was this translation helpful? Give feedback.
-
I'll try to answer in more detail later, but my immediate reaction is that the azure provider has logic to prevent clobbering - basically, an instance should only be allowed to update the shared key file if it's (pre-update) local state matched the server state. So my guess would be that the instance that generated the bad key (service A) also failed to publish the bad key back to azure. If that's what happened, then B didn't clobber the bad key - it simply never existed in shared storage. Do you have the option of testing out the 9.0 version of the data protection package(s)? We added some more resilience to server unavailability that we're hoping will prevent situations like this. |
Beta Was this translation helpful? Give feedback.
Yes I have a feeling this is it! Im looking at
https://github.com/Azure/azure-sdk-for-net/blob/Azure.Extensions.AspNetCore.DataProtection.Blobs_1.3.4/sdk/extensions/Azure.Extensions.AspNetCore.DataProtection.Blobs/src/AzureBlobXmlRepository.cs
and this seems to have logic to refresh first before overwriting. I will get that looked at first thing tomorrow by my devs.