You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There have been times when Rekor is successful in writing a new entry to the transparency log, but fails to write to Redis. It would be possible to improve this by adding retry logic to the Rekor API, but there will always be some edge case where the API server is unable to write to Redis before the API server is shut down and loses its in-memory retry queue.
Rather than relying on the API server to guarantee writes to Redis, we can use the MySQL database as the source of truth. GCP Datastream is a serverless offering that integrates with databases and takes action when writes occur. Datastream does not currently (WIP) support taking events straight to GCP PubSub, but does support writing to GCS and GCS write events can be used to trigger PubSub which would then be consumed by a new job. The new job would only Ack the PubSub messages after the entry was successfully written to Redis.
Open Questions:
Should the API server still attempt to write to Redis and this would only be used for reconciliation, or should the API server not write to Redis at all and rely on this flow?
How to prevent abuse of principals with the ability to write to the GCS bucket? Can we rely on IAM or should the PubSub consumer job validate the entry against Rekor's signing keys?
How to handle the lifecycle for the temporary GCS objects? Is deleting all objects older than N days sufficient?
Would the costs of Datastream + GCS + PubSub be too much?
The text was updated successfully, but these errors were encountered:
I very much like this idea to use the DB as the source of truth and rely on GCP to guarantee entry upload side effects occur.
Should the API server still attempt to write to Redis and this would only be used for reconciliation, or should the API server not write to Redis at all and rely on this flow?
Given this feature would be exclusively for GCP, having both be supported would be ideal, for those who only want to use Redis. I would disable writing directly to Redis if a flag is enabled that says datastream is in use.
How to prevent abuse of principals with the ability to write to the GCS bucket? Can we rely on IAM or should the PubSub consumer job validate the entry against Rekor's signing keys?
I think IAM should be sufficient, though this should be fleshed out in a design.
How to handle the lifecycle for the temporary GCS objects? Is deleting all objects older than N days sufficient?
Should also consider multi-day outages. For example, if pub/sub is down for a few days, what happens if the temporary object has been deleted from GCS? Do we need a job to delete old entries from GCS?
Description
There have been times when Rekor is successful in writing a new entry to the transparency log, but fails to write to Redis. It would be possible to improve this by adding retry logic to the Rekor API, but there will always be some edge case where the API server is unable to write to Redis before the API server is shut down and loses its in-memory retry queue.
Rather than relying on the API server to guarantee writes to Redis, we can use the MySQL database as the source of truth. GCP Datastream is a serverless offering that integrates with databases and takes action when writes occur. Datastream does not currently (WIP) support taking events straight to GCP PubSub, but does support writing to GCS and GCS write events can be used to trigger PubSub which would then be consumed by a new job. The new job would only Ack the PubSub messages after the entry was successfully written to Redis.
Open Questions:
The text was updated successfully, but these errors were encountered: