Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a login flow via pyDataverse to retrieve an API Token via Dataverse/a browser #209

Open
shoeffner opened this issue Jul 30, 2024 · 1 comment
Labels
type:feature New feature

Comments

@shoeffner
Copy link
Collaborator

shoeffner commented Jul 30, 2024

In https://dataverse.zulipchat.com/#narrow/stream/377090-python/topic/auth.20options, we discussed different options to ease the access to Dataverse with pyDataverse / CLI / API clients.

One of the options is similar to kubectl's and nomad's authentication mechanisms: You open a browser to retrieve a (bearer) token, login to your favorite OIDC provider with a callback to localhost, and get the token passed to a temporarily started local webserver.

In this issue we want to track ideas about how to make it this work.

@shoeffner shoeffner added type:feature New feature status:incoming Newly created issue to be forwarded and removed status:incoming Newly created issue to be forwarded labels Jul 30, 2024
@shoeffner
Copy link
Collaborator Author

Yesterday, @JR-1991 and I had a productive programming session around the OIDC login.
It seems that (at least in the example setup) Dataverse does not store a fixed redirect_uri in Keycloak or that it is at least not checked by Keycloak. This should allow us to "intercept" the login flow (not in a bad way, don't get me wrong) and retrieve the cookie response with the mechanism which I outlined above and which we discussed in Zulip.

When doing a login via OIDC on Dataverse via the browser, the process is roughly like this:

sequenceDiagram
    participant B as Browser
    participant D as Dataverse
    participant K as Keycloak/IdP
    B->>D: requests OIDC information
    D->>B: returns auth URL and start code as <a href>
    B->>K: opens auth URL
    K->>B: returns login page
    B->>K: submits credentials
    K->>B: returns response including result code with redirect to redirect URI
    B->>D: follows redirect, passing result code
    D->>K: passes result code
    K->>D: returns identity information as JWT
    D->>B: responds with Cookie containing JWT
Loading

We managed to basically replace the browser for some of the steps, allowing us to a) find out the Dataverse Client ID to tell Keycloak for which client we require credentials and b) retrieve the Cookie Dataverse sets so that we get a Bearer token and not the browser.

sequenceDiagram
    participant P as pyDataverse
    participant B as Browser
    participant D as Dataverse
    participant K as Keycloak/IdP
    P->>D: requests OIDC information
    D->>P: returns auth URL and start code as <a href>
    P->>P: rewrites redirect URL
    create participant L as local server
    P->>L: starts local server
    P->>B: opens auth URL in browser
    B->>K: opens auth URL
    K->>B: returns login page
    B->>K: submits credentials
    K->>B: returns response including result code with redirect to redirect URI
    B->>L: follows redirect, passing result code
    L->>P: passes result code
    destroy L
    P->>L: stops local server
    P->>D: passes result code
    D->>K: passes result code
    K->>D: returns identity information as JWT
    D->>P: responds with Cookie containing JWT
Loading

We got a toy example for this flow working, but a few open questions remain:

  • The auth URL / client ID is embedded in some lazy-loaded partial HTML document, which we need to retrieve somehow. Maybe Dataverse can provide an API to retrieve the required information
  • What if multiple OIDC providers are configured for Dataverse? How to select the correct one? Maybe we can go a step further than having Dataverse provide the auth URL, but instead cooperate to do the login for us and pass down the cookie via a redirect URL – but I am not sure that's a good idea. I guess if there's an API, one could simply let the user decide and fallback to the first or so.
  • Where to store the Cookie/JWT? We don't want to perform the full ping pong for every requests, that'd defeat the purpose. Only in memory (only makes limited sense for interactive applications)? Current working directory? In ~/.config/XDG_CONFIG_HOME? In a credential manager? Let the user decide?
  • What about OIDC providers where the callback_uri is stored? Should we maybe pick and document a specific port to open such that a http://localhost:PORT can be used? Is there a convention or spec for those local server ports?

Anyways, this was a very productive session and we plan to continue with an actual implementation for pyDataverse next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature New feature
Projects
None yet
Development

No branches or pull requests

1 participant