Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API OIDC authentication mechanism #10905

Draft
wants to merge 61 commits into
base: develop
Choose a base branch
from
Draft

API OIDC authentication mechanism #10905

wants to merge 61 commits into from

Conversation

ErykKul
Copy link
Collaborator

@ErykKul ErykKul commented Oct 3, 2024

What this PR does / why we need it:
It reimplements OIDC authentication mechanism (including the bearer tokens authentication)

Which issue(s) this PR closes:

Closes OIDC/bearer token related issues

Special notes for your reviewer:
This is very similar conceptually as the oauth2-proxy, as in IQSS/dataverse-frontend#504, however, it turns out that the needed functionality is already supported in Payara: https://docs.payara.fish/enterprise/docs/Technical%20Documentation/Public%20API/OpenID%20Connect%20Support.html
This PR eliminates any need for a proxy, etc. It turned out that the implementation is very simple and elegant, i.m.h.o.

Suggestions on how to test this:

  • it is a good idea to test the entire flow and make sure the user you are testing with does not yet exist in the dev dataverse instance, but it does exist keycloack. You can delete your docker volumes with sudo rm -rf docker-dev-volumes, if that helps. Also, make sure that you have an entry for the Kaycloak in your hosts file (e.g., /etc/hosts on unix system), as described in the docker-compose file. It should look soemthing like this:
127.0.1.1       keycloak.mydomain.com
  • you can run the dev env with mvn -Pct clean package docker:run
  • go to http://localhost:8080/ and click 'Log In', click "OpenID Connect" and then click "Log In with OpenID Connect"
  • log in with admin/admin

image

  • you will get the redirect to create a new user (because of the first time log in)

image

  • choose a user name and agree to the conditions, click on "Create Account" and now you are logged in:

image

image

image

  • copy the "session" field and try curl (replace the session-id with the copied value)
curl -v --cookie "JSESSIONID=session-id" http://localhost:8080/api/v1/users/:me | jq .

image

  • test the API/Bearer token flow by running the Python script, e.g.:
cd doc/sphinx-guides/_static/api/bearer-token-example
./run.sh
  • this will start a Python script that will prompt a log in in a new browser window:

image

image

Does this PR introduce a user interface change? If mockups are available, please link/include them here:
No

Is there a release notes update needed for this change?:
Yes

Additional documentation:
https://docs.payara.fish/enterprise/docs/Technical%20Documentation/Public%20API/OpenID%20Connect%20Support.html

Doc preview: https://dataverse-guide--10905.org.readthedocs.build/en/10905/installation/oidc.html

@ErykKul ErykKul requested a review from pdurbin October 3, 2024 15:33
@pdurbin pdurbin changed the title api oidc authentication mechanism API OIDC authentication mechanism Oct 3, 2024
@coveralls
Copy link

coveralls commented Oct 3, 2024

Coverage Status

coverage: 20.705% (-0.2%) from 20.869%
when pulling 0ec91dd on authn-arch-api-oidc
into a0cb73d on develop.

@pdurbin pdurbin self-assigned this Oct 3, 2024
@pdurbin pdurbin added the Size: 10 A percentage of a sprint. 7 hours. label Oct 3, 2024

This comment has been minimized.

@cmbz cmbz added the GREI Re-arch Issues related to the GREI Dataverse rearchitecture label Oct 3, 2024

This comment has been minimized.

This comment has been minimized.

docker-compose-dev.yml Outdated Show resolved Hide resolved
docker-compose-dev.yml Outdated Show resolved Hide resolved
Thanks!

Co-authored-by: Philip Durbin <philip_durbin@harvard.edu>
@ErykKul
Copy link
Collaborator Author

ErykKul commented Oct 3, 2024

I forgot to mention this: the log-in is a one time thing. The normal flow is to go to http://localhost:8080/api/v1/callback/session and only if you are not authenticated (I implemented returning better error message now), you go to http://localhost:8080/oidc/login.

@pdurbin thanks for the docker run command, I will try testing it with that i.s.o. my own dev environment.

This comment has been minimized.

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial feedback. I haven't done any real testing yet.

docker-compose-dev.yml Outdated Show resolved Hide resolved
docker-compose-dev.yml Outdated Show resolved Hide resolved
docker-compose-dev.yml Show resolved Hide resolved
docker-compose-dev.yml Outdated Show resolved Hide resolved
docker-compose-dev.yml Outdated Show resolved Hide resolved
docker-compose-dev.yml Show resolved Hide resolved

This comment has been minimized.

1 similar comment

This comment has been minimized.

This comment has been minimized.

1 similar comment

This comment has been minimized.

This comment has been minimized.

@ErykKul
Copy link
Collaborator Author

ErykKul commented Oct 7, 2024

I am little bit lost in the comments right now. I tried to filter out the important things and I went trough some of the comments quickly (sorry for that...). I think that the most important thing is to decide if we want to use this code. If we decide not to use it because of the cookies use, or the security context/identity store/authentication servlet thing, then it makes not much sense to focus on the small details (at least not right now).

I have tried to explain how this code works here: #10905 (comment)

I think these comments are also relevant to the future discussions:

    useSession = true // If enabled state & nonce value stored in session otherwise in cookies.

#10905 (comment)

I might have missed some, but I thing these were the important comments and they illustrates some of the limitations. I still think it might be worth it to go for it.

There was also a question of using the standard Jakarta annotation vs. Payara one. It might be not possible. I based the bearer token implementation on the identity store provided by the Payara code. I am not sure if the same is possible with Jakarta only code. I am also not sure if the multi-tenancy would work with that. If it is possible, we might want to do it, but maybe it might wait some time? I would like to not have too many features at once, and not everything in one PR.

This comment has been minimized.

@ErykKul
Copy link
Collaborator Author

ErykKul commented Oct 8, 2024

I have tried this PKCE client implementation (added to doc/sphinx-guides/_static/frontend/PKCE-example/PKCE-example.html):

<!doctype html>
<html>

<body>
    <script src="http://unpkg.com/keycloak-js@25.0.6/dist/keycloak-authz.js"></script>
    <script src="http://unpkg.com/keycloak-js@25.0.6/dist/keycloak.js"></script>

    <script>
        const kc = new Keycloak({
            url: 'http://keycloak.mydomain.com:8090',
            realm: 'test',
            clientId: 'test'
        });
        kc.init({
            pkceMethod: 'S256',
            redirectUri: 'http://localhost:8080/api/v1/users/:me'
        });
        kc.login();
    </script>
</body>

</html>

At first it looked like it worked perfectly, but then I noticed I was redirected to a different page than intended... Then I checked the network log and it looks a lot like what auth2-proxy would do:

image

Then I tried http://localhost:8080/api/v1/users/:me in the browser (when not being logged in), and very similar thing happened; this really behaves like an oath2-proxy. I am not sure if I like it. At least it did not break PKCE clients, so to see.

Anyway, I am putting this on hold, until we have a better understanding what we want to do.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

1 similar comment

This comment has been minimized.

#!/bin/bash

python3 -m venv run_env
source run_env/bin/activate
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 [shellcheck] reported by reviewdog 🐶
Not following: run_env/bin/activate: openBinaryFile: does not exist (No such file or directory) SC1091

This comment has been minimized.

1 similar comment

This comment has been minimized.

@pdurbin pdurbin added the Type: Feature a feature request label Oct 9, 2024

This comment has been minimized.

1 similar comment

This comment has been minimized.

Copy link

📦 Pushed preview images as

ghcr.io/gdcc/dataverse:authn-arch-api-oidc
ghcr.io/gdcc/configbaker:authn-arch-api-oidc

🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name.

@ErykKul
Copy link
Collaborator Author

ErykKul commented Oct 13, 2024

I have simplified the implementation by a lot; the IdentityStore for bearer tokens is no longer needed. I still use the fish.payara.security.openid.api.AccessTokenCallerPrincipal class, so the fish dependency could not be removed yet. Note, however, that the implementation got very short, it should not be a problem to replace it with some other implementation in the future, if needed.

I am still not sure about the auth filters thing. It looks nice to me how it is now, it simply uses SecurityContext added to the abstract API bean as context. Also, are we going to need all of those filters when we will migrate to the new SPA? Maybe they will become deprecated?

The handling of the user IDs got also better. The issuer and subject IDs used to create the UserIdentifier are configurable now. This would solve our problem, where we would like to differentiate between different providers configured in one broker (one Keycloak with only one url), where you would only get edit rights when you are authenticated with our internal two factor authentication. You can also change where the subject name is retrieved from, we could, e.g., use the "preferred_email" as configured in our Keycloak, i.s.o. the sub field, which would make mapping of the users easier (we preprocess the users, such that the internal KU Leuven users are known by the Dataverse before they login for the first time).

@ErykKul
Copy link
Collaborator Author

ErykKul commented Oct 15, 2024

Oh no! I looked at how the communication is done with Keycloak and it does not look good! It is full of JWT tokens in cookies and even a session cookie! (this might be a security risk, would need to investigate more, I hope these can be turned off):

image

@ErykKul
Copy link
Collaborator Author

ErykKul commented Oct 16, 2024

If we do not want the Payara implementation, there are some alternatives. For example, there is a Spring Boot version that also uses the SecurityContext, it looks like it would not be difficult to switch to that. There are also plenty of other possibilities, we really do not have to implement everything ourselves (and all implementations look very similar, as I mentioned in the tech hours, and most likely they use sessions and cookies, like Keycloak does, etc.):

https://docs.spring.io/spring-security/reference/servlet/oauth2/index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GREI Re-arch Issues related to the GREI Dataverse rearchitecture Size: 10 A percentage of a sprint. 7 hours. Type: Feature a feature request
Projects
Status: On Hold ⌛
Development

Successfully merging this pull request may close these issues.

6 participants