Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting logged out after my laptop was in sleep mode #307

Open
1 task done
flippingbits opened this issue Sep 26, 2024 · 5 comments
Open
1 task done

Getting logged out after my laptop was in sleep mode #307

flippingbits opened this issue Sep 26, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@flippingbits
Copy link
Contributor

flippingbits commented Sep 26, 2024

Code of Conduct

  • I agree to follow this project's Code of Conduct

On what operating system are you seeing the problem?

macOS (Apple/arm64)

VS Code version

Version: 1.93.1 (Universal)
Commit: 38c31bc77e0dd6ae88a4e9cc93428cc27a56ba40
Date: 2024-09-11T17:20:05.685Z (2 wks ago)
Electron: 30.4.0
ElectronBuildId: 10073054
Chromium: 124.0.6367.243
Node.js: 20.15.1
V8: 12.4.254.20-electron.0
OS: Darwin arm64 23.6.0

Version of Confluent extension

v0.16.3

To Reproduce

  1. Open Confluent for VS Code
  2. Authenticate with Confluent Cloud
  3. Put your machine/laptop in sleep or hibernation mode for more than 5 minutes (for instance, by closing the lid)
  4. Try using Confluent for VS Code again
  5. After a few seconds, you'll (1) lose the CCloud connection and (2) maybe see the error "Error authenticating with Confluent Cloud. Please try again." and get prompted to re-authenticate with Confluent Cloud

Current vs. Expected behavior

Current behavior:
After my laptop was in sleep or hibernation mode for more than 5 minutes (sometimes even less, I'll explain why in "Additional context"), I get logged out of Confluent Cloud and must re-start the authentication flow.

Expected behavior:
My laptop should be able to sleep for up to 4 hours without causing a log-out because the readme states "Confluent Cloud connections require reauthenticating after 4 hours, and you will be prompted to reauthenticate.".

Relevant log output

2024-09-26 11:23:00,119 c7mw2pkhk6 /Users/stefansprenger/.vscode/extensions/confluentinc.vscode-confluent-0.16.3-darwin-arm64/ide-sidecar-0.34.0-runner[47772] INFO  [io.con.ide.res.cac.ClusterCache] (executor-thread-49) Connected ConnectionSpec[id=vscode-confluent-cloud-connection, name=Confluent Cloud, type=CCLOUD, ccloudConfig=null]
2024-09-26 11:23:10,121 c7mw2pkhk6 /Users/stefansprenger/.vscode/extensions/confluentinc.vscode-confluent-0.16.3-darwin-arm64/ide-sidecar-0.34.0-runner[47772] INFO  [io.con.ide.res.cac.ClusterCache] (executor-thread-49) Connected ConnectionSpec[id=vscode-confluent-cloud-connection, name=Confluent Cloud, type=CCLOUD, ccloudConfig=null]
2024-09-26 11:23:20,116 c7mw2pkhk6 /Users/stefansprenger/.vscode/extensions/confluentinc.vscode-confluent-0.16.3-darwin-arm64/ide-sidecar-0.34.0-runner[47772] INFO  [io.con.ide.res.cac.ClusterCache] (executor-thread-49) Connected ConnectionSpec[id=vscode-confluent-cloud-connection, name=Confluent Cloud, type=CCLOUD, ccloudConfig=null]
2024-09-26 11:32:47,848 c7mw2pkhk6 /Users/stefansprenger/.vscode/extensions/confluentinc.vscode-confluent-0.16.3-darwin-arm64/ide-sidecar-0.34.0-runner[47772] ERROR [io.con.ide.res.aut.CCloudOAuthContext] (vert.x-eventloop-thread-4) Error in CCloud response while verifying the auth status of this connection: {"code":401,"message":"token is expired","error_code":"token_expired"}
2024-09-26 11:32:47,849 c7mw2pkhk6 /Users/stefansprenger/.vscode/extensions/confluentinc.vscode-confluent-0.16.3-darwin-arm64/ide-sidecar-0.34.0-runner[47772] INFO  [io.con.ide.res.cac.ClusterCache] (executor-thread-50) Disconnected ConnectionSpec[id=vscode-confluent-cloud-connection, name=Confluent Cloud, type=CCLOUD, ccloudConfig=null]
2024-09-26 11:32:48,056 c7mw2pkhk6 /Users/stefansprenger/.vscode/extensions/confluentinc.vscode-confluent-0.16.3-darwin-arm64/ide-sidecar-0.34.0-runner[47772] ERROR [io.con.ide.res.aut.CCloudOAuthContext] (vert.x-eventloop-thread-4) Error in CCloud response while verifying the auth status of this connection: {"code":401,"message":"token is expired","error_code":"token_expired"}
2024-09-26 11:32:48,057 c7mw2pkhk6 /Users/stefansprenger/.vscode/extensions/confluentinc.vscode-confluent-0.16.3-darwin-arm64/ide-sidecar-0.34.0-runner[47772] INFO  [io.con.ide.res.cac.ClusterCache] (executor-thread-50) Disconnected ConnectionSpec[id=vscode-confluent-cloud-connection, name=Confluent Cloud, type=CCLOUD, ccloudConfig=null]
2024-09-26 11:32:48,061 c7mw2pkhk6 /Users/stefansprenger/.vscode/extensions/confluentinc.vscode-confluent-0.16.3-darwin-arm64/ide-sidecar-0.34.0-runner[47772] INFO  [io.con.ide.res.cac.ClusterCache] (executor-thread-49) Deleted ConnectionSpec[id=vscode-confluent-cloud-connection, name=Confluent Cloud, type=CCLOUD, ccloudConfig=null]
2024-09-26 11:32:48,091 c7mw2pkhk6 /Users/stefansprenger/.vscode/extensions/confluentinc.vscode-confluent-0.16.3-darwin-arm64/ide-sidecar-0.34.0-runner[47772] INFO  [io.con.ide.res.res.ConfluentLocalQueryResource] (vert.x-eventloop-thread-4) Get local kafka clusters for connection vscode-local-connection
2024-09-26 11:32:48,567 c7mw2pkhk6 /Users/stefansprenger/.vscode/extensions/confluentinc.vscode-confluent-0.16.3-darwin-arm64/ide-sidecar-0.34.0-runner[47772] INFO  [io.con.ide.res.aut.RefreshCCloudTokensBean] (vert.x-eventloop-thread-1) Refreshed tokens of connection with ID=vscode-confluent-cloud-connection.

My laptop was in sleep mode between 11:23:20 and 11:32:47.

Which area(s) are affected? (Select all that apply)

Connections, Confluent Cloud

Additional context

Some background:

  • Every five minutes, the ide-sidecar refreshes the CCloud access tokens shortly before they expire. Technically, refreshing tokens requires performing multiple requests against the CCloud API.
  • Every ten seconds, the extension checks the health of the access tokens. If the access tokens are unhealthy, it resets the CCloud connection. Technically, checking the health of the access tokens requires performing a single request against the CCloud API.

When laptops are in sleep or hibernation mode, they pause the running processes, so neither token refreshes nor health checks are performed:

  • If the laptop was in sleep mode for more than five minutes, the access tokens are guaranteed to have expired because no token refresh was performed.
  • If the laptop was in sleep mode for less than five minutes, the access tokens may have expired depending on when the last token refresh was performed.

After waking up, the extension will immediately check the health of the access tokens. When detecting that the access tokens are unhealthy, it will reset the CCloud connection and maybe prompt the user to re-authenticate. Although the ide-sidecar will immediately refresh tokens after the laptop wakes up, the extension will never see the refreshed tokens because the extension-side CCloud connection has already been reset.

This might also explain the behavior reported in #281.

I created this issue in the vscode repo for visibility reasons but feel that it should be addressed in the ide-sidecar repo.

@flippingbits flippingbits added the bug Something isn't working label Sep 26, 2024
@flippingbits flippingbits self-assigned this Sep 26, 2024
@flippingbits flippingbits changed the title Getting logged out after my laptop was in sleep or hibernation mode Getting logged out after my laptop was in sleep mode Sep 26, 2024
@flippingbits
Copy link
Contributor Author

It seems like we need to improve on handling transient errors in the authentication flow. The following sections describe the status quo of the authentication flow and propose an improvement.

Status quo

Sidecar

At the moment, the sidecar returns three different authentication states for a Confluent Cloud connection:

  • If the connection does not hold tokens, it reports the state NO_TOKEN.
  • If the connection holds tokens and can use them to sign requests against the Confluent Cloud API, it reports the state VALID_TOKEN.
  • If the connection holds tokens but cannot perform requests against the Confluent Cloud API successfully, it reports the state INVALID_TOKEN.

The sidecar refreshes tokens every 5 minutes. It does not distinguish between transient (e.g., network loss) and non-transient (e.g., expired refresh token) errors. Instead, it aborts refreshing tokens after 50 failed attempts and resets the connection so that its auth status equals NO_TOKEN again.

Extension

The extension prompts the user to authenticate with Confluent Cloud if the state of the connection equals NO_TOKEN. The extension checks the authentication status of the connection every 10 seconds. If the extension detects that the auth status of the connection equals INVALID_TOKEN, it deletes the connection and re-creates it so that its state equals NO_TOKEN, prompting the user to sign in again.

Suggested change

Sidecar

The sidecar introduces a fourth authentication state, TOKEN_REFRESH_FAILED. If more than 50 token refresh attempts have failed or if the sidecar has experienced a non-transient error while refreshing tokens (e.g., if the Confluent Cloud API has returned that the refresh token is invalid), the Confluent Cloud connection ends up in the state TOKEN_REFRESH_FAILED, from which it cannot automatically recover. The sidecar does not attempt to refresh tokens of connections that are in the state TOKEN_REFRESH_FAILED.

Extension

If the state of the connection equals INVALID_TOKEN, the extension does not delete and re-create the connection but shows a notification like "Experiencing issues interacting with the Confluent Cloud API. Trying to reconnect..." or similar. The notification shows a button that allows the user to start the authentication flow so that they can try to re-authenticate early on. If the state of the connection equals TOKEN_REFRESH_FAILED, the extension deletes the connection and re-creates it so that its state is reset to NO_TOKEN.

@noeldevelops
Copy link
Member

Having that extra state to help us create a better UX sounds great @flippingbits, thanks for working on this !

@MSeal
Copy link
Contributor

MSeal commented Oct 2, 2024

How long will 50 token refresh attempts have failed take to fully fail and what's the user experience during that time if it takes a full 50 to reach retry failure?

@flippingbits
Copy link
Contributor Author

flippingbits commented Oct 2, 2024

How long will 50 token refresh attempts have failed take to fully fail

Good question! We attempt a token refresh every 5 seconds if the control plane token is expired so it will take roughly 4 minutes until it fully fails. This allows us to tolerate transient network issues, like a failed Wi-Fi connection. We can adjust the number of attempts if needed.

what's the user experience during that time if it takes a full 50 to reach retry failure?

Users will not be able to interact with the Confluent Cloud resources in the extension and see the warning "Experiencing issues interacting with the Confluent Cloud API. Trying to reconnect...".

@MSeal
Copy link
Contributor

MSeal commented Oct 2, 2024

Sounds reasonable then. We can adjust if users find friction therein that could be adjusted down the road.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants