Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform read lock on LSP requests #1640

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

msujew
Copy link
Member

@msujew msujew commented Aug 21, 2024

I've noticed that some long running requests (i.e. completion in large files, sometimes semantic highlighting) can break pretty heavily and lead to unresolved references if the specified document gets a new build triggered. This change adds a new type of read to the WorkspaceLock service that allows to instantly serve a read request - all other requests are queued up until this read request has finished (or has been aborted).

@msujew msujew added the LSP Language Server Protocol integration label Aug 21, 2024
@msujew msujew added this to the v3.2.0 milestone Aug 22, 2024
@dhuebner
Copy link
Contributor

As what I understood from the discussion in the weekly dev meeting, the problem is the following:
We have a running read operation (e.g. semantic highlighting) and a new write operation request (e.g. document changed) arrives. What happens, is that the write operation resets the document where the read operation still operates on this document reference. This seems to break the document and also the running read operation may fail..?
Introducing a priority read doesn't feels right to me as some? read operations become a priority over the write operations. It also makes the locking logic more complicated and difficult to understand.

Maybe the solution in this PR is the only one, but I still want to share some other ideas (not sure it works well with Langium):

  • Would creating a new Document instead of reseting an existing one work? Doing so, the running read operation can finish its work on the old outdated document and nothing should break.
  • What one can also consider is actively trigger canceling on all running read services when a write arrives and wait until they are canceled. After that apply the write operation.

Copy link
Contributor

@Lotes Lotes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small things I noticed

*/
read<T>(action: () => MaybePromise<T>): Promise<T>;
read<T>(action: () => MaybePromise<T>, priority?: boolean): Promise<T>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am associating priority with an ordering/number. I understand the usage here as hasPriority. Maybe there is a better name, like: isUrgent, important, handleFirst...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO a Boolean argument is problematic because seeing a method call like read(..., true) hides the semantics behind it. I'd use a string value like 'normal' | 'prioritized' instead.

@@ -59,8 +64,25 @@ export class DefaultWorkspaceLock implements WorkspaceLock {
return this.enqueue(this.writeQueue, action, tokenSource.token);
}

read<T>(action: () => MaybePromise<T>): Promise<T> {
return this.enqueue(this.readQueue, action);
read<T>(action: () => MaybePromise<T>, priority?: boolean): Promise<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking loud: Is this the first time we have a "priority" queue?
You could refactor a common component/function if not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workspace lock is a mutex, not a queue. It features some very specific semantics about read/read with prio/write actions that are likely not relevant for a common component.

Copy link
Contributor

@spoenemann spoenemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I share @dhuebner's concerns. Is it right to push through an LSP request even though the text has changed in the meantime?

Reading the specification section Implementation Considerations, it looks like the best solution would be to return an error code ContentModified when a change has happened.

return await serviceCall(language, document, params, cancelToken);
const result = await lock.read(async () => {
return await serviceCall(language, document, params, cancelToken);
}, true); // Give this priority, since we already waited until the target state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost all LSP requests are now treated with priority? That seems too much to me: especially implicitly sent requests like DocumentHighlights should not block the build process. And what about Completion requests that are sent while typing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these are getting cancelled by the language client - they shouldn't be blocking the workspace in any way for longer than our timeout (5ms).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does that 5 ms timeout come from?

*/
read<T>(action: () => MaybePromise<T>): Promise<T>;
read<T>(action: () => MaybePromise<T>, priority?: boolean): Promise<T>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO a Boolean argument is problematic because seeing a method call like read(..., true) hides the semantics behind it. I'd use a string value like 'normal' | 'prioritized' instead.

@msujew
Copy link
Member Author

msujew commented Aug 29, 2024

@spoenemann The link you've posted literally says this about changed content:

servers should therefore not decide by themselves to cancel requests simply due to that fact that a state change notification is detected in the queue. As said the result could still be useful for the client.

We do exactly this: Prevent the response from failing by blocking the workspace lock and returning the result. In the meantime, the client is free to cancel the pending request which we completely respect and unblock the workspace lock

@spoenemann
Copy link
Contributor

spoenemann commented Aug 29, 2024

But that section also says:

if a server detects an internal state change (for example, a project context changed) that invalidates the result of a request in execution the server can error these requests with ContentModified

This PR is about long running requests; when the state of a document is changed so that we can no longer finish the request, we should return an error. Note that this is not the same as canceling the request (which should be done only by the client).

@spoenemann
Copy link
Contributor

Could it help to use a utility function like this in LSP request processing?

export async function interruptAndCheckDocument(document: LangiumDocument, token: CancellationToken): Promise<void> {
    const previousState = document.state;
    await interruptAndCheck(token);
    if (document.state < previousState) {
        throw new ResponseError(LSPErrorCodes.ContentModified, 'Document content has been modified.');
    }
}

@msujew
Copy link
Member Author

msujew commented Aug 29, 2024

This PR is about long running requests; when the state of a document is changed so that we can no longer finish the request

The issue is: We cannot really know? For example, for the document highlight:

  1. User edits text on line 2 and the language client requests a highlighting delta
  2. Server is computing delta request for line 2
  3. Meanwhile, user edits text on line 10 and the language client requests a highlighting delta for that line
  4. ???

Aborting the initial operation for line 2 when receiving a document update is not the correct move here - the data for that highlighting is still valid and aborting the request might leave the user with incorrectly highlighted text.

This is the case for most LSP requests. The internal state change the documentation is talking about (like project context switches) are much more fundamental changes than the "normal" document changes that make the result literally useless - however our results are probably still useful (except if the language client deems the change to be too large - in which case we get a cancellation anyway).

@msujew msujew removed this from the v3.2.0 milestone Aug 29, 2024
Copy link
Contributor

@spoenemann spoenemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I see the problem with semantic highlighting (Semantic Tokens) (you said document highlighting, but that's a different service). The situation is different for other services. But it's true that it's the client that should decide whether to use a potentially outdated result or not.

We could apply this and then check how it affects the editing experience in larger projects. I'm particularly curious how it performs when completion is continuously triggered while typing.

@@ -89,7 +111,7 @@ export class DefaultWorkspaceLock implements WorkspaceLock {
} else {
return;
}
this.done = false;
this.counter += entries.length;
await Promise.all(entries.map(async ({ action, deferred, cancellationToken }) => {
try {
// Move the execution of the action to the next event loop tick via `Promise.resolve()`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this change, but does awaiting Promise.resolve() really change the execution order in the event loop? Isn't that rather what setImmediate and our utility function delayNextTick are designed for?

We could rewrite this to:

await delayNextTick();
const result = await action(cancellationToken);

Promise.resolve(action()).then(
result => end(() => deferred.resolve(result)),
err => end(() => deferred.reject(err))
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see two potential issues with this code:

  • The solution with the local end function is a more complicated way of expressing a finally block.
  • I suspect that Promise.resolve(action()) does not behave as we want when the action throws an error – the error would just be propagated instead of rejecting the resulting promise.

I suggest making this whole method async so we can await the action and wrap that in a try-finally block.

return await serviceCall(language, params, cancelToken);
const result = await lock.read(async () => {
return await serviceCall(language, params, cancelToken);
}, true); // Give this priority, since we already waited until the target state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be simplified to

await lock.read(() => serviceCall(language, params, cancelToken), true)

(same in createServerRequestHandler and createRequestHandler).

@spoenemann
Copy link
Contributor

Question: with this change, does it make any sense for LSP services to use interruptAndCheck? I see it currently used by inlay-hint-provider, semantic-token-provider and workspace-symbol-provider.

Should we make it clear in the function documentation that it should be used only in the document building process and the associated services?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LSP Language Server Protocol integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants