Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inactivity_timeout is triggered regardless of having received pings #2716

Open
xim opened this issue Aug 10, 2023 · 13 comments · May be fixed by #2718
Open

inactivity_timeout is triggered regardless of having received pings #2716

xim opened this issue Aug 10, 2023 · 13 comments · May be fixed by #2718
Assignees

Comments

@xim
Copy link

xim commented Aug 10, 2023

Version of Beast

347

Steps necessary to reproduce the problem

I'm using plain, async boost::beast::websocket for a server and a client.

I have a server that's sending ping packets at a 15s interval and keeping the connection alive correctly. I have verified that the client receives these and responds with pongs.

Since I have these ping packets going, I've added a timeout in the client to try to detect if the connection goes down:

ws_.set_option(boost::beast::websocket::stream_base::timeout{
    60s,
    30s,
    false
});

I had assumed that this would be kept alive by the ping packets sent from the server, but it isn't. The client disconnects with the message The socket was closed due to a timeout.

Setting the final flag (keep_alive_pings) to true makes the client keep the connection alive by doing the same as the server's already doing. Given the server set to send ping frames at 15s intervals, this setting on the client:

ws_.set_option(boost::beast::websocket::stream_base::timeout{
    60s,
    80s,
    true
});

Means I'll get this behaviour:

00:00 Client connects
00:15 Server pings as expected
00:15 Client pongs as required
00:30 Server pings as expected
00:30 Client pongs as required
00:40 Client pings despite having seen activity in the form of a ping

I feel is the desired behaviour would be to count incoming ping frames as activity. If that can't be made the default behaviour, could we try to add it as a config on boost::beast::websocket::stream_base::timeout?

@xim
Copy link
Author

xim commented Aug 10, 2023

I made a somewhat-minimal reproduction at https://gist.github.com/xim/06d60c455363fbb764c14f21d7705c40

Compiled and run by running each in a separate terminal:

g++ -g -std=gnu++17 websocket_server_async.cpp && echo running && ./a.out 127.0.0.1 8080 1
g++ -g -std=gnu++17 websocket_client_async.cpp && ./a.out 127.0.0.1 8080 'test'

@klemens-morgenstern
Copy link
Collaborator

What are you using as the tcp stream? Are you calling async_read perpetually?

@xim
Copy link
Author

xim commented Aug 11, 2023

As shown in the gist, I made a somewhat-minimal reproduction with the example websocket_client_async.cpp as a base.

It's a websocket::stream<beast::tcp_stream> and I re-call async_read in the on_read handler

@xim
Copy link
Author

xim commented Aug 11, 2023

Full code instead of patch at https://gist.github.com/xim/97874f6d7e27402f26228a0da6448236

Here, the server sends pings ever 1s and the client has a 5s timeout.

Client output:

ping
ping
ping
ping
read: The socket was closed due to a timeout

Server output:

pong
pong
pong
pong
read: End of file

xim added a commit to xim/beast that referenced this issue Aug 11, 2023
If the stream is receiving control packets like ping, don't count it as
idle. This means you can enable `timeout_opt.keep_alive_ping` on only
one side to get heartbeat.

Addresses issue boostorg#2716
@xim
Copy link
Author

xim commented Aug 11, 2023

Made a PR. If you want this to be configurable, I can have a stab at that as well.

@ashtum ashtum self-assigned this Aug 18, 2023
@ashtum
Copy link
Collaborator

ashtum commented Aug 19, 2023

I couldn't find any indication in RFC 6455 that either permits or prohibits the counting of control frames as a form of activity. Considering that this approach could potentially eliminate the need for extra ping-pong messages every n seconds for clients with enabled keep_alive_pings, introducing an additional parameter to the configuration could unnecessarily complicate matters for a feature that might see very limited use.

The only plausible options appear to be either counting pings as activity (assuming the standard allows for it), or maintaining the status quo. Users could opt to utilize the low-level API and manually implement ping-pong if they wish to minimize the transmission of extra ping-pong messages.

@xim
Copy link
Author

xim commented Aug 25, 2023

I couldn't find any indication in RFC 6455 that either permits or prohibits the counting of control frames as a form of activity

Agree, the most precise is the phrasing A Ping frame may serve either as a keepalive or as a means to verify that the remote endpoint is still responsive.

IMHO it makes sense for control frames to count as activity.

this approach could potentially eliminate the need for extra ping-pong messages every n seconds for clients with enabled keep_alive_pings

Exactly, good summary there.

The only plausible options appear to be either counting pings as activity (assuming the standard allows for it), or maintaining the status quo

My vote is on the former =)

@vinniefalco
Copy link
Member

vinniefalco commented Aug 25, 2023

Without thinking about it too much, I believe received pings should reset the timer, as they represent activity.

Regardless of what is changed, the library documentation should be updated to describe either the current or the new behavior of received pings with respect to timeouts if it is not already explained.

@vinniefalco
Copy link
Member

assuming the standard allows for it

The specifications say nothing about what counts as idle activity. Ideally Beast would have great defaults that work for most people, while also allowing for the caller to have perfect control over customizing that behavior. However Beast has reaching close to the end of what can be done without incurring technical debt, so this ideal will likely only be approached in a new websocket library which will replace beast (this work has not started yet).

@xim
Copy link
Author

xim commented Aug 29, 2023

Without thinking about it too much, I believe received pings should reset the timer, as they represent activity.

Does that mean we should proceed towards merging the PR, @vinniefalco? I can make additional adjustments or try to document more inline if you wish!

@ashtum
Copy link
Collaborator

ashtum commented Aug 29, 2023

@xim It would be great if you could add the relevant test cases to the PR; This is the appropriate location for including them: https://github.com/boostorg/beast/blob/develop/test/beast/websocket/ping.cpp.

@xim
Copy link
Author

xim commented Aug 30, 2023

I don't see how I can test this without actually waiting for timers to expire. I mean, this test passes, but sleeping in the test like this is probably not OK?

echo_server es{log};
error_code ec;
stream<test::stream> ws{ioc_};
ws.set_option(stream_base::timeout{
    stream_base::none(),
    std::chrono::milliseconds(100),
    false
});
ws.next_layer().connect(es.stream());
ws.handshake("localhost", "/");
flat_buffer b;
bool got_timeout = false;
ws.async_read(b,
    [&](error_code ec, std::size_t)
    {
        std::cerr << ec.message() << std::endl;
        if(ec != beast::error::timeout)
            BOOST_THROW_EXCEPTION(
                system_error{ec});
        got_timeout = true;
    });
BEAST_EXPECT(ws.impl_->idle_counter == 0);
ioc_.run_for(std::chrono::milliseconds(60));
ioc_.restart();
BEAST_EXPECT(ws.impl_->idle_counter == 1);
ws.ping("", ec);
if(ec)
    BOOST_THROW_EXCEPTION(
        system_error{ec});
ioc_.run_for(std::chrono::milliseconds(10));
ioc_.restart();
BEAST_EXPECT(ws.impl_->idle_counter == 0);
BEAST_EXPECT(!got_timeout);
ioc_.run_for(std::chrono::milliseconds(100));
ioc_.restart();
BEAST_EXPECT(got_timeout);

@ashtum
Copy link
Collaborator

ashtum commented Aug 30, 2023

It seems this how is how timeout cases are tested in the entire code base (searching for stream_base::timeout or test::run_for).
Please ensure that you allocate a sufficient margin to prevent false failures caused by OS rescheduling; examining other test cases can provide a reference for determining the appropriate margin.
@klemens-morgenstern is this the right way?

xim added a commit to xim/beast that referenced this issue Aug 30, 2023
If the stream is receiving control packets like ping, don't count it as
idle. This means you can enable `timeout_opt.keep_alive_ping` on only
one side to get heartbeat.

Addresses issue boostorg#2716
xim added a commit to xim/beast that referenced this issue Oct 2, 2023
If the stream is receiving control packets like ping, don't count it as
idle. This means you can enable `timeout_opt.keep_alive_ping` on only
one side to get heartbeat.

Addresses issue boostorg#2716
xim added a commit to xim/beast that referenced this issue Nov 27, 2023
If the stream is receiving control packets like ping, don't count it as
idle. This means you can enable `timeout_opt.keep_alive_ping` on only
one side to get heartbeat.

Addresses issue boostorg#2716
xim added a commit to xim/beast that referenced this issue Dec 26, 2023
If the stream is receiving control packets like ping, don't count it as
idle. This means you can enable `timeout_opt.keep_alive_ping` on only
one side to get heartbeat.

Make tests that verify current behaviour.

Update documentation to match changes in PR boostorg#2718.

Addresses issue boostorg#2716
ashtum pushed a commit to xim/beast that referenced this issue Feb 15, 2024
If the stream is receiving control packets like ping, don't count it as
idle. This means you can enable `timeout_opt.keep_alive_ping` on only
one side to get heartbeat.

Make tests that verify current behaviour.

Update documentation to match changes in PR boostorg#2718.

Addresses issue boostorg#2716
ashtum pushed a commit to ashtum/beast that referenced this issue Jun 22, 2024
If the stream is receiving control packets like ping, don't count it as
idle. This means you can enable `timeout_opt.keep_alive_ping` on only
one side to get heartbeat.

Make tests that verify current behaviour.

Update documentation to match changes in PR boostorg#2718.

Addresses issue boostorg#2716
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants