Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit tries to improve two issues
A free port is obtained by setting the port value as zero and the OS will bind to a free port. We immediately close the port and then later create another socket on the same port. The issue with the approach is, OS could allocate the same port to another because we have closed the port. This leads to a situation where more than one bypass server could listen on the same port (this is possible because of SO_REUSEPORT flag). The issue is fixed by not closing the socket.
Bypass exposes a down api, which closes the socket. The issue here is the same as above, the OS is free to allocate the port to others. The current solution tries to fix the issue by keeping track of which test owns which ports and try not to reuse the same ports. This is still not foolproof, there is a small interval during which the socket is active, but better than the old logic.
To set some background context, we have a very large test suite and run into this issue a lot. We have been using this patch for some time and it has definitely improved our CI situation. We still hit the race condition from time to time due to the usage of Bypass.down. I am open to different approaches as well since the fix is not perfect.
Thanks to @akash-akya for helping me with debugging and brainstorming.