Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

join the cgroup after the initial setup finished #4438

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lifubang
Copy link
Member

We should join the cgroup after the initial setup finished,
but before runc init clone new children processes. (#4427)

Because we should try our best to reduce the influence of
memory cgroup accounting from all runc init processes
before we start the container init process.

If we cherry pick this commit to release-1.1, it will eliminate
the impacts of memory accounting from ensure_clone_binary.

We should join the cgroup after the initial setup finished,
but before runc init clone new children processes. (opencontainers#4427)
Because we should try our best to reduce the influence of
memory cgroup accounting from all runc init processes
before we start the container init process.

Signed-off-by: lifubang <lifubang@acmcoder.com>
@kolyshkin
Copy link
Contributor

This made me look into why initWaiter is added, and here's what I found:

  • added by commit 4ecff8d (see its description as for why);
  • since commit 0e9a335 it might no longer be necessary to have initWaiter at all.

So I opened a draft PR to remove it (#4441).

@kolyshkin
Copy link
Contributor

Frankly, I don't understand what this PR is doing, or what difference does it make, from looking in the main branch code.

@lifubang
Copy link
Member Author

Frankly, I don't understand what this PR is doing, or what difference does it make, from looking in the main branch code.

There is really a race situation here, it's very obviously in the branch release-1.1. (containerd/containerd#10799 (comment))
Yes, this is not very important after we have moved the binary clone operation from runc init to runc create/run, but it also have a unimportant race here. If we keep this initWaiter, it will let runc in A stable state rather than a race state.

@lifubang
Copy link
Member Author

Like I mentioned in #4439 (comment)
If we removed this initWaiter, the cgroup resource accounting before nl_parse(pipenum, &config) in runc init maybe lost when we are joining the cgroup.

nl_parse(pipenum, &config);

@lifubang
Copy link
Member Author

I don't understand what this PR is doing,

I think the waitInit chan waiting's place is wrong, it will also cause this unimportant race situation mentioned above, so I place it in the correct place. This is what this PR doing.
And I have test it, it will not introduce any performance issue in the main branch.

@lifubang
Copy link
Member Author

lifubang commented Oct 12, 2024

The other aspect, keep this initWaiter also may be useful for future, for example, maybe for some times in future, we need to add some expensive function call like ensure_clone_binary in nsexec.

@kolyshkin
Copy link
Contributor

Yes, I can understand why this is needed in 1.1.

I still can not understand why this is needed here in the main branch, what race it is fixing, what difference does it make.

Can you please explain the race in details to me? Like, draw two columns, one for runc parent, another for runc init, and write which one is doing what and how a race can happen.

Or, I will take another look next week (maybe I'm just dumb and tired).

The other aspect, keep this initWaiter also may be useful for future, for example, maybe for some times in future, we need to add some expensive function call like ensure_clone_binary in nsexec.

This is not a good argument. First of all, we hope we'll have less code in nsexec, not more, and in general, the less locking and synchronization there is, the better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants