composefs: random EINVALs #2042

edsantiago · 2024-07-24T12:01:51Z

Sometimes on commit, sometimes on diff

# podman-remote [options] commit -q testcon newimage
Error: copying layers and metadata for container "cecc9c90f4107cf89808b8234df98e173e1bf052a006665f460a1e86f8d070b8": initializing source containers-storage:testcon: extracting layer "1745071114b3933b11b73a9f099d2b82d4e80262c1211b1b6e7f388466ebdf3c": invalid argument

# podman [options] --pull-option=enable_partial_images=true --pull-option=convert_images=true container checkpoint --create-image alpine-checkpoint-bmjmtt --keep e3793c9b35baf7d6a62a976fb4da4e4fc139c535edd159cd6edcf96a8b4e1d02
Error: exporting root file-system diff for "e3793c9b35baf7d6a62a976fb4da4e4fc139c535edd159cd6edcf96a8b4e1d02": invalid argument

rawhide : int podman rawhide root host sqlite
- 07-24 07:14 in Podman checkpoint podman checkpoint --create-image with running container
rawhide : int remote rawhide root host sqlite [remote]
- 07-22 11:56 in Podman commit podman commit adds exposed ports

x	x	x	x	x	x
int(2)	podman(1)	rawhide(2)	root(2)	host(2)	sqlite(2)
	remote(1)

The text was updated successfully, but these errors were encountered:

Motivated by containers#2042 where we just get a bare `invalid argument` out of the entire storage stack. My offhand guess skimming some of the code is by far the most likely thing here is the raw `lgetxattr` call. It'd be useful to know that for sure. Signed-off-by: Colin Walters <walters@verbum.org>

edsantiago · 2024-07-24T18:48:33Z

Another one on commit:

# podman-remote [options] commit -q -f docker --message testing-commit test1 foobar.com/test1-image:latest
Error: copying layers and metadata for container "0e4bc8c8f422fea0f64b850c57307c45d172c136e1e503a60c0f2ea1c0c6c894": initializing source containers-storage:test1: extracting layer "8a6b0c5bf45deb44539988fe8f0214bff28b79dd03cacc9246febc6c7b3a1dce": failed to mount erofs image at "/tmp/CI_uMrH/podman-e2e-1025971979/imagecachedir/overlay/03901b4a2ea88eeaad62dbe59b072b28b6efa00491962b8741081c5df50c65e0/composefs-layers/0": invalid argument

giuseppe · 2024-07-24T21:25:20Z

this is more helpful, thanks! The erofs image could be corrupted

CC @cgwalters

cgwalters · 2024-07-29T17:40:47Z

The erofs image could be corrupted

Hmm...you think possibly some container image -> mkcomposefs -> (internal mkfs.erofs-alike) -> kernel refusing to read it?

I'm unfortunately not yet familiar with the podman test suite, but is this 100% reproducible, so basically setting up composefs for c/storage and run that relevant test?

giuseppe · 2024-07-29T18:34:06Z

I'm unfortunately not yet familiar with the podman test suite, but is this 100% reproducible, so basically setting up composefs for c/storage and run that relevant test?

I've tried to get as close as possible to the CI setup but I've not managed to reproduce it locally yet :/

Might be worth testing with a version of podman that stores somewhere the erofs image that fails on mount, so we can analyze it.

cgwalters · 2024-07-29T18:41:09Z

Might be worth testing with a version of podman that stores somewhere the erofs image that fails on mount, so we can analyze it.

Maybe:

If the image fails to mount, rename it as .bad or so (or just create a symlink/hardlink with that name)
Change podman CI to scrape all such images and store them as cirrus CI artifacts

giuseppe · 2024-07-30T09:38:58Z

a couple of improvements for composefs that could help to detect potential failures when creating the erofs image:

hsiangkao · 2024-07-31T01:39:47Z

Might be worth testing with a version of podman that stores somewhere the erofs image that fails on mount, so we can analyze it.

Maybe:

If the image fails to mount, rename it as .bad or so (or just create a symlink/hardlink with that name)

If a image fails to mount, it could just have corrupted on-disk superblock and/or rootinode. fsck.erofs with -d9 could give more information.
Also if dmesg is available, it would give more hints in it too.

alexlarsson · 2024-08-05T15:09:23Z

Yeah, we probably need the dmesg to figure this one out.

edsantiago · 2024-08-05T15:17:08Z

At the top of each log is a link to journal. Here is the latest flake list as of this morning:

rawhide : int podman rawhide root host sqlite
- 08-02 08:11 in Podman checkpoint podman checkpoint container with export and try to change the runtime
- 07-31 20:50 in Podman run podman run --seccomp-policy image (bogus profile)
- 07-31 17:50 in Podman checkpoint podman checkpoint --create-image with running container
- 07-31 11:20 in Podman checkpoint podman checkpoint and run exec in restored container
- 07-26 07:53 in Podman images podman images filter intermediate
- 07-24 07:14 in Podman checkpoint podman checkpoint --create-image with running container
rawhide : int remote rawhide root host sqlite [remote]
- 08-05 07:47 in Podman run podman run --seccomp-policy image (block all syscalls)
- 08-04 16:23 in Podman run podman run --seccomp-policy image (bogus profile)
- 08-04 09:44 in Podman run entrypoint podman run entrypoint with cmd
- 08-02 08:12 in Podman run entrypoint podman run entrypoint with user cmd no image cmd
- 08-02 08:12 in Podman push podman push to local registry with authorization
- 08-01 12:50 in Podman commit podman commit should not commit env secret
- 08-01 12:50 in Podman checkpoint podman checkpoint --create-image with running container
- 07-31 23:06 in Podman diff podman diff latest container
- 07-31 20:51 in Podman checkpoint podman restore multiple containers from single checkpoint image
- 07-31 17:50 in Podman images podman builder prune
- 07-29 15:03 in Podman diff podman image diff
- 07-24 14:13 in Podman commit podman commit container with message
- 07-22 11:56 in Podman commit podman commit adds exposed ports

x	x	x	x	x	x
int(19)	remote(13)	rawhide(19)	root(19)	host(19)	sqlite(19)
	podman(6)

alexlarsson · 2024-08-06T09:06:47Z

I looked at the first one, and its clearly an error in this code:

if err := unix.Mount(loop.Name(), mountPoint, "erofs", unix.MS_RDONLY, mountOpts); err != nil {
	return fmt.Errorf("failed to mount erofs image at %q: %w", mountPoint, err)

Where mountOpts is either empty or "noacl". But I don't see any errors reported from erofs in the journal around that time (or indeed ever). So, where is the EINVAL coming from? If the image was somehow corrupt, shouldn't erofs give some kernel log? Is it the loop device name that is invalid or something?

alexlarsson · 2024-08-06T09:16:05Z

The second one has:
msg="Unmount "/tmp/CI_5HQJ/podman-e2e-1833510854/imagecachedir/overlay/7ffe79913b6cd452b42b4aef7f51ff4eeecbdde76a464ef55c321a345faaa4a6/composefs-layers/0": invalid argument"

Now, we're failing to unmount too, with EINVAL. That is very strange...

hsiangkao · 2024-08-06T09:18:23Z

The second one has: msg="Unmount "/tmp/CI_5HQJ/podman-e2e-1833510854/imagecachedir/overlay/7ffe79913b6cd452b42b4aef7f51ff4eeecbdde76a464ef55c321a345faaa4a6/composefs-layers/0": invalid argument"

Now, we're failing to unmount too, with EINVAL. That is very strange...

anyway, clearly erofs itself doesn't return EINVAL on unmount(), I'm not sure how it happens, maybe the directory is already unmounted so it's not a mountpoint though.

alexlarsson · 2024-08-06T09:20:36Z

Yeah, or maybe the mount failed.

alexlarsson · 2024-08-06T09:22:03Z

umount:
EINVAL target is not a mount point.

hsiangkao · 2024-08-06T09:25:53Z

Yeah, or maybe the mount failed.

yeah, anyway, it'd be better to have some dmesg result, since it seems (I think) there are enough kernel prints in the mount failure path.

alexlarsson · 2024-08-06T09:28:47Z

@hsiangkao Well, the thing is that we see these in the journal logs:

Aug 02 07:11:37 cirrus-task-6203064640602112 kernel: erofs: (device loop1): mounted with root inode @ nid 36.

So, should we not see if there were any other erofs error messages?

hsiangkao · 2024-08-06T09:35:45Z

@hsiangkao Well, the thing is that we see these in the journal logs:

Aug 02 07:11:37 cirrus-task-6203064640602112 kernel: erofs: (device loop1): mounted with root inode @ nid 36.

So, should we not see if there were any other erofs error messages?

I think if it's printed, this mount is already successed in erofs itself, see:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/erofs/super.c?h=v6.10#n687

so at least mount() won't return EINVAL at least for this time of mount.

Is it the exact time to fail?

alexlarsson · 2024-08-06T09:43:19Z

@hsiangkao No, that is just an example of some dmesg output from the journal logs we have. What I meant was, if these successful mounts are reported (and they are), should not also failed mounts be reported in the logs?

I have been unable to find any other erofs log output other than copies of the above example.

hsiangkao · 2024-08-06T09:46:13Z

@hsiangkao No, that is just an example of some dmesg output from the journal logs we have. What I meant was, if these successful mounts are reported (and they are), should not also failed mounts be reported in the logs?

I have been unable to find any other erofs log output other than copies of the above example.

If there is no "erofs" dmesg log returned in the kernel message, I guess EINVAL wasn't returned by erofs, maybe that is what you mentioned just now: loop device is invalid or likewise...

since it seems that currently erofs added error messages to all error paths in erofs_fc_fill_super():
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/erofs/super.c?h=v6.10#n579

hsiangkao · 2024-08-06T09:50:07Z

@hsiangkao No, that is just an example of some dmesg output from the journal logs we have. What I meant was, if these successful mounts are reported (and they are), should not also failed mounts be reported in the logs?
I have been unable to find any other erofs log output other than copies of the above example.

If there is no "erofs" dmesg log returned in the kernel message, I guess EINVAL wasn't returned by erofs, maybe that is what you mentioned just now: loop device is invalid or likewise...

since it seems that currently erofs added error messages to all error paths in erofs_fc_fill_super(): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/erofs/super.c?h=v6.10#n579

oh, I'm not sure how errorfc() actually works now though...

alexlarsson · 2024-08-06T12:07:09Z

Hmm, yeah, errorfc() seems to log to the error buffer in the fs context, at least in some cases, and you can then extract these messages via the fd (as in e.g. https://github.com/torvalds/linux/blob/b446a2dae984fa5bd56dd7c3a02a426f87e05813/samples/vfs/test-fsmount.c#L20). Does this mean those errors are not logged though?

hsiangkao · 2024-08-06T12:45:33Z

Hmm, yeah, errorfc() seems to log to the error buffer in the fs context, at least in some cases, and you can then extract these messages via the fd (as in e.g. https://github.com/torvalds/linux/blob/b446a2dae984fa5bd56dd7c3a02a426f87e05813/samples/vfs/test-fsmount.c#L20). Does this mean those errors are not logged though?

Not tried though, but I guess those errorfc()s are hardly triggered:
super.c: errorfc(fc, "failed to set initial blksize");
super.c: errorfc(fc, "unsupported blksize for fscache mode");
super.c: errorfc(fc, "failed to set erofs blksize");
super.c: errorfc(fc, "DAX unsupported by block device. Turning off DAX.");
super.c: errorfc(fc, "unsupported blocksize for DAX");

Anyway, I tend to guess that EINVAL wasn't returned by erofs. It looks like invalid device or something.
I think I will provide file-backed mount in the next version.

alexlarsson · 2024-08-06T12:49:51Z

If anything I'd expect it to be this:

super.c: ret = -EINVAL;
super.c- if (le32_to_cpu(dsb->magic) != EROFS_SUPER_MAGIC_V1) {
super.c- erofs_err(sb, "cannot find valid erofs superblock");
super.c- goto out;
super.c- }

edsantiago · 2024-08-14T16:01:13Z

ping, what is the status of this?

rawhide : int podman rawhide root host sqlite
- 08-11 09:26 in Podman checkpoint podman checkpoint container with export and try to change the runtime
- 08-06 11:36 in Podman checkpoint podman restore multiple containers from single checkpoint image
- 08-05 09:48 in Podman play kube with build Check that image is built using Containerfile
- 08-02 08:11 in Podman checkpoint podman checkpoint container with export and try to change the runtime
- 07-31 20:50 in Podman run podman run --seccomp-policy image (bogus profile)
- 07-31 17:50 in Podman checkpoint podman checkpoint --create-image with running container
- 07-31 11:20 in Podman checkpoint podman checkpoint and run exec in restored container
- 07-26 07:53 in Podman images podman images filter intermediate
- 07-24 07:14 in Podman checkpoint podman checkpoint --create-image with running container
rawhide : int remote rawhide root host sqlite [remote]
- 08-14 11:43 in Podman checkpoint podman restore multiple containers from single checkpoint image
- 08-13 16:48 in Podman checkpoint podman restore multiple containers from single checkpoint image
- 08-06 17:29 in Podman commit podman commit container with change CMD flag
- 08-06 17:29 in Podman checkpoint podman restore multiple containers from single checkpoint image
- 08-06 17:29 in Podman checkpoint podman checkpoint a container started with --rm
- 08-06 15:38 in Podman checkpoint podman restore multiple containers from multiple checkpoint images
- 08-05 07:47 in Podman run podman run --seccomp-policy image (block all syscalls)
- 08-04 16:23 in Podman run podman run --seccomp-policy image (bogus profile)
- 08-04 09:44 in Podman run entrypoint podman run entrypoint with cmd
- 08-02 08:12 in Podman run entrypoint podman run entrypoint with user cmd no image cmd
- 08-02 08:12 in Podman push podman push to local registry with authorization
- 08-01 12:50 in Podman commit podman commit should not commit env secret
- 08-01 12:50 in Podman checkpoint podman checkpoint --create-image with running container
- 07-31 23:06 in Podman diff podman diff latest container
- 07-31 20:51 in Podman checkpoint podman restore multiple containers from single checkpoint image
- 07-31 17:50 in Podman images podman builder prune
- 07-29 15:03 in Podman diff podman image diff
- 07-24 14:13 in Podman commit podman commit container with message
- 07-22 11:56 in Podman commit podman commit adds exposed ports

x	x	x	x	x	x
int(28)	remote(19)	rawhide(28)	root(28)	host(28)	sqlite(28)
	podman(9)

cgwalters · 2024-08-16T20:55:01Z

I'm not totally sure this is all related to composefs; for example:

Error: exporting root file-system diff for "9c8758400f1191794e89029144d40892208ac0b3b11d01dc6e21703dee4a2589": failed to move mount: invalid argument

@edsantiago can you elaborate a bit on the background on this? Basically two questions:

Are these tests only running against Fedora rawhide, and we have no data on whether they also appear on e.g. f40/c10s/c9s?
Are we sure they only reproduce when composefs is enabled for c/storage? i.e. did we just enable composefs for rawhide or are there separate composefs-only jobs?

cgwalters · 2024-08-16T21:00:44Z

super.c- erofs_err(sb, "cannot find valid erofs superblock");

Right, I agree that seems like the most likely source of an EINVAL from the erofs layer - and https://github.com/containers/composefs/releases/tag/v1.0.5 contains containers/composefs@76b4da5 which has a "possible" fix for this (though seeing it would just turn the question into how corrupt composefs erofs files were being generated).

Given that some of the EINVALs we're seeing here are coming from what looks like generic VFS operations, I think there's either a kernel bug/regression, or somehow us using composefs tickles other bugs generically for the VFS. It'd be really useful to have a bit more data on these tests on other OS versions.

edsantiago · 2024-08-19T11:04:39Z

Are these tests only running against Fedora rawhide, and we have no data on whether they also appear on e.g. f40/c10s/c9s?

We are only testing composefs in rawhide.

Are we sure they only reproduce when composefs is enabled for c/storage? i.e. did we just enable composefs for rawhide or are there separate composefs-only jobs?

When composefs is enabled, it takes place on all tests. With the exception of any tests that use their own private $CONTAINERS_STORAGE_CONF but I only see two tests that do so.

This was referenced Jul 24, 2024

composefs: random EINVALs containers/podman#23385

Open

test: disable artifacts cache with composefs containers/podman#23274

Merged

cgwalters mentioned this issue Jul 24, 2024

Add some error context in Changes codepaths #2043

Merged

cgwalters added the area/composefs composefs related changes label Jul 29, 2024

giuseppe mentioned this issue Aug 6, 2024

[CI][TEST] investigate https://github.com/containers/storage/issues/2042 containers/podman#23516

Closed

giuseppe mentioned this issue Aug 6, 2024

vendor: update c/storage containers/podman#23521

Merged

cgwalters mentioned this issue Aug 20, 2024

composefs fixes #2069

Merged

cgwalters mentioned this issue Sep 6, 2024

libcomposefs: detect short erofs files containers/composefs#333

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

composefs: random EINVALs #2042

composefs: random EINVALs #2042

edsantiago commented Jul 24, 2024

edsantiago commented Jul 24, 2024

giuseppe commented Jul 24, 2024

cgwalters commented Jul 29, 2024

giuseppe commented Jul 29, 2024

cgwalters commented Jul 29, 2024

giuseppe commented Jul 30, 2024

hsiangkao commented Jul 31, 2024

alexlarsson commented Aug 5, 2024

edsantiago commented Aug 5, 2024

alexlarsson commented Aug 6, 2024

alexlarsson commented Aug 6, 2024

hsiangkao commented Aug 6, 2024

alexlarsson commented Aug 6, 2024

alexlarsson commented Aug 6, 2024

hsiangkao commented Aug 6, 2024

alexlarsson commented Aug 6, 2024

hsiangkao commented Aug 6, 2024

alexlarsson commented Aug 6, 2024

hsiangkao commented Aug 6, 2024 •

edited

Loading

hsiangkao commented Aug 6, 2024

alexlarsson commented Aug 6, 2024 •

edited

Loading

hsiangkao commented Aug 6, 2024

alexlarsson commented Aug 6, 2024

edsantiago commented Aug 14, 2024

cgwalters commented Aug 16, 2024

cgwalters commented Aug 16, 2024

edsantiago commented Aug 19, 2024

composefs: random EINVALs #2042

composefs: random EINVALs #2042

Comments

edsantiago commented Jul 24, 2024

edsantiago commented Jul 24, 2024

giuseppe commented Jul 24, 2024

cgwalters commented Jul 29, 2024

giuseppe commented Jul 29, 2024

cgwalters commented Jul 29, 2024

giuseppe commented Jul 30, 2024

hsiangkao commented Jul 31, 2024

alexlarsson commented Aug 5, 2024

edsantiago commented Aug 5, 2024

alexlarsson commented Aug 6, 2024

alexlarsson commented Aug 6, 2024

hsiangkao commented Aug 6, 2024

alexlarsson commented Aug 6, 2024

alexlarsson commented Aug 6, 2024

hsiangkao commented Aug 6, 2024

alexlarsson commented Aug 6, 2024

hsiangkao commented Aug 6, 2024

alexlarsson commented Aug 6, 2024

hsiangkao commented Aug 6, 2024 • edited Loading

hsiangkao commented Aug 6, 2024

alexlarsson commented Aug 6, 2024 • edited Loading

hsiangkao commented Aug 6, 2024

alexlarsson commented Aug 6, 2024

edsantiago commented Aug 14, 2024

cgwalters commented Aug 16, 2024

cgwalters commented Aug 16, 2024

edsantiago commented Aug 19, 2024

hsiangkao commented Aug 6, 2024 •

edited

Loading

alexlarsson commented Aug 6, 2024 •

edited

Loading