Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEATURE: Add option to abstract the repository when using skopeo sync #1845

Open
ro11net opened this issue Jan 11, 2023 · 3 comments
Open

FEATURE: Add option to abstract the repository when using skopeo sync #1845

ro11net opened this issue Jan 11, 2023 · 3 comments
Labels
kind/feature A request for, or a PR adding, new functionality stale-issue

Comments

@ro11net
Copy link

ro11net commented Jan 11, 2023

I would first like to say that I am new to the open source community. I would like to see if I can help tackle this if it seems like a project you're willing to add as it sounds like a fun project to get more familiar with Go and making contributions. Please let me know if there is already a fix for this or an option I may be missing in the documentation.

My team has recently started using Skopeo. It works great for pulling large numbers of images and staging them for "airgapped" image registries. We use the scoped option when syncing images to local device or proxy registry for scanning images, but the output seems like it's missing an option to abstract the registry from the directory structure. Adding an option to abstract the repository from the output could offer more capabilities in regards to Skopeo being used in various automation frameworks to pull images from public registries and push to private registries.

UNSCOPED:

When the --scoped flag is NOT set: (skopeo sync --src yaml --dest dir), the following directory structure is created:

.
└── image:tag
└── image:tag
└── image:tag

This is not ideal when pushing images to a private registry as they are no longer separated into projects and the default image in the resource manifests will need to be changed.

SCOPED:

When the --scoped flag is added to sync images: (skopeo sync --scoped --src yaml --dest dir), the following directory structure is built:

.
├── docker.io
│   ├── project1
│   │   └── image:tag
│   └── project2
│       └── image:tag
└── quay.io
    └── project3
        └── image:tag

The problem with this is that now I need to move into the registry directories and run a sync from each in order to push all projects to the private registry.

  • Currently we are moving projects out of the registry directories in a single directory so all projects may be pushed at once.

Staying in the current directory and running a sync will result in the following push to the private registry:

https://private.io/docker.io/project/image:tag

When I would like it to be:

https://private.io/project/image:tag

This process unecessarily complicates pipelines which stage images to be moved to an airgapped environment to be moved to storage used by the automation tools or a private registry.

FIX/FEATURE REQUEST

What I would like to see is an option to use the --scoped option with an additional flag --abstract-registry with an output like this:

.
├── project1
|   └── image:tag
└── project2
|   └── image:tag
└── project3
   └── image:tag

This would also introduce a capability to sync from a source yaml and push directly to a private registry destination docker with a single command: skopeo sync --scoped --abstract-repo --src yaml --dest docker sync.yaml https://private.io

  • I believe this feature would keep the projects associated with images in tact in order to not have to change manifests between environments and helping make the resources more portable.

This is a problem I see fairly often in on-prem kubernetes clusters using their own private registries. There seems to be a similar issue relating to this in issue 854 opened in 2020, but it appears there was some confusion on what the intent might be for this option. The issue says it was fixed in issue 870 but I don't see any changes for what I previously described implemented. Please correct me if I'm wrong.

EXAMPLE USE CASE:

                                                      | (No internet past this point)
---------------             ---------------           | ---------------             ---------------
| docker.io   |             |             |           | |  airgapped  |             |             |
---------------   images    |  temporary  |   images  | | storage for |   images    |   private   |
| quay.io     |   ------>   |   storage   |   ------> | | automation  |   ------>   |    image    |
---------------             |             |           | |    tools    |             |  repository |
| k8s.gcr.io  |             |             |           | |             |             |(private.io) |
---------------             ---------------           | ---------------             ---------------
                                                      |                                    |
                                                      |                                    |   images
                                                      |                                   \ /
                                                      |                             ---------------
                                                      |  endpoint:                  |             |
                                                      |    "docker.io"="private.io" |     K8s     |                 
                                                      |     "quay.io"="private.io"  |   cluster   | 
                                                      | pull repo/project/image:tag |             | 
                                                      |                             |             |
                                                      |                             ---------------

EXAMPLE sync.yaml

docker.io:
  images:
    project1/image:
    - tag
    project2/image:
    - tag
quay.io:
  images:
    project3/image:
    - tag
@mtrmac
Copy link
Contributor

mtrmac commented Jan 11, 2023

Thanks for reaching out.

There’s a fair bit of previous work and conversation in earlier pull requests, https://github.com/containers/skopeo/pulls?q=is%3Apr+is%3Aopen .

I think the very basic summary is:

  • We shouldn’t keep adding individual mapping options; that would never end. Instead, let the user just specify source/destination repo pairs explicitly.
  • The current yaml format doesn’t allow that, so we need a new one.

#1531 (comment) is one very rough sketch of what that might look like, and #1792 a start of an implementation effort.

I’m afraid this is not something I can promise to spend a lot of time on short-term.

@mtrmac mtrmac added the kind/feature A request for, or a PR adding, new functionality label Jan 11, 2023
@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@deric
Copy link

deric commented Feb 6, 2024

@mtrmac I understand that you're trying to find a universal solution and some rewrite schemes might be very complex. But the basic usage might be:

  1. Copy multiple registries to a single registry - SUPPORTED using --scoped flag docker.io/foo/bar -> myrepo/docker.io/foo/bar
  2. Mirror images while preserving path NOT SUPPORTED e.g. docker.io/foo/bar -> myrepo/foo/bar

Would it be ok to add simple flag to support this? I guess the hardest part is to think of good naming, e.g. --preserve-path, or --preserve-prefix

When you're running

rsync -r /src/foo/ /dest/

you're expecting it to preserve directory structure and not to copy child directories into parent dir.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature A request for, or a PR adding, new functionality stale-issue
Projects
None yet
Development

No branches or pull requests

3 participants