Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Nested index: is it normal for an index to be deleted if its tag is deleted even when it is part of another index #2624

Closed
Sebastian-Maier opened this issue Aug 21, 2024 · 14 comments
Labels
bug Something isn't working rm-external Roadmap item submitted by non-maintainers

Comments

@Sebastian-Maier
Copy link

Sebastian-Maier commented Aug 21, 2024

zot version

v2.1.1

Describe the bug

I have the following questions (not sure whether this is actually a bug):

I want to copy a multi-platform image (index) to a repository (via regctl image copy) and include its digest as an platform-independent entry in another multi-platform image (index) that I add to the same repository.
Since I do no longer need (nor want) a specific tag referencing the copied image, I delete it (with regctl tag delete).
All these operations succeed. However when I try to access/retrieve the image copied originally (via the digest stored in the root index), the image is no longer available.

Is this normal behavior even when the image/index is still referenced (via digest in the root index)?
Is this an OCI registry or a client tool issue? (I use ZOT v2.1.1 and regctl v0.7.1 on macOS)
Is there a way to copy a multi-platform image (index) without a specific target tag in the first place?

To reproduce

  1. Copy multi-platform image (index) to target repository with a tag specified (via regclient: regctl image copy)
  2. Use the digest of the copied image as an entry in another index and store it in the same repository (with a tag)
  3. Delete the tag of the image copied originally (via regclient: regctl tag delete)
  4. Try to access/retrieve the image copied originally (via the digest stored in the root index)

Expected behavior

I would expect the copied image to be still available since it is referenced via digest by the (root) index.

Screenshots

No response

Additional context

No response

@Sebastian-Maier Sebastian-Maier added the bug Something isn't working label Aug 21, 2024
@rchincha rchincha added the rm-external Roadmap item submitted by non-maintainers label Aug 21, 2024
@rchincha
Copy link
Contributor

@Sebastian-Maier thanks for trying zot out.

Yes, this is likely a bug. However, just curious, is this just an experiment or a concrete use case - so we can prioritize.

@andaaron
Copy link
Contributor

Hi @Sebastian-Maier,

Do you have retention / gc settings enabled?

Let me see if I understand the scenario:

  1. You push an index by digest (let's call it A), and it remains untagged.
  2. You push another index which references A (let's call it B), and it is tagged.
  3. At some point you explicitly deleted B. And A also gets removed, but you did not expect this.

If GC is enabled and set to delete untagged images after a while, it could delete A because it is not referenced in another image, and it is not tagged.

If at point 1 you actually tagged A separately, it should be retained regardless of removing B.

If by "root index" you mean the index zot manages internally for tracking all images pushed to a specific repo, that one is managed by zot, and zot can add or removed images when the user calls the respective APIs or when GC runs.

@Sebastian-Maier
Copy link
Author

@Sebastian-Maier thanks for trying zot out.

Yes, this is likely a bug. However, just curious, is this just an experiment or a concrete use case - so we can prioritize.

We currently use this nested index in a PoC implementation but we plan to use it in production soon.

@Sebastian-Maier
Copy link
Author

Sebastian-Maier commented Aug 22, 2024

Do you have retention / gc settings enabled?

I use the docker image with the default settings. Not sure what that means for the setting, but I would expect that the nested index that is still referenced from another index is not deleted in either of these modes.

Let me see if I understand the scenario:

  1. You push an index by digest (let's call it A), and it remains untagged.
  2. You push another index which references A (let's call it B), and it is tagged.
  3. At some point you explicitly deleted B. And A also gets removed, but you did not expect this.

The scenario I described was a little bit different.
I performed the following steps to produce the error ("N" standing for nested index, "R" standing for root index):

  1. I copied index "N" to the repository "repo" and tagged it "tagN" (I had to use a target tag as regctl image copy does not support copying by digest and without a target tag)
  2. I created an index "R" in the same repository "repo" that references "N" and is tagged as "tagR"
  3. I deleted "tagN" to avoid tags of nested indexes in the shared namespace (that could confuse users and that would also prevent "N" from beeing deleted if "tagR" would be deleted at some time in the future)
  4. I was no longer able to access "N" via the digest stored in "R" as it has seemingly been deleted

If by "root index" you mean the index zot manages internally for tracking all images pushed to a specific repo, that one is managed by zot, and zot can add or removed images when the user calls the respective APIs or when GC runs.

By root index I mean my own index "R" that references the nested index "N" and should be the only one that has a tag in the end.

I also opened an issue/question for regclient to find out whether this is a ZOT or regctl issue: regclient/regclient#808
In addition, I just asked the question there whether regctl image copy could be extended to allow copying images by digest only (and without specifying a target tag).

@Sebastian-Maier
Copy link
Author

I got an update from the maintainers of regclient: It is possible to use regctl image copy to copy an image by digest (and without a task).
I could therefore adjust the steps accordingly:

  1. Copy index "N" to the repository "repo" (without a tag)
  2. Create index "R" in the same repository "repo" that references "N" and is tagged as "tagR"

This seems to work, as "N" seems to be accessible via its digest – at least initially. However, I am not sure whether some kind of GC might eventually delete "N" nevertheless. In addition, I also think the original behavior (from the steps we discussed before) is probably still a bug.

@rchincha
Copy link
Contributor

@Sebastian-Maier can you share your zot config file (anonymized of course), so we may try to understand and reproduce this locally.

@Sebastian-Maier
Copy link
Author

Sebastian-Maier commented Aug 23, 2024

We currently do not use a custom ZOT config file. But below you'll find two scripts that start ZOT (in the same way we do) and in addition also implement the two approaches described above.

delete_nested_tag.sh implements the original steps and reproduces the problem.
avoid_nested_tag.sh avoids tags for the nested index entirely and is seemingly not affected by the problem (at least initially ?)

delete_nested_tag.sh

#!/usr/bin/env bash

set -eu

CONT_NAME="zot-nested-index-test"
PORT=60000
TMP_DIR=${PWD}/tmp/${CONT_NAME}/registry-storage

perform_cleanup(){
    docker rm \
        -f \
        "${CONT_NAME}" &> /dev/null

    rm -rf $TMP_DIR
}

trap perform_cleanup EXIT

mkdir -p $TMP_DIR

docker run \
    -d \
    -p ${PORT}:5000 \
    --name "${CONT_NAME}" \
    -v ${TMP_DIR}:/var/lib/registry \
    ghcr.io/project-zot/zot-linux-arm64:latest # adjust architecture according to your needs

SOME_IMAGE_INDEX=ghcr.io/regclient/regctl:latest

regctl image copy $SOME_IMAGE_INDEX localhost:$PORT/repo:tagN

DIGEST=$(regctl manifest head localhost:$PORT/repo:tagN)
echo "Digest (nested index): $DIGEST"

INFO=$(regctl manifest get --format raw-body localhost:$PORT/repo:tagN)

echo "Nested Index:"
echo $INFO | jq .

SIZE=$(echo $INFO | wc -c | xargs echo) # xargs echo to trim whitespace

TYPE=$(echo $INFO | jq --raw-output .mediaType)

INDEX="{
    \"schemaVersion\":2,
    \"mediaType\":\"application/vnd.oci.image.index.v1+json\",
    \"manifests\":[{
        \"mediaType\":\"$TYPE\",
        \"size\":$SIZE,
        \"digest\":\"$DIGEST\"
    }]
}"

echo "Root Index:"
echo $INDEX | jq .

echo $INDEX | regctl manifest put localhost:$PORT/repo:tagR

regctl tag delete localhost:$PORT/repo:tagN

# sleep 10

regctl manifest get localhost:$PORT/repo@$DIGEST # this fails unexpectedly

avoid_nested_tag.sh

#!/usr/bin/env bash

set -eu

CONT_NAME="zot-nested-index-test"
PORT=60000
TMP_DIR=${PWD}/tmp/${CONT_NAME}/registry-storage

perform_cleanup(){
    docker rm \
        -f \
        "${CONT_NAME}" &> /dev/null

    rm -rf $TMP_DIR
}

trap perform_cleanup EXIT

mkdir -p $TMP_DIR

docker run \
    -d \
    -p ${PORT}:5000 \
    --name "${CONT_NAME}" \
    -v ${TMP_DIR}:/var/lib/registry \
    ghcr.io/project-zot/zot-linux-arm64:latest # adjust architecture according to your needs

SOME_IMAGE_INDEX=ghcr.io/regclient/regctl:latest

DIGEST=$(regctl manifest head $SOME_IMAGE_INDEX)
echo "Digest (nested index): $DIGEST"

regctl image copy $SOME_IMAGE_INDEX localhost:$PORT/repo@$DIGEST

INFO=$(regctl manifest get --format raw-body localhost:$PORT/repo@$DIGEST)

echo "Nested Index:"
echo $INFO | jq .

SIZE=$(echo $INFO | wc -c | xargs echo) # xargs echo to trim whitespace

TYPE=$(echo $INFO | jq --raw-output .mediaType)

INDEX="{
    \"schemaVersion\":2,
    \"mediaType\":\"application/vnd.oci.image.index.v1+json\",
    \"manifests\":[{
        \"mediaType\":\"$TYPE\",
        \"size\":$SIZE,
        \"digest\":\"$DIGEST\"
    }]
}"

echo "Root Index:"
echo $INDEX | jq .

echo $INDEX | regctl manifest put localhost:$PORT/repo:tagR

# sleep 10

regctl manifest get localhost:$PORT/repo@$DIGEST # this succeeds (at least initially ?)

@andaaron
Copy link
Contributor

andaaron commented Aug 23, 2024

It's OK, I managed to reproduce yesterday, we are discussing how to best fix this.

@rchincha
Copy link
Contributor

@rchincha
Copy link
Contributor

@Sebastian-Maier

If you are ok with it, can you pls try this patch and verify that it indeed solves your particular use case.
#2626

@Sebastian-Maier
Copy link
Author

Thanks for all your efforts to fix this bug.
I just tried the patch with the slightly adjusted scripts I provided earlier. Both variants (delete_nested_tag.sh and avoid_nested_tag.sh) seem to work now. However, I could obviously not verify whether the nested index might still eventually be deleted by GC or another mechanism.

@rchincha
Copy link
Contributor

rchincha commented Oct 3, 2024

PR #2626 is now merged. Closing this issue.

@rchincha rchincha closed this as completed Oct 3, 2024
@andaaron
Copy link
Contributor

More tests for GC, covering indexes referencing other indexes: #2716

@andaaron
Copy link
Contributor

This one also solved some issues with the UI and CVE scan breaking for nested indexes: #2732

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working rm-external Roadmap item submitted by non-maintainers
Projects
None yet
Development

No branches or pull requests

3 participants