-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to delete subvolume with retain-snapshot feature #233
Conversation
cmd/commands/subvolume.go
Outdated
}, | ||
} | ||
|
||
func init() { | ||
SubvolumeCmd.AddCommand(listCmd) | ||
SubvolumeCmd.PersistentFlags().Bool("stale", false, "List only stale subvolumes") | ||
SubvolumeCmd.AddCommand(deleteCmd) | ||
SubvolumeCmd.PersistentFlags().Bool("retain-snapshots", false, "Delete subvolume but retain the snapshot") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the default behaviour, we dont need to have a new flag from user, because you can never delete a subvolume which is having snapshots without retain-snapshot
flag in ceph CLI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to list the subvolmes along with the status(if it has snapshots or not), so user can pass this flag and delete the stale subvolumes, even if it has snapshots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO we dont need any flag this should be the default behaviour
pkg/filesystem/subvolume.go
Outdated
@@ -53,7 +53,9 @@ func getK8sRefSubvolume(ctx context.Context, clientsets *k8sutil.Clientsets) map | |||
subvolumeNames := make(map[string]subVolumeInfo) | |||
for _, pv := range pvList.Items { | |||
if pv.Spec.CSI != nil && pv.Spec.CSI.VolumeAttributes["subvolumeName"] != "" { | |||
subvolumeNames[pv.Spec.CSI.VolumeAttributes["subvolumeName"]] = subVolumeInfo{} | |||
if pv.Spec.CSI.VolumeAttributes["subvolumeName"] != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in which case the subvolumeName can be empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yati1998 this is still not answered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was added as a check when there will be no CSI subvolumes, but yeah with your last PR to fix this issue, got to understand the correct check.
if there is a CSI volume, don't think the subvolumename will be empty. I will remove the check, it's not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase to pick up the recent CI fix.
@Mergifyio rebase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note the CI failures
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note the CI failures
Run set -ex + set -ex + kubectl rook-ceph rbd ls replicapool Warning: rook version 'rook: v1.13.0-alpha.0.220.gf5fcf[9](https://github.com/rook/kubectl-rook-ceph/actions/runs/7781414479/job/21215804058?pr=233#step:11:10)6e0' is running a pre-release version of Rook.``` It says failure while listing the rbd pools, doesn't seems to be related to my changes. Any idea on what's the issue?
Looks like no OSDs were created, it seems related to other CI issues we are seeing related to #234. You might try a force push to see if a fresh run might succeed.
The PR looks good, we should have a separate fix for the CI today, see discussion in rook/rook#13684 |
pkg/filesystem/subvolume.go
Outdated
@@ -53,7 +53,9 @@ func getK8sRefSubvolume(ctx context.Context, clientsets *k8sutil.Clientsets) map | |||
subvolumeNames := make(map[string]subVolumeInfo) | |||
for _, pv := range pvList.Items { | |||
if pv.Spec.CSI != nil && pv.Spec.CSI.VolumeAttributes["subvolumeName"] != "" { | |||
subvolumeNames[pv.Spec.CSI.VolumeAttributes["subvolumeName"]] = subVolumeInfo{} | |||
if pv.Spec.CSI.VolumeAttributes["subvolumeName"] != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yati1998 this is still not answered
c4194a5
to
9b47647
Compare
cmd/commands/subvolume.go
Outdated
Use: "delete", | ||
Short: "Deletes a stale subvolume.", | ||
Args: cobra.MinimumNArgs(2), | ||
Example: "kubectl rook-ceph delete <filesystem> <subvolume> <subvolumegroup>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use square brackets around optional args
Example: "kubectl rook-ceph delete <filesystem> <subvolume> <subvolumegroup>", | |
Example: "kubectl rook-ceph delete <filesystem> <subvolume> [subvolumegroup]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small nits, LGTM
|
||
``` | ||
|
||
```bash | ||
kubectl rook-ceph subvolume delete csi-vol-427774b4-340b-11ed-8d66-0242ac110004,csi-vol-427774b4-340b-11ed-8d66-0242ac110005 ocs-storagecluster csi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a lone about to mention that you are deleting without subvolumegroup or else no one will notice it and we will have two delete commands in the doc
pkg/filesystem/subvolume.go
Outdated
var svg string | ||
if len(args) == 2 { | ||
svg = "csi" | ||
} else { | ||
svg = args[2] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var svg string | |
if len(args) == 2 { | |
svg = "csi" | |
} else { | |
svg = args[2] | |
} | |
svg:="csi" | |
if len(args) > 2 { | |
svg = args[2] | |
} |
pkg/filesystem/subvolume.go
Outdated
} | ||
|
||
check := isStaleSubvolume(subvol, k8sSubvolume) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove extra line
} | ||
subvolumeNames[sv.Name] = subVolumeInfo{fs.Name, svg.Name} | ||
subvolumeNames[sv.Name] = subVolumeInfo{fs.Name, svg.Name, state} | ||
fmt.Println(fs.Name, sv.Name, svg.Name, status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Println is not a formatted output when we have large number of subvolumes it difficult to read, can you please open an issue to print as formatted like kubectl output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#211 , it is already added here, I will check on how we can have a proper format for it.
pkg/filesystem/subvolume.go
Outdated
InUse = "in-use" | ||
Stale = "stale" | ||
StaleWithSnapshot = "stale-with-snapshot" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to be exported variables as they are not used anywhere else outside this file?
66cb8cb
to
cad9411
Compare
@travisn @subhamkrai @Madhu-1 I have addressed all the comments, please do review it once you are free. |
cad9411
to
19a7dd3
Compare
This commit includes the retain-snapshot flag to delete the subvolume. Signed-off-by: yati1998 <[email protected]>
19a7dd3
to
a0359c0
Compare
@@ -166,36 +205,13 @@ func unMarshaljson(list string) []fsStruct { | |||
return unmarshal | |||
} | |||
|
|||
func Delete(ctx context.Context, clientsets *k8sutil.Clientsets, OperatorNamespace, CephClusterNamespace, subList, fs, svg string) { | |||
subvollist := strings.Split(subList, ",") | |||
func Delete(ctx context.Context, clientsets *k8sutil.Clientsets, OperatorNamespace, CephClusterNamespace, fs, subvol, svg string) { | |||
k8sSubvolume := getK8sRefSubvolume(ctx, clientsets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will query all of the PVs in the cluster, just to confirm if it exists. Instead, can we skip this query, and just attempt to delete the subvolume? If it doesn't exist, hopefully an error will already be returned that it doesn't exist, which is anyway what we will print as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not about the subvolume being present, this is a check added to make sure we are deleting only stale subvolumes. We are listing all the subvolumes with ls
command, but still, the user may enter an incorrect subvolume name, which may lead to the deletion of the non-stale subvolume. Hence, it's good to check before directly deleting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I see now that it is checking to confirm the stale volume. But the perf implication is still concerning. In a test cluster with a handful of PVs it is fine. But what if there are 100 or 1K or 10K PVs in the cluster? How long will this take? Or will these queries for all the PVs even fail if there are too many. I'll open an issue to test it.
@@ -166,36 +205,13 @@ func unMarshaljson(list string) []fsStruct { | |||
return unmarshal | |||
} | |||
|
|||
func Delete(ctx context.Context, clientsets *k8sutil.Clientsets, OperatorNamespace, CephClusterNamespace, subList, fs, svg string) { | |||
subvollist := strings.Split(subList, ",") | |||
func Delete(ctx context.Context, clientsets *k8sutil.Clientsets, OperatorNamespace, CephClusterNamespace, fs, subvol, svg string) { | |||
k8sSubvolume := getK8sRefSubvolume(ctx, clientsets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I see now that it is checking to confirm the stale volume. But the perf implication is still concerning. In a test cluster with a handful of PVs it is fine. But what if there are 100 or 1K or 10K PVs in the cluster? How long will this take? Or will these queries for all the PVs even fail if there are too many. I'll open an issue to test it.
This PR introduces the following enhancement to the stale subvolume cleanup process: