Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide better error descriptions for CSI-S3 access and connection failures #249

Open
srikumar003 opened this issue Apr 28, 2023 · 6 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed s3 Issues relating to S3/CSI-S3 integration

Comments

@srikumar003
Copy link
Collaborator

Issue:
Currently, CSI-S3 emits this error string in the logs most of the time: failed to initialize S3 client: Endpoint: does not follow ip address or domain name standards. However, the root causes of this error can be different e.g. #160 #90 The dataset is left pending.

Requirements:
Provide an error log message corresponding to the root cause and use it to populate the status field of the Dataset.

@srikumar003 srikumar003 added s3 Issues relating to S3/CSI-S3 integration enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Apr 28, 2023
@ashutosh887
Copy link

I would like to take this @srikumar003

/assign

@starpit
Copy link

starpit commented Jun 16, 2023

i am seeing the invalid endpoint errors when running in Kind. i have tried a bunch of variants for the endpoint field of my DataSet. none seem to work. e.g. "codeflare-s3.codeflare-system.svc.cluster.local:9000", with and without the leading http://... with and without the svc... part, etc. nothing works.

how does one go about diagnosing/debugging/fixing these kinds of issues?

@srikumar003
Copy link
Collaborator Author

@ashutosh887 apologies for not responding and thanks for taking on this issue!

@ashutosh887
Copy link

Thanks @srikumar003
Is there a public channel to discuss if I get stuck somewhere

@zentavr
Copy link

zentavr commented Dec 18, 2023

I have the same issue and cannot understand what's wrong:

I1218 00:06:07.573611       1 controller.go:1332] provision "mattermost/mm-db-dump" class "csi-s3": started
I1218 00:06:07.573990       1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"mattermost", Name:"mm-db-dump", UID:"66e7a933-4e0d-4389-b1af-f187372ac075", APIVersion:"v1", ResourceVersion:"178444495", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "mattermost/mm-db-dump"
I1218 00:06:07.581240       1 connection.go:182] GRPC call: /csi.v1.Controller/CreateVolume
I1218 00:06:07.581273       1 connection.go:183] GRPC request: {"capacity_range":{"required_bytes":5368709120},"name":"pvc-66e7a933-4e0d-4389-b1af-f187372ac075","parameters":{"mounter":"goofys"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{"mode":5}}]}
I1218 00:06:07.583155       1 connection.go:185] GRPC response: {}
I1218 00:06:07.583264       1 connection.go:186] GRPC error: rpc error: code = Unknown desc = failed to initialize S3 client: Endpoint:  does not follow ip address or domain name standards.
I1218 00:06:07.583350       1 controller.go:753] CreateVolume failed, supports topology = false, node selected false => may reschedule = false => state = Finished: rpc error: code = Unknown desc = failed to initialize S3 client: Endpoint:  does not follow ip address or domain name standards.
I1218 00:06:07.583480       1 controller.go:1099] Final error received, removing PVC 66e7a933-4e0d-4389-b1af-f187372ac075 from claims in progress
W1218 00:06:07.583524       1 controller.go:958] Retrying syncing claim "66e7a933-4e0d-4389-b1af-f187372ac075", failure 7
E1218 00:06:07.583574       1 controller.go:981] error syncing claim "66e7a933-4e0d-4389-b1af-f187372ac075": failed to provision volume with StorageClass "csi-s3": rpc error: code = Unknown desc = failed to initialize S3 client: Endpoint:  does not follow ip address or domain name standards.
I1218 00:06:07.583635       1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"mattermost", Name:"mm-db-dump", UID:"66e7a933-4e0d-4389-b1af-f187372ac075", APIVersion:"v1", ResourceVersion:"178444495", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "csi-s3": rpc error: code = Unknown desc = failed to initialize S3 client: Endpoint:  does not follow ip address or domain name standards.

The CRD is the next:

    apiVersion: com.ie.ibm.hpsys/v1alpha1
    kind: Dataset
    metadata:
      name: mm-db-dump
    spec:
      local:
        type: "COS" # Cloud Object Storage
        secret-name: "s3-dump-secret"
        #secret-namespace: "{SECRET_NAMESPACE}" #optional if the secret is in the same ns as dataset
        endpoint: "http://rgw-slow-dev01.ti.local"
        bucket: "mm-db-dump"
        readonly: "false" # default is false
        provision: "false" # DLF will allocate bucket on the COS if it doesn't exist [Default: false]
        #region: "" #it can be empty

@zentavr
Copy link

zentavr commented Dec 18, 2023

The issue was that #90 (comment)
I'd edited the endpoint and that was not catched.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed s3 Issues relating to S3/CSI-S3 integration
Projects
None yet
Development

No branches or pull requests

4 participants