Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not working with python3 + Nautilus #6

Open
cervigni opened this issue May 18, 2020 · 12 comments
Open

Not working with python3 + Nautilus #6

cervigni opened this issue May 18, 2020 · 12 comments

Comments

@cervigni
Copy link

Hello,
Is this still a supported tool? I cannot make it work with python3. after exporting credentials it always hangs without doing anything.
Is there another tool to make requests on the elasticsearch tier for searching ceph metadata like ?mdsearch?

Many thanks

@yehudasa
Copy link
Owner

Can you check this branch?
https://github.com/yehudasa/obo/tree/wip-python3

@cervigni
Copy link
Author

cervigni commented May 18, 2020 via email

@yehudasa
Copy link
Owner

What operations hang?

You can try using s3curl to generate these requests.

@cervigni
Copy link
Author

I don't think s3curl it's an option as it does not support V4 as far as I know.

obo does not do anything, I also tried to enabled pbd but I am not a programmer, probably too much for me. It times out with: Had an issue connecting: Remote end closed connection without response

Do you use this for production environments? Could you send me some example on how to use obo?
We also wrote a request script by hand that does the aws HMAC but what it returns from an ES research is always:
./s3-rest.py -c config/xxx-metadata.json -t "query=name==file-020" -b test Elapsed time: 0.16282385099475505 (s) Response status: 200 Response headers: {'x-amz-request-id': 'tx000000000000000000010-005ec29c17-76b6eb35-objectstorage-metadata', 'content-type': 'application/xml', 'content-length': '98', 'date': 'Mon, 18 May 2020 14:30:48 GMT'} Response body: <SearchMetadataResponse><Marker></Marker><IsTruncated>false</IsTruncated></SearchMetadataResponse> SearchMetadataResponse: None Marker: None IsTruncated: false
The file is definitely there as I can see it with MC
mc stat s3-xxx-prod/test/file-020 Name : file-020 Date : 2020-05-18 22:15:20 AWST Size : 4.0 KiB ETag : 993ad389d5aecb23d78fdc2fd6fa88c2 Type : file Metadata : Content-Type : application/octet-stream X-Rgw-Object-Type: Normal

Specifying the bucket it does the same.
Trying to specify custom metadata (with explicit_custom_meta=false):
mc stat s3-nimbus-prod/test2/file-001 Name : file-001 Date : 2020-05-18 22:31:26 AWST Size : 4.0 KiB ETag : 95cbff1fbd2e883216c46d7a7792e4d2 Type : file Metadata : Content-Type : application/octet-stream X-Rgw-Object-Type: Normal X-Amz-Meta-Key1 : value1 X-Amz-Meta-Luca : thebest

/s3-rest.py -c config/xxx-metadata.json -t "query=X-Amz-Meta-Luca==thebest" -b test2 Elapsed time: 0.12439538100443315 (s) Response status: 400 Response headers: {'content-length': '326', 'x-amz-request-id': 'tx000000000000000000025-005ec29cae-76b6eb35-objectstorage-metadata', 'accept-ranges': 'bytes', 'content-type': 'application/xml', 'date': 'Mon, 18 May 2020 14:33:18 GMT'} Response body: <?xml version="1.0" encoding="UTF-8"?><Error><Code>InvalidArgument</Code><Message>unexpected generic field &apos;X-Amz-Meta-Luca&apos;</Message><BucketName>test2</BucketName><RequestId>tx000000000000000000025-005ec29cae-76b6eb35-objectstorage-metadata</RequestId><HostId>76b6eb35-objectstorage-metadata-pawsey</HostId></Error> Error: None Code: InvalidArgument Message: unexpected generic field 'X-Amz-Meta-Luca' BucketName: test2 RequestId: tx000000000000000000025-005ec29cae-76b6eb35-objectstorage-metadata HostId: 76b6eb35-objectstorage-metadata-xxx
If you could help pointing me to the right direction would be highly appreciated.

Thanks
Cheers, Luca

@cervigni
Copy link
Author

cervigni commented May 19, 2020

Did a bit of messing around yesterday. I made OBO work, with SSL disable on the endpoints and IP only (hostname does not work). Same problem though for the reply, similar to the "handmade" request I did previously.

[root@obo3 bin]# ./obo mdsearch obo --query='name==file100' { "Marker": "", "IsTruncated": "false", "Objects": [] } []

if I did query manually ES for the data it returns
curl -X GET "localhost:9200/rgw-xxx-e763c3c3/_search?q=name:file100&pretty" { "took" : 3, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 3.921734, "hits" : [ { "_index" : "rgw-xxx-e763c3c3", "_type" : "object", "_id" : "2f453f51-9134-4e62-893f-88275f85b3fb.1991699160.1%3Afile100%3Anull", "_score" : 3.921734, "_source" : { "bucket" : "obo", "name" : "file100", "instance" : "null", "versioned_epoch" : 0, "owner" : { "id" : "xxx$xxx", "display_name" : "xxx-project" }, "permissions" : [ "xxx$xxx" ], "meta" : { "size" : 1048576, "mtime" : "2020-05-19T02:49:08.377Z", "content_type" : "application/octet-stream", "etag" : "b6d81b360a5672d80c27430f39153e2c", "tail_tag" : "2f453f51-9134-4e62-893f-88275f85b3fb.1991345516.2301949", "x-amz-content-sha256" : "UNSIGNED-PAYLOAD", "x-amz-date" : "20200519T024905Z" } } } ] } }

The payload that RGW sends though (via OBO) is:
{ "query": { "bool": { "must": [ { "term": { "bucket": "obo" } }, { "bool": { "must": [ { "term": { "permissions": "xxx$xxx" } }, { "term": { "name": "file100" } } ] } } ] } } }

You might wanna pipe everything into jq

@cervigni
Copy link
Author

Ok another update. After getting crazy at understanding what is going on.
The problem is that we use on the permission matching query in the RGW ES code, you should either use MATCH instead of TERM for querying the permissions, or use permission.keyword so it does the comparison as a single string. I don't think is a problem of the ES or don't think there is any way of fixing it via ES. RGW ES code needs to change.

Do you want I open a bug? Or do you think is there any way to "tune ES" for having it work until the patch is released?

Thanks

@cervigni
Copy link
Author

https://tracker.ceph.com/issues/45607

I opened the bug. I am not a C programmer but the change should be super easy to do. If there is a chance you could provide another idea it would be incredibly welcome

Thanks

@yehudasa
Copy link
Owner

@cervigni what version of elasticsearch are you using?

@cervigni
Copy link
Author

cervigni commented May 19, 2020 via email

@yehudasa
Copy link
Owner

@cervigni did you reinitialized everything (e.g.., by creating a new zone for the sync) after switching the elasticsearch versions? could be a mismatch there

@cervigni
Copy link
Author

cervigni commented May 19, 2020 via email

@cervigni
Copy link
Author

I did what you asked. wiped the zone with radosgw-admin zone rm. re-create it from scratch and same problem. I don't know how that is related. This is a problem of payload creation of rgw ES tier zone. a TERM query will never match something "blablabla$testestest" because the analyzer will split it in half. Also because the permissions is an array of strings changing the analyzer is a mess. For the moment I cannot see any other way other than a change of the rgw ES code.

The only two options I can see here to make it work, both working with an array of strings in ES is either change:

  • replace TERM with MATCH when searching for the permissions
  • keep TERM but replace permissions with permissions.keyword when searching

If there is something magical in ES to make this work without changing the RGW code please let me know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants