Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to filter blobs by tags - HTTP 500 with multiple conditions on single tag #2514

Open
aaronenberg-msft opened this issue Dec 12, 2024 · 6 comments
Assignees
Labels
alignment Alignment between Azurite with Azure Storage production blob-storage

Comments

@aaronenberg-msft
Copy link

aaronenberg-msft commented Dec 12, 2024

I am testing azurite 3.33.0 support for finding blobs by index tags and it is failing with a HTTP 500 with this WHERE clause:

@container='mycontainer' AND MyTag >= 'Foo/296ab642-162c-4db8-a4ae-9517189e411d/' AND MyTag < 'Foo/296ab642-162c-4db8-a4ae-9517189e411d/zzzzzzzzzzzzzzzzzzzzzzz'

The error message is:
Error: can't have multiple conditions for a single tag unless they define a range

My expectation is that this should succeed given this is the same format used with Azure Blob Storage.

Here is the full debug log for the request:

2024-12-12T19:23:26.907Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: BlobStorageContextMiddleware: RequestMethod=GET RequestURL=http://127.0.0.1/devstoreaccount1/?comp=blobs&where=%40container%3D%27mycontainer%27%20AND%20MyTag%20%3E%3D%20%27Foo%2F296ab642-162c-4db8-a4ae-9517189e411d%2F%27%20AND%20MyTag%20%3C%20%27Foo%2F296ab642-162c-4db8-a4ae-9517189e411d%2Fzzzzzzzzzzzzzzzzzzzzzzz%27&maxresults=100 RequestHeaders:{"host":"127.0.0.1:10000","x-ms-version":"2021-12-02","accept":"application/xml","x-ms-client-request-id":"d132709d-3f10-4efe-86f5-012ada151d3c","x-ms-return-client-request-id":"true","user-agent":"azsdk-net-Storage.Blobs/12.15.1 (.NET 8.0.11; Microsoft Windows 10.0.22631)","x-ms-date":"Thu, 12 Dec 2024 19:23:26 GMT","authorization":"SharedKey devstoreaccount1:19ZJUVfGgitMomyBXGP8nQJ6+hyxu+SLqhHuhNoTijs=","traceparent":"00-3ea7bda368499cfd04c59fd7c1f64610-1307d5eb337eea57-01"} ClientIP=172.17.0.1 Protocol=http HTTPVersion=1.1
2024-12-12T19:23:26.908Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: BlobStorageContextMiddleware: Account=devstoreaccount1 Container= Blob=
2024-12-12T19:23:26.908Z e639ab74-1c14-41ac-97ca-86d1587c49f1 verbose: DispatchMiddleware: Dispatching request...
2024-12-12T19:23:26.909Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: DispatchMiddleware: Operation=Service_FilterBlobs
2024-12-12T19:23:26.910Z e639ab74-1c14-41ac-97ca-86d1587c49f1 verbose: AuthenticationMiddlewareFactory:createAuthenticationMiddleware() Validating authentications.
2024-12-12T19:23:26.910Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: PublicAccessAuthenticator:validate() Start validation against public access.
2024-12-12T19:23:26.910Z e639ab74-1c14-41ac-97ca-86d1587c49f1 debug: PublicAccessAuthenticator:validate() Getting account properties...
2024-12-12T19:23:26.910Z e639ab74-1c14-41ac-97ca-86d1587c49f1 debug: PublicAccessAuthenticator:validate() Retrieved account name from context: devstoreaccount1, container: , blob:
2024-12-12T19:23:26.912Z e639ab74-1c14-41ac-97ca-86d1587c49f1 debug: PublicAccessAuthenticator:validate() Skip public access authentication. Cannot get public access type for container
2024-12-12T19:23:26.913Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: BlobSharedKeyAuthenticator:validate() Start validation against account shared key authentication.
2024-12-12T19:23:26.914Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: BlobSharedKeyAuthenticator:validate() [STRING TO SIGN]:"GET\n\n\n\n\n\n\n\n\n\n\n\nx-ms-client-request-id:d132709d-3f10-4efe-86f5-012ada151d3c\nx-ms-date:Thu, 12 Dec 2024 19:23:26 GMT\nx-ms-return-client-request-id:true\nx-ms-version:2021-12-02\n/devstoreaccount1/devstoreaccount1/\ncomp:blobs\nmaxresults:100\nwhere:@container='mycontainer' AND MyTag >= 'Foo/296ab642-162c-4db8-a4ae-9517189e411d/' AND MyTag < 'Foo/296ab642-162c-4db8-a4ae-9517189e411d/zzzzzzzzzzzzzzzzzzzzzzz'"
2024-12-12T19:23:26.915Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: BlobSharedKeyAuthenticator:validate() Calculated authentication header based on key1: SharedKey devstoreaccount1:19ZJUVfGgitMomyBXGP8nQJ6+hyxu+SLqhHuhNoTijs=
2024-12-12T19:23:26.915Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: BlobSharedKeyAuthenticator:validate() Signature 1 matched.
2024-12-12T19:23:26.915Z e639ab74-1c14-41ac-97ca-86d1587c49f1 verbose: DeserializerMiddleware: Start deserializing...
2024-12-12T19:23:26.916Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: HandlerMiddleware: DeserializedParameters={"options":{"where":"@container='mycontainer' AND MyTag >= 'Foo/296ab642-162c-4db8-a4ae-9517189e411d/' AND MyTag < 'Foo/296ab642-162c-4db8-a4ae-9517189e411d/zzzzzzzzzzzzzzzzzzzzzzz'","maxresults":100,"include":[],"requestId":"d132709d-3f10-4efe-86f5-012ada151d3c"},"comp":"blobs","version":"2021-12-02"}
2024-12-12T19:23:26.917Z e639ab74-1c14-41ac-97ca-86d1587c49f1 error: ErrorMiddleware: Received an error, fill error information to HTTP response
2024-12-12T19:23:26.918Z e639ab74-1c14-41ac-97ca-86d1587c49f1 error: ErrorMiddleware: ErrorName=Error ErrorMessage=can't have multiple conditions for a single tag unless they define a range ErrorStack="Error: can't have multiple conditions for a single tag unless they define a range\n    at QueryParser.validateWithPreviousComparison (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:81:27)\n    at QueryParser.visitBinary (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:245:26)\n    at QueryParser.visitExpressionGroup (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:208:25)\n    at QueryParser.visitUnary (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:189:28)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:170:27)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:173:32)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:173:32)\n    at QueryParser.visitOr (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:149:27)\n    at QueryParser.visitExpression (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:139:21)\n    at QueryParser.visitQuery (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:126:27)"
2024-12-12T19:23:26.918Z e639ab74-1c14-41ac-97ca-86d1587c49f1 error: ErrorMiddleware: Set HTTP code: 500
2024-12-12T19:23:26.918Z e639ab74-1c14-41ac-97ca-86d1587c49f1 info: EndMiddleware: End response. TotalTimeInMS=11 StatusCode=500 StatusMessage=undefined Headers={"server":"Azurite-Blob/3.33.0"}
@blueww blueww added blob-storage alignment Alignment between Azurite with Azure Storage production labels Dec 16, 2024
@blueww
Copy link
Member

blueww commented Dec 16, 2024

@EmmaZhu
Would you please help to look at the blob tag issue?

@tobiasxg
Copy link

tobiasxg commented Jan 13, 2025

I am using Azurite 3.33.0 in my Python project, but I am encountering issues when attempting to query blobs with multiple greater than / smaller than conditions on the same tag, which does work on the actual blob storage. Specifically, the query fails with an HttpResponseError (Internal Server Error) mentioning the errorMessage can't have multiple conditions for a single tag unless they define a range, which should work in this instance.

The following query works fine for filtering by a single tag:

blob_service_client = BlobServiceClient(
    account_url=azure_blob_storage_endpoint, credential=credentials
)

container_client = blob_service_client.get_container_client(
    container=container_name
)

start_year = 2012

query = f"\"year\">='{start_year}'"
next(container_client.find_blobs_by_tags(filter_expression=query))["name"]

However, when I attempt to combine multiple conditions using AND, like this:

start_year = 2012
end_year = 2022

query = (
    f"\"year\">='{start_year}' AND \"year\"<='{end_year}'"
)
next(container_client.find_blobs_by_tags(filter_expression=query))["name"]

I get the following error:
azure.core.exceptions.HttpResponseError: Internal Server Error
ErrorCode: None

The debug shows:

2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 info: HandlerMiddleware: DeserializedParameters={"options":{"where":"\"year\">='2012' AND \"year\"<='2022'","include":[],"requestId":"65c52b18-d417-11ef-84c3-c84bd64ac0da"},"restype":"container","comp":"blobs","version":"2025-01-05"}
2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 error: ErrorMiddleware: Received an error, fill error information to HTTP response
2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 error: ErrorMiddleware: ErrorName=Error ErrorMessage=can't have multiple conditions for a single tag unless they define a range ErrorStack="Error: can't have multiple conditions for a single tag unless they define a range\n    at QueryParser.validateWithPreviousComparison (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:81:27)\n    at QueryParser.visitBinary (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:249:26)\n    at QueryParser.visitExpressionGroup (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:208:25)\n    at QueryParser.visitUnary (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:189:28)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:170:27)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:173:32)\n    at QueryParser.visitOr (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:149:27)\n    at QueryParser.visitExpression (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:139:21)\n    at QueryParser.visitQuery (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:126:27)\n    at QueryParser.visit (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:118:21)"
2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 error: ErrorMiddleware: Set HTTP code: 500
2025-01-16T14:37:22.861Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 info: EndMiddleware: End response. TotalTimeInMS=5 StatusCode=500 StatusMessage=undefined Headers={"server":"Azurite-Blob/3.33.0"}

@blueww
Copy link
Member

blueww commented Jan 14, 2025

@tobiasxg

Would you please share the Azurite debug log for this success and failed request? (run Azurite with parameter like "-d c:\temp\debug.log")

@EmmaZhu
Would you please help to look at the tag filter issue?

@blueww
Copy link
Member

blueww commented Jan 15, 2025

@blueww

I can get these docker logs:

172.17.0.1 - - [14/Jan/2025:14:09:59 +0000] "GET /devstoreaccount1/testcontainer?restype=container&comp=blobs&where=%22category_tag%22%3D%27valueA%27 HTTP/1.1" 200 -
172.17.0.1 - - [14/Jan/2025:14:10:54 +0000] "GET /devstoreaccount1/testcontainer?restype=container&comp=blobs&where=%22category_tag%22%3D%27ValueA%27%20AND%20%22custom_tag%22%3D%27valueB%27 HTTP/1.1" 500 -
172.17.0.1 - - [14/Jan/2025:14:11:11 +0000] "GET /devstoreaccount1/testcontainer?restype=container&comp=blobs&where=%22category_tag%22%3D%27ValueA%27%20AND%20%22custom_tag%22%3D%27valueB%27 HTTP/1.1" 500 -

This is not debug log.
To generate Azurite debug log, you need run Azurite with "-d [debugLogPath]" parameter.
As you run Azurite in docker, you need start Azurite in docker with :

  1. "-v c:/azurite:/workspace" map host machine folder c:/azurite as Azurite's workspace location. (you can use other host machine folder )
  2. "-d /workspace/debug.log", generate debug log in workspace location, then you will find it in host path c:/azurite/debug.log

Following is the sample commandline:

docker run -p 10000:10000 -p 10001:10001 -p 10002:10002 -v c:/azurite:/workspace mcr.microsoft.com/azure-storage/azurite azurite --blobHost 0.0.0.0 --queueHost 0.0.0.0 --tableHost 0.0.0.0 -d /workspace/debug.log

@tobiasxg
Copy link

@blueww
I can get these docker logs:

172.17.0.1 - - [14/Jan/2025:14:09:59 +0000] "GET /devstoreaccount1/testcontainer?restype=container&comp=blobs&where=%22category_tag%22%3D%27valueA%27 HTTP/1.1" 200 -
172.17.0.1 - - [14/Jan/2025:14:10:54 +0000] "GET /devstoreaccount1/testcontainer?restype=container&comp=blobs&where=%22category_tag%22%3D%27ValueA%27%20AND%20%22custom_tag%22%3D%27valueB%27 HTTP/1.1" 500 -
172.17.0.1 - - [14/Jan/2025:14:11:11 +0000] "GET /devstoreaccount1/testcontainer?restype=container&comp=blobs&where=%22category_tag%22%3D%27ValueA%27%20AND%20%22custom_tag%22%3D%27valueB%27 HTTP/1.1" 500 -

This is not debug log. To generate Azurite debug log, you need run Azurite with "-d [debugLogPath]" parameter. As you run Azurite in docker, you need start Azurite in docker with :

1. "-v c:/azurite:/workspace" map host machine folder c:/azurite as Azurite's workspace location. (you can use other host machine folder )

2. "-d /workspace/debug.log", generate debug log in workspace location, then you will find it in host path c:/azurite/debug.log

Following is the sample commandline:

docker run -p 10000:10000 -p 10001:10001 -p 10002:10002 -v c:/azurite:/workspace mcr.microsoft.com/azure-storage/azurite azurite --blobHost 0.0.0.0 --queueHost 0.0.0.0 --tableHost 0.0.0.0 -d /workspace/debug.log

The logs indicate that it is caused by using multiple conditions for a single tag. However it is to find blobs in a specific range, which works on the actual blob storages.

2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 info: HandlerMiddleware: DeserializedParameters={"options":{"where":"\"year\">='2012' AND \"year\"<='2022'","include":[],"requestId":"65c52b18-d417-11ef-84c3-c84bd64ac0da"},"restype":"container","comp":"blobs","version":"2025-01-05"}
2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 error: ErrorMiddleware: Received an error, fill error information to HTTP response
2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 error: ErrorMiddleware: ErrorName=Error ErrorMessage=can't have multiple conditions for a single tag unless they define a range ErrorStack="Error: can't have multiple conditions for a single tag unless they define a range\n    at QueryParser.validateWithPreviousComparison (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:81:27)\n    at QueryParser.visitBinary (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:249:26)\n    at QueryParser.visitExpressionGroup (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:208:25)\n    at QueryParser.visitUnary (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:189:28)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:170:27)\n    at QueryParser.visitAnd (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:173:32)\n    at QueryParser.visitOr (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:149:27)\n    at QueryParser.visitExpression (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:139:21)\n    at QueryParser.visitQuery (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:126:27)\n    at QueryParser.visit (/opt/azurite/dist/src/blob/persistence/QueryInterpreter/QueryParser.js:118:21)"
2025-01-16T14:37:22.860Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 error: ErrorMiddleware: Set HTTP code: 500
2025-01-16T14:37:22.861Z 5d5a823f-8ca7-463c-8506-fae5aa4382a2 info: EndMiddleware: End response. TotalTimeInMS=5 StatusCode=500 StatusMessage=undefined Headers={"server":"Azurite-Blob/3.33.0"}

@EmmaZhu
Copy link
Member

EmmaZhu commented Jan 21, 2025

Hi @aaronenberg-msft , I can reproduce the issue and will fix it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alignment Alignment between Azurite with Azure Storage production blob-storage
Projects
None yet
Development

No branches or pull requests

4 participants