Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 Select mocking support #35

Open
iori-yja opened this issue May 28, 2019 · 4 comments
Open

S3 Select mocking support #35

iori-yja opened this issue May 28, 2019 · 4 comments

Comments

@iori-yja
Copy link

Hello,

I am wondering if it's possible to deal with a code which uses "S3 Select" feature.

S3 Select is an S3 feature to get contents with SQL query. It is, in general, a tough challenge with a great benefit to test a query like SQL which leads its query complex, and S3 Select is something special in testing difficulty. S3 Select is used with large aws-sdk's interface and its backend is locally unavailable.

My questions:

  • Is it in gofakes3's scope of concern to support select feature?
  • If so, how the implementation supposed to be?

I am not sure even it is good idea to use other project's SQL parser for this purpose.

@shabbyrobe
Copy link
Collaborator

shabbyrobe commented May 28, 2019

Is it in gofakes3's scope of concern to support select feature?

Yep I think so, but I think it will be difficult to get right. Compliance will be especially tricky - even just the list of different errors that the S3 SQL parser can emit is very long, and that's before we start getting into all the corner cases of the expression parsing. It'll take a long time to replicate the behaviour exactly. Doesn't mean we shouldn't try, but we should be very up front with users that GoFakeS3's implementation will not be able to be accurate for a long time, if ever.

If so, how the implementation supposed to be?

I've got a branch here that shows how I think it should be structured, though this code is in no way ready for release yet and I haven't actually got the SQL part of it up and firing yet, just the structure: https://github.com/johannesboyne/gofakes3/compare/master...shabbyrobe:feature/s3-select?expand=1

It has very minimal impact on the existing code though. S3 Select is quite broad so I think we should focus on shipping a version with support for a small set of use cases and encouraging people to make their own more tailored plugin if they need more.

I am not sure even it is good idea to use other project's SQL parser for this purpose.

I think the plugin-based structure in this PR means that that's probably a good place to start if we can - the core of GoFakeS3 is unaffected, we can move the code that depends on a 3rd-party SQL parser off to the side. I started looking at this one, but it chokes on the select field semi-JMESPath syntax used by S3 Select; SELECT s._1, s._2 FROM S3Object[*].Pants[*] s WHERE s._3 > 100 somewhat unsurprisingly fails parsing with this error: syntax error at position 33 near '['.

@shabbyrobe
Copy link
Collaborator

Same with github.com/araddon/qlbridge. I suppose I shouldn't be surprised; S3 Select is a bit outside the standard. I think we'll probably have to either hand-roll a parser or borrow the bulk of one from an existing project; I'd prefer the latter if possible. I'll take a look around inside qlbridge to see if the guts of it can be airlifted in.

@iori-yja
Copy link
Author

iori-yja commented May 31, 2019

Nice to see your implementation! That's big.

It'll take a long time to replicate the behaviour exactly

Yes, it must be. For testing purpose, it is not always needed to be exact same error message, though. I believe that handling the working example of let's say, select, will come on the most needed part at first, especially seeing whether where part logic works as intended, for example.

Parsing S3 Select query is quite difficult because it is outside the standard as you pointed out but minio has already implemented [https://github.com/minio/minio/blob/master/pkg/s3select/sql/parser.go]. The approach of parser implementation is quite different with your PEG's one, yet I wonder if it is possible to borrow from them.

@shabbyrobe
Copy link
Collaborator

Oh that's a good find, I'll see if I can airlift minio's implementation directly. The PEG was just a "let's get started quickly" experiment. I've got the absolute basics almost working locally, but if minio's works nicely out of the box with the structure I added the other day, it makes no sense to continue with my own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants