Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow dimension and datasource values to be string #61

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jcmackie
Copy link

Dimension and datasource

While I was playing around with this plugin, I noticed that I was getting JSON unmarshal errors for what should be valid JSON queries.

After digging around I found that this schema validation code only accepts DimensionSpec and Datasource spec objects.
But the query language accepted by Druid does allow plain strings to be used.

I've put some extra validation in which creates the objects manually if plain string values are found

Intervals

A similar thing happened for parsing Intervals.
The parsing code in intervals.go appeared to be expecting a full JSON object:

{"intervals": ["2022-01-10T00:00:00.000/2022-01-20T00:00:00.000"]}

But the code that was calling intervals.go:Load() was only passing the value. Which caused the unmarshal to fail.

Let me know if this looks okay to you, or if there's anything you want to change.

Change intervals value to be loaded as string array
Comment on lines +58 to +61
d.(*Base).SetType("Default")
d.(*Base).SetDimension(tstr)
d.(*Base).SetOutputName(tstr)
d.(*Base).SetOutputType(types.String)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The (*Base) shouldn't be required in this context, as it's already a concrete type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remove the (*Base) type assertion, the compiler complains.
image

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/facepalm

My bad, I didn't realize d already existed, I read it as d := &Base{}. Keep it as is.

default:
return nil, errors.New("unsupported intervals type")
// Now cast the only array item into an actual "interval"
interval := Interval(intv[0])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no guarantee that we have an item, it could be []. We need to be more defensive here - remember, it's user supplied data.

Also, does the spec say it's only one?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the Druid documentation for the query spec is... not very specific (from what I've seen).
The GroupBy documentation has this to say:

intervals | A JSON Object representing ISO-8601 Intervals. This defines the time ranges to run the query over.
The code example is a single object array

"intervals": [ "2012-01-01T00:00:00.000/2012-01-03T00:00:00.000" ],

However some other documentation shows that multiple time ranges are acceptable.

Returns a list of all segments, overlapping with any of given intervals, for a datasource as stored in the metadata store. Request body is array of string IS0 8601 intervals like [interval1, interval2,...] for example ["2012-01-01T00:00:00.000/2012-01-03T00:00:00.000", "2012-01-05T00:00:00.000/2012-01-07T00:00:00.000"]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is a good point, the code should be able to handle multiple intervals.

@tiedotguy
Copy link

I had a further think about this, and I think it's worth taking a step back and asking/confirming exactly what is being done by this library. From the way I understand it, all of this loading is to validate the schema, optionally take modifications, and then submit it. It also allows building a query from the ground up in an object based way, and submitting that.

If we reduce the problem to just "validate the schema" though, then we're in a tricky position because if we decode / encode, then we don't have the same data - we're not validating, we've also changed strings to structs. It may not make a difference to druid (I'm not familiar enough with it), but if it ever does, then the user will be left confused at the result, because what they're entering isn't what's being executed.

I think we might be able to handle it by doing something like...

type DimensionString string

func (ds DimensionString) Type() builder.ComponentType {
  return "string"
}

...
	if err := json.Unmarshal(data, &t); err != nil {
		// the datasource may be a string which is supported by Druid
		var dim DimensionString
		if err := json.Unmarshal(data, &dim); err != nil {
			return nil, err
		}
		return dim, nil
	}
...

This will allow us to return a basic string instead of a structure, and I think it will reserialize properly, exactly matching the input.

I don't know the exact goals of this project though, but food for thought.

@jbguerraz
Copy link
Contributor

Thank you for such initiative!
This short / full spec is a thing we already faced & discussed (on grafadruid.slack.com => you're welcome!). At that time we decided to rely on full spec for the sake of simplicity (and since most - if not all - users until now have control on queries it was not an issue)
That being said, it would be awesome to support both variants.
I'll find a time to think more about it and come back to you!

@jcmackie
Copy link
Author

jcmackie commented Feb 1, 2022

Thanks @jbguerraz.

Okay I figured it was a concious choice to support full spec. But I couldn't find anything in the README of either go-druid or grafadruid. I can't seem to log into your Slack workspace so, I'm not sure if it's open or not...

In any case, are you okay if I continue working on this branch based on the comments currently put on by my teammate?

I will also probably be submitting a PR to support quantilesDoubleSketch soonish, I hope: grafadruid/druid-grafana#94

@jcmackie
Copy link
Author

jcmackie commented Feb 1, 2022

@tiedotguy your last comment makes a lot of sense, and a much more logical way of doing this.
I'll wait to see if the Grafadruid team want to go down this path before putting that much work into this...

@jbguerraz
Copy link
Contributor

Hello mates,

The proposal looks good! Thank you!
I've updated the slack invite link, here it is as well https://join.slack.com/t/grafadruid/shared_invite/zt-12x4szeeb-PbOhhlOo61Xc1rxv_q161A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants