-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate config Go code from schema #10694
base: main
Are you sure you want to change the base?
Conversation
25932da
to
81cb055
Compare
SendBatchMaxSize int `json:"send_batch_max_size,omitempty" yaml:"send_batch_max_size,omitempty" mapstructure:"send_batch_max_size,omitempty"` | ||
|
||
// SendBatchSize corresponds to the JSON schema field "send_batch_size". | ||
SendBatchSize int `json:"send_batch_size,omitempty" yaml:"send_batch_size,omitempty" mapstructure:"send_batch_size,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go-jsonschema doesn't seem to use uint
. In theory, the minimum: 0
in the schema would be a good enough workaround but it looks like it's not supported right now.
Would you be happy if instances of uint
are replaced with int
+ a validation that the number is >= 0
?
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
81cb055
to
4c62d16
Compare
processor/batchprocessor/config.go
Outdated
if v, ok := raw["timeout"]; !ok || v == nil { | ||
var err error | ||
plain.Timeout, err = time.ParseDuration("200ms") | ||
if err != nil { | ||
return err | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I cheated a little bit. The original code was:
if v, ok := raw["timeout"]; !ok || v == nil {
plain.Timeout = "200ms"
}
I edited it a bit so that it compiles. I will have to update the upstream PR for time.Duration
support to generate this kind of code correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose that we add files like this whenever we need to augment the auto-generated code with some additional behaviour. For example, in this file we added Validate()
and createDefaultConfig()
functions.
send_batch_size: | ||
type: integer | ||
minimum: 0 | ||
default: 8192 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting a default value is important, because otherwise g--jsonschema will autogenerate an *int
. Similarly, for string
properties which don't have a default value it'd generate a *string
. If we do set a default value, we will get an int
and a string
, with no pointers.
processor/batchprocessor/config.go
Outdated
Timeout time.Duration `mapstructure:"timeout"` | ||
// MetadataCardinalityLimit corresponds to the JSON schema field | ||
// "metadata_cardinality_limit". | ||
MetadataCardinalityLimit int `mapstructure:"metadata_cardinality_limit,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to check whether having the omitempty
is a problem. OTel doesn't use it much right now, but it looks like go-jsonschema always puts it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think omitempty is used by mapstructure at all.
processor/batchprocessor/config.go
Outdated
// Default value is 0, that means no maximum size. | ||
SendBatchMaxSize uint32 `mapstructure:"send_batch_max_size"` | ||
// SendBatchMaxSize corresponds to the JSON schema field "send_batch_max_size". | ||
SendBatchMaxSize int `mapstructure:"send_batch_max_size,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care much that the uint
is now an int
?
I suppose we don't, because we can add a check to make sure it's not negative. Also, I doubt that the integers people configure are so large that an int
wouldn't fit them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do - at the very least we cannot change types like this. We need to deprecate and move over with a set of changes over multiple releases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could update my fork of go-jsonschema
to use uint whenever there is a minimum: 0
or an exclusiveMinimum: 0
? It seems like a reasonable feature - maybe it'll also get accepted upstream.
send_batch_max_size: | ||
type: integer | ||
default: 0 | ||
minimum: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, go-jsonschema
's main
branch doesn't do anything when it sees constraints such as minimum
and maximum
. However, I patched my own fork of go-jsonschema
so that it supports it. You can see in the config.go
that there are checks to make sure some integers are not negative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can keep uint32 and we won't have to set such constraints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Json schema itself doesn't seem to have an unsigned integer type. It only has integer
and number
(float) types, which can have various constraints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @atoulme @ptodev , go-jsonschema maintainer here. I have just merged a pr that adds support for numeric validation (multipleOf, maximum, exclusiveMaximum, minimum, exclusiveMinimum) and string patterns in main. But there is still no support for uint types. I am open to discuss it if it is needed, maybe we can fit it in with a bit of work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's neat and appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @omissis, thank you so much for taking a look! I just realised that go-jsonschema already has a goJSONSchema
property which allows users to override the type. I think it should be sufficient for our purposes.
Hi @atoulme! I updated the PR to use things like squashing structs, using uint, and using opaque string types. You can do it with:
goJSONSchema:
type: uint32
And:
goJSONSchema:
type: configopaque.String
imports:
- go.opentelemetry.io/collector/config/configopaque
There are examples in the metadata.yaml files.
Are you happy with the way the uints and opaque strings are set up? If you are, I'll resolve this GitHub discussion.
|
||
// Config defines configuration for OTLP/HTTP exporter. | ||
type Config struct { | ||
confighttp.ClientConfig `mapstructure:",squash"` // squash ensures fields are correctly decoded in embedded struct. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I can't find a good way to schematise squashed fields. This is my main blocker right now.
I was hoping that we can use something like anyOf, but judging by some examples, it also looks awkward.
For now, I just listed all the squashed properties explicitly in metadata.yaml
. Does anyone know of a better way? I'm not sure what folks working with Go code normally do in such situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @atoulme! Good news! 🥳 I was able to find a way to create squashed structs with go-jsonschema. I will update this PR next week, once I polish my code a bit. The new generated code will look a lot more like the current OTel code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
cmd/mdatagen/go.mod
Outdated
@@ -100,3 +113,5 @@ replace go.opentelemetry.io/collector/internal/globalgates => ../../internal/glo | |||
replace go.opentelemetry.io/collector/consumer/consumerprofiles => ../../consumer/consumerprofiles | |||
|
|||
replace go.opentelemetry.io/collector/consumer/consumertest => ../../consumer/consumertest | |||
|
|||
replace github.com/atombender/go-jsonschema => github.com/ptodev/go-jsonschema v0.0.0-20240813163654-5518ba93ee84 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to fork go-jsonschema
, because it didn't contain some of the features which I needed. I hope we can upstream those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be best, especially if the maintainer of go-jsonschema
is involved in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @ptodev now that the support for numeric comparison has landed, do you think we can look at the duration? so that would be one less blocker here ?
config/confighttp/confighttp.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to rewrite these shared configs in yaml.
exporter/otlpexporter/otlp.go
Outdated
@@ -65,7 +65,7 @@ func (e *baseExporter) start(ctx context.Context, host component.Host) (err erro | |||
e.metricExporter = pmetricotlp.NewGRPCClient(e.clientConn) | |||
e.logExporter = plogotlp.NewGRPCClient(e.clientConn) | |||
headers := map[string]string{} | |||
for k, v := range e.config.ClientConfig.Headers { | |||
for k, v := range e.config.Headers { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert this change
processor/batchprocessor/config.go
Outdated
if cfg.SendBatchMaxSize > 0 && cfg.SendBatchMaxSize < cfg.SendBatchSize { | ||
return errors.New("send_batch_max_size must be greater or equal to send_batch_size") | ||
// UnmarshalJSON implements json.Unmarshaler. | ||
func (j *Config) UnmarshalJSON(b []byte) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use UnmarshalJSON
at all. We use the Validate
function to validate.
exporter/otlpexporter/config.go
Outdated
} | ||
|
||
func (c *Config) sanitizedEndpoint() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this functionality achieved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment it is not. The migration of otlpexporter
and otlphttpexporter
to Json schema isn't complete, due to a blocker I hit with migrating squashed fields. I've been hoping we can find a solution to that blocker before we I fix the functionality of otlpexporter
and otlphttpexporter
.
But in general, we could put functions like this in .go
helper files which are not generated by Json schema.
This PR is too big - please start with something much smaller. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is too big - please start with something much smaller.
Hi, @atoulme, thank you for your review! I agree that the PR is very large. The main reason for this is to demonstrate how using shared configuration could work between components. This was raised by @yurishkuro in the previous PR. The HTTP server/client settings are the primary example of shared config that I could find in the codebase.
Unfortunately, while working on shared configs, I realised that I don't know how to represent squashed fields in a clean way. I've been hoping that some of the OTel maintainers may have experience with Json schema and that the solution may be obvious to those folks.
exporter/otlpexporter/config.go
Outdated
} | ||
|
||
func (c *Config) sanitizedEndpoint() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment it is not. The migration of otlpexporter
and otlphttpexporter
to Json schema isn't complete, due to a blocker I hit with migrating squashed fields. I've been hoping we can find a solution to that blocker before we I fix the functionality of otlpexporter
and otlphttpexporter
.
But in general, we could put functions like this in .go
helper files which are not generated by Json schema.
processor/batchprocessor/config.go
Outdated
// Default value is 0, that means no maximum size. | ||
SendBatchMaxSize uint32 `mapstructure:"send_batch_max_size"` | ||
// SendBatchMaxSize corresponds to the JSON schema field "send_batch_max_size". | ||
SendBatchMaxSize int `mapstructure:"send_batch_max_size,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could update my fork of go-jsonschema
to use uint whenever there is a minimum: 0
or an exclusiveMinimum: 0
? It seems like a reasonable feature - maybe it'll also get accepted upstream.
send_batch_max_size: | ||
type: integer | ||
default: 0 | ||
minimum: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Json schema itself doesn't seem to have an unsigned integer type. It only has integer
and number
(float) types, which can have various constraints.
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
b6a3daf
to
b3a0e99
Compare
goJSONSchema is a feature which already exists upstream. goType doesn't exist upstream. I added it, because I didn't know goJSONSchema exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @omissis, thank you very much for helping, and for maintaining such a useful project! I think the most important things from a Collector point of view are:
- Having a way to use schema that doesn't come from a file.
- Generating a SetDefaults function. We'd also need some sort of
Validate
function, but I'm not yet sure what that would look like. - Disabling omitempty.. I'm not 100% sure if it impacts us negatively, but it seems like a good feature to have to minimise disruption to codebases.
- Subschemas, so that squashed structs would work.
- A duration format, so that Go types don't change.
- Generating maps without declaring a custom type.
Apologies, I realise this is a lot 😅 There is no urgency, and I'd be happy to help! I'm happy to polish my go-jsonschema PRs, as long as we agree that they are going in the right direction :)
Hi @atoulme - I think the most important thing we need to decide right now is how we'd like to generate the SetDefaults
and Validate
functions. Currently, go-jsonschema doesn't generate such functions. Please feel free to opine on the thread.
send_batch_max_size: | ||
type: integer | ||
default: 0 | ||
minimum: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @omissis, thank you so much for taking a look! I just realised that go-jsonschema already has a goJSONSchema
property which allows users to override the type. I think it should be sufficient for our purposes.
Hi @atoulme! I updated the PR to use things like squashing structs, using uint, and using opaque string types. You can do it with:
goJSONSchema:
type: uint32
And:
goJSONSchema:
type: configopaque.String
imports:
- go.opentelemetry.io/collector/config/configopaque
There are examples in the metadata.yaml files.
Are you happy with the way the uints and opaque strings are set up? If you are, I'll resolve this GitHub discussion.
// SetDefaults sets the fields of Config to their defaults. | ||
// Fields which do not have a default value are left untouched. | ||
func (c *Config) SetDefaults() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @atoulme, would you be happy with an autogenerated SetDefaults()
function such as this one?
I think we will also need a func (c *Config) Validate()
function. It could check for constraints such as whether a number is within a certain range. I'm not yet sure how (and if) it would check for required properties though.
cc @omissis
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Hi @atoulme 👋 Would you be open to using a forked version of go-jsonschema? We haven't had traction on the upstream PRs yet. There are many things we need to upstream, and it will almost certainly take a while for that to happen. If we use a forked version, I could make a smaller PR which is mergable, and then gradually will try to expand it to more and more components. Such a change would be low-risk - in case of issues with the code generation, commenting out the schema in metadata.yaml should disable it. The main issue with a forked repo is that it is not clear who should host it and maintain it. I could host it on my own GitHub profile? Or we could use the OpenTelemetry one? Alternatively, I could try to embed the generation code into the Collector, similarly to the earlier PR? I've been trying to avoid doing that though, since it makes it harder to upstream changes. |
What do you mean by that? I spot checked a couple PRs, they are in Draft state, sometimes with CI breaking / no tests. I can understand your comment if you're waiting on maintainers to approve the direction of those PRs before investing more time. But if the alternative is to use a fork, then those PR still need to be completed to pass CI and have the required quality before we would consider taking a dependency on such fork in the collector. |
Hi @yurishkuro, yes, I agree that I'd have to polish these PRs by adding more tests, etc. I haven't done that yet because I'd prefer to firstly get feedback from the maintainers on whether I'm going in the right direction with those changes. I sent a reminder message to the upstream maintainer to ask about one of the small (but essential) features that we need. I'm honestly not sure to what extent we can keep a "good quality" fork of go-jsonschema. Some of the features are complicated, and we need some fundamental things like support for subschemas. Since we're just using go-jsonschema to autogenerate code which then has to pass manual review, I hope that might be ok. If the code generation doesn't run correctly, we could just disable it by commenting out the schema. Then we can fix the code generation issue in our own time without blocking Collector PRs. I don't know if this would be acceptable for you though? It could be a problem if a lot of tooling starts to depend on the Collector's schema files... The alternative is to wait for go-jsonschema to support all the things we need. If we want to go with this approach, we could close this PR and then reopen it one day if go-jsonschema has the required features? |
@ptodev agreed, less concerns about high quality of it's only code gen. Still, a fork of a reasonably complex project is something to avoid if possible. |
Hey all. I would like to avoid a fork if possible, and cooperate to incorporate the needed changes so that opentelemetry can use them. I do have some time Friday I can dedicate to a first round of reviews, so let's see what comes out of it. Thanks. |
Thank you, @omissis! I will be on vacation for the next two weeks, but I will take a look at your feedback when I'm back. |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
Description
This is an attempt to generate the
config.go
files from Json schema. The goal of the PR is to delete all theconfig.go
files for each component and replace them with some sort of Yaml or Json schema, which will most likely be sourced from themetadata.yaml
file for each component. This PR uses an earlier PR as a starting point.There are many unknowns related to this project. I will start threads on various problems via file comments and line comments, so that we can keep the discussion organised.
Link to tracking issue
Fixes #9769
Related to #5223