Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Eventhub target #63

Merged
merged 2 commits into from
Jul 23, 2021
Merged

Add Eventhub target #63

merged 2 commits into from
Jul 23, 2021

Conversation

colmsnowplow
Copy link
Collaborator

PR for an eventHub target

I haven't managed to find an easily available option to mock an eventhub for testing... I notice that we don't have one for pubsub either - so I think I'll need some input on the best way to approach implementing some kind of unit or integration test for a target like this.

ehBatch[i] = ehEvent
}

if !eht.batching { // Defaults to using non-batch when nothing is provided
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to support batching and single event emission? Why not just have batching as the only option here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, I figured that batching would likely result in increased latency, which may be problematic for some use cases - which his how it is for the kafka target, where we support both.

Some initial rough tests have shown that this might not actually be the case - configuring it to send batches of 1 seems to be about the same as sending individual events. However I'm still not sure until I load test for it so I figured it's better to have both available than need to do a subsequent release to resolve an issue with just picking one.

batching: cfg.Batching,
batchByteLimit: cfg.BatchByteLimit,

log: log.WithFields(log.Fields{"target": "eventhub", "cloud": "AWS", "namespace": os.Getenv("EVENTHUB_NAMESPACE"), "eventhub": os.Getenv("EVENTHUB_NAME")}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getenv is not always super safe as you might not get a legit value out - is there any risk that things will blow up if the env vars are not set? What kind of errors does stream replicator toss out if these values are empty?

Copy link
Collaborator Author

@colmsnowplow colmsnowplow Jul 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will fail to initialise the client before it gets to this point if these env vars aren't set. The error thrown would look like this:

ERRO[0000] environment var EVENTHUB_NAMESPACE must not be empty error="environment var EVENTHUB_NAMESPACE must not be empty"

However, I recognise that this is outside the error handling pattern we have for all the other config options, so indeed perhaps it's best not to use env vars to initialise the client, and to handle these via the config object...

@colmsnowplow colmsnowplow changed the base branch from master to release/0.5.0 July 19, 2021 11:11
Copy link
Contributor

@paulboocock paulboocock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

cmd/config.go Outdated
// FailureEventHubTargetConfig configures the destination for records consumed
type FailureEventHubTargetConfig struct {
EventHubNamespace string `env:"FAILURE_TARGET_EVENTHUB_NAMESPACE"` // REQUIRED - namespace housing Eventhub
EventHubName string `env:"FAILURE_TARGET_EVENTHUB_NAME"` // REQUIRED - namespace housing Eventhub
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
EventHubName string `env:"FAILURE_TARGET_EVENTHUB_NAME"` // REQUIRED - namespace housing Eventhub
EventHubName string `env:"FAILURE_TARGET_EVENTHUB_NAME"` // REQUIRED - name housing Eventhub

batching: cfg.Batching,
batchByteLimit: cfg.BatchByteLimit,

log: log.WithFields(log.Fields{"target": "eventhub", "cloud": "AWS", "namespace": cfg.EventHubNamespace, "eventhub": cfg.EventHubName}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
log: log.WithFields(log.Fields{"target": "eventhub", "cloud": "AWS", "namespace": cfg.EventHubNamespace, "eventhub": cfg.EventHubName}),
log: log.WithFields(log.Fields{"target": "eventhub", "namespace": cfg.EventHubNamespace, "eventhub": cfg.EventHubName}),

successes = messages
}

eht.log.Debugf("Successfully wrote %d/%d messages", len(successes), len(messages))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We print this twice, here and line 98. This makes it look broken in the logs, like its emitting the events twice.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair - I was mimicking the kinesis target, which prints one message per chunk, and one message when all chunks are complete. Changing to ensure that the messages are different for clarity.

@paulboocock
Copy link
Contributor

I've noticed that if I don't set the Azure libraries Env Vars (such as key name and value), then the call to create the Event Hub simply hangs. This seems rather unfortunate and I can imagine a scenario where we spin this up and don't realise it's not processing events.
I'm not sure if there's a way to fix the library from hanging but maybe we could check at least one of the two types of auth env vars are set.

pkg/target/eventhub.go Outdated Show resolved Hide resolved
@colmsnowplow
Copy link
Collaborator Author

Created an issue in the client library repo about that issue with it hanging. Azure/azure-event-hubs-go#230

Just adding tests for the new checks on env vars, should be good to go shortly.

_, azCertPathPresent := os.LookupEnv("AZURE_CERTIFICATE_PATH")
_, azCertPwrdPresent := os.LookupEnv("AZURE_CERTIFICATE_PASSWORD")

if !(connStringPresent || (keyNamePresent && keyValuePresent) || (tenantIDPresent && clientIDPresent && ((azCertPathPresent && azCertPwrdPresent) || clientSecretPresent))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man, thats nasty. I hope they fix the hang in the library 😂 The ANDs and ORs look good to me here though 👍

@colmsnowplow colmsnowplow merged commit 27dcf58 into release/0.5.0 Jul 23, 2021
@colmsnowplow colmsnowplow deleted the feat/eventHubs branch July 23, 2021 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants