Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(lambda-event-sources): starting position timestamp for kafka #31439

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nikovirtala
Copy link
Contributor

@nikovirtala nikovirtala commented Sep 13, 2024

Issue # (if applicable)

Closes #31808

Reason for this change

It was impossible to start consuming a Kafka topic from a specific point in time.

Description of changes

The user may now set startingPositionTimestamp to ManagedKafkaEventSource and SelfManagedKafkaEventSource to start consuming a Kafka topic from a specific point in time.

Lambda and CloudFormation have supported the functionality for a while, and other stream event sources like KinesisEventSource already had a CDK implementation for it. So, technically, this change is doing nothing new; it is only repeating the pattern that has already been proven to work on other sources.

Description of how you validated changes

The change is tested by adding similar unit tests to other event sources supporting this functionality.

Checklist


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@aws-cdk-automation aws-cdk-automation requested a review from a team September 13, 2024 10:41
@github-actions github-actions bot added p2 valued-contributor [Pilot] contributed between 6-12 PRs to the CDK labels Sep 13, 2024
@aws-cdk-automation aws-cdk-automation added the pr/needs-community-review This PR needs a review from a Trusted Community Member or Core Team Member. label Sep 13, 2024
@moelasmar
Copy link
Contributor

could you please add a new GH issue as a Feature request, and link it to this issue.

Copy link
Contributor

@moelasmar moelasmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this is a feature and not a chore, and it needs update in Readme, and new integration test case

@aws-cdk-automation aws-cdk-automation removed the pr/needs-community-review This PR needs a review from a Trusted Community Member or Core Team Member. label Sep 16, 2024
Copy link
Contributor

@sumupitchayan sumupitchayan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with @moelasmar, this seems like a feat to me - @nikovirtala can you please add an integ test and README documentation on this?

@aws-cdk-automation
Copy link
Collaborator

This PR has been in the CHANGES REQUESTED state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.

@moelasmar moelasmar changed the title chore(lambda-event-sources): starting position timestamp for kafka feat(lambda-event-sources): starting position timestamp for kafka Oct 14, 2024
Copy link
Collaborator

@aws-cdk-automation aws-cdk-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pull request linter has failed. See the aws-cdk-automation comment below for failure reasons. If you believe this pull request should receive an exemption, please comment and provide a justification.

A comment requesting an exemption should contain the text Exemption Request. Additionally, if clarification is needed add Clarification Request to a comment.

@nikovirtala nikovirtala force-pushed the chore/kafka-starting-position-timestamp branch 2 times, most recently from e521983 to ab944c3 Compare October 18, 2024 12:10
@nikovirtala
Copy link
Contributor Author

nikovirtala commented Oct 18, 2024

Agreed with @moelasmar, this seems like a feat to me - @nikovirtala can you please add an integ test and README documentation on this?

README is clear, but creating the integration test for this is a lot of work:

1) there are no existing integration tests for any other Event Source Mappings
2) testing asserting Kafka Event Source Mapping requires creating an MSK cluster or such, which is not as trivial as creating a simple resource like DynamoDB Table (I hope there is something I can borrow from the MSK module, if one exist)

@nikovirtala
Copy link
Contributor Author

nikovirtala commented Oct 18, 2024

Exemption Request

  1. testing Kafka Event Source Mapping requires creating an MSK cluster or such, which is not as trivial as creating a simple resource like DynamoDB Table (I hope there is something I can borrow from the MSK module, if one exist)

Ok, the MSK module is in alpha state, and even that doesn't have an integration test that would create a topic and messages to the topic — which would be required to test this feature 😞

I doubt that creating all that is too much for me.

@nikovirtala nikovirtala force-pushed the chore/kafka-starting-position-timestamp branch 2 times, most recently from 811d9b7 to 81e290b Compare October 18, 2024 13:07
@github-actions github-actions bot added the feature-request A feature should be added or improved. label Oct 18, 2024
@aws-cdk-automation aws-cdk-automation added the pr-linter/exemption-requested The contributor has requested an exemption to the PR Linter feedback. label Oct 18, 2024
@nikovirtala nikovirtala force-pushed the chore/kafka-starting-position-timestamp branch from 81e290b to 19100c2 Compare November 16, 2024 11:40
Copy link

codecov bot commented Nov 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.52%. Comparing base (e4703c1) to head (b287c83).

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #31439   +/-   ##
=======================================
  Coverage   81.52%   81.52%           
=======================================
  Files         224      224           
  Lines       13762    13762           
  Branches     2414     2414           
=======================================
  Hits        11220    11220           
  Misses       2270     2270           
  Partials      272      272           
Flag Coverage Δ
suite.unit 81.52% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
packages/aws-cdk 80.93% <ø> (ø)
packages/aws-cdk-lib/core 82.15% <ø> (ø)

@github-actions github-actions bot added the effort/small Small work item – less than a day of effort label Jan 9, 2025
@nikovirtala
Copy link
Contributor Author

nikovirtala commented Jan 9, 2025

It seems there are zero integration tests in this module. So, can anyone change this module before there is one? 🤔

@nikovirtala nikovirtala force-pushed the chore/kafka-starting-position-timestamp branch 4 times, most recently from 77978e6 to 2114f64 Compare January 9, 2025 15:10
@aws-cdk-automation aws-cdk-automation dismissed their stale review January 9, 2025 15:10

✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.

@mergify mergify bot dismissed moelasmar’s stale review January 9, 2025 15:10

Pull request has been modified.

*
* @default - no timestamp
*/
readonly startingPositionTimestamp?: number;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a minor comment, I know that we already use number as a type for this property in Kinesis, but I was thinking that it is better for customers to use a real time stamp here, and we do calculate the epoch number on behalf of the customers.

I can accept to have this property to keep it aligned with Kinesis, but I prefer to add another property that use Time stamp instead of number .. what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a minor comment, I know that we already use number as a type for this property in Kinesis, but I was thinking that it is better for customers to use a real time stamp here, and we do calculate the epoch number on behalf of the customers.

I can accept to have this property to keep it aligned with Kinesis, but I prefer to add another property that use Time stamp instead of number .. what do you think?

Well, our current internal implementation of this very same functionality is using Date but I chose to use the number here to keep the consistency. I will take a look if I can find a nice way to get them both and we can then choose which looks/feels the best.

If we end up to improve the API here, I can then implement those changes also to the Kinesis source (naturally on another pull request).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @nikovirtala

@moelasmar
Copy link
Contributor

sorry @nikovirtala for not coming back to this PR. I see that you already handled the integ testing point. I left one minor comment.

@nikovirtala
Copy link
Contributor Author

sorry @nikovirtala for not coming back to this PR. I see that you already handled the integ testing point. I left one minor comment.

No worries! I was stuck in thinking that the integration test requirement meant something else than it actually does — I had set the bar "a little" too high in my mind 😅

@nikovirtala nikovirtala force-pushed the chore/kafka-starting-position-timestamp branch from 49c0903 to b8aec1b Compare January 10, 2025 14:45
*
* @default - no timestamp
*/
readonly startingPositionTimestamp?: number | Date;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@moelasmar, here is one way to achieve it but is it very idiomatic to do it that way in CDK?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in CDK we do not accept the union types in L2 any more. Personally I was thinking on replacing the type to be TimeStamp instead of number, but to be align with kinesis, I think we can add another property, and let customers choose which one to use, and for sure they can only use one of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After digging into how this kind of property type change should happen in CDK, I found out that there is no clearly defined convention for it. There is a long discussion, e.g., on Enhanced L1s RFC about using the _v2 pattern, but no consensus on going with it.

So, I suggest that we merge this like this:

I can accept to have this property to keep it aligned with Kinesis

and worry about the API later once there is a clear convention on how to proceed.

The changes I made to test the union type are now reverted.

@moelasmar moelasmar removed the pr-linter/exemption-requested The contributor has requested an exemption to the PR Linter feedback. label Jan 10, 2025
@nikovirtala nikovirtala force-pushed the chore/kafka-starting-position-timestamp branch from 8577db4 to c8245d4 Compare January 23, 2025 11:10
@nikovirtala nikovirtala requested a review from moelasmar January 23, 2025 11:22
@nikovirtala nikovirtala force-pushed the chore/kafka-starting-position-timestamp branch from c8245d4 to b287c83 Compare January 23, 2025 13:54
@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: b287c83
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation aws-cdk-automation added the pr/needs-community-review This PR needs a review from a Trusted Community Member or Core Team Member. label Jan 23, 2025
@samson-keung samson-keung self-assigned this Feb 19, 2025
Copy link
Contributor

@samson-keung samson-keung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a small comment, then I think it will be good to merge!


if (props.startingPosition !== lambda.StartingPosition.AT_TIMESTAMP && props.startingPositionTimestamp) {
throw new Error('startingPositionTimestamp can only be used when startingPosition is AT_TIMESTAMP');
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest giving warnings instead of throw errors on these validations so that the following use cases become possible:

  • when customers use Tokens for props.startingPosition and/or props.startingPosition (although probably a rare case)
  • when customers use escape hatches (e.g. using addPropertyOverride method) to set either of the props after initializing the construct

Same comment on the other validations that throws error (line 260 - 268).

@aws-cdk-automation aws-cdk-automation removed the pr/needs-community-review This PR needs a review from a Trusted Community Member or Core Team Member. label Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. p2 valued-contributor [Pilot] contributed between 6-12 PRs to the CDK
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(lambda-event-sources): (starting position timestamp for kafka)
5 participants