-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
349: Implemented round() function #359
base: main
Are you sure you want to change the base?
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start! 🚀
Missing:
- Tests
- Be thorough
- Test edge cases
- Test passing an EventSet with a single feature
- Test passing an EventSet with several features
- Test passing an EventSet with features of non-accepted types (e.g. string, ints)
- Test passing an EventSet with some float32 and some float64 features, ensure output types are the same as input ones
- Adding the new operation to the docs
See last two points in https://temporian.readthedocs.io/en/latest/contributing/#developing-a-new-operator for guidance and examples.
Let's keep this PR for the round() function only, and open a new one for floor() and ceil() once done.
Thanks!
temporian/core/operators/unary.py
Outdated
@classmethod | ||
def allowed_dtypes(cls) -> List[DType]: | ||
return [ | ||
DType.INT32, | ||
DType.INT64, | ||
] | ||
|
||
@classmethod | ||
def get_output_dtype(cls, feature_dtype: DType) -> DType: | ||
return feature_dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These look off:
allowed_dtypes
specifies the types that the operator can consume - should be DType.FLOAT32 and DType.FLOAT64get_output_dtype
returns the output dtype, given the input dtype - in this case it should return the same type as received (I said it should return ints in the issue, edited it, needs to return floats because they have a much larger range than ints)
temporian/core/operators/unary.py
Outdated
operator_lib.register_operator(InvertOperator) | ||
operator_lib.register_operator(IsNanOperator) | ||
operator_lib.register_operator(NotNanOperator) | ||
operator_lib.register_operator(AbsOperator) | ||
operator_lib.register_operator(LogOperator) | ||
operator_lib.register_operator(RoundOperator) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra newline here too
Thanks for your time! |
Let's do |
Hi.. |
with self.assertRaises(TypeError): | ||
_ = evset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing calling round()
on the evset here - this shouldn't be raising anything? did you manage to run the tests locally to ensure they pass?
) | ||
expected = event_set( | ||
timestamps=[1, 2], | ||
features={"a": [11, 12], "b": [1, 3]}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these will pass, since a list of ints will yield an eventset with int features, which won't be equal to the float results of round(). Ensure your new tests are passing by running bazel test //temporian/core/operators/test:test_unary --config=macos --test_output=errors
or bazel test //temporian/core/operators/test:test_unary --config=linux --test_output=errors
depending on your OS
def test_round_float32_and_float64_features(self): | ||
evset = event_set( | ||
timestamps=[1, 2], | ||
features={"a": [10.5, 11.7], "b": [1.2, 2.9]}, | ||
) | ||
expected = event_set( | ||
timestamps=[1, 2], | ||
features={"a": [11.0, 12.0], "b": [1.0, 3.0]}, | ||
same_sampling_as=evset, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which of these are being defined as f32 and which as f64? see the f64
and f32
methods in temporian/test/utils.py
to explicitly create an eventset with the desired feature types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need a clarification. Can I use the numpy for specifying the float32 and float64 separately as
float32
def test_round_float32(self):
evset = event_set(
timestamps=[1, 2],
features={"a": np.array([10.5, 11.7], dtype=np.float32), "b": np.array([1.2, 2.9], dtype=np.float32)},
)
expected = event_set(
timestamps=[1, 2],
features={"a": np.array([11.0, 12.0], dtype=np.float32), "b": np.array([1.0, 3.0], dtype=np.float32)},
same_sampling_as=evset,
)
assertOperatorResult(self, evset.round(), expected)
assertOperatorResult(self, round(evset), expected) # __round__ magic
float64
def test_round_float64(self):
evset = event_set(
timestamps=[1, 2],
features={"a": np.array([10.5, 11.7], dtype=np.float64), "b": np.array([1.2, 2.9], dtype=np.float64)},
)
expected = event_set(
timestamps=[1, 2],
features={"a": np.array([11.0, 12.0], dtype=np.float64), "b": np.array([1.0, 3.0], dtype=np.float64)},
same_sampling_as=evset,
)
assertOperatorResult(self, evset.round(), expected)
assertOperatorResult(self, round(evset), expected) # __round__ magic
temporian/core/operators/unary.py
Outdated
DType.INT32, | ||
DType.INT64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ints shouldn't be allowed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work so far, please run tests locally to make sure they pass before submitting
temporian/core/event_set_ops.py
Outdated
self: EventSetOrNode, | ||
) -> EventSetOrNode: | ||
"""Rounds the values of an [`EventSet`][temporian.EventSet]'s features to the nearest integer. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add another line specifying that only float types are allowed, and the output type will always be the same as the input's
One more thing, please merge main branch into your PR's branch so that tests are ran on it (fixed in #363) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, There are some leftovers from the merge on index.md
Black should take care of the incorrect indentations and unnecessary empty lines. If you have any issue running black reach out to me on discord
assertOperatorResult(self, evset.round(), expected) | ||
assertOperatorResult(self, round(evset), expected) # __round__ magic | ||
|
||
def test_correct_sin(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation of this line is off. Also no need for type annotations in these test
def test_correct_sin(self) -> None: | |
def test_correct_sin(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should run black .
on the root of the repository (might need to call poetry shell
or pre-ped poetry run
) to fix the other formatting issues
@@ -44,6 +44,8 @@ Check the index on the left for a more detailed description of any symbol. | |||
| ---------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------- | | |||
| [`tp.combine()`][temporian.combine] | Combines events from [`EventSets`][temporian.EventSet] with different samplings. | | |||
| [`tp.glue()`][temporian.glue] | Concatenates features from [`EventSets`][temporian.EventSet] with the same sampling. | | |||
| [`EventSet.abs()`][temporian.EventSet.abs] | Computes the absolute value of the features. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
abs
is repeated on line 49 below
@@ -44,6 +44,8 @@ Check the index on the left for a more detailed description of any symbol. | |||
| ---------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------- | | |||
| [`tp.combine()`][temporian.combine] | Combines events from [`EventSets`][temporian.EventSet] with different samplings. | | |||
| [`tp.glue()`][temporian.glue] | Concatenates features from [`EventSets`][temporian.EventSet] with the same sampling. | | |||
| [`EventSet.abs()`][temporian.EventSet.abs] | Computes the absolute value of the features. | |||
| [`EventSet.add_index()`][temporian.EventSet.add_index] | Adds indexes to an [`EventSet`][temporian.EventSet]. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add_index
is also repeated below
self: EventSetOrNode, | ||
) -> EventSetOrNode: | ||
"""Rounds the values of an [`EventSet`][temporian.EventSet]'s features to the nearest integer. | ||
only float types are allowed and the output. Output type wil be same as the input type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation is incorrect, black should fix this
assertOperatorResult(self, evset.round(), expected) | ||
assertOperatorResult(self, round(evset), expected) # __round__ magic | ||
|
||
def test_round_float64(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_round_float64
and test_round_float32
are doing the same thing. You can use functions f64
and f32
to create an array with that type (link).
f64
is used above in test_correct_isnan
@@ -272,6 +277,7 @@ class ArcTanOperator(BaseUnaryOperator): | |||
def op_key_definition(cls) -> str: | |||
return "ARCTAN" | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary empty line
@@ -414,5 +431,6 @@ def arctan( | |||
assert isinstance(input, EventSetNode) | |||
|
|||
return ArcTanOperator( | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary empty line
I have made changes the files
Kindly review my implementation.