-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor WSMessage to use tagged unions #7319
base: master
Are you sure you want to change the base?
Conversation
❌ 3 Tests Failed:
View the top 3 failed tests by shortest run time
To view individual test run time comparison to the main branch, go to the Test Analytics Dashboard |
Creating the dataclasses will be significantly slower than the NamedTuple, but the property accesses will be faster...probably need to benchmark it to see what the tradeoff will be |
for more information, see https://pre-commit.ci
A quick test locally suggests that NamedTuple will also work with unions. Let me know which you want to go with and I'll update (and then try to revert the micro-optimisations that got caught in conflict resolutions). |
Also worth noting that with dataclass we get errors because DataQueue expects a Sized thing. This suggests to me that DataQueue is probably not the best solution here as there seems to be no meaning to the size when processing WS messages. Maybe an unsized queue could be used here to be a little more optimised. |
I'll write a benchmark for all the options later today |
@@ -428,25 +496,23 @@ def _feed_data(self, data: bytes) -> None: | |||
raise WebSocketError( | |||
WSCloseCode.INVALID_TEXT, "Invalid UTF-8 text message" | |||
) from exc | |||
msg = tuple.__new__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be some much nicer if we could lazy decode these since it results in a lot of cycles that could be avoided
Options compared from dataclasses import dataclass
from typing import NamedTuple
from aiohttp.http_websocket import WSMsgType
from typing import Optional, Literal
import timeit
BINARY_TYPE = WSMsgType.BINARY
@dataclass
class _WSMessageDataClass:
data: object
type: WSMsgType
extra: Optional[str] = None
@dataclass
class WSMessageBinaryDataClass(_WSMessageDataClass):
data: bytes
type: Literal[WSMsgType.BINARY] = BINARY_TYPE
@dataclass(frozen=True)
class _WSMessageFrozenDataClass:
data: object
type: WSMsgType
extra: Optional[str] = None
@dataclass(frozen=True)
class WSMessageBinaryFrozenDataClass(_WSMessageFrozenDataClass):
data: bytes
type: Literal[WSMsgType.BINARY] = BINARY_TYPE
class _WSMessageNamedTuple(NamedTuple):
data: object
type: WSMsgType
extra: Optional[str] = None
class WSMessageBinaryNamedTuple(_WSMessageNamedTuple):
data: bytes
type: Literal[WSMsgType.BINARY]
data = b"Hello, world!"
named_tuple_direct = timeit.timeit(
"x=WSMessageBinaryNamedTuple(data, BINARY_TYPE);x.data;x.type",
globals=locals(),
number=1000000,
)
named_tuple_tuple_new = timeit.timeit(
"x=tuple.__new__(WSMessageBinaryNamedTuple, (data, BINARY_TYPE));x.data;x.type",
globals=locals(),
number=1000000,
)
dataclass_direct = timeit.timeit(
"x=WSMessageBinaryDataClass(data);x.data;x.type", globals=locals(), number=1000000
)
frozen_dataclass_direct = timeit.timeit(
"x=WSMessageBinaryFrozenDataClass(data);x.data;x.type",
globals=locals(),
number=1000000,
)
print(f"named_tuple_direct: {named_tuple_direct}")
print(f"named_tuple_tuple_new: {named_tuple_tuple_new}")
print(f"dataclass_direct: {dataclass_direct}")
print(f"frozen_dataclass_direct: {frozen_dataclass_direct}") named_tuple_direct: 0.13957041699904948
|
This reworks WSMessage to be a union of dataclasses which can provide much better type safety using tagged unions (on the
type
attribute).I need to go over a couple of details (like what should be exported and from where, imports are a little messy currently). But, my main question before I finish it off, is whether it makes sense to have
.json()
only on the TEXT/BINARY messages? If we do this, it becomes easy for mypy to validate the code, requiring amsg.type is WSMsgType.TEXT
check before calling.json()
.Fixes #7313